{"id":7395,"date":"2023-09-07T10:21:30","date_gmt":"2023-09-07T18:21:30","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=7395"},"modified":"2025-04-24T17:14:21","modified_gmt":"2025-04-24T17:14:21","slug":"the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/","title":{"rendered":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II)"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\">\n\n\n\n<div class=\"eo ep eq er es\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg\" alt=\"\" width=\"2000\" height=\"1324\"><\/figure><div class=\"ld bg\">\n<figure class=\"le lf lg lh li ld bg paragraph-image\"><picture><\/picture><\/figure>\n<\/div>\n<div class=\"ab ca\">\n<div class=\"ch bg dx dy dz ea\">\n<p id=\"f126\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">In <a href=\"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-i\/\">part 1<\/a>, I introduced the field of Natural Language Processing (NLP) and the deep learning movement that\u2019s powered it. I also walked you through 3 critical concepts in NLP: <strong class=\"be mj\">text embeddings<\/strong> (vector representations of strings), <strong class=\"be mj\">machine translation<\/strong> (using neural networks to translate languages), and <strong class=\"be mj\">dialogue &amp; conversation<\/strong> (tech that can hold conversations with humans in real time). In part 2, I\u2019ll cover 4 other important NLP techniques that you should pay attention to in order to keep up with the fast growing pace of this research field.<\/p>\n<h1 id=\"5b69\" class=\"mk ml ev be mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh bj\" data-selectable-paragraph=\"\">Technique 4: Sentiment Analysis<\/h1>\n<p id=\"133d\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">Human communication isn\u2019t just words and their explicit meanings. Instead, it\u2019s nuanced and complex. You can tell based on the way a friend asks you a question whether they\u2019re bored, angry, or curious. You can tell based on word choice and punctuation whether a customer is getting furious, even in a completely text-based conversation.<\/p>\n<p id=\"b23b\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">You can read an Amazon review for a product and understand whether the reviewer liked or disliked it even if they never directly said so. For computers to truly understand the way humans communicate every day, they need to understand more than the objective definitions of words; they need to understand our sentiments, what we really mean. <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/training-a-sentiment-analysis-core-ml-model-28823b21322c\" target=\"_blank\" rel=\"noopener ugc nofollow\">Sentiment analysis<\/a> is this process of interpreting the meaning of larger text units (entities, descriptive terms, facts, arguments, stories) by the semantic composition of smaller elements.<\/p>\n<p id=\"ea6a\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The traditional approach to sentiment analysis is to treat a sentence as a bag-of-words and to consult a curated list of \u201cpositive\u201d and \u201cnegative\u201d words to determine the sentiment of that particular sentence. This would require hand-designed features to capture the sentiment, which is extremely time-consuming and unscalable.<\/p>\n<p id=\"c3b2\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The modern deep learning approach for sentiment analysis can be used for morphology, syntax, and logical semantics, of which the most effective one is <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/detecting-the-language-of-a-persons-name-using-pytorch-rnn-29a9090c20f2\" target=\"_blank\" rel=\"noopener ugc nofollow\">Recursive Neural Networks<\/a>. As the name implies, the main assumption for Recursive Neural Net development is such that recursion is a natural way for describing language. Recursion is useful in disambiguation, helpful for some tasks to refer to specific phrases, and works extremely well for tasks that use a grammatical tree structure.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*21mXicYKLNvqAuu1\" alt=\"\" width=\"700\" height=\"311\"><\/figure><div class=\"nn no np\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*21mXicYKLNvqAuu1 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*21mXicYKLNvqAuu1 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*21mXicYKLNvqAuu1 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*21mXicYKLNvqAuu1 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*21mXicYKLNvqAuu1 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*21mXicYKLNvqAuu1 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*21mXicYKLNvqAuu1 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*21mXicYKLNvqAuu1 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*21mXicYKLNvqAuu1 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*21mXicYKLNvqAuu1 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*21mXicYKLNvqAuu1 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*21mXicYKLNvqAuu1 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*21mXicYKLNvqAuu1 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*21mXicYKLNvqAuu1 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"7c5e\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Recursive Neural Networks are perfect for settings that have a nested hierarchy and an intrinsic recursive structure. If we think about a sentence, doesn\u2019t this have such a structure? Take the sentence \u201cA big crowd violently attacks the unarmed police.\u201d First, we break apart the sentence into its respective Noun Phrase and Verb Phrase \u2014 \u201cA big crowd\u201d and \u201cviolently attacks the unarmed police.\u201d But there\u2019s a noun phrase within that verb phrase, right? \u201cviolently attacks\u201d and \u201cunarmed police.\u201d Seems pretty recursive to me.<\/p>\n<p id=\"930e\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The syntactic rules of language are highly recursive. So we take advantage of that recursive structure with a model that respects it! Another added benefit of modeling sentences with RNN\u2019s is that we can now input sentences of arbitrary length, which was a huge head scratcher for using Neural Nets in NLP, with very clever tricks to make the sentence\u2019s input vector to be of equal size, despite the length of the sentences not being equal.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*VcovzyASsk3LQCHf\" alt=\"\" width=\"700\" height=\"175\"><\/figure><div class=\"nn no nz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*VcovzyASsk3LQCHf 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*VcovzyASsk3LQCHf 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*VcovzyASsk3LQCHf 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*VcovzyASsk3LQCHf 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*VcovzyASsk3LQCHf 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*VcovzyASsk3LQCHf 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*VcovzyASsk3LQCHf 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*VcovzyASsk3LQCHf 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*VcovzyASsk3LQCHf 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*VcovzyASsk3LQCHf 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*VcovzyASsk3LQCHf 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*VcovzyASsk3LQCHf 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*VcovzyASsk3LQCHf 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*VcovzyASsk3LQCHf 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"dcf0\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The <a class=\"af mi\" href=\"https:\/\/ai.stanford.edu\/~ang\/papers\/icml11-ParsingWithRecursiveNeuralNetworks.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Standard RNN<\/a> is the most basic version of a Recursive Neural Network. It has a max-margin structure prediction architecture that can successfully recover such structure both in complex scene images as well as sentences. It\u2019s used to provide a competitive syntactic parser for natural language sentences from the <a class=\"af mi\" href=\"https:\/\/catalog.ldc.upenn.edu\/ldc99t42\" target=\"_blank\" rel=\"noopener ugc nofollow\">Penn Treebank<\/a>. For your reference, the Penn Treebank is the 1st large-scale treebank dataset composed of 2,499 stories from a three year Wall Street Journal (WSJ) collection of 98,732 stories for syntactic annotation. Additionally, it outperforms alternative approaches for semantic scene segmentation, annotation, and classification.<\/p>\n<p id=\"9aa7\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">However, the standard RNN captures neither the full syntactic nor semantic richness of linguistic phrases. The <a class=\"af mi\" href=\"https:\/\/nlp.stanford.edu\/pubs\/SocherBauerManningNg_ACL2013.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Syntactically Untied RNN<\/a>, otherwise known as Compositional Vector Grammar (CVG), is a major upgrade that addresses this issue. It uses a syntactically untied recursive neural network that learns syntactic-semantic and compositional vector representations. The model is fast to train and implemented as efficiently as the standard RNN. It learns a soft notion of head words and improves performance on the types of ambiguities that require semantic information.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*JDPFqECmMbQ4J4TU\" alt=\"\" width=\"700\" height=\"394\"><\/figure><div class=\"nn no oa\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*JDPFqECmMbQ4J4TU 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*JDPFqECmMbQ4J4TU 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*JDPFqECmMbQ4J4TU 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*JDPFqECmMbQ4J4TU 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*JDPFqECmMbQ4J4TU 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*JDPFqECmMbQ4J4TU 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*JDPFqECmMbQ4J4TU 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*JDPFqECmMbQ4J4TU 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*JDPFqECmMbQ4J4TU 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*JDPFqECmMbQ4J4TU 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*JDPFqECmMbQ4J4TU 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*JDPFqECmMbQ4J4TU 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*JDPFqECmMbQ4J4TU 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*JDPFqECmMbQ4J4TU 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"7acd\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Another evolution is the <a class=\"af mi\" href=\"https:\/\/ai.stanford.edu\/~ang\/papers\/emnlp12-SemanticCompositionalityRecursiveMatrixVectorSpaces.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Matrix-Vector RNN<\/a>, which is capable of capturing the compositional meaning of even much longer phrases. The model assigns a vector and a matrix to every node in a parse tree: the vector captures the inherent meaning of the constituent, while the matrix captures how it changes the meaning of neighboring words or phrases. This matrix-vector RNN can learn the meaning of operators in propositional logic and natural language.<\/p>\n<p id=\"cd75\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">As a result, the model obtains state of the art performance on three different experiments:<\/p>\n<ul class=\"\">\n<li id=\"1434\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Predicting fine-grained sentiment distributions of adverb-adjective pairs.<\/li>\n<li id=\"44eb\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Classifying sentiment labels of movie reviews.<\/li>\n<li id=\"b539\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Classifying semantic relationships such as cause-effect or topic-message between nouns using the syntactic path between them.<\/li>\n<\/ul>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*pnf9LImvAZ4zKJBv\" alt=\"\" width=\"700\" height=\"394\"><\/figure><div class=\"nn no oa\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*pnf9LImvAZ4zKJBv 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*pnf9LImvAZ4zKJBv 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*pnf9LImvAZ4zKJBv 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*pnf9LImvAZ4zKJBv 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*pnf9LImvAZ4zKJBv 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*pnf9LImvAZ4zKJBv 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*pnf9LImvAZ4zKJBv 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*pnf9LImvAZ4zKJBv 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*pnf9LImvAZ4zKJBv 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*pnf9LImvAZ4zKJBv 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*pnf9LImvAZ4zKJBv 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*pnf9LImvAZ4zKJBv 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*pnf9LImvAZ4zKJBv 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*pnf9LImvAZ4zKJBv 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"fca7\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The most powerful RNN model for sentiment analysis developed thus far is <a class=\"af mi\" href=\"https:\/\/nlp.stanford.edu\/~socherr\/EMNLP2013_RNTN.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Recursive Neural Tensor Network<\/a>, which has a tree structure with a neural net at each node. This model can be used for boundary segmentation to determine which word groups are positive and which are negative. The same applies to sentences as a whole. When trained on the <a class=\"af mi\" href=\"https:\/\/nlp.stanford.edu\/sentiment\/treebank.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">Sentiment Treebank<\/a>, this model outperformed all previous methods on several metrics by more than 5%. Currently, it\u2019s the only model that can accurately capture the effects of negation and its scope at various tree levels for both positive and negative phrases.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"eo ep eq er es\">\n<div class=\"ab ca\">\n<div class=\"ch bg dx dy dz ea\">\n<blockquote class=\"ou\"><p id=\"8311\" class=\"ov ow ev be ox oy oz pa pb pc pd mh hb\" data-selectable-paragraph=\"\">The latest in deep learning \u2014 from a source you can trust. <a class=\"af mi\" href=\"https:\/\/www.deeplearningweekly.com\/?utm_campaign=dlweekly-newsletter-expertise1&amp;utm_source=heartbeat\" target=\"_blank\" rel=\"noopener ugc nofollow\">Sign up for a weekly dive into all things deep learning<\/a>, curated by experts working in the field.<\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"eo ep eq er es\">\n<div class=\"ab ca\">\n<div class=\"ch bg dx dy dz ea\">\n<h1 id=\"333c\" class=\"mk ml ev be mm mn pe mp mq mr pf mt mu mv pg mx my mz ph nb nc nd pi nf ng nh bj\" data-selectable-paragraph=\"\">Technique 5: Question Answering<\/h1>\n<p id=\"52de\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">The idea of a <strong class=\"be mj\">Question Answering (QA) system<\/strong> is to extract information, directly from documents, conversations, online searches, and elsewhere, that will meet a user\u2019s information needs. Rather than make the user read through an entire document, a QA system prefers to give a short and concise answer. Nowadays, a QA system can combine very easily with other NLP systems like chatbots, and some QA systems even go beyond the search of text documents and can extract information from a collection of pictures.<\/p>\n<p id=\"1e6f\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">In fact, most of the NLP problems can be considered as a question answering problem. The paradigm is simple: we issue a query, and the machine provides a response. By reading through a document, or a set of instructions, an intelligent system should be able to answer a wide variety of questions. So naturally, we\u2019d like to design a model that can be used for general QA.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*y8-7H0vNOtRpM-r9\" alt=\"\" width=\"700\" height=\"359\"><\/figure><div class=\"nn no nz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*y8-7H0vNOtRpM-r9 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*y8-7H0vNOtRpM-r9 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*y8-7H0vNOtRpM-r9 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*y8-7H0vNOtRpM-r9 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*y8-7H0vNOtRpM-r9 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*y8-7H0vNOtRpM-r9 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*y8-7H0vNOtRpM-r9 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*y8-7H0vNOtRpM-r9 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*y8-7H0vNOtRpM-r9 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*y8-7H0vNOtRpM-r9 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*y8-7H0vNOtRpM-r9 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*y8-7H0vNOtRpM-r9 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*y8-7H0vNOtRpM-r9 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*y8-7H0vNOtRpM-r9 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"6ca3\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">A powerful deep learning architecture, known as <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1506.07285.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">dynamic memory network<\/a>(DMN), has been developed and optimized specifically for QA problems. Given a training set of input sequences (knowledge) and questions, it can form episodic memories, and use them to generate relevant answers. The architecture has the following components:<\/p>\n<ul class=\"\">\n<li id=\"f101\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\">The <strong class=\"be mj\">Semantic Memory Module<\/strong> (analogous to a knowledge base) consists of pre-trained GloVe vectors that are used to create sequences of word embeddings from input sentences. These vectors will act as inputs to the model.<\/li>\n<li id=\"9e1f\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">The <strong class=\"be mj\">Input Module<\/strong> processes the input vectors associated with a question into a set of vectors termed <em class=\"pj\">facts<\/em>. This module is implemented using a <a class=\"af mi\" href=\"https:\/\/towardsdatascience.com\/understanding-gru-networks-2ef37df6c9be\" target=\"_blank\" rel=\"noopener\">Gated Recurrent Unit<\/a>. The GRU enables the network to learn if the sentence currently under consideration is relevant or has nothing to do with the answer.<\/li>\n<li id=\"2b0e\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">The <strong class=\"be mj\">Question Module<\/strong> processes the question word by word, and outputs a vector using the same GRU as the input module, and the same weights. Both facts and questions are encoded as embeddings.<\/li>\n<li id=\"d14b\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">The <strong class=\"be mj\">Episodic Memory Module<\/strong> receives the fact and question vectors extracted from the input and encoded as embeddings. This uses a process inspired by the brain\u2019s hippocampus, which can retrieve temporal states that are triggered by some response, like sights or sounds.<\/li>\n<li id=\"0541\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Finally, the <strong class=\"be mj\">Answer Module<\/strong> generates an appropriate response. By the final pass, the episodic memory should contain all the information required to answer the question. This module uses another GRU, trained with the cross-entropy error classification of the correct sequence, which can then be converted back to natural language.<\/li>\n<\/ul>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*5K06B3-K08ypyZjB\" alt=\"\" width=\"700\" height=\"297\"><\/figure><div class=\"nn no pk\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*5K06B3-K08ypyZjB 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*5K06B3-K08ypyZjB 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*5K06B3-K08ypyZjB 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*5K06B3-K08ypyZjB 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*5K06B3-K08ypyZjB 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*5K06B3-K08ypyZjB 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*5K06B3-K08ypyZjB 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*5K06B3-K08ypyZjB 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*5K06B3-K08ypyZjB 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*5K06B3-K08ypyZjB 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*5K06B3-K08ypyZjB 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*5K06B3-K08ypyZjB 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*5K06B3-K08ypyZjB 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*5K06B3-K08ypyZjB 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"5b1a\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">DMN not only did extremely well for QA tasks, but also outperformed other architectures for sentiment analysis and part-of-speech tagging. Since its inception, there have been major improvements to Dynamic Memory Networks to further improve their accuracy on question answering tasks, including:<\/p>\n<ul class=\"\">\n<li id=\"36ba\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\"><a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1603.01417.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Dynamic Memory Networks for Visual and Textual Question Answering<\/a> is basically DMN being applied to images. Its memory and input modules are upgraded in order to be able to answer visual questions. This model improves the state of the art on many benchmark Visual Question Answering datasets without supporting fact supervision.<\/li>\n<li id=\"10f9\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\"><a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1611.01604.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Dynamic Coattention Networks for Question Answering<\/a> addresses the problem of recovering from local maxima corresponding to incorrect answers. It first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both. Then, a dynamic pointing decoder iterates over potential answer spans. This iterative procedure enables the model to recover from initial local maxima corresponding to incorrect answers.<\/li>\n<\/ul>\n<h1 id=\"0f05\" class=\"mk ml ev be mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh bj\" data-selectable-paragraph=\"\">Technique 6: Text Summarization<\/h1>\n<p id=\"b132\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">It\u2019s very difficult for human beings to manually summarize large documents of text. Text summarization is the problem in NLP of creating short, accurate, and fluent summaries for source documents. It\u2019s become an important and timely tool for assisting and interpreting text information in today\u2019s fast-growing information age. With push notifications and article digests gaining more and more traction, the task of generating intelligent and accurate summaries for long pieces of text has been growing every day.<\/p>\n<p id=\"33f0\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Automatic summarization of text works by first calculating the word frequencies for the entire text document. Then, the 100 most common words are stored and sorted. Each sentence is then scored based on how many high frequency words it contains, with higher frequency words being worth more. Finally, the top X sentences are taken and sorted based on their position in the original text.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*CS0NqiXahggQjabL\" alt=\"\" width=\"700\" height=\"438\"><\/figure><div class=\"nn no nz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*CS0NqiXahggQjabL 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*CS0NqiXahggQjabL 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*CS0NqiXahggQjabL 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*CS0NqiXahggQjabL 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*CS0NqiXahggQjabL 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*CS0NqiXahggQjabL 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*CS0NqiXahggQjabL 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*CS0NqiXahggQjabL 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*CS0NqiXahggQjabL 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*CS0NqiXahggQjabL 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*CS0NqiXahggQjabL 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*CS0NqiXahggQjabL 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*CS0NqiXahggQjabL 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*CS0NqiXahggQjabL 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"ec0c\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">By keeping things simple and for a general purpose, the automatic text summarization algorithm is able to function in a variety of situations that other implementations might struggle with, such as documents containing foreign languages or unique word associations that aren\u2019t found in standard english language corpuses.<\/p>\n<p id=\"247b\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">There are two fundamental approaches to text summarization: <strong class=\"be mj\">extractive <\/strong>and <strong class=\"be mj\">abstractive<\/strong>. The former extracts words and word phrases from the original text to create a summary. The latter learns an internal language representation to generate more human-like summaries, paraphrasing the intent of the original text.<\/p>\n<p id=\"ef8d\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">The methods in <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/extractive-text-summarization-using-neural-networks-5845804c7701\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be mj\">extractive summarization<\/strong><\/a> work by selecting a subset. This is done by extracting the phrases or sentences from the actual article to form a summary. <mark class=\"aeb aec ao\"><strong class=\"be mj\">LexRank<\/strong><\/mark><mark class=\"aeb aec ao\"> and <\/mark><mark class=\"aeb aec ao\"><strong class=\"be mj\">TextRank<\/strong><\/mark><mark class=\"aeb aec ao\"> are well known extractive summarizations.<\/mark>Both of them use a variation of the Google PageRank algorithm.<\/p>\n<ul class=\"\">\n<li id=\"f72d\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\"><a class=\"af mi\" href=\"https:\/\/pypi.org\/project\/lexrank\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">LexRank <\/a>is an unsupervised graph-based algorithm that uses IDF-modified Cosine as the similarity measure between two sentences. This similarity is used as weight of the graph edge between two sentences. LexRank also incorporates an intelligent post-processing step that makes sure top sentences chosen for the summary are not too similar to each other.<\/li>\n<li id=\"6699\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\"><a class=\"af mi\" href=\"https:\/\/www.analyticsvidhya.com\/blog\/2018\/11\/introduction-text-summarization-textrank-python\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">TextRank<\/a> is a similar algorithm to LexRank with a few enhancements, such as using lemmatization instead of stemming, incorporating Part-Of-Speech tagging and <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/natural-language-in-ios-12-customizing-tag-schemes-and-named-entity-recognition-caf2da388a9f\" target=\"_blank\" rel=\"noopener ugc nofollow\">Named Entity Resolution<\/a>, extracting key phrases from the article, and extracting summary sentences based on those phrases. Along with a summary of the article, TextRank also extracts meaningful key phrases from the article.<\/li>\n<\/ul>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*73TZw9Ltv1G4MQVa\" alt=\"\" width=\"700\" height=\"211\"><\/figure><div class=\"nn no nz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*73TZw9Ltv1G4MQVa 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*73TZw9Ltv1G4MQVa 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*73TZw9Ltv1G4MQVa 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*73TZw9Ltv1G4MQVa 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*73TZw9Ltv1G4MQVa 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*73TZw9Ltv1G4MQVa 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*73TZw9Ltv1G4MQVa 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*73TZw9Ltv1G4MQVa 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*73TZw9Ltv1G4MQVa 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*73TZw9Ltv1G4MQVa 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*73TZw9Ltv1G4MQVa 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*73TZw9Ltv1G4MQVa 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*73TZw9Ltv1G4MQVa 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*73TZw9Ltv1G4MQVa 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"4d3e\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Models for <strong class=\"be mj\">abstractive summarization<\/strong> fall under the larger umbrella of deep learning. There have been certain breakthroughs in text summarization using deep learning. Below are some of the most notable published results by some of the biggest companies in the field of NLP:<\/p>\n<ul class=\"\">\n<li id=\"8784\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Facebook\u2019s <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1509.00685.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Neural Attention<\/a> is a neural network architecture that utilizes a local attention-based model capable of generating each word of the summary conditioned on the input sentence.<\/li>\n<li id=\"4794\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Google Brain\u2019s <a class=\"af mi\" href=\"https:\/\/ai.googleblog.com\/2016\/08\/text-summarization-with-tensorflow.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">Sequence-to-Sequence<\/a> model follows an encoder-decoder architecture. The encoder is responsible for reading the source document and encoding it to an internal representation. The decoder is a language model responsible for generating each word in the output summary using the encoded representation of the source document.<\/li>\n<li id=\"0b79\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">IBM Watson uses the similar <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1602.06023.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Sequence-to-Sequence<\/a> model, but with attention and bidirectional recurrent neural network features.<\/li>\n<\/ul>\n<h1 id=\"f682\" class=\"mk ml ev be mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh bj\" data-selectable-paragraph=\"\">Technique 7: Attention Mechanism<\/h1>\n<p id=\"36cf\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">Attention Mechanisms in Neural Networks are loosely based on the visual attention mechanism found in humans. Human visual attention is well-studied and while there exist different models, all of them essentially come down to being able to focus on a certain region of an image with \u201chigh resolution\u201d while perceiving the surrounding image in \u201clow resolution,\u201d and then adjusting the focal point over time.<\/p>\n<p id=\"d8d7\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Imagine you\u2019re reading a whole essay: instead of going through each word or character sequentially, you subconsciously focus on a few sentences of highest information density and filter out the rest. Your attention effectively captures contextual information in a hierarchical manner, such that it\u2019s sufficient for decision making while reducing overheads. Attention Mechanisms in Neural Networks are loosely based on the visual attention mechanism found in humans.<\/p>\n<p id=\"76ae\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">So why is this important? Models such as <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/a-beginners-guide-to-implementing-long-short-term-memory-networks-lstm-eb7a2ff09a27\" target=\"_blank\" rel=\"noopener ugc nofollow\">LSTM<\/a> and GRU rely on reading a complete sentence and compressing all the information into a fixed-length vector. This requires sophisticated <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/introduction-to-automated-feature-engineering-using-deep-feature-synthesis-dfs-3feb69a7c00b\" target=\"_blank\" rel=\"noopener ugc nofollow\">feature engineering<\/a> based on the statistical properties of text. A sentence with hundreds of words represented by several words will surely lead to information loss, inadequate translation, etc.<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:666\/0*E5xgyeKg3Ga-HPPT\" alt=\"\" width=\"666\" height=\"495\"><\/figure><div class=\"nn no pl\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*E5xgyeKg3Ga-HPPT 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*E5xgyeKg3Ga-HPPT 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*E5xgyeKg3Ga-HPPT 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*E5xgyeKg3Ga-HPPT 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*E5xgyeKg3Ga-HPPT 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*E5xgyeKg3Ga-HPPT 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1332\/0*E5xgyeKg3Ga-HPPT 1332w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 666px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*E5xgyeKg3Ga-HPPT 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*E5xgyeKg3Ga-HPPT 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*E5xgyeKg3Ga-HPPT 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*E5xgyeKg3Ga-HPPT 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*E5xgyeKg3Ga-HPPT 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*E5xgyeKg3Ga-HPPT 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1332\/0*E5xgyeKg3Ga-HPPT 1332w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 666px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"95d3\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">With an attention mechanism, we no longer try encode the full-surge sentence into a fixed-length vector. Rather, we allow the decoder to attend to different parts of the source sentence at each step of the output generation. We let the model learn what to attend to based on the input sentence and what it has produced so far.<\/p>\n<p id=\"f428\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">According to the image above from <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1508.04025.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Effective Approaches to Attention-Based Neural Machine Translation<\/a>, blue represents encoder and red represents decoder, so we can see that the context vector takes all cells\u2019 outputs as input to compute the probability distribution of source language words for each single word the decoder wants to generate. By utilizing this mechanism, it\u2019s possible for the decoder to capture global information rather than solely to infer based on one hidden state.<\/p>\n<p id=\"d1f8\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Besides Machine Translation, the attention model works on a variety of other NLP tasks. In <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1502.03044.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Show, Attend and Tell: Neural Image Caption Generation with Visual Attention<\/a>, the authors apply attention mechanisms to the problem of generating image descriptions. They use a <a class=\"af mi\" href=\"https:\/\/heartbeat.comet.ml\/a-beginners-guide-to-convolutional-neural-networks-cnn-cf26c5ee17ed\" target=\"_blank\" rel=\"noopener ugc nofollow\">Convolutional Neural Network<\/a> to encode the image, and a Recurrent Neural Network with attention mechanisms to generate a description. By visualizing the attention weights, they interpret what the model is looking at while generating a word:<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*WnDU6Uss-t9el013\" alt=\"\" width=\"700\" height=\"313\"><\/figure><div class=\"nn no pm\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*WnDU6Uss-t9el013 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*WnDU6Uss-t9el013 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*WnDU6Uss-t9el013 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*WnDU6Uss-t9el013 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*WnDU6Uss-t9el013 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*WnDU6Uss-t9el013 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*WnDU6Uss-t9el013 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*WnDU6Uss-t9el013 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*WnDU6Uss-t9el013 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*WnDU6Uss-t9el013 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*WnDU6Uss-t9el013 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*WnDU6Uss-t9el013 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*WnDU6Uss-t9el013 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*WnDU6Uss-t9el013 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"ee23\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">In <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1412.7449.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Grammar as a Foreign Language<\/a>, the authors use a Recurrent Neural Network with attention mechanism to generate sentence-parsed trees. The visualized attention matrix gives insight into how the network generates those trees:<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*bp7ba7jEeaot3Ru_\" alt=\"\" width=\"700\" height=\"651\"><\/figure><div class=\"nn no pn\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*bp7ba7jEeaot3Ru_ 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*bp7ba7jEeaot3Ru_ 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*bp7ba7jEeaot3Ru_ 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*bp7ba7jEeaot3Ru_ 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*bp7ba7jEeaot3Ru_ 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*bp7ba7jEeaot3Ru_ 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*bp7ba7jEeaot3Ru_ 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*bp7ba7jEeaot3Ru_ 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*bp7ba7jEeaot3Ru_ 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*bp7ba7jEeaot3Ru_ 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*bp7ba7jEeaot3Ru_ 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*bp7ba7jEeaot3Ru_ 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*bp7ba7jEeaot3Ru_ 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*bp7ba7jEeaot3Ru_ 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"af4b\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">In <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1506.03340.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Teaching Machines to Read and Comprehend<\/a>, the authors use a Recurrent Neural Network to read a text, read a question, and then produce an answer. By visualizing the attention matrix, they show where the network looks while trying to find the answer to the question:<\/p>\n<figure class=\"nq nr ns nt nu ld nn no paragraph-image\">\n<div class=\"nv nw go nx bg ny\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg lj lk c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*mp6k1YAaMxaL5Gx-\" alt=\"\" width=\"700\" height=\"306\"><\/figure><div class=\"nn no nz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*mp6k1YAaMxaL5Gx- 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*mp6k1YAaMxaL5Gx- 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*mp6k1YAaMxaL5Gx- 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*mp6k1YAaMxaL5Gx- 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*mp6k1YAaMxaL5Gx- 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*mp6k1YAaMxaL5Gx- 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*mp6k1YAaMxaL5Gx- 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*mp6k1YAaMxaL5Gx- 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*mp6k1YAaMxaL5Gx- 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*mp6k1YAaMxaL5Gx- 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*mp6k1YAaMxaL5Gx- 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*mp6k1YAaMxaL5Gx- 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*mp6k1YAaMxaL5Gx- 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*mp6k1YAaMxaL5Gx- 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"1004\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">Attention does come at a cost, however. We need to calculate an attention value for each combination of input and output word. If you have a 100-word input sequence and generate a 100-word output sequence, that would be 10,000 attention values. If you do character-level computations and deal with sequences consisting of hundreds of tokens, the above mechanisms can become prohibitively expensive.<\/p>\n<h1 id=\"6b88\" class=\"mk ml ev be mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh bj\" data-selectable-paragraph=\"\">Natural Language Processing Obstacles<\/h1>\n<p id=\"794f\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">It should be noted that in each of the 7 NLP techniques I have discussed over these 2 posts, researchers have had to deal with a variety of obstacles: limits of the algorithms, scalability of the models, vague understanding of the human language. . .The good news is that the development of this field seems like a giant open-source project: researchers keep building better models to solve the existing problems and sharing their results with the community. Here are the major obstacles in NLP that have been resolved thanks to recent academic research progress:<\/p>\n<ul class=\"\">\n<li id=\"7464\" class=\"ll lm ev be b ln lo lp lq lr ls lt lu ob lw lx ly oc ma mb mc od me mf mg mh oe of og bj\" data-selectable-paragraph=\"\">There is no single model architecture with consistent state-of-the-art results <strong class=\"be mj\">across tasks<\/strong>. For example, in Question Answering, we have <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1503.08895.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Strongly Supervised End-to-End Memory Networks<\/a>; in Sentiment Analysis, we have <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1503.00075.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tree-LSTMs<\/a>; and in Sequence Tagging, we have <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1508.01991.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Bidirectional LSTM-CRF<\/a>. The <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1506.07285.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Dynamic Memory Network<\/a> I mentioned earlier in the Question Answering section somehow addressed this challenge, as it could perform well consistently across multiple domains.<\/li>\n<li id=\"ab71\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">A powerful approach in machine learning is <strong class=\"be mj\">multi-task learning,<\/strong>which shares representations between related tasks in order to enable the model to generalize better on the original task. However, fully-joint multitask learning is hard, as it\u2019s usually restricted to lower layers, useful only if tasks are related (often hurts performance if tasks are not related), and has the same decoder\/classifier in the proposed model. In <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1611.01587.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">A Joint Many-Task Model: Growing a NN for Multiple NLP Tasks<\/a>, the authors pre-define a hierarchical architecture consisting of several NLP tasks as a joint model for multi-task learning. The model includes character n-grams and short-circuits as well as a state-of-the-art, purely feedforward parser, capable of performing dependency parsing, multi-sentence tasks, and joint training.<\/li>\n<li id=\"6e8d\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\"><strong class=\"be mj\">Zero-shot learning<\/strong> is the ability to solve a task despite not having received any training examples of that task. There aren\u2019t many models capable of doing zero shot learning for NLP, as answers can only be predicted if they were seen during training and as part of the softmax function. In order to tackle this obstacle, the authors of <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1609.07843.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Pointer Sentinel Mixture Models<\/a> have combined a standard LSTM softmax with Pointer Networks in a mixture model. The pointer networks help with rare words and long-term dependencies, while the standard softmax can refer to words that are not in the input.<\/li>\n<li id=\"cc9e\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Another challenge is the problem of <strong class=\"be mj\">duplicate word representations<\/strong>, where different encodings for the encoder and decoder in a model result in duplicate parameters \/ meanings. The simplest solution for this is to tie word vectors together and train single weights jointly, as demonstrated in <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1611.01462.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling<\/a>.<\/li>\n<li id=\"ae89\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Another big obstacle is that Recurrent Neural Networks, the basic building block for any Deep NLP techniques, are quite slow compared to, say, Convolutional Neural Nets or Feedforward Neural Nets. <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1611.01576.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Quasi-Recurrent Neural Networks<\/a> take the best parts of RNNs and CNNs to enhance the training speed, using convolutions for parallelism across time and element-wise gated recurrence for parallelism across channels. This approach is better and faster than any other models in language modeling and sentiment analysis.<\/li>\n<li id=\"85e9\" class=\"ll lm ev be b ln oh lp lq lr oi lt lu ob oj lx ly oc ok mb mc od ol mf mg mh oe of og bj\" data-selectable-paragraph=\"\">Finally, in NLP, <strong class=\"be mj\">architecture search<\/strong> \u2014 the process of using machine learning to automate the design of artificial neural networks \u2014 is quite slow, as the traditional manual process requires a lot of expertise. What if we could use AI to find the right architecture for any problem? <a class=\"af mi\" href=\"https:\/\/arxiv.org\/pdf\/1611.01578.pdf\" target=\"_blank\" rel=\"noopener ugc nofollow\">Neural architecture search with reinforcement learning<\/a>from Google Brain is the most viable solution developed so far. The authors use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.<\/li>\n<\/ul>\n<h1 id=\"3633\" class=\"mk ml ev be mm mn mo mp mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh bj\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<p id=\"1b12\" class=\"pw-post-body-paragraph ll lm ev be b ln ni lp lq lr nj lt lu lv nk lx ly lz nl mb mc md nm mf mg mh eo bj\" data-selectable-paragraph=\"\">So there you go! I showed you a basic rundown of the major natural language processing techniques that can help a computer extract, analyze, and understand useful information from a single text or sequence of texts.<\/p>\n<p id=\"045e\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">From machine translation that connects humans across cultures, to conversational chatbots that help with customer service; from sentiment analysis that deeply understands a human\u2019s mood, to attention mechanisms that can mimic our visual attention, the field of NLP is too expansive to cover completely, so I\u2019d encourage you to explore it further, whether through online courses, blog tutorials, or research papers.<\/p>\n<p id=\"e3aa\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\">I\u2019d highly recommend <a class=\"af mi\" href=\"http:\/\/web.stanford.edu\/class\/cs224n\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Stanford\u2019s CS 224<\/a> for starters, as you\u2019ll learn to implement, train, debug, visualize, and invent your own neural network models for NLP tasks. As a bonus, you can get all the lecture slides, assignment guidelines, and source code from <a class=\"af mi\" href=\"https:\/\/github.com\/khanhnamle1994\/natural-language-processing\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be mj\">my GitHub repo<\/strong><\/a>. I hope it\u2019ll guide you in the quest of changing how we\u2019ll communicate in the future!<\/p>\n<p id=\"b5f9\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\"><em class=\"pj\">If you enjoyed this piece, I\u2019d love it if you hit the clap button<\/em> \ud83d\udc4f <em class=\"pj\">so others might stumble upon it. You can find my own code on<\/em> <a class=\"af mi\" href=\"https:\/\/github.com\/khanhnamle1994\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"pj\">GitHub<\/em><\/a><em class=\"pj\">, and more of my writing and projects at<\/em> <a class=\"af mi\" href=\"https:\/\/jameskle.com\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"pj\">https:\/\/jameskle.com\/<\/em><\/a><em class=\"pj\">. You can also follow me on <\/em><a class=\"af mi\" href=\"https:\/\/twitter.com\/@james_aka_yale\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"pj\">Twitter<\/em><\/a><em class=\"pj\">, email me directly or <\/em><a class=\"af mi\" href=\"http:\/\/www.linkedin.com\/in\/khanhnamle94\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"pj\">find me on LinkedIn<\/em><\/a><em class=\"pj\">. <\/em><a class=\"af mi\" href=\"http:\/\/eepurl.com\/deWjzb\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"pj\">Sign up for my newsletter<\/em><\/a><em class=\"pj\"> to receive my latest thoughts on data science, machine learning, and artificial intelligence right at your inbox!<\/em><\/p>\n<p id=\"87ea\" class=\"pw-post-body-paragraph ll lm ev be b ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh eo bj\" data-selectable-paragraph=\"\"><strong class=\"be mj\">Discuss the post on <\/strong><a class=\"af mi\" href=\"https:\/\/news.ycombinator.com\/item?id=17451198\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"be mj\">Hacker News<\/strong><\/a><strong class=\"be mj\">.<\/strong><\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In part 1, I introduced the field of Natural Language Processing (NLP) and the deep learning movement that\u2019s powered it. I also walked you through 3 critical concepts in NLP: text embeddings (vector representations of strings), machine translation (using neural networks to translate languages), and dialogue &amp; conversation (tech that can hold conversations with humans [&hellip;]<\/p>\n","protected":false},"author":39,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[6],"tags":[],"coauthors":[150],"class_list":["post-7395","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II) - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II)\" \/>\n<meta property=\"og:description\" content=\"In part 1, I introduced the field of Natural Language Processing (NLP) and the deep learning movement that\u2019s powered it. I also walked you through 3 critical concepts in NLP: text embeddings (vector representations of strings), machine translation (using neural networks to translate languages), and dialogue &amp; conversation (tech that can hold conversations with humans [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-09-07T18:21:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:14:21+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg\" \/>\n<meta name=\"author\" content=\"James Le\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"James Le\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"18 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II) - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/","og_locale":"en_US","og_type":"article","og_title":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II)","og_description":"In part 1, I introduced the field of Natural Language Processing (NLP) and the deep learning movement that\u2019s powered it. I also walked you through 3 critical concepts in NLP: text embeddings (vector representations of strings), machine translation (using neural networks to translate languages), and dialogue &amp; conversation (tech that can hold conversations with humans [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-09-07T18:21:30+00:00","article_modified_time":"2025-04-24T17:14:21+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg","type":"","width":"","height":""}],"author":"James Le","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"James Le","Est. reading time":"18 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/"},"author":{"name":"James Le","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/9ea207111d311668f59477646ffd469a"},"headline":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II)","datePublished":"2023-09-07T18:21:30+00:00","dateModified":"2025-04-24T17:14:21+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/"},"wordCount":3449,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg","articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/","url":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/","name":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II) - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg","datePublished":"2023-09-07T18:21:30+00:00","dateModified":"2025-04-24T17:14:21+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:2000\/1*wWg2hy2Tum8rHuy-Dmfl_g.jpeg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-ii\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II)"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/9ea207111d311668f59477646ffd469a","name":"James Le","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/e9faebcdd7afdaff187857dc289b23ba","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1678305362870-96x96.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1678305362870-96x96.jpg","caption":"James Le"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/khanhle-1013gmail-com\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7395","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=7395"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7395\/revisions"}],"predecessor-version":[{"id":15557,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7395\/revisions\/15557"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=7395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=7395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=7395"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=7395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}