{"id":4587,"date":"2022-11-10T17:48:17","date_gmt":"2022-11-11T01:48:17","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=4587"},"modified":"2025-04-24T17:16:36","modified_gmt":"2025-04-24T17:16:36","slug":"deep-learning-how-it-works","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/","title":{"rendered":"Deep Learning: How it Works"},"content":{"rendered":"\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"has-text-align-center\">Photo by&nbsp;<a class=\"au kj\" href=\"https:\/\/unsplash.com\/@jjying?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">JJ Ying<\/a>&nbsp;on&nbsp;<a class=\"au kj\" href=\"https:\/\/unsplash.com\/s\/photos\/neural-networks?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">Unsplash<\/a><\/p>\n\n\n\n<div class=\"ir is it iu iv\">\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"1975\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Our lives have transitioned to revolve around Artificial Intelligence (AI) and Machine Learning (ML). Everybody is talking about it and it\u2019s been implemented in our day-to-day tasks and actions without us even realizing sometimes.<\/p>\n<p id=\"7f2e\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">They are the hottest topics right now, and everybody wants to know more. People throw the term \u201cAI\u201d around, from developers, companies, and even people who have no understanding; but are living in a tech-driven world.<\/p>\n<h1 id=\"ecda\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">Deep Learning<\/h1>\n<p id=\"cc88\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Deep Learning<\/strong>&nbsp;is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs. Supervised and Unsupervised Learning can be used with Deep Learning.<\/p>\n<p id=\"0c0f\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The hype about Deep Learning is due to the fact that Deep learning models have been able to achieve higher levels of recognition accuracy than ever before. Recent advances in deep learning have exceeded human-level performance in tasks such as image recognition.<\/p>\n<h1 id=\"448f\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">So how does this method work that\u2019s out-performing human-level?<\/h1>\n<p id=\"30e8\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">The majority of Deep Learning methods use neural network architectures. You may hear Deep Learning is referred to as Deep Neural Networks sometimes. The term \u2018Deep\u2019 relates to the number of hidden layers in the neural network.<\/p>\n<h1 id=\"fa52\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">Neural Network<\/h1>\n<p id=\"e0d7\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">A&nbsp;<strong class=\"bm mk\">Neural Network<\/strong>&nbsp;is a network of biological neurons. In the use of AI, Artificial Neural Network contains artificial neurons or nodes.<\/p>\n<p id=\"9bcd\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">If we refer back to the definition of AI:&nbsp;<em class=\"ml\">the ability of a computer or a computer-controlled robot to perform tasks that are usually done by humans as they require human intelligence.&nbsp;<\/em>We can connect the dots of a Neural Network being a structure of biological neurons, similar to the human brain.<\/p>\n<p id=\"e7ed\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">So let\u2019s dive into the brain of an AI.<\/p>\n<p id=\"03be\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Artificial Neural Networks<\/strong>&nbsp;(ANNs) are made up of neurons that contain three different layers: an input layer, one or more hidden layers, and an output layer. Each neuron is connected to another neuron and is where computation happens.<\/p>\n<ol class=\"\">\n<li id=\"a99f\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Input Layer<\/strong>&nbsp;\u2014 receives the input data<\/li>\n<li id=\"4003\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Hidden Layer(s)<\/strong>&nbsp;\u2014 perform mathematical computations on the input data<\/li>\n<li id=\"de3c\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Output Layer<\/strong>&nbsp;\u2014 returns the output data.<\/li>\n<\/ol>\n<p id=\"7922\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The term \u201cDeep\u201d in Deep Learning refers to more than one hidden layer.<\/p>\n<h2 id=\"606c\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Perceptrons<\/h2>\n<p id=\"8c95\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Perceptrons were introduced by Frank Rosenblatt in 1957. There are two types of Perceptrons:<\/p>\n<ul class=\"\">\n<li id=\"a801\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg no ms mt mu ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Single-layer:<\/strong>&nbsp;This type of perceptron can learn only learn a linear function, and is the oldest neural network. It contains a single neuron and does not contain any hidden layers.<\/li>\n<li id=\"debb\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg no ms mt mu ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Multilayer:<\/strong>&nbsp;This type of perceptron consists of two or more layers. They are primarily used to learn more about the data and the relationships between the features on a non-linear level.<\/li>\n<\/ul>\n<p id=\"f3fd\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Consider each node as an individual linear regression model, which consists of input data, weights, a bias, and an output. The formula:<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<div class=\"ka kb do kc ce kd\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/700\/1*bZRJ_XzYDkyzY2UBdazrwA.png\" alt=\"\" width=\"700\" height=\"134\"><\/figure><div class=\"gl gm np\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*bZRJ_XzYDkyzY2UBdazrwA.png 640w, https:\/\/miro.medium.com\/max\/720\/1*bZRJ_XzYDkyzY2UBdazrwA.png 720w, https:\/\/miro.medium.com\/max\/750\/1*bZRJ_XzYDkyzY2UBdazrwA.png 750w, https:\/\/miro.medium.com\/max\/786\/1*bZRJ_XzYDkyzY2UBdazrwA.png 786w, https:\/\/miro.medium.com\/max\/828\/1*bZRJ_XzYDkyzY2UBdazrwA.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*bZRJ_XzYDkyzY2UBdazrwA.png 1100w, https:\/\/miro.medium.com\/max\/1400\/1*bZRJ_XzYDkyzY2UBdazrwA.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.ibm.com\/cloud\/learn\/neural-networks#toc-what-are-n-2oQ5Vepe\" target=\"_blank\" rel=\"noopener ugc nofollow\">IBM<\/a><\/div>\n<\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"db55\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Where the output can be formulated as:<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<div class=\"ka kb do kc ce kd\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/700\/1*2HmFCsNS5KLXTWMaMY6sOQ.png\" alt=\"\" width=\"700\" height=\"168\"><\/figure><div class=\"gl gm nu\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 640w, https:\/\/miro.medium.com\/max\/720\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 720w, https:\/\/miro.medium.com\/max\/750\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 750w, https:\/\/miro.medium.com\/max\/786\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 786w, https:\/\/miro.medium.com\/max\/828\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 1100w, https:\/\/miro.medium.com\/max\/1400\/1*2HmFCsNS5KLXTWMaMY6sOQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.ibm.com\/cloud\/learn\/neural-networks#toc-what-are-n-2oQ5Vepe\" target=\"_blank\" rel=\"noopener ugc nofollow\">IBM<\/a><\/div>\n<\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"963d\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Perceptrons consist of:<\/p>\n<ol class=\"\">\n<li id=\"c680\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Input layer<\/li>\n<li id=\"a591\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Weights and Bias<\/li>\n<li id=\"6843\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Summation Function<\/li>\n<li id=\"2676\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Activation Function<\/li>\n<\/ol>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<div class=\"ka kb do kc ce kd\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/700\/1*bV-zR_GgCeSvB1aqdxxehA.png\" alt=\"\" width=\"700\" height=\"355\"><\/figure><div class=\"gl gm nv\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*bV-zR_GgCeSvB1aqdxxehA.png 640w, https:\/\/miro.medium.com\/max\/720\/1*bV-zR_GgCeSvB1aqdxxehA.png 720w, https:\/\/miro.medium.com\/max\/750\/1*bV-zR_GgCeSvB1aqdxxehA.png 750w, https:\/\/miro.medium.com\/max\/786\/1*bV-zR_GgCeSvB1aqdxxehA.png 786w, https:\/\/miro.medium.com\/max\/828\/1*bV-zR_GgCeSvB1aqdxxehA.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*bV-zR_GgCeSvB1aqdxxehA.png 1100w, https:\/\/miro.medium.com\/max\/1400\/1*bV-zR_GgCeSvB1aqdxxehA.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/towardsdatascience.com\/whats-the-role-of-weights-and-bias-in-a-neural-network-4cf7e9888a0f\" target=\"_blank\" rel=\"noopener\">Towards Data Science<\/a><\/div>\n<\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"85b3\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">As we already know, the&nbsp;<strong class=\"bm mk\">Input Layer<\/strong>&nbsp;is the layer that receives the input data.<\/p>\n<p id=\"2947\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The&nbsp;<strong class=\"bm mk\">Weight<\/strong>&nbsp;controls the strength of the connection between two neurons. The weight is a big factor in deciding how much influence the input has on the output.<\/p>\n<p id=\"6f5a\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The<strong class=\"bm mk\">&nbsp;Bias&nbsp;<\/strong>is constant and will always have a value of 1, and is an additional input into the next layer. Bias guarantees that there will always be activation in the neurons shifting the activation function to the left or right, regardless of the inputs being 0. They are not influenced by the previous layer, however, they have outgoing connections with their own weights.<\/p>\n<p id=\"2c37\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The&nbsp;<strong class=\"bm mk\">Summation Function<\/strong>&nbsp;phase is when all the inputs are summed up and bias is added to it. This can be formulated as:<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/235\/1*61dp1qKESkyIzif1JOGomg.gif\" alt=\"\" width=\"235\" height=\"128\"><\/figure><div class=\"gl gm nw\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*61dp1qKESkyIzif1JOGomg.gif 640w, https:\/\/miro.medium.com\/max\/720\/1*61dp1qKESkyIzif1JOGomg.gif 720w, https:\/\/miro.medium.com\/max\/750\/1*61dp1qKESkyIzif1JOGomg.gif 750w, https:\/\/miro.medium.com\/max\/786\/1*61dp1qKESkyIzif1JOGomg.gif 786w, https:\/\/miro.medium.com\/max\/828\/1*61dp1qKESkyIzif1JOGomg.gif 828w, https:\/\/miro.medium.com\/max\/1100\/1*61dp1qKESkyIzif1JOGomg.gif 1100w, https:\/\/miro.medium.com\/max\/470\/1*61dp1qKESkyIzif1JOGomg.gif 470w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 235px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/towardsdatascience.com\/whats-the-role-of-weights-and-bias-in-a-neural-network-4cf7e9888a0f\" target=\"_blank\" rel=\"noopener\">TDS<\/a><\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"7156\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Activation Functions&nbsp;<\/strong>decide whether a neuron should be activated (fired) or not. This means that it will decide if the importance of the input is worth making a prediction on. They are very important in neural networks converging, and without them, the neural network would be made up of linear combinations. There are different types of activation functions which can be put into three main categories:<\/p>\n<ol class=\"\">\n<li id=\"3377\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Binary Step Function<\/li>\n<li id=\"7e91\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Linear Activation Function<\/li>\n<li id=\"ee0e\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Non-Linear Activation functions<\/li>\n<\/ol>\n<p id=\"869e\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Non-Linear Activation functions include the Sigmoid function, tanh function, and ReLU function.<\/p>\n<h2 id=\"925b\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Binary Step Function<\/h2>\n<p id=\"693a\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Binary Step Function produces binary output, either 1 (true) when the input passes the threshold limit. It produces 0 (false) when the input does not pass the threshold limit.<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/283\/1*NSItZWKQ9XGpVjj-p-CY0A.png\" alt=\"\" width=\"283\" height=\"80\"><\/figure><div class=\"gl gm nx\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*NSItZWKQ9XGpVjj-p-CY0A.png 640w, https:\/\/miro.medium.com\/max\/720\/1*NSItZWKQ9XGpVjj-p-CY0A.png 720w, https:\/\/miro.medium.com\/max\/750\/1*NSItZWKQ9XGpVjj-p-CY0A.png 750w, https:\/\/miro.medium.com\/max\/786\/1*NSItZWKQ9XGpVjj-p-CY0A.png 786w, https:\/\/miro.medium.com\/max\/828\/1*NSItZWKQ9XGpVjj-p-CY0A.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*NSItZWKQ9XGpVjj-p-CY0A.png 1100w, https:\/\/miro.medium.com\/max\/566\/1*NSItZWKQ9XGpVjj-p-CY0A.png 566w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 283px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.v7labs.com\/blog\/neural-networks-activation-functions#3-activation-types\" target=\"_blank\" rel=\"noopener ugc nofollow\">v7labs<\/a><\/div>\n<\/figure>\n<h2 id=\"2e94\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Linear Activation Function<\/h2>\n<p id=\"3cd1\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Linear Activation Function is also commonly known as Identity Function. It is a straight-line function where the activation function is proportional to the input, which includes the weighted sum of the neurons, including bias.<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/263\/1*zywTyXM_p3EVsv3Bx1iH3Q.png\" alt=\"\" width=\"263\" height=\"87\"><\/figure><div class=\"gl gm ny\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 640w, https:\/\/miro.medium.com\/max\/720\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 720w, https:\/\/miro.medium.com\/max\/750\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 750w, https:\/\/miro.medium.com\/max\/786\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 786w, https:\/\/miro.medium.com\/max\/828\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 1100w, https:\/\/miro.medium.com\/max\/526\/1*zywTyXM_p3EVsv3Bx1iH3Q.png 526w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 263px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.v7labs.com\/blog\/neural-networks-activation-functions#3-activation-types\" target=\"_blank\" rel=\"noopener ugc nofollow\">v7labs<\/a><\/div>\n<\/figure>\n<h2 id=\"fedc\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Non-Linear Activation Functions<\/h2>\n<p id=\"4a21\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Non-Linear Activation Functions are the most popular types of activation functions. This is due to many drawbacks with using Binary Step Function and Linear Activation Function. They make it easier for models to generalize data and be able to differentiate between the outputs.<\/p>\n<ol class=\"\">\n<li id=\"6dff\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Sigmoid Function: The input is transformed into values between 0 and 1.<\/li>\n<\/ol>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/283\/1*pN6UkRfkMq81sIrn0YmvCg.png\" alt=\"\" width=\"283\" height=\"224\"><\/figure><div class=\"gl gm nx\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*pN6UkRfkMq81sIrn0YmvCg.png 640w, https:\/\/miro.medium.com\/max\/720\/1*pN6UkRfkMq81sIrn0YmvCg.png 720w, https:\/\/miro.medium.com\/max\/750\/1*pN6UkRfkMq81sIrn0YmvCg.png 750w, https:\/\/miro.medium.com\/max\/786\/1*pN6UkRfkMq81sIrn0YmvCg.png 786w, https:\/\/miro.medium.com\/max\/828\/1*pN6UkRfkMq81sIrn0YmvCg.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*pN6UkRfkMq81sIrn0YmvCg.png 1100w, https:\/\/miro.medium.com\/max\/566\/1*pN6UkRfkMq81sIrn0YmvCg.png 566w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 283px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.researchgate.net\/figure\/Commonly-used-activation-functions-a-Sigmoid-b-Tanh-c-ReLU-and-d-LReLU_fig3_335845675\" target=\"_blank\" rel=\"noopener ugc nofollow\">researchgate<\/a><\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"df3f\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">2. Tanh Function: The input is transformed into values between -1 and 1.<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/283\/1*N8hbTSf07mskZtJ9HAshnA.png\" alt=\"\" width=\"283\" height=\"206\"><\/figure><div class=\"gl gm nx\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*N8hbTSf07mskZtJ9HAshnA.png 640w, https:\/\/miro.medium.com\/max\/720\/1*N8hbTSf07mskZtJ9HAshnA.png 720w, https:\/\/miro.medium.com\/max\/750\/1*N8hbTSf07mskZtJ9HAshnA.png 750w, https:\/\/miro.medium.com\/max\/786\/1*N8hbTSf07mskZtJ9HAshnA.png 786w, https:\/\/miro.medium.com\/max\/828\/1*N8hbTSf07mskZtJ9HAshnA.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*N8hbTSf07mskZtJ9HAshnA.png 1100w, https:\/\/miro.medium.com\/max\/566\/1*N8hbTSf07mskZtJ9HAshnA.png 566w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 283px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.researchgate.net\/figure\/Commonly-used-activation-functions-a-Sigmoid-b-Tanh-c-ReLU-and-d-LReLU_fig3_335845675\" target=\"_blank\" rel=\"noopener ugc nofollow\">researchgate<\/a><\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"3056\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">3. ReLU function: Short for Rectified Linear Activation Function. If the input is positive it will output it, however, if not it will output zero.<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/283\/1*a2vB0rmJz6Y5-__U77t50w.png\" alt=\"\" width=\"283\" height=\"180\"><\/figure><div class=\"gl gm nx\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*a2vB0rmJz6Y5-__U77t50w.png 640w, https:\/\/miro.medium.com\/max\/720\/1*a2vB0rmJz6Y5-__U77t50w.png 720w, https:\/\/miro.medium.com\/max\/750\/1*a2vB0rmJz6Y5-__U77t50w.png 750w, https:\/\/miro.medium.com\/max\/786\/1*a2vB0rmJz6Y5-__U77t50w.png 786w, https:\/\/miro.medium.com\/max\/828\/1*a2vB0rmJz6Y5-__U77t50w.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*a2vB0rmJz6Y5-__U77t50w.png 1100w, https:\/\/miro.medium.com\/max\/566\/1*a2vB0rmJz6Y5-__U77t50w.png 566w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 283px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.researchgate.net\/figure\/Commonly-used-activation-functions-a-Sigmoid-b-Tanh-c-ReLU-and-d-LReLU_fig3_335845675\" target=\"_blank\" rel=\"noopener ugc nofollow\">researchgate<\/a><\/div>\n<\/figure>\n<h1 id=\"1c4e\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">The Process<\/h1>\n<p id=\"9c36\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Once an input layer receives the data, weights are assigned. Weights help determine the importance of any given variable. The inputs are multiplied by their respective weights and then summed up.<\/p>\n<p id=\"ab5c\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">At this point, our input has been multiplied by its respective weight and the bias has been added to it. The output is then passed through an activation function, which determines the output. If the output exceeds a given threshold, it activated the neuron, passing the data from one connected layer to another in the network.<\/p>\n<p id=\"4adb\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The result in the next neuron now becomes the input for the next neuron. This process is called a feedforward network in a Neural Network as the information always moves in one direction (forward).<\/p>\n<p id=\"2284\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">In Neural Network, there is also a process called Backpropagation, sometimes abbreviated as \u201cbackprop.\u201d In Layman\u2019s terms,&nbsp;<strong class=\"bm mk\">Backpropagation<\/strong>&nbsp;is the messenger who tells the Neural Network whether it made a mistake when it made a prediction.<\/p>\n<p id=\"0cdc\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The word Propagate in this context means to transmit something, such as light or sound through a medium. To backpropagate is to send information back with the purpose of correcting an error. Backpropagation goes through these steps:<\/p>\n<ol class=\"\">\n<li id=\"d98d\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\">The Neural Network makes a guess about data<\/li>\n<li id=\"49a0\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">The Neural Network is measured with a loss function<\/li>\n<li id=\"3254\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">The error is backpropagated to be adjusted and corrected<\/li>\n<\/ol>\n<p id=\"bcad\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">When the network makes a guess about the data and causes an error, backpropagation takes the error and adjusts the neural network\u2019s parameters in the direction of less error.<\/p>\n<\/div>\n\n\n\n<div class=\"o dx nz oa id ob\" role=\"separator\"><\/div>\n\n\n\n<div class=\"ir is it iu iv\">\n<blockquote class=\"og\"><p id=\"8ae8\" class=\"oh oi iy bm oj ok ol om on oo op lg cn\" data-selectable-paragraph=\"\">Want to get the most up-to-date news on all things Deep Learning?&nbsp;<a class=\"au kj\" href=\"https:\/\/www.deeplearningweekly.com\/about\" target=\"_blank\" rel=\"noopener ugc nofollow\">Subscribe to Deep Learning Weekly<\/a>&nbsp;for the latest research, resources, and industry news, delivered to your inbox.<\/p><\/blockquote>\n<\/div>\n\n\n\n<div class=\"o dx nz oa id ob\" role=\"separator\"><\/div>\n\n\n\n<div class=\"ir is it iu iv\">\n<p id=\"333e\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">How does it know which direction to go into which has less error? Gradient Descent.<\/p>\n<h1 id=\"9254\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">Gradient Descent<\/h1>\n<p id=\"fb21\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">We need to understand Cost Function before we dive into Gradient Descent.<\/p>\n<p id=\"a9eb\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The&nbsp;<strong class=\"bm mk\">Cost Function<\/strong>&nbsp;is a measure of how well and efficiently a neural network did with respect to its given training sample and the expected output. Cost Function shows us how wrong the AI\u2019s outputs were from the correct outputs. Ideally, we want a Cost Function of 0, which tells us that our AI\u2019s outputs are the same as the data set outputs.<\/p>\n<p id=\"1dab\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">The formula:<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<div class=\"ka kb do kc ce kd\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/700\/1*Pa3V6SF8iuQ73esZztGicw.png\" alt=\"\" width=\"700\" height=\"168\"><\/figure><div class=\"gl gm oq\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/1*Pa3V6SF8iuQ73esZztGicw.png 640w, https:\/\/miro.medium.com\/max\/720\/1*Pa3V6SF8iuQ73esZztGicw.png 720w, https:\/\/miro.medium.com\/max\/750\/1*Pa3V6SF8iuQ73esZztGicw.png 750w, https:\/\/miro.medium.com\/max\/786\/1*Pa3V6SF8iuQ73esZztGicw.png 786w, https:\/\/miro.medium.com\/max\/828\/1*Pa3V6SF8iuQ73esZztGicw.png 828w, https:\/\/miro.medium.com\/max\/1100\/1*Pa3V6SF8iuQ73esZztGicw.png 1100w, https:\/\/miro.medium.com\/max\/1400\/1*Pa3V6SF8iuQ73esZztGicw.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.ibm.com\/cloud\/learn\/neural-networks#toc-what-are-n-2oQ5Vepe\" target=\"_blank\" rel=\"noopener ugc nofollow\">IBM<\/a><\/div>\n<\/div>\n<\/figure>\n<p data-selectable-paragraph=\"\">\n<\/p><p id=\"6c11\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\"><strong class=\"bm mk\">Gradient Descent<\/strong>&nbsp;is an optimization algorithm that is commonly used to train models and neural networks to help learn over time and reduce the Cost Function. Gradient Descent acts as a barometer, measuring its accuracy with each iteration of parameter updates.<\/p>\n<p id=\"18dd\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">A gradient is a slope, in which we can measure the angles. All slopes can be expressed as a relationship between two variables: \u201cy over x\u201d. In the case of Neural Networks, \u2018y\u2019 is the error produced and \u2018x\u2019 is the parameter of the Neural Network.<\/p>\n<p id=\"ad48\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">There is a relationship between the parameters and the error, therefore by changing the parameters we can either increase or decrease the error. We can use this to help us understand which gradient.<\/p>\n<p id=\"709d\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">It is implemented by changing the weights in small increments after each data set iteration. Updating the weights is done automatically in Deep Learning, that\u2019s the beauty of it and then we can see in which direction is the lowest error.<\/p>\n<figure class=\"nq nr ns nt gx jz gl gm paragraph-image\">\n<div class=\"ka kb do kc ce kd\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"ce ke kf c aligncenter\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/700\/0*qaKLf3rxgfwesr8U\" alt=\"\" width=\"700\" height=\"435\"><\/figure><div class=\"gl gm or\" style=\"text-align: center;\"><picture><source srcset=\"https:\/\/miro.medium.com\/max\/640\/0*qaKLf3rxgfwesr8U 640w, https:\/\/miro.medium.com\/max\/720\/0*qaKLf3rxgfwesr8U 720w, https:\/\/miro.medium.com\/max\/750\/0*qaKLf3rxgfwesr8U 750w, https:\/\/miro.medium.com\/max\/786\/0*qaKLf3rxgfwesr8U 786w, https:\/\/miro.medium.com\/max\/828\/0*qaKLf3rxgfwesr8U 828w, https:\/\/miro.medium.com\/max\/1100\/0*qaKLf3rxgfwesr8U 1100w, https:\/\/miro.medium.com\/max\/1400\/0*qaKLf3rxgfwesr8U 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\">Source: <\/picture><a class=\"au kj\" href=\"https:\/\/www.oreilly.com\/library\/view\/learn-arcore\/9781788830409\/e24a657a-a5c6-4ff2-b9ea-9418a7a5d24c.xhtml\" target=\"_blank\" rel=\"noopener ugc nofollow\">O\u2019Reilly<\/a><\/div>\n<\/div>\n<\/figure>\n<h1 id=\"5066\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">When should you use deep learning?<\/h1>\n<p id=\"a174\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">In order to understand if you should use deep learning, you should ask yourself these questions:<\/p>\n<h2 id=\"7535\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">What is the complexity of your problem?<\/h2>\n<p id=\"3a3a\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">One of the biggest uses of deep learning is being able to discover hidden patterns within the data and get a better understanding of the relationship between the different interdependent variables. Deep learning is widely used for complex tasks such as image classification, speech recognition, and natural language processing.<\/p>\n<p id=\"1d8a\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">However, some tasks are not as complex and don\u2019t require the need to process unstructured data. Therefore classic machine learning is a more effective approach.<\/p>\n<h2 id=\"9c71\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Is interpretability more important than accuracy?<\/h2>\n<p id=\"82af\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Deep Learning eliminates the need for human feature engineering, however, model interpretability is one of the biggest challenges in deep learning. Deep Learning makes it difficult for humans to be able to understand and interpret the model.<\/p>\n<p id=\"7861\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Deep Learning has achieves a high level of accuracy. But this is where you ask yourself, is accuracy or interpretability more important to you?<\/p>\n<h2 id=\"10b2\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Is your data good enough?<\/h2>\n<p id=\"3d07\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Although deep learning does not require human feature engineering, it still requires labeled data. A deep learning model is doing two jobs; feature extraction and classification. It eliminates the manual stage of Feature Extraction in a typical Machine Learning model done by a human. In order for Deep Learning models to handle complex tasks, they require a lot of labeled data in order for them to learn complex patterns.<\/p>\n<h2 id=\"e35b\" class=\"na li iy bm lj nb nc nd ln ne nf ng lr ku nh ni lv ky nj nk lz lc nl nm md nn ga\" data-selectable-paragraph=\"\">Do you have the resources, time, and funds?<\/h2>\n<p id=\"d3c4\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Deep Learning is expensive. This is due to their level of complexity using large amounts of data and layers to produce accurate effective outputs. Deep Learning models are also very slow to train, requiring heavy amounts of computational power. This makes them ineffective if your task is time and resource-sensitive.<\/p>\n<h1 id=\"dc89\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">What Deep Learning models already exist?<\/h1>\n<p id=\"427e\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Below are the top 10 deep learning algorithms you should know about:<\/p>\n<ol class=\"\">\n<li id=\"32b9\" class=\"mm mn iy bm b km kn kq kr ku mo ky mp lc mq lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Convolutional Neural Networks (CNNs)<\/li>\n<li id=\"7176\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Long Short Term Memory Networks (LSTMs)<\/li>\n<li id=\"5599\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Recurrent Neural Networks (RNNs)<\/li>\n<li id=\"4af8\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Generative Adversarial Networks (GANs)<\/li>\n<li id=\"d86f\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Autoencoders<\/li>\n<li id=\"7806\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Multilayer Perceptrons (MLPs)<\/li>\n<li id=\"2c40\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Restricted Boltzmann Machines (RBMs)<\/li>\n<li id=\"1fb6\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Radial Basis Function Networks (RBFNs)<\/li>\n<li id=\"3777\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Self Organizing Maps (SOMs)<\/li>\n<li id=\"d184\" class=\"mm mn iy bm b km mv kq mw ku mx ky my lc mz lg mr ms mt mu ga\" data-selectable-paragraph=\"\">Deep Belief Networks (DBNs)<\/li>\n<\/ol>\n<h1 id=\"7db6\" class=\"lh li iy bm lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me ga\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<p id=\"79dc\" class=\"pw-post-body-paragraph kk kl iy bm b km mf ko kp kq mg ks kt ku mh kw kx ky mi la lb lc mj le lf lg ir ga\" data-selectable-paragraph=\"\">Having a good understanding and grasp of the math behind Deep Learning will give you a better idea of when and when not to use Deep Learning. Appreciating the different Activation Functions and Cost Functions will help you produce the outputs you are looking for to solve your problem.<\/p>\n<p id=\"cb47\" class=\"pw-post-body-paragraph kk kl iy bm b km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf lg ir ga\" data-selectable-paragraph=\"\">Deep Learning is getting more popular by the day. DL models are achieving higher levels of accuracy than ever before, some tasks better than humans. Therefore, I believe that it is important that everyone, not only Data Scientists, Machine Learning Engineers, and other programmers, have an in-depth understanding of how Deep Learning works.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Photo by&nbsp;JJ Ying&nbsp;on&nbsp;Unsplash Our lives have transitioned to revolve around Artificial Intelligence (AI) and Machine Learning (ML). Everybody is talking about it and it\u2019s been implemented in our day-to-day tasks and actions without us even realizing sometimes. They are the hottest topics right now, and everybody wants to know more. People throw the term \u201cAI\u201d [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6],"tags":[],"coauthors":[139],"class_list":["post-4587","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Deep Learning: How it Works - Comet<\/title>\n<meta name=\"description\" content=\"Deep Learning\u00a0is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deep Learning: How it Works\" \/>\n<meta property=\"og:description\" content=\"Deep Learning\u00a0is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2022-11-11T01:48:17+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:16:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg\" \/>\n<meta name=\"author\" content=\"Nisha Arya Ahmed\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Nisha Arya Ahmed\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Deep Learning: How it Works - Comet","description":"Deep Learning\u00a0is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/","og_locale":"en_US","og_type":"article","og_title":"Deep Learning: How it Works","og_description":"Deep Learning\u00a0is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs.","og_url":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2022-11-11T01:48:17+00:00","article_modified_time":"2025-04-24T17:16:36+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg","type":"","width":"","height":""}],"author":"Nisha Arya Ahmed","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Nisha Arya Ahmed","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/"},"author":{"name":"Team Comet Digital","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/6266601170c60a7a82b3e0043fbe8ddf"},"headline":"Deep Learning: How it Works","datePublished":"2022-11-11T01:48:17+00:00","dateModified":"2025-04-24T17:16:36+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/"},"wordCount":1918,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg","articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/","url":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/","name":"Deep Learning: How it Works - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg","datePublished":"2022-11-11T01:48:17+00:00","dateModified":"2025-04-24T17:16:36+00:00","description":"Deep Learning\u00a0is a Machine Learning method that teaches computers to do what comes naturally to humans. It trains an algorithm to predict outputs, given a set of inputs.","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#primaryimage","url":"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg","contentUrl":"https:\/\/miro.medium.com\/max\/700\/1*ykKb1xNcvIPZa52h612hfg.jpeg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/deep-learning-how-it-works\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Deep Learning: How it Works"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/6266601170c60a7a82b3e0043fbe8ddf","name":"Team Comet Digital","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/4f0c0a8cc7c0e87c636ff6a420a6647c","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/Screen-Shot-2023-08-12-at-8.58.50-AM-96x96.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/Screen-Shot-2023-08-12-at-8.58.50-AM-96x96.png","caption":"Team Comet Digital"},"sameAs":["https:\/\/www.comet.ml\/"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/teamcometdigital\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/4587","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=4587"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/4587\/revisions"}],"predecessor-version":[{"id":15652,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/4587\/revisions\/15652"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=4587"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=4587"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=4587"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=4587"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}