{"id":7130,"date":"2023-08-14T05:01:20","date_gmt":"2023-08-14T13:01:20","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=7130"},"modified":"2025-04-24T17:14:48","modified_gmt":"2025-04-24T17:14:48","slug":"first-step-to-object-detection-algorithms","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/","title":{"rendered":"First Step to Object Detection Algorithms"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\">\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<figure class=\"lx ly lz ma mb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg\" alt=\"\" width=\"700\" height=\"383\"><\/figure><div class=\"lu lv lw\"><picture><\/picture><\/div>\n<\/div><figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\">Photo by <a class=\"af mo\" href=\"https:\/\/unsplash.com\/@brechtdenil?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">Brecht Denil<\/a> on <a class=\"af mo\" href=\"https:\/\/unsplash.com\/photos\/H54mZnQua8k?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\" rel=\"noopener ugc nofollow\">Unsplash<\/a><\/figcaption><\/figure>\n<p id=\"a543\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Object detection is a field of computer vision used to identify and position objects within an image. Examples of object detection applications include detecting abnormal movement from security cameras, obstacle detection in autonomous driving, and character detection from within a document.<\/p>\n<h2 id=\"1774\" class=\"nm nn fo be no np nq nr ns nt nu nv nw mz nx ny nz nd oa ob oc nh od oe of og bj\" data-selectable-paragraph=\"\">How do Object Detection Algorithms Work?<\/h2>\n<p id=\"c6bf\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">There are two main categories of object detection algorithms.<\/p>\n<ol class=\"\">\n<li id=\"9880\" class=\"mp mq fo be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl op oq or bj\" data-selectable-paragraph=\"\"><strong class=\"be os\">Two-Stage Algorithms: <\/strong><br>\nTwo-stage object detection algorithms consist of two different stages. In the first step, potential object areas in the image are determined. In the second step, these potential fields are classified and corrected by the neural network model. Two-stage methods are more accurate than single-stage methods, but they work more slowly. R-CNN (Regions with Convolutional Neural Networks) and similar two-stage object detection algorithms are the most widely used in this regard.<\/li>\n<li id=\"b3b6\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl op oq or bj\" data-selectable-paragraph=\"\"><strong class=\"be os\">Single-Stage Algorithms:<\/strong><br>\nSingle-stage object detection algorithms identify potential regions directly from within the image and then objects. Single-stage object detection algorithms do the whole process through a single neural network model. These algorithms aim to predict the objects in the image all at once and therefore the processing speed is very high. However, the accuracy rates of the single-stage methods are lower than the two-stage methods. Single-stage object detection algorithms such as YOLO (You Only Look Once) and SSD (Single Shot MultiBox Detector) are the most widely used algorithms in this regard. We will also examine these algorithms in the following stages of the article.<\/li>\n<\/ol>\n<blockquote class=\"oy oz pa\"><p id=\"48d8\" class=\"mp mq pb be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl fh bj\" data-selectable-paragraph=\"\">This blog lists the workings of different object detection algorithms and compares them with similar algorithms.<\/p><\/blockquote>\n<h1 id=\"0e3b\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">R-CNN (Regions with CNNs)<\/h1>\n<p id=\"b923\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">R-CNN (Regions with CNN or Region-based CNN) is an object detection algorithm that uses a CNN (Convolutional Neural Network) to identify objects within an image.<\/p>\n<figure class=\"px py pz qa qb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*_SUBWSl3PHDcPPZz.jpg\" alt=\"\" width=\"700\" height=\"247\"><\/figure><div class=\"lu lv pw\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*_SUBWSl3PHDcPPZz.jpg 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*_SUBWSl3PHDcPPZz.jpg 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*_SUBWSl3PHDcPPZz.jpg 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*_SUBWSl3PHDcPPZz.jpg 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*_SUBWSl3PHDcPPZz.jpg 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*_SUBWSl3PHDcPPZz.jpg 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*_SUBWSl3PHDcPPZz.jpg 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*_SUBWSl3PHDcPPZz.jpg 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\">Source: <a class=\"af mo\" href=\"https:\/\/paperswithcode.com\/method\/r-cnn\" target=\"_blank\" rel=\"noopener ugc nofollow\">https:\/\/paperswithcode.com\/method\/r-cnn<\/a><\/figcaption>\n<\/figure>\n<p id=\"f2fb\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">The R-CNN algorithm divides an image into parts that likely contain objects of interest and examines each of these parts separately. It then detects which objects are in these regions. The R-CNN algorithm can make sensitive detections and has a high accuracy rate. However, this algorithm is slower than other algorithms.<\/p>\n<h1 id=\"90c3\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">Mask R-CNN<\/h1>\n<p id=\"d021\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\"><strong class=\"be os\">Mask R-CNN (Masked Region-based Convolutional Neural Network)<\/strong> is an object detection and sample separation algorithm. This is an extension of the Faster R-CNN architecture, that is, a <strong class=\"be os\">two-stage object detection<\/strong> algorithm.<\/p>\n<p id=\"c9a9\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Mask R-CNN is more powerful than other object recognition models because it also supports object segmentation. This is a useful feature for pinpointing the exact location of the object in the image and is also used in image analysis applications.<\/p>\n<figure class=\"px py pz qa qb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*fAk0wTAtrMN49WIY.png\" alt=\"\" width=\"700\" height=\"340\"><\/figure><div class=\"lu lv qc\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*fAk0wTAtrMN49WIY.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*fAk0wTAtrMN49WIY.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*fAk0wTAtrMN49WIY.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*fAk0wTAtrMN49WIY.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*fAk0wTAtrMN49WIY.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*fAk0wTAtrMN49WIY.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*fAk0wTAtrMN49WIY.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*fAk0wTAtrMN49WIY.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*fAk0wTAtrMN49WIY.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*fAk0wTAtrMN49WIY.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*fAk0wTAtrMN49WIY.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*fAk0wTAtrMN49WIY.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*fAk0wTAtrMN49WIY.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*fAk0wTAtrMN49WIY.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\">Source: <a class=\"af mo\" href=\"https:\/\/paperswithcode.com\/method\/mask-r-cnn\" target=\"_blank\" rel=\"noopener ugc nofollow\">https:\/\/paperswithcode.com\/method\/mask-r-cnn<\/a><\/figcaption>\n<\/figure>\n<ol class=\"\">\n<li id=\"a0b5\" class=\"mp mq fo be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl op oq or bj\" data-selectable-paragraph=\"\">First step: attributes of the input image (for example, object positions, dimensions, etc.) are extracted. This is done by the \u201cbackbone network\u201d part of the model and is designed to extract the features of the image primarily by a customized CNN.<\/li>\n<li id=\"f337\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl op oq or bj\" data-selectable-paragraph=\"\">Second Step: objects are defined and segmented using attributes. At this stage, its position, size, and segmentation mask (determining the exact area of the object in the image) are determined for each object.<\/li>\n<\/ol>\n<p id=\"6a20\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">The Mask R-CNN model is trained with a combination of directed learning and relearning techniques. The model is trained with a large dataset of marked images, where objects of interest and their respective masks are labeled. During training, the model is presented with an image with its own real labels and learns to predict the class and position of each object in the image, as well as the corresponding mask.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"ab ca qd qe qf qg\" role=\"separator\"><\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<blockquote class=\"ql\"><p id=\"31c1\" class=\"qm qn fo be qo qp qq qr qs qt qu nl dv\" data-selectable-paragraph=\"\">When you\u2019re working on an enterprise scale, managing your ML models can be tricky. <a class=\"af mo\" href=\"https:\/\/www.comet.com\/site\/customers\/uber\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">Learn how the team at Uber created a solution for their experiment management needs.<\/a><\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"ab ca qd qe qf qg\" role=\"separator\"><\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<h1 id=\"4c5f\" class=\"pc nn fo be no pd qv pf ns pg qw pi nw pj qx pl pm pn qy pp pq pr qz pt pu pv bj\" data-selectable-paragraph=\"\">Faster R-CNN:<\/h1>\n<p id=\"debb\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">The Faster R-CNN algorithm is trained on datasets during the learning process. These datasets consist of pre-labeled images and the positions of the objects contained in each image have been labeled. After the algorithm is trained on the datasets, it scans the input images and identifies the objects.<\/p>\n<ul class=\"\">\n<li id=\"02f1\" class=\"mp mq fo be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Faster R-CNN has a two-stage pipeline structure and feature extraction is performed in the first stage, and object recognition and positioning are performed in the second stage. This structure increases the efficiency and accuracy of the model and allows it to produce a faster result.<\/li>\n<li id=\"b2fa\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Faster R-CNN uses a Region Proposal Network (RPN) in the object definition step. This mesh helps identify potential object regions in the image and reduces unnecessary work for the object recognition step.<\/li>\n<li id=\"c9d8\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Multi-task learning: Faster R-CNN learns on multiple tasks simultaneously. The model simultaneously performs tasks such as object recognition and positioning, which increases the model\u2019s efficiency and accuracy.<\/li>\n<\/ul>\n<figure class=\"px py pz qa qb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*wmoMrPgVmnM5Uhtx.jpeg\" alt=\"\" width=\"700\" height=\"370\"><\/figure><div class=\"lu lv rb\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*wmoMrPgVmnM5Uhtx.jpeg 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*wmoMrPgVmnM5Uhtx.jpeg 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*wmoMrPgVmnM5Uhtx.jpeg 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*wmoMrPgVmnM5Uhtx.jpeg 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*wmoMrPgVmnM5Uhtx.jpeg 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*wmoMrPgVmnM5Uhtx.jpeg 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*wmoMrPgVmnM5Uhtx.jpeg 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*wmoMrPgVmnM5Uhtx.jpeg 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\">Source: towardsdatascience.com\/fast-r-cnn-for-object-detection-a-technical-summary-a0ff94faa022<\/figcaption>\n<\/figure>\n<p id=\"f3f6\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">The Faster R-CNN algorithm works similarly to the R-CNN algorithm, but works faster and makes more precise detections.<\/p>\n<h1 id=\"5ab1\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">SSD (Single Shot Detector)<\/h1>\n<p id=\"2ede\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">Single Shot MultiBox Detector (SSD) is for object recognition and localization. This model aims to identify and localize multiple objects in a single run.<\/p>\n<p id=\"c51b\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">SSD analyzes the input image at multiple different scales at the same time and uses multiple object identification frames (anchors) at each scale. Each anchor is designed based on the expected dimensions of objects in the image and is used to estimate the position of the object in the image.<\/p>\n<p id=\"54b3\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Another feature of SSD that differs from other object recognition models is that it performs a single classification step. In other models, object recognition and localization are done separately with the region proposal network (RPN), while SSD does a single classification step for each object.<\/p>\n<p id=\"a24d\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Other advantages of SSD include speed and efficiency. The model works faster than other object recognition models and also has a higher success rate in image analytics applications.<\/p>\n<h1 id=\"6f56\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">YOLO (You Only Look Once)<\/h1>\n<p id=\"e020\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">The YOLO algorithm scans the given image in one go and divides it into parts. The goal is to identify potential regions and then objects. In each of these parts, it detects whether there are objects and detects the <strong class=\"be os\">position<\/strong> of the object. It is an algorithm that is famous for its speed and high accuracy rate.<\/p>\n<ul class=\"\">\n<li id=\"a3e5\" class=\"mp mq fo be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Image Partitions: The image is divided into squares of certain sizes and object recognition is performed for each frame.<\/li>\n<li id=\"e989\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Object Predictions: For each frame, object predictions are made by the neural network, and the region of the object is determined.<\/li>\n<li id=\"8730\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Class Predictions: Class estimates are made for each region and what the object is is determined.<\/li>\n<li id=\"054b\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">Object Classification and Regional Positioning: Object recognition is completed by combining class and regional location data for each object.<\/li>\n<\/ul>\n<figure class=\"px py pz qa qb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*sUzuKJw1eA6whViW.png\" alt=\"\" width=\"700\" height=\"372\"><\/figure><div class=\"lu lv rc\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*sUzuKJw1eA6whViW.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*sUzuKJw1eA6whViW.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*sUzuKJw1eA6whViW.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*sUzuKJw1eA6whViW.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*sUzuKJw1eA6whViW.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*sUzuKJw1eA6whViW.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*sUzuKJw1eA6whViW.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*sUzuKJw1eA6whViW.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*sUzuKJw1eA6whViW.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*sUzuKJw1eA6whViW.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*sUzuKJw1eA6whViW.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*sUzuKJw1eA6whViW.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*sUzuKJw1eA6whViW.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*sUzuKJw1eA6whViW.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\"><a class=\"af mo\" href=\"https:\/\/pjreddie.com\/darknet\/yolo\/\" target=\"_blank\" rel=\"noopener ugc nofollow\">https:\/\/pjreddie.com\/darknet\/yolo\/<\/a><\/figcaption>\n<\/figure>\n<h2 id=\"bdf4\" class=\"nm nn fo be no np nq nr ns nt nu nv nw mz nx ny nz nd oa ob oc nh od oe of og bj\" data-selectable-paragraph=\"\">YOLOv3 (You Only Look Once version 3)<\/h2>\n<p id=\"3ef2\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">YOLOv3 (YOLO version 3) is an object detection algorithm for detecting and classifying objects in images or video frames. The main difference between YOLO and YOLOv3 is that YOLOv3 is more accurate and efficient than YOLO. YOLOv3 is a newer version of YOLO and was released in 2018. It has been developed in many ways, including:<\/p>\n<ul class=\"\">\n<li id=\"3b3f\" class=\"mp mq fo be b mr ms mt mu mv mw mx my om na nb nc on ne nf ng oo ni nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">YOLOv3 uses anchor boxes, which are predefined boxes, to detect objects. These anchor boxes help YOLOv3 handle objects of different shapes and sizes more effectively.<\/li>\n<li id=\"018f\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">YOLOv3 uses a stronger convolutional neural network (CNN) architecture, allowing it to learn more complex patterns and perform better on tasks such as object detection.<\/li>\n<li id=\"973f\" class=\"mp mq fo be b mr ot mt mu mv ou mx my om ov nb nc on ow nf ng oo ox nj nk nl ra oq or bj\" data-selectable-paragraph=\"\">YOLOv3 also uses a multi-scale training and prediction approach, allowing it to detect objects at different scales in the input image.<\/li>\n<\/ul>\n<p id=\"3933\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Overall, YOLOv3 is a more advanced object detection algorithm than YOLO and has the ability to achieve higher accuracy and efficiency.<\/p>\n<h1 id=\"bd47\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">DSSD (Deconvolutional Single Shot Detector)<\/h1>\n<p id=\"fea7\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">DSSD (Deconvolutional Single Shot Detector) is a single-stage object detection algorithm developed to improve its speed and accuracy of object detection. It is based on Single Shot Detector (SSD) architecture, which is a fast and effective object detection algorithm widely used in various applications.<\/p>\n<figure class=\"px py pz qa qb mc lu lv paragraph-image\">\n<div class=\"md me eb mf bg mg\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mh mi c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*RHlQCKkShmiPOge6pP0OzQ.png\" alt=\"\" width=\"700\" height=\"421\"><\/figure><div class=\"lu lv rd\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*RHlQCKkShmiPOge6pP0OzQ.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*RHlQCKkShmiPOge6pP0OzQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*RHlQCKkShmiPOge6pP0OzQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*RHlQCKkShmiPOge6pP0OzQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*RHlQCKkShmiPOge6pP0OzQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*RHlQCKkShmiPOge6pP0OzQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*RHlQCKkShmiPOge6pP0OzQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*RHlQCKkShmiPOge6pP0OzQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mj mk ml lu lv mm mn be b bf z dv\" data-selectable-paragraph=\"\"><strong class=\"be os\">Top:<\/strong> SSD, <strong class=\"be os\">Bottom:<\/strong> DSSD, <strong class=\"be os\">Source:<\/strong> Fu, Cheng-Yang, et al. \u201cDssd: Deconvolutional single shot detector.\u201d <em class=\"re\">arXiv preprint arXiv:1701.06659<\/em> (2017).<\/figcaption>\n<\/figure>\n<p id=\"1b8c\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">Like SSD, DSSD uses a convolutional neural network (CNN) to process the input image and predict the position and class of objects in the image. However, DSSD brings several improvements to improve the performance of SSD architecture.<\/p>\n<p id=\"ef3c\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">A major improvement in DSSD is the use of deconvolutional layers that upscale CNN-generated feature maps. These layers improve the spatial resolution of feature maps, which is important for the accurate positioning and identification of small objects in an image.<\/p>\n<h1 id=\"771d\" class=\"pc nn fo be no pd pe pf ns pg ph pi nw pj pk pl pm pn po pp pq pr ps pt pu pv bj\" data-selectable-paragraph=\"\">Conclusion<\/h1>\n<p id=\"a2ad\" class=\"pw-post-body-paragraph mp mq fo be b mr oh mt mu mv oi mx my mz oj nb nc nd ok nf ng nh ol nj nk nl fh bj\" data-selectable-paragraph=\"\">We\u2019ve gotten acquainted with object detection and understood its basic logic. We\u2019ve also looked at the R-CNN, SSD, YOLO, Mask R-CNN, Faster R-CNN, YOLOv3, and DSSD algorithms.<\/p>\n<p id=\"5946\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">You can <a class=\"af mo\" href=\"https:\/\/iremkomurcu.medium.com\/\" rel=\"noopener\">follow my Medium account<\/a>, and if you like the article, you can present your appreciation with claps.<\/p>\n<p id=\"5818\" class=\"pw-post-body-paragraph mp mq fo be b mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl fh bj\" data-selectable-paragraph=\"\">You can also follow and communicate with me on social media. Thanks!<\/p>\n<p>https:\/\/iremkomurcu.com\/<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Photo by Brecht Denil on Unsplash Object detection is a field of computer vision used to identify and position objects within an image. Examples of object detection applications include detecting abnormal movement from security cameras, obstacle detection in autonomous driving, and character detection from within a document. How do Object Detection Algorithms Work? There are [&hellip;]<\/p>\n","protected":false},"author":74,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[6],"tags":[],"coauthors":[171],"class_list":["post-7130","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>First Step to Object Detection Algorithms - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"First Step to Object Detection Algorithms\" \/>\n<meta property=\"og:description\" content=\"Photo by Brecht Denil on Unsplash Object detection is a field of computer vision used to identify and position objects within an image. Examples of object detection applications include detecting abnormal movement from security cameras, obstacle detection in autonomous driving, and character detection from within a document. How do Object Detection Algorithms Work? There are [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-14T13:01:20+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:14:48+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg\" \/>\n<meta name=\"author\" content=\"Irem Komurcu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Irem Komurcu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"First Step to Object Detection Algorithms - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/","og_locale":"en_US","og_type":"article","og_title":"First Step to Object Detection Algorithms","og_description":"Photo by Brecht Denil on Unsplash Object detection is a field of computer vision used to identify and position objects within an image. Examples of object detection applications include detecting abnormal movement from security cameras, obstacle detection in autonomous driving, and character detection from within a document. How do Object Detection Algorithms Work? There are [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-08-14T13:01:20+00:00","article_modified_time":"2025-04-24T17:14:48+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg","type":"","width":"","height":""}],"author":"Irem Komurcu","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Irem Komurcu","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/"},"author":{"name":"Irem Komurcu","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/bd9f8cd6514218b6ad4a7620a68ba1eb"},"headline":"First Step to Object Detection Algorithms","datePublished":"2023-08-14T13:01:20+00:00","dateModified":"2025-04-24T17:14:48+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/"},"wordCount":1430,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg","articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/","url":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/","name":"First Step to Object Detection Algorithms - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg","datePublished":"2023-08-14T13:01:20+00:00","dateModified":"2025-04-24T17:14:48+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*lqph4TLQsUdiSeBNuvIMmw.jpeg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/first-step-to-object-detection-algorithms\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"First Step to Object Detection Algorithms"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/bd9f8cd6514218b6ad4a7620a68ba1eb","name":"Irem Komurcu","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/40f7738f502b3180fb9452d4b9ed449a","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1649927440797-96x96.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1649927440797-96x96.jpg","caption":"Irem Komurcu"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/iremkomurcubmgmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7130","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/74"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=7130"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7130\/revisions"}],"predecessor-version":[{"id":15582,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7130\/revisions\/15582"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=7130"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=7130"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=7130"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=7130"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}