{"id":5740,"date":"2023-05-03T11:22:36","date_gmt":"2023-05-03T19:22:36","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=5740"},"modified":"2025-04-24T17:15:34","modified_gmt":"2025-04-24T17:15:34","slug":"compare-object-detection-models-from-torchvision","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/","title":{"rendered":"Compare Object Detection Models From TorchVision"},"content":{"rendered":"\n<figure class=\"wp-block-image aligncenter size-full wp-image-5741\"><img loading=\"lazy\" decoding=\"async\" width=\"2062\" height=\"1188\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM.png\" alt=\"Comparing object detection model predictions with masks and bounding boxes.\" class=\"wp-image-5741\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM.png 2062w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM-300x173.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM-1024x590.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM-768x442.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM-1536x885.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.20.55-PM-2048x1180.png 2048w\" sizes=\"auto, (max-width: 2062px) 100vw, 2062px\" \/><figcaption class=\"wp-element-caption\"><a href=\"https:\/\/www.comet.com\/anmorgan24\/torchvision-object-detection\/view\/xhYVJ6hqqvNG537dFs9Q5Vxa0\/panels\">Comparing object detection predictions in Comet<\/a>; GIF by author<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Introduction<\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Object detection is one of the most popular applications of machine learning for computer vision. A detection model predicts both the class types and locations of each distinct object in an image. Object detection models have a wide range of applications including manufacturing, surveillance, health care, and more.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">TorchVision is a Python package that extends the PyTorch framework for computer vision use cases. In TorchVision\u2019s detection module, developers can find pre-trained object detection models that are ready to be fine-tuned on their own datasets. But how can you systematically find the best model for a particular use-case? Here, we&#8217;ll explore how to use an experiment tracking tool like Comet to visually compare and evaluate object detection models.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Follow along with the full code in <\/span><a href=\"https:\/\/colab.research.google.com\/drive\/1_JDpLmluoN6BoR4WvbpnSibhMxXW82re#scrollTo=7-OqwEUj98VK\"><span style=\"font-weight: 400;\">this Colab<\/span><\/a><span style=\"font-weight: 400;\"> and check out <\/span><a href=\"https:\/\/www.comet.com\/anmorgan24\/torchvision-object-detection\/view\/i2d2OPoUb5tGTSPiaFckgwp6D\/panels\"><span style=\"font-weight: 400;\">the public project here<\/span><\/a><span style=\"font-weight: 400;\">!<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">What is Object Detection?<\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Object detection is a computer vision task that aims to identify instances of objects in images and assign them to specific classes. At a low level, object detection seeks to answer the question, \u201cwhat objects are where?\u201d<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5745\"><img loading=\"lazy\" decoding=\"async\" width=\"711\" height=\"484\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/turtle_detection.gif\" alt=\"Object detection of sea animals.\" class=\"wp-image-5745\"\/><figcaption class=\"wp-element-caption\">Detecting sea animals; GIF by author<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Object detection algorithms are generally separated into two categories: single-stage (RetinaNet, SSD, FCOS, YOLO, etc.) and two-stage (Fast RCNN, Mask RCNN, FPN, etc.). In two-stage detectors, one model is used to extract generalized regions of objects, and a second model is used to classify and further refine the location of an object. Single-stage detectors do all of this in one step. Single-stage detectors tend to be faster and less computationally expensive than two-stage detectors, but they\u2019re also less accurate.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5748\"><img loading=\"lazy\" decoding=\"async\" width=\"1086\" height=\"718\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-9.45.14-AM.png\" alt=\"Diagram of one stage and two stage object detection.\" class=\"wp-image-5748\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-9.45.14-AM.png 1086w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-9.45.14-AM-300x198.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-9.45.14-AM-1024x677.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-9.45.14-AM-768x508.png 768w\" sizes=\"auto, (max-width: 1086px) 100vw, 1086px\" \/><figcaption class=\"wp-element-caption\">Image from <em><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/2107.07153\">Semantic Image Cropping<\/a><\/em>, by Oriol Corcoll Andreu<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Finetuning Pre-trained Models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">The best object detection models are trained on tens, if not hundreds, of thousands of labeled images. What\u2019s more, image datasets themselves are inherently computationally expensive to process. To train an object detection model from scratch requires a lot of time and resources that aren\u2019t always available. To train several object detection models for comparison requires even more time and resources. Thankfully, we don\u2019t have to. Instead, we can use transfer learning or fine-tune pre-trained models.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In essence, both these methods allow us to take advantage of the weights and biases learned from one task and repurpose them on a new task. By leveraging feature representations from a pre-trained model, we don\u2019t have to train a new model from scratch, saving us time and compute resources. What\u2019s more, these methods can contribute to rapid boosts in model performance for little overhead.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5750\"><img loading=\"lazy\" decoding=\"async\" width=\"1380\" height=\"1102\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-10.01.29-AM.png\" alt=\"Difference in model performance with transfer learning and without transfer learning\" class=\"wp-image-5750\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-10.01.29-AM.png 1380w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-10.01.29-AM-300x240.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-10.01.29-AM-1024x818.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-10.01.29-AM-768x613.png 768w\" sizes=\"auto, (max-width: 1380px) 100vw, 1380px\" \/><figcaption class=\"wp-element-caption\">Image from <em><a href=\"https:\/\/www.researchgate.net\/figure\/A-graphic-representation-of-the-potential-benefits-of-transfer-learning_fig79_348512084\">A graphic representation of the potential benefits of transfer learning<\/a><\/em>, by Laura Aelenei<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Transfer learning and fine-tuning are similar processes but with one key difference. In transfer learning, all previously trained layers are frozen, and (optionally) additional layers are added for retraining. In fine-tuning, all previously trained layers are retrained, but at a very low learning rate. Both methods typically result in boosted initial performance, steeper improvement slopes, and elevated final performance.&nbsp;<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">TorchVision&#8217;s Pre-Trained Models<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In real-world applications, we often make choices to balance accuracy and speed. The performance of a model under a given set of circumstances might not be relevant if we aren\u2019t able to replicate those circumstances in production. So when looking for the \u201cbest\u201d object detection model, it becomes essential to monitor a wide range of metrics pertaining your particular use case. In this tutorial, we\u2019ll show you how Comet helps us do this.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">TorchVision\u2019s detection module comes with several pre-trained models already built in. For this tutorial we will be comparing <\/span><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/1504.08083\"><span style=\"font-weight: 400;\">Fast-RCNN<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/1506.01497\"><span style=\"font-weight: 400;\">Faster-RCNN<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/1703.06870\"><span style=\"font-weight: 400;\">Mask-RCNN<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/1708.02002\"><span style=\"font-weight: 400;\">RetinaNet<\/span><\/a><span style=\"font-weight: 400;\">, and <\/span><a href=\"https:\/\/ar5iv.labs.arxiv.org\/html\/1904.01355\"><span style=\"font-weight: 400;\">FCOS<\/span><\/a><span style=\"font-weight: 400;\">, with either <\/span><a href=\"https:\/\/pytorch.org\/vision\/master\/models\/generated\/torchvision.models.resnet50.html\"><span style=\"font-weight: 400;\">ResNet50<\/span><\/a><span style=\"font-weight: 400;\"> of <\/span><a href=\"https:\/\/pytorch.org\/vision\/main\/models\/mobilenetv2.html\"><span style=\"font-weight: 400;\">MobileNet v2<\/span><\/a><span style=\"font-weight: 400;\"> backbones. Each of these models was previously trained on the <\/span><a href=\"https:\/\/cocodataset.org\/#home\"><span style=\"font-weight: 400;\">COCO dataset<\/span><\/a><span style=\"font-weight: 400;\">. We will download the trained models, replace the classifier heads to reflect our target classes, and retrain the models on our own data.<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Image Data Formats<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In computer vision, images are represented as matrices of pixel intensity values. Black and white (grayscale) images are usually two-dimensional, and color images are typically three-dimensional, with one \u201clayer\u201d each representing red, blue, and green pixels.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Just as there are several ways to represent images, there are also several ways we can represent our labels and predictions. In the <\/span><a href=\"https:\/\/colab.research.google.com\/drive\/1_JDpLmluoN6BoR4WvbpnSibhMxXW82re#scrollTo=wxYQgzSC8Z_n\"><span style=\"font-weight: 400;\">full code for this tutorial<\/span><\/a><span style=\"font-weight: 400;\">, we\u2019ll provide methods for logging bounding boxes, <\/span><a href=\"https:\/\/huggingface.co\/tasks\/image-segmentation\"><span style=\"font-weight: 400;\">segmentation masks<\/span><\/a><span style=\"font-weight: 400;\">, and <\/span><a href=\"https:\/\/humansintheloop.org\/services\/polygon-annotation\/\"><span style=\"font-weight: 400;\">polygon annotations<\/span><\/a><span style=\"font-weight: 400;\"> to Comet. But when comparing our TorchVision models we will only use bounding boxes, as not all of our models are able to calculate the other types of predictions.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">To further complicate things, not all algorithms format bounding box annotations in the same way. Below we\u2019ve listed a few of the most common bounding box formats you\u2019re likely to run into, but we\u2019ll be focusing on Pascal VOC and COCO formats in this tutorial.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5753\"><img loading=\"lazy\" decoding=\"async\" width=\"1638\" height=\"1110\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM.png\" alt=\"An image of a cat and a potted plant showing different bounding box formats, including COCO, YOLO, Albumentations, and Pascal VOC\" class=\"wp-image-5753\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM.png 1638w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM-300x203.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM-1024x694.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM-768x520.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-7.55.33-AM-1536x1041.png 1536w\" sizes=\"auto, (max-width: 1638px) 100vw, 1638px\" \/><figcaption class=\"wp-element-caption\">Image from <a href=\"https:\/\/albumentations.ai\/docs\/getting_started\/bounding_boxes_augmentation\/\">Albumentations<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><b>Pascal VOC:<\/b><span style=\"font-weight: 400;\"> [xmin, ymin, xmax, ymax] \u2192 [98, 345, 420, 462]<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><b>Albumentations:<\/b><span style=\"font-weight: 400;\"> normalized([x_min, y_min, x_max, y_max]) \u2192 [0.153125, 0.71875, 0.65625, 0.9625]<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><b>COCO:<\/b><span style=\"font-weight: 400;\"> [xmin, ymin, width, height] \u2192 [98, 345, 322, 117]<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><b>YOLO:<\/b><span style=\"font-weight: 400;\"> normalized([x_center, y_center, width, height]) \u2192 [0.4046875, 0.8614583, 0.503125, 0.24375]<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Evaluation Metrics for Object Detection<\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">A naive approach to evaluating object detection models might be binary classification (\u201cmatch\u201d or \u201cno match\u201d, \u201c1\u201d or \u201c0\u201d), but this method leaves little room for nuance. We can do better!<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Intersection Over Union (IOU)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">The standard evaluation metric for comparing individual bounding boxes is Intersection over Union, or IoU. IoU evaluates the degree of overlap between the ground truth bounding box and the predicted bounding box with a value between 0 and 1.&nbsp;<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5755\"><img loading=\"lazy\" decoding=\"async\" width=\"1792\" height=\"962\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM.png\" alt=\"Image of multiple bounding boxes overlapping, explaining how to calculate intersection and union for object detection\" class=\"wp-image-5755\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM.png 1792w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM-300x161.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM-1024x550.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM-768x412.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-05-01-at-10.45.51-PM-1536x825.png 1536w\" sizes=\"auto, (max-width: 1792px) 100vw, 1792px\" \/><figcaption class=\"wp-element-caption\">Intersection over Union, image from Shivy Yohanandan in <a href=\"https:\/\/towardsdatascience.com\/map-mean-average-precision-might-confuse-you-5956f1bfa9e2\">Towards Data Science<\/a><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5758\"><img loading=\"lazy\" decoding=\"async\" width=\"1239\" height=\"413\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/turtle_iou.gif\" alt=\"Selecting the intersection and union of bounding boxes for object detection on an image of a turtle underwater with bright yellow fish.\" class=\"wp-image-5758\"\/><figcaption class=\"wp-element-caption\">The difference between the intersection and union of a ground truth bounding box and a prediction bounding box; GIF by author<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">IoU is determined for each set of bounding boxes in an image and then a threshold is applied. If an IoU meets the threshold, it\u2019s marked as a \u201ctrue positive.\u201d All predictions not marked as \u201ctrue positives\u201d are marked as \u201cfalse positives,\u201d and any items left in our \u201cground truth\u201d annotations list are marked as \u201cfalse negatives. The decision to mark a detection as TP, FP, or FN is completely contingent on the choice of IoU threshold. The IoU threshold is commonly set at 0.5, but you may want to experiment with this number. Once we\u2019ve calculated our confusion matrix, we can compute precision and recall.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full wp-image-5759\"><img loading=\"lazy\" decoding=\"async\" width=\"2402\" height=\"788\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM.png\" alt=\"Image showing the difference between true positives (TP), true negatives (TN), and false positives (PN).\" class=\"wp-image-5759\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM.png 2402w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM-300x98.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM-1024x336.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM-768x252.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM-1536x504.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-4.54.29-PM-2048x672.png 2048w\" sizes=\"auto, (max-width: 2402px) 100vw, 2402px\" \/><figcaption class=\"wp-element-caption\">True positives, false positives, and false negatives; note that we do not calculate true negatives in object detection. Image by author<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\"><\/h4>\n\n\n\n<h4 class=\"wp-block-heading\">Mean Average Precision (mAP) and Mean Average Recall (mAR)<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Precision (also known as specificity) is the degree of exactness of the model in identifying only relevant objects. The equation for precision is:<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"2544\" height=\"508\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM.png\" alt=\"Formula to calculate precision\" class=\"wp-image-5761\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM.png 2544w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM-300x60.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM-1024x204.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM-768x153.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM-1536x307.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.25.52-AM-2048x409.png 2048w\" sizes=\"auto, (max-width: 2544px) 100vw, 2544px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Recall (also known as sensitivity) measures the ability of the model to detect all ground truths. The equation for recall is:<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"2544\" height=\"488\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM.png\" alt=\"Formula for recall\" class=\"wp-image-5762\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM.png 2544w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM-300x58.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM-1024x196.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM-768x147.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM-1536x295.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.26.34-AM-2048x393.png 2048w\" sizes=\"auto, (max-width: 2544px) 100vw, 2544px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In a perfect world, our perfect model would have a precision and recall of 1, meaning it predicted zero false negatives and zero false positives. But in the real world this isn\u2019t generally achievable. A precision-recall curve plots the value of precision against recall for different confidence thresholds. The area under this curve is also referred to as the Average&nbsp;<\/span><span style=\"font-weight: 400;\">Precision (AP). Average recall (AR) describes double the value of the area under the recall-IoU curve.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5763\"><img loading=\"lazy\" decoding=\"async\" width=\"2194\" height=\"1024\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM.png\" alt=\"Chart of a precision recall curve\" class=\"wp-image-5763\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM.png 2194w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM-300x140.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM-1024x478.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM-768x358.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM-1536x717.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-19-at-9.26.57-PM-2048x956.png 2048w\" sizes=\"auto, (max-width: 2194px) 100vw, 2194px\" \/><figcaption class=\"wp-element-caption\">Average precision is equal to the area under the PR curve; image by author.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Mean Average Precision (mAP) and mean Average Recall (mAR) are calculated by taking the weighted mean of the AP or AR over all classes and\/or over all IoU thresholds. They are two of the most common evaluation metrics for object detection and are used to evaluate submissions in popular computer vision competitions like the COCO and Pascal VOC challenges. We can derive many other metrics from mAP and mAR, including mAP across scales, at different IoU thresholds, and with a minimum number of detections per image.&nbsp;<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full wp-image-5764\"><img loading=\"lazy\" decoding=\"async\" width=\"1588\" height=\"660\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM.png\" alt=\"COCO's 12 metrics for evaluating object detection models.\" class=\"wp-image-5764\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM.png 1588w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM-300x125.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM-1024x426.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM-768x319.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-8.20.19-AM-1536x638.png 1536w\" sizes=\"auto, (max-width: 1588px) 100vw, 1588px\" \/><figcaption class=\"wp-element-caption\">The 12 metrics used for characterizing the performance of an object detector on COCO; image from COCO<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Other metrics<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">If we were building a model to detect very large objects (relative to the image\u2019s <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Field_of_view\"><span style=\"font-weight: 400;\">field of view<\/span><\/a><span style=\"font-weight: 400;\">), we might be willing to consider models with poor \u201cAP_small\u201d scores, as this metric would be less relevant to our use case. If we were planning on using our model to aid in medical diagnoses, we might place a higher emphasis on mAR values than mAP values, since it would likely be more important not to miss any positive samples than it would be to miss negative samples.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">For this tutorial we use a combination of the mAP and mAR values calculated by the <\/span><a href=\"https:\/\/github.com\/pytorch\/vision\/blob\/main\/references\/detection\/coco_eval.py\"><span style=\"font-weight: 400;\">COCO evaluator<\/span><\/a><span style=\"font-weight: 400;\"> and <\/span><a href=\"https:\/\/torchmetrics.readthedocs.io\/en\/stable\/detection\/mean_average_precision.html\"><span style=\"font-weight: 400;\">torchmetrics.detection<\/span><\/a><span style=\"font-weight: 400;\"> module. We\u2019ll log all relevant values to a DataFrame and then examine it more closely in the Comet Data Panel. This will help give us a full picture of how different models perform in different scenarios and we\u2019ll choose our \u201cbest\u201d model accordingly. <\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Challenges of Comparing Object Detection Models<\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Comparing object detection models can be challenging for a number of reasons. Different models have different requirements when it comes to input size and shape, annotation formats, and other dataset attributes. Hyperparameters vary from algorithm to algorithm and keeping track of which values produce which results can quickly become tedious and overwhelming.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Most computer vision pipelines incorporate image augmentation in some form, so dataset versioning becomes essential for reproducibility and explainability. What\u2019s more, performance metrics only tell part of the story when it comes to object detection. Often, it\u2019s necessary to visualize prediction annotations to understand where things are going right\u2013 and where they\u2019re going wrong.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">What\u2019s more, image datasets themselves are inherently computationally expensive to process. To train an object detection model from scratch requires a lot of time and resources that aren\u2019t always available. To train several object detection models for comparison requires even more time and resources.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5751\"><img loading=\"lazy\" decoding=\"async\" width=\"1412\" height=\"1278\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-9.34.33-AM.png\" alt=\"Idealized accuracy-speed tradeoff has a logarithmic relationship\" class=\"wp-image-5751\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-9.34.33-AM.png 1412w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-9.34.33-AM-300x272.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-9.34.33-AM-1024x927.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-16-at-9.34.33-AM-768x695.png 768w\" sizes=\"auto, (max-width: 1412px) 100vw, 1412px\" \/><figcaption class=\"wp-element-caption\">Idealized accuracy-speed tradeoff has a logarithmic relationship; image by Jeff Miller on <a href=\"https:\/\/www.researchgate.net\/figure\/Idealized-speed-accuracy-trade-off-function-tracing-out-the-relationship-between-reaction_fig1_5238422\">ResearchGate<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In real-world applications, we often make choices to balance accuracy and speed. The performance of a model under a given set of circumstances might not be relevant if we aren\u2019t able to replicate those circumstances in production. So when looking for the \u201cbest\u201d object detection model, it becomes essential to monitor a wide range of metrics pertaining your particular use case. In this tutorial, we\u2019ll show you how Comet helps us do this.<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Using Comet for Object Detection<\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Clearly, comparing object detection models isn\u2019t nearly so simple as just minimizing a single loss function. We have a pretty wide range of metrics to calculate and log, some of which need to be visualized to completely understand, and each model has it\u2019s own graph definition, set of hyperparameters, code output, and other features. To help keep track of all of these moving pieces, we\u2019ll log our inputs, metrics, and outputs to Comet, a experiment tracking tool. Comet has some pretty extensive auto-logging capabilities, but we\u2019ll also explore how to log custom metrics and outputs to Comet. By visualizing all of our data in the Comet UI, we\u2019ll be able to get a much more complete understanding of how each of our models behaves, under which circumstances, and with which data.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">For this tutorial, we\u2019ll be using the <\/span><a href=\"https:\/\/www.cis.upenn.edu\/~jshi\/ped_html\/\"><span style=\"font-weight: 400;\">Penn-Fudan dataset<\/span><\/a><span style=\"font-weight: 400;\">, which consists of 170 images labeled with 345 instances of pedestrians. Pedestrian detection has several applications, including surveillance, training self-driving cars, and other traffic safety applications. Since we\u2019re using PyTorch, we\u2019ll need to define a custom dataset class that inherits from the <\/span><a href=\"https:\/\/pytorch.org\/tutorials\/beginner\/basics\/data_tutorial.html\"><span style=\"font-weight: 400;\">torch.utils.data.Dataset<\/span><\/a><span style=\"font-weight: 400;\"> class.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5785\"><img loading=\"lazy\" decoding=\"async\" width=\"775\" height=\"574\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/PennFudan_bbox_mask_new.gif\" alt=\"Example image from the PennFudan dataset with bounding box and mask labels from our object detection model\" class=\"wp-image-5785\"\/><figcaption class=\"wp-element-caption\">Example image from the PennFudan dataset with bounding box and mask labels; GIF by author<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">All of the models in TorchVision\u2019s detection module use Pascal VOC format, so we\u2019ll format our bounding boxes accordingly. We\u2019ll then need to convert the model\u2019s prediction labels from Pascal VOC to COCO format for use with the COCO evaluator and Comet. If you\u2019re using this tutorial with your own model, check your specific model\u2019s annotation requirements. <\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Single Experiment View<\/span><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In order to get a good understanding of how each of our models is performing, and with which hyperparameters, we\u2019ll start by examining our results at an experiment-level. <\/span><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Store Hyper-parameters<\/span><\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Keeping track of our hyperparameters is essential for reproducibility and explainability. Model hyperparameters can affect model performance, computational choices and what information to retain for analysis. Hyperparameters vary from algorithm to algorithm, and some are more important than others, so this critical task can quickly become tedious and confusing.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Here, we log important hyperparameters with just a single command. For our project we\u2019ll be monitoring the following hyperaparameters, which we can adjust by simply editing the relevant keys-value pairs:<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><script src=\"https:\/\/gist.github.com\/anmorgan24\/187e89014606abb6ac93f5c88bf4d883.js\"><\/script><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Sometimes metrics that work well for one model may not work at all for another. For example, the FCOS model tends to struggle with exploding gradients. When using it, we have to significantly decrease the learning rate to accommodate for this. If, however, we use the reduced learning rate on a model like Fast-RCNN, (typically one of our best-performing models), it performs unusually poorly. This is because it fails to ever really \u201clearn\u201d the feature maps of our dataset.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5768\"><img loading=\"lazy\" decoding=\"async\" width=\"2358\" height=\"1258\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM.png\" alt=\"\" class=\"wp-image-5768\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM.png 2358w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM-300x160.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM-1024x546.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM-768x410.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM-1536x819.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-11.48.59-AM-2048x1093.png 2048w\" sizes=\"auto, (max-width: 2358px) 100vw, 2358px\" \/><figcaption class=\"wp-element-caption\">In this Comet experiment panel, green represents a Fast-RCNN model trained with a learning rate of 5e-4. Blue represents the same model trained on a learning rate of 5e-8, the rate needed to prevent exploding gradients in the FCOS model; image by author<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Since we are focusing on comparing different models in this tutorial, we will mostly be keeping the hyperparameters constant. However, if we were looking to optimize the hyperparameters of a single model, we could also pass a list of values to each key and use an optimizer to iterate through them. Comet has a <\/span><a href=\"https:\/\/www.comet.com\/docs\/v2\/api-and-sdk\/python-sdk\/introduction-optimizer\/\"><span style=\"font-weight: 400;\">built-in optimizer<\/span><\/a><span style=\"font-weight: 400;\"> that supports RandomSearch, GridSearch, Bayes&#8217; Optimization, and custom-built optimization algorithms.<\/span><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Visualizing Outputs<\/h4>\n\n\n\n<h5 class=\"wp-block-heading\">System Metrics<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Since object detection is such a resource-heavy task, we\u2019ll also want to monitor our system metrics, including CPU and GPU usage. Luckily, Comet automatically does this for us, so we don\u2019t need to add any additional code. This can also help diagnose bottlenecks in our pipeline, aid with reproducibility, and debug crashed experiments.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5769\"><img loading=\"lazy\" decoding=\"async\" width=\"1171\" height=\"512\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-11.21.32-AM.png\" alt=\"System metrics of our Fast RCNN experiment\" class=\"wp-image-5769\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-11.21.32-AM.png 1171w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-11.21.32-AM-300x131.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-11.21.32-AM-1024x448.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-11-at-11.21.32-AM-768x336.png 768w\" sizes=\"auto, (max-width: 1171px) 100vw, 1171px\" \/><figcaption class=\"wp-element-caption\">System metrics of our Fast RCNN experiment; image by author<\/figcaption><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">Evaluation Metrics<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Each of our PyTorch detection models comes with relevant evaluation metrics built-in. Comet\u2019s is integrated with PyTorch, so each of these pre-defined metrics will be automatically logged to the experiment. This is very helpful when comparing multiple runs of the same model, or different object detection models with the same evaluation metrics, but the PyTorch models we\u2019ve chosen don\u2019t all come with the same built-in metrics. We\u2019ll still use these auto-logged plots to get an initial impression of the performance of our models, but we\u2019ll want to log some of our own metrics for cross-experiment comparisons.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5771\"><img loading=\"lazy\" decoding=\"async\" width=\"2112\" height=\"1252\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM.png\" alt=\"Auto-logged metrics in the Comet UI\" class=\"wp-image-5771\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM.png 2112w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM-300x178.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM-1024x607.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM-768x455.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM-1536x911.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.35.39-PM-2048x1214.png 2048w\" sizes=\"auto, (max-width: 2112px) 100vw, 2112px\" \/><figcaption class=\"wp-element-caption\">Autologged metrics are super helpful when comparing multiple runs with the same model or multiple models with the same evaluation metrics. But as you can see in the plot above, not all of our models have the same default evaluation metrics, making these plots less relevant. Instead, we\u2019ll define our own.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">We have the ability to manually log just about any metric, asset, artifact, or graphic we want. In <\/span><span style=\"font-weight: 400;\">this tutorial, we\u2019ll track:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400;\">Mean Average Precision (mAP) of all validation images per epoch<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Mean Average Recall (mAR) of all validation images per epoch<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Torchmetric\u2019s 12 metrics for characterizing the performance of an object detection (very similar to COCO\u2019s 12 metrics listed above) per image<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Relevant code files from torchvision (engine.py, transforms.py, etc.)<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Graph definitions of our various models<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400;\">Each image in the validation dataset, as well as our model\u2019s predicted bounding boxes, with their corresponding labels and confidence scores.<\/span><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><script src=\"https:\/\/gist.github.com\/anmorgan24\/f568c1e503db06e05788449f520b1ebd.js\"><\/script><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5770\"><img loading=\"lazy\" decoding=\"async\" width=\"2130\" height=\"1260\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM.png\" alt=\"Custom-defined metrics in Comet UI\" class=\"wp-image-5770\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM.png 2130w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM-300x177.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM-1024x606.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM-768x454.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM-1536x909.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-14-at-1.41.56-PM-2048x1211.png 2048w\" sizes=\"auto, (max-width: 2130px) 100vw, 2130px\" \/><figcaption class=\"wp-element-caption\">For our experiment, we log epoch mAP, epoch mAR, epoch F1, and loss; image by author<\/figcaption><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">Graphics Tab<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Understanding where your model is going right and where it\u2019s going wrong can be especially difficult with image datasets. Loss metrics and other scalar values don\u2019t always tell the whole story and can be hard to visualize. So we\u2019ll also log each of our validation images, along with their predicted bounding boxes, per model per epoch. Flipping through a model\u2019s predictions can also be helpful to see how our models improve over time. <\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5795\"><img loading=\"lazy\" decoding=\"async\" width=\"1529\" height=\"432\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/seg_masks_over_time_1.gif\" alt=\"Examining bounding box predictions over time in our object detection project\" class=\"wp-image-5795\"\/><figcaption class=\"wp-element-caption\">Under the Graphics tab of your experiment view, sort your images by step (ascending), then search for a particular image name (here we used \u201cimage id: 77\u201d). Then use the arrow to watch the model\u2019s predictions over time<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">To log images to Comet, we simply use the \u2018log_image\u2019 method:<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><script src=\"https:\/\/gist.github.com\/anmorgan24\/b93ebc473ca52389fd9473e0d1ee1d2c.js\"><\/script><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Alternatively, we can also pass the annotations to the metadata parameter:<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><script src=\"https:\/\/gist.github.com\/anmorgan24\/21368392bbc63c9fa365d26fefe9d1a6.js\"><\/script><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">In either case, the image annotations should be in JSON format and the bounding boxes should be in COCO format. Bounding boxes can either be passed as a dictionary (as shown below) or as a list of lists. Note that a new instance should be created for each bounding box. Polygon points are passed in the format [x1, y1, x2, y2, \u2026, xn, yn].<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><script src=\"https:\/\/gist.github.com\/anmorgan24\/31ce9d8aac10a7dff4ebe7208cb57e5a.js\"><\/script><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project Level View<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Since we\u2019re comparing object detection models in this tutorial, one of the most important ways we can use Comet is to create a holistic project-level view. Comet automatically generates a basic model performance panel, but we also have the ability to customize our panels for our particular use case.<\/span><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Image Panel<\/span><\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Comet\u2019s new image panel allows us to visualize different models\u2019 prediction per experiment run, over time. Use the step slider to walk through each model\u2019s predictions, or click on an individual image for a closer look.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5773\"><img loading=\"lazy\" decoding=\"async\" width=\"1186\" height=\"675\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/image_slider-1.gif\" alt=\"Examining our object detection models' predictions over time (steps)\" class=\"wp-image-5773\"\/><figcaption class=\"wp-element-caption\">Examining our object detection models&#8217; predictions over time (steps); GIF by author<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">From there, choose to smooth your image or render it in grayscale, select which class labels you want to examine, and set confidence thresholds with a sliding bar.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1380\" height=\"467\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/seg_masks_gray_label_2.gif\" alt=\"Using the confidence score slider, grayscale converter, and selecting category classes for our object detection model\" class=\"wp-image-5774\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Data Panel<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Sometimes we really want a deep dive into the numbers. With Comet\u2019s Data Panel we can log any CSV, DataFrame or table to our project and explore it interactively in the UI. We logged all 12 evaluation metrics from TorchMetric\u2019s mean_ap module, as shown below. A prediction receives a score of -1 if a given metric isn\u2019t relevant to that particular image. For example, if an image doesn\u2019t predict any \u201clarge\u201d bounding boxes, then mAP_large for that image will be -1. We can reorder columns, sort them, and filter values. Below, we compare our most basic mAP and mAR measures and then sort them to see where precision is very different from recall. Alternatively, we could also check the epoch f1-score that we logged as an additional tool in our toolbox. <\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1113\" height=\"610\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/data_panels.gif\" alt=\"Comet Data Panels \" class=\"wp-image-5775\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span style=\"font-weight: 400;\">Multiple Dashboards <\/span><\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Now that we\u2019ve built all of these panels, we need a way to keep them organized! For this, we build and save multiple dashboards, each of which we\u2019ll use for a different purpose. We\u2019ll keep the auto-generated dashboard that Comet built for us, and we\u2019ll organize the rest of our panels into four more dashboards. We have a project overview dashboard that gives us a very basic overview of our project\u2019s stats (parameters used, number of experiments run, and some of the best metrics achieved). We\u2019ll put our image panel and data panel into a Debugging dashboard and we\u2019ll store our plots and charts in a Metrics dashboard. Now we can easily navigate through all of our panels to find exactly what we\u2019re looking for!<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full wp-image-5776\"><img loading=\"lazy\" decoding=\"async\" width=\"1425\" height=\"763\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/multiple_dashboards.gif\" alt=\"Navigating through multiple dashboards in the Comet UI\" class=\"wp-image-5776\"\/><figcaption class=\"wp-element-caption\">Navigating through multiple dashboards in the Comet UI; GIF by author<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Accuracy Speed Tradeoff<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">At the beginning of this tutorial, we briefly explored machine learning\u2019s accuracy-speed tradeoff. Models with higher precision and accuracy tend to consume more compute resources, and fast models tend to be be less accurate. Depending on your use case, your definition of the \u201cbest\u201d model may vary. Circling back to this thought, we\u2019ll compare four of our models in terms of their general speed and accuracy in order to understand which models work \u201cbest\u201d for which scenarios.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5777\"><img loading=\"lazy\" decoding=\"async\" width=\"1225\" height=\"594\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/fast_faster_diff.gif\" alt=\"Diffing experiment hyperparameters in Comet\" class=\"wp-image-5777\"\/><figcaption class=\"wp-element-caption\">As its name suggests, Faster RCNN is faster than Fast_RCNN, but as you can see in <a href=\"https:\/\/www.comet.com\/anmorgan24\/torchvision-object-detection\/compare?experiment-tab=metrics&amp;experiments=d55e54c68b6d4a4eba59cdd9d6489f74,b1cd7cb73b2449ce8e298fbeee787882\">the experiment diffing view<\/a> above, it is also a lot less accurate.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">We create a final dashboard called \u201cAccuracy-Speed Tradeoff\u201d and plot some basic evaluation and system metrics for four different models: Mask RCNN, Fast RCNN, RetinaNet, and FCOS. Remember that both RCNN models are two-stage object detection models, which are generally more computational expensive. RetinaNet and FCOS are both single-stage models.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter wp-image-5778\"><img loading=\"lazy\" decoding=\"async\" width=\"2768\" height=\"1276\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM.png\" alt=\"Accuracy-Speed Tradeoff dashboard in Comet UI\" class=\"wp-image-5778\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM.png 2768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM-300x138.png 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM-1024x472.png 1024w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM-768x354.png 768w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM-1536x708.png 1536w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2023-04-15-at-8.20.40-PM-2048x944.png 2048w\" sizes=\"auto, (max-width: 2768px) 100vw, 2768px\" \/><figcaption class=\"wp-element-caption\">Our <a href=\"https:\/\/www.comet.com\/anmorgan24\/torchvision-object-detection\/view\/6n7O0MIB81RzPBdJqZsB9vAL5\/panels\">Accuracy-Speed Tradeoff dashboard<\/a> for four of our base models<\/figcaption><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">Choosing the Best Model<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Both of our two-stage object detection models (in green and light blue above) far out-perform the single-stage models in mean average precision, epoch f1-score, and loss. Shifting to the bottom row of charts, however, we can see that they are also much more computationally-expensive. It may come as no surprise that Mask RCNN is the slowest model of all, because it\u2019s based on Fast RCNN, but with additional outputs (masks).&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">For a general purpose object detection model, we might conclude that Fast RCNN performs the best with bounding box prediction. It has the highest mAP and f1, the lowest loss, and consumes far less memory than Mask RCNN. But the \u201cbest\u201d model is subjective and entirely dependent on your use case! If we were looking to deploy our model to a mobile device, Fast RCNN\u2019s memory requirements might disqualify it from our consideration. <\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Comparing and logging object detection models can be a tedious and overwhelming task, but when you have an experiment tracking tool like Comet, you can focus your attention where it really matters. Comet is a powerful tool for tracking your models, datasets, and metrics to keep your experiments organized, reproducible, and explainable.&nbsp;<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"font-weight: 400;\">Try out the code in this tutorial in <\/span><a href=\"https:\/\/colab.research.google.com\/drive\/1_JDpLmluoN6BoR4WvbpnSibhMxXW82re#scrollTo=LAI7OMNH35lu\"><span style=\"font-weight: 400;\">this Colab<\/span><\/a><span style=\"font-weight: 400;\"> and apply it to a dataset of your own! You can view the <\/span><a href=\"https:\/\/www.comet.com\/anmorgan24\/torchvision-object-detection\/view\/new\/panels\"><span style=\"font-weight: 400;\">public project here<\/span><\/a><span style=\"font-weight: 400;\"> or, to get started with your own project, <\/span><a href=\"\/signup\"><span style=\"font-weight: 400;\">create an account here for free<\/span><\/a><span style=\"font-weight: 400;\">!<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Additional Resources<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400;\"><a href=\"https:\/\/www.comet.com\/site\/blog\/debugging-classifiers-with-confusion-matrices\/\">Debugging Classifiers With Confusion Matrices<\/a> (for imbalanced datasets)<\/span><\/li>\n\n\n\n<li><a href=\"https:\/\/www.comet.com\/site\/blog\/fixing-object-detection-models-with-better-data\/\">Fixing Object Detection Models With Better Data<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.comet.com\/site\/blog\/kangas-visualize-multimedia-data-at-scale\/\">Kangas: Visualize Multimedia Data at Scale\u00a0<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.comet.com\/site\/blog\/us-comet-registry-to-track-your-machine-learning-models\/\">How to Use the Comet Registry to Track Your Machine Learning Models<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Object detection is one of the most popular applications of machine learning for computer vision. A detection model predicts both the class types and locations of each distinct object in an image. Object detection models have a wide range of applications including manufacturing, surveillance, health care, and more.&nbsp; TorchVision is a Python package that [&hellip;]<\/p>\n","protected":false},"author":22,"featured_media":9445,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[8,23,6,7],"tags":[29,30,35,36,37,38,39],"coauthors":[133],"class_list":["post-5740","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-comet-community-hub","category-integrations","category-machine-learning","category-tutorials","tag-computer-vision","tag-deep-learning","tag-image-classification","tag-image-panels","tag-object-detection","tag-pytorch","tag-torchvision"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Compare Object Detection Models From TorchVision<\/title>\n<meta name=\"description\" content=\"It can be very challenging to systematically compare different object detection models, unless you use an experiment tracking tool like Comet\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Compare Object Detection Models From TorchVision\" \/>\n<meta property=\"og:description\" content=\"It can be very challenging to systematically compare different object detection models, unless you use an experiment tracking tool like Comet\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-05-03T19:22:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:15:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1664\" \/>\n\t<meta property=\"og:image:height\" content=\"980\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Abby Morgan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@anmorgan2414\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Abby Morgan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"22 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Compare Object Detection Models From TorchVision","description":"It can be very challenging to systematically compare different object detection models, unless you use an experiment tracking tool like Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/","og_locale":"en_US","og_type":"article","og_title":"Compare Object Detection Models From TorchVision","og_description":"It can be very challenging to systematically compare different object detection models, unless you use an experiment tracking tool like Comet","og_url":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-05-03T19:22:36+00:00","article_modified_time":"2025-04-24T17:15:34+00:00","og_image":[{"width":1664,"height":980,"url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","type":"image\/png"}],"author":"Abby Morgan","twitter_card":"summary_large_image","twitter_creator":"@anmorgan2414","twitter_site":"@Cometml","twitter_misc":{"Written by":"Abby Morgan","Est. reading time":"22 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/"},"author":{"name":"Abby Morgan","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/826ee39a2e30cf9d8d73155de09bb7b2"},"headline":"Compare Object Detection Models From TorchVision","datePublished":"2023-05-03T19:22:36+00:00","dateModified":"2025-04-24T17:15:34+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/"},"wordCount":3736,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","keywords":["Computer Vision","Deep Learning","Image Classification","Image Panels","Object Detection","PyTorch","TorchVision"],"articleSection":["Comet Community Hub","Integrations","Machine Learning","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/","url":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/","name":"Compare Object Detection Models From TorchVision","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","datePublished":"2023-05-03T19:22:36+00:00","dateModified":"2025-04-24T17:15:34+00:00","description":"It can be very challenging to systematically compare different object detection models, unless you use an experiment tracking tool like Comet","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#primaryimage","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","width":1664,"height":980,"caption":"computer vision project"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/compare-object-detection-models-from-torchvision\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Compare Object Detection Models From TorchVision"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/826ee39a2e30cf9d8d73155de09bb7b2","name":"Abby Morgan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/dbbf1ae921ee179c768f508340415946","url":"https:\/\/secure.gravatar.com\/avatar\/28d4934d14261b4afe12e226f0eaa57c4fb0c2761ad4586eb9a5bec3b8160bc9?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/28d4934d14261b4afe12e226f0eaa57c4fb0c2761ad4586eb9a5bec3b8160bc9?s=96&d=mm&r=g","caption":"Abby Morgan"},"description":"AI\/ML Growth Engineer @ Comet","sameAs":["https:\/\/www.comet.com\/","https:\/\/www.linkedin.com\/in\/anmorgan24\/","https:\/\/x.com\/anmorgan2414"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/abigailmcomet-com\/"}]}},"jetpack_featured_media_url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/05\/Screenshot-2024-03-15-at-4.50.14\u202fPM.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/5740","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/22"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=5740"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/5740\/revisions"}],"predecessor-version":[{"id":15623,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/5740\/revisions\/15623"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media\/9445"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=5740"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=5740"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=5740"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=5740"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}