{"id":7619,"date":"2023-09-22T12:17:15","date_gmt":"2023-09-22T20:17:15","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=7619"},"modified":"2025-04-24T17:13:51","modified_gmt":"2025-04-24T17:13:51","slug":"image-augmentation-for-computer-vision-tasks-using-pytorch","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/","title":{"rendered":"Image Augmentation for Computer Vision Tasks Using PyTorch"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\">\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<figure class=\"lw lx ly lz ma mb lt lu paragraph-image\">\n<div class=\"mc md eb me bg mf\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mg mh c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg\" alt=\"\" width=\"700\" height=\"467\"><\/figure><div class=\"lt lu lv\"><picture><\/picture><\/div>\n<\/div><figcaption class=\"mi mj mk lt lu ml mm be b bf z dv\" data-selectable-paragraph=\"\"><a class=\"af mn\" href=\"https:\/\/unsplash.com\/photos\/-f8ssjFhD1k\" target=\"_blank\" rel=\"noopener ugc nofollow\">Source: https:\/\/unsplash.com\/photos\/-f8ssjFhD1k<\/a><\/figcaption><\/figure>\n<p id=\"de40\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Data augmentation is the process of transforming training data to introduce randomness. This strategy is common for computer vision tasks. In this scenario, the training data in question are images. For example, you can scale, rotate, mirror, and\/or crop your images during training.<\/p>\n<p id=\"7475\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Image augmentation has two key benefits: One, it helps your neural network generalize well by increasing the diversity of learning examples, allowing your model to make reliable predictions on new, never seen before input data. This prevents your model from being either excessively adjusted to training data (over-fitted), or not capable of capturing pattern in data at all (under-fitted).<\/p>\n<figure class=\"nm nn no np nq mb lt lu paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mg mh c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*STacecTPW0o_LZy_x8UnJg.png\" alt=\"\" width=\"700\" height=\"284\"><\/figure><div class=\"lt lu nl\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*STacecTPW0o_LZy_x8UnJg.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*STacecTPW0o_LZy_x8UnJg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*STacecTPW0o_LZy_x8UnJg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*STacecTPW0o_LZy_x8UnJg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*STacecTPW0o_LZy_x8UnJg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*STacecTPW0o_LZy_x8UnJg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*STacecTPW0o_LZy_x8UnJg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*STacecTPW0o_LZy_x8UnJg.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mi mj mk lt lu ml mm be b bf z dv\" data-selectable-paragraph=\"\">Src: <a class=\"af mn\" href=\"https:\/\/medium.com\/analytics-vidhya\/data-augmentation-in-deep-learning-3d7a539f7a28\" rel=\"noopener\">https:\/\/medium.com\/analytics-vidhya\/<\/a><\/figcaption>\n<\/figure>\n<p id=\"96f8\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Second, it boosts the performance and outcomes of your trained models by forming new and different examples to train datasets. If the dataset is rich and sufficient, the model performs better and more accurately.<\/p>\n<p id=\"07f4\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">In this tutorial, you are going to learn how to perform data augmentation using PyTorch. It is a great library that allows you to train neural networks.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<blockquote class=\"nz\"><p id=\"7449\" class=\"oa ob fo be oc od oe of og oh oi nk dv\" data-selectable-paragraph=\"\">Centralizing knowledge means being able to reproduce, extrapolate, and tailor experiments. <a class=\"af mn\" href=\"https:\/\/www.youtube.com\/watch?v=tIgya4PaCWM&amp;list=PLX9GmL8cVn_yout9BRYNj43XJco3gsZ3r&amp;index=10\" target=\"_blank\" rel=\"noopener ugc nofollow\">Learn how large scale companies like Uber share internal knowledge.<\/a><\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fh fi fj fk fl\">\n<div class=\"ab ca\">\n<div class=\"ch bg et eu ev ew\">\n<p id=\"27ef\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">To make the development experience smooth for you, let\u2019s use<a class=\"af mn\" href=\"https:\/\/colab.research.google.com\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"> Google Colab<\/a>to create a notebook. In order to create a code cell in your notebook, click <strong class=\"be oj\">+Code. <\/strong>Create a code cell to install the required PyTorch dependencies.<\/p>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"63f6\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">%pip install -q torch==1.4.0 torchvision==0.5.0. <\/span><\/pre>\n<p id=\"8de2\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">Add another code cell to import the libraries that are required in this tutorial.<\/p>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"2438\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">import shutil\nfrom pathlib import Path\nfrom urllib.request import urlretrieve\nimport PIL\nimport torch\nimport torch.utils.data as data\nfrom torchvision import datasets, transforms\nimport urllib.request\nimport os<\/span><\/pre>\n<h1 id=\"c503\" class=\"ou oq fo be ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Variable Definitions<\/strong><\/h1>\n<p id=\"223c\" class=\"pw-post-body-paragraph mo mp fo be b mq pr ms mt mu ps mw mx my pt na nb nc pu ne nf ng pv ni nj nk fh bj\" data-selectable-paragraph=\"\">Create another code cell that will host the variable definitions for your script. Here, you will have to define the link for the flower dataset, paths, image, and batch sizes.<\/p>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"a1cb\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">DATASET_LINK = \u2018https:\/\/storage.googleapis.com\/download.tensorflow.org\/example_images\/flower_photos.tgz'\nWORKING_DIR_PATH = Path(\u2018.\u2019)\nFLOWERS_PATH = WORKING_DIR_PATH \/ \u2018flower_photos\u2019\nIMAGE_SIZE = 64\nBATCH_SIZE = 128<\/span><\/pre>\n<h1 id=\"7e27\" class=\"ou oq fo be ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Data Collection<\/strong><\/h1>\n<p id=\"1202\" class=\"pw-post-body-paragraph mo mp fo be b mq pr ms mt mu ps mw mx my pt na nb nc pu ne nf ng pv ni nj nk fh bj\" data-selectable-paragraph=\"\">Use the data set link created above to download the flower images in your Google Colab instance. Create a new code cell and add the code below.<\/p>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"c2d3\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">def download_and_unpack_file(link, filename, unpack=True):\n    if (WORKING_DIR_PATH \/ filename).exists():\n        return\n    archname = link.split(\u2018\/\u2019)[-1]\n    urllib.request.urlretrieve(link, archname)urllib.request.urlcleanup()\n    shutil.unpack_archive(archname, WORKING_DIR_PATH)\n    os.remove(archname)\ndownload_and_unpack_file(DATASET_LINK, \u2018flower_photos.tgz\u2019, unpack=False)<\/span><\/pre>\n<h1 id=\"f595\" class=\"ou oq fo be ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Image Augmentation<\/strong><\/h1>\n<p id=\"75b7\" class=\"pw-post-body-paragraph mo mp fo be b mq pr ms mt mu ps mw mx my pt na nb nc pu ne nf ng pv ni nj nk fh bj\" data-selectable-paragraph=\"\">In computer vision tasks, there are classic image processing activities for augmentation of images: Vertical and horizontal flipping, padding, zooming. random rotating, adding <a class=\"af mn\" href=\"https:\/\/en.wikipedia.org\/wiki\/Image_noise\" target=\"_blank\" rel=\"noopener ugc nofollow\">noise<\/a>, random erasing, cropping, re-scaling, color modification, changing contrast, gray scaling and translation (image is moved along X, Y direction). All these operations and many more are well defined in PyTorch\u2019s <a class=\"af mn\" href=\"https:\/\/pytorch.org\/vision\/main\/transforms.html\" target=\"_blank\" rel=\"noopener ugc nofollow\">documentation<\/a>. Using the compose interface, you can stack a number of these operations to form your pipeline. The choice of operations you want to include in your augmentation pipeline depends on the level of variation you want to achieve with your training images.<\/p>\n<blockquote class=\"pw px py\"><p id=\"b075\" class=\"mo mp pz be b mq mr ms mt mu mv mw mx qa mz na nb qb nd ne nf qc nh ni nj nk fh bj\" data-selectable-paragraph=\"\">PyTorch also supports automatic augmentation, a common Data Augmentation technique that can improve the accuracy of image classification models.<\/p><\/blockquote>\n<figure class=\"nm nn no np nq mb lt lu paragraph-image\">\n<div class=\"mc md eb me bg mf\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mg mh c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*o4kJxuOpS-tGqmmN-3e4Aw.png\" alt=\"\" width=\"700\" height=\"361\"><\/figure><div class=\"lt lu qd\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*o4kJxuOpS-tGqmmN-3e4Aw.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mi mj mk lt lu ml mm be b bf z dv\" data-selectable-paragraph=\"\">Source: Medium<\/figcaption>\n<\/figure>\n<p id=\"38fe\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">In this section, you are going to compose transformation pipelines using PyTorch. For this tutorial, let\u2019s keep it simple and create the following pipeline in a new code cell:<\/p>\n<ul class=\"\">\n<li id=\"ca07\" class=\"mo mp fo be b mq mr ms mt mu mv mw mx my qe na nb nc qf ne nf ng qg ni nj nk qh qi qj bj\" data-selectable-paragraph=\"\">Resize the image to a specified dimension<\/li>\n<li id=\"e20a\" class=\"mo mp fo be b mq qk ms mt mu ql mw mx my qm na nb nc qn ne nf ng qo ni nj nk qh qi qj bj\" data-selectable-paragraph=\"\">Flip the image horizontally<\/li>\n<li id=\"64e4\" class=\"mo mp fo be b mq qk ms mt mu ql mw mx my qm na nb nc qn ne nf ng qo ni nj nk qh qi qj bj\" data-selectable-paragraph=\"\">Introduce color filters to the image<\/li>\n<li id=\"ef3e\" class=\"mo mp fo be b mq qk ms mt mu ql mw mx my qm na nb nc qn ne nf ng qo ni nj nk qh qi qj bj\" data-selectable-paragraph=\"\">Affine the image<\/li>\n<li id=\"9c43\" class=\"mo mp fo be b mq qk ms mt mu ql mw mx my qm na nb nc qn ne nf ng qo ni nj nk qh qi qj bj\" data-selectable-paragraph=\"\">Convert the image to a tensor<\/li>\n<\/ul>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"17cd\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">transform= transforms.Compose([\n    transforms.Resize(IMAGE_SIZE),\n    transforms.RandomHorizontalFlip(),\n    transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.2, hue=0.1),\n    transforms.RandomAffine(3, scale=(0.95, 1.05)),\n    transforms.ToTensor()\n])<\/span><\/pre>\n<h1 id=\"0add\" class=\"ou oq fo be ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Load Data<\/strong><\/h1>\n<p id=\"df0f\" class=\"pw-post-body-paragraph mo mp fo be b mq pr ms mt mu ps mw mx my pt na nb nc pu ne nf ng pv ni nj nk fh bj\" data-selectable-paragraph=\"\">PyTorch provides developers an intuitive interface to load images located in a folder. Afterward, you will create training and validation datasets you need to use later on.<\/p>\n<blockquote class=\"pw px py\"><p id=\"94b0\" class=\"mo mp pz be b mq mr ms mt mu mv mw mx qa mz na nb qb nd ne nf qc nh ni nj nk fh bj\" data-selectable-paragraph=\"\">The random split function needs the sum of the training and validation datasets to be equal to the total number of images in the dataset, otherwise the function will throw an error.<\/p><\/blockquote>\n<pre class=\"nm nn no np nq ok ol om on ax oo bj\"><span id=\"b8d0\" class=\"op oq fo ol b ho or os l ie ot\" data-selectable-paragraph=\"\">data_dir = \u2018.\/flower_photos\u2019\ndataset = datasets.ImageFolder( data_dir , transform=transform)\ntrain_set, val_set = data.random_split(dataset, [3000, 670])\ntrainloader = data.DataLoader(train_set, batch_size=BATCH_SIZE)\ntestloader = data.DataLoader(val_set, batch_size=BATCH_SIZE)<\/span><\/pre>\n<h1 id=\"0c4f\" class=\"ou oq fo be ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq bj\" data-selectable-paragraph=\"\"><strong class=\"al\">Conclusion<\/strong><\/h1>\n<figure class=\"nm nn no np nq mb lt lu paragraph-image\">\n<div class=\"mc md eb me bg mf\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg mg mh c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*IqMmSSKiI7TSqP7UNDsX4Q.png\" alt=\"\" width=\"700\" height=\"325\"><\/figure><div class=\"lt lu lv\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*IqMmSSKiI7TSqP7UNDsX4Q.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mi mj mk lt lu ml mm be b bf z dv\" data-selectable-paragraph=\"\">Sample from augmentation pipeline<\/figcaption>\n<\/figure>\n<p id=\"44b8\" class=\"pw-post-body-paragraph mo mp fo be b mq mr ms mt mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk fh bj\" data-selectable-paragraph=\"\">PyTorch library simplifies image augmentation by providing a way to compose transformation pipelines. They work with PyTorch datasets that you use when creating your neural network. You can use this <a class=\"af mn\" href=\"https:\/\/colab.research.google.com\/drive\/1w4UOXJ26lp_aeFlTM04tM0NBEWeHVSgH?usp=sharing\" target=\"_blank\" rel=\"noopener ugc nofollow\">Google Colab notebook<\/a> based on this tutorial to speed up your experiments, it has all the working code in this tutorial. Happy training.<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Source: https:\/\/unsplash.com\/photos\/-f8ssjFhD1k Data augmentation is the process of transforming training data to introduce randomness. This strategy is common for computer vision tasks. In this scenario, the training data in question are images. For example, you can scale, rotate, mirror, and\/or crop your images during training. Image augmentation has two key benefits: One, it helps your [&hellip;]<\/p>\n","protected":false},"author":65,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[6,7],"tags":[],"coauthors":[165],"class_list":["post-7619","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Image Augmentation for Computer Vision Tasks Using PyTorch - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Image Augmentation for Computer Vision Tasks Using PyTorch\" \/>\n<meta property=\"og:description\" content=\"Source: https:\/\/unsplash.com\/photos\/-f8ssjFhD1k Data augmentation is the process of transforming training data to introduce randomness. This strategy is common for computer vision tasks. In this scenario, the training data in question are images. For example, you can scale, rotate, mirror, and\/or crop your images during training. Image augmentation has two key benefits: One, it helps your [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-09-22T20:17:15+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:13:51+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg\" \/>\n<meta name=\"author\" content=\"Klurdy Studios\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Klurdy Studios\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Image Augmentation for Computer Vision Tasks Using PyTorch - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/","og_locale":"en_US","og_type":"article","og_title":"Image Augmentation for Computer Vision Tasks Using PyTorch","og_description":"Source: https:\/\/unsplash.com\/photos\/-f8ssjFhD1k Data augmentation is the process of transforming training data to introduce randomness. This strategy is common for computer vision tasks. In this scenario, the training data in question are images. For example, you can scale, rotate, mirror, and\/or crop your images during training. Image augmentation has two key benefits: One, it helps your [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-09-22T20:17:15+00:00","article_modified_time":"2025-04-24T17:13:51+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg","type":"","width":"","height":""}],"author":"Klurdy Studios","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Klurdy Studios","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/"},"author":{"name":"Klurdy Studios","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/069e186ad4a5b6d6950292821ea0f37b"},"headline":"Image Augmentation for Computer Vision Tasks Using PyTorch","datePublished":"2023-09-22T20:17:15+00:00","dateModified":"2025-04-24T17:13:51+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/"},"wordCount":615,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg","articleSection":["Machine Learning","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/","url":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/","name":"Image Augmentation for Computer Vision Tasks Using PyTorch - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg","datePublished":"2023-09-22T20:17:15+00:00","dateModified":"2025-04-24T17:13:51+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*gmPWALRLrIUVVQA70RWKpA.jpeg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/image-augmentation-for-computer-vision-tasks-using-pytorch\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Image Augmentation for Computer Vision Tasks Using PyTorch"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/069e186ad4a5b6d6950292821ea0f37b","name":"Klurdy Studios","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/b1a3bf3caaa793aaad2da005b3ba38ba","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1635710213869-96x96.jpg","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/1635710213869-96x96.jpg","caption":"Klurdy Studios"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/brianklurdy-com\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7619","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/65"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=7619"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7619\/revisions"}],"predecessor-version":[{"id":15529,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7619\/revisions\/15529"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=7619"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=7619"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=7619"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=7619"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}