{"id":7880,"date":"2023-10-06T15:42:50","date_gmt":"2023-10-06T23:42:50","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=7880"},"modified":"2025-04-24T17:05:36","modified_gmt":"2025-04-24T17:05:36","slug":"dataset-tracking-with-comet-ml-artifacts","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/","title":{"rendered":"Dataset Tracking with Comet ML Artifacts"},"content":{"rendered":"\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\">\n\n\n\n<div class=\"fi fj fk fl fm\">\n<div class=\"ab ca\">\n<div class=\"ch bg eu ev ew ex\">\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX\" alt=\"\" width=\"700\" height=\"464\"><\/figure><div class=\"mf mg mh\"><picture><\/picture><\/div>\n<\/div><figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Photo by <a class=\"af mz\" href=\"https:\/\/unsplash.com\/@uxindo?utm_source=medium&amp;utm_medium=referral\" target=\"_blank\" rel=\"noopener ugc nofollow\">UX Indonesia<\/a> on <a class=\"af mz\" href=\"https:\/\/unsplash.com\/?utm_source=medium&amp;utm_medium=referral\" target=\"_blank\" rel=\"noopener ugc nofollow\">Unsplash<\/a><\/figcaption><\/figure>\n<p id=\"3dae\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Projects are often extensive and have intricacies that need to be more intuitive for a single individual to track. It is the same in machine learning and data science projects.<\/p>\n<p id=\"fa18\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">It is necessary to keep track of many aspects of a given project. A dataset is one of the most important but easily overlooked aspects of a machine learning project. As one engages in feature engineering and data cleaning, it is easy to forget that one needs to keep track of the changes to understand where improvements or declines in performance came from.<\/p>\n<p id=\"e751\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">In this article, I intend to show how someone can keep track of changes with Comet ML\u2019s dataset storage feature: Artifacts. Let\u2019s begin.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fi fj fk fl fm\">\n<div class=\"ab ca\">\n<div class=\"ch bg eu ev ew ex\">\n<h2 id=\"445e\" class=\"od oe fp be of og oh oi oj ok ol om on ni oo op oq nm or os ot nq ou ov ow ox bj\" data-selectable-paragraph=\"\">Requirements<\/h2>\n<p id=\"18f9\" class=\"pw-post-body-paragraph na nb fp be b gn oy nd ne gq oz ng nh ni pa nk nl nm pb no np nq pc ns nt nu fi bj\" data-selectable-paragraph=\"\">There are a few requirements that you\u2019ll need to perform the following project. They are:<\/p>\n<ol class=\"\">\n<li id=\"5afd\" class=\"na nb fp be b gn nc nd ne gq nf ng nh ni pd nk nl nm pe no np nq pf ns nt nu pg ph pi bj\" data-selectable-paragraph=\"\">A Comet ML account. You can get one <a class=\"af mz\" href=\"\/signup?utm_source=heartbeat&amp;utm_medium=referral&amp;utm_campaign=AMS_US_EN_SNUP_heartbeat_CTA\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a>.<\/li>\n<li id=\"d69f\" class=\"na nb fp be b gn pj nd ne gq pk ng nh ni pl nk nl nm pm no np nq pn ns nt nu pg ph pi bj\" data-selectable-paragraph=\"\">A Python 3.9+ installation.<\/li>\n<li id=\"870c\" class=\"na nb fp be b gn pj nd ne gq pk ng nh ni pl nk nl nm pm no np nq pn ns nt nu pg ph pi bj\" data-selectable-paragraph=\"\">An IDE; preferably Visual Studio Code or Jupyter Notebooks.<\/li>\n<li id=\"c960\" class=\"na nb fp be b gn pj nd ne gq pk ng nh ni pl nk nl nm pm no np nq pn ns nt nu pg ph pi bj\" data-selectable-paragraph=\"\">The following python libraries: comet_ml, Scikit-learn, and Pandas.<\/li>\n<li id=\"9e63\" class=\"na nb fp be b gn pj nd ne gq pk ng nh ni pl nk nl nm pm no np nq pn ns nt nu pg ph pi bj\" data-selectable-paragraph=\"\">The passion to learn everything in this article.<\/li>\n<\/ol>\n<h2 id=\"9803\" class=\"od oe fp be of og oh oi oj ok ol om on ni oo op oq nm or os ot nq ou ov ow ox bj\" data-selectable-paragraph=\"\">Project<\/h2>\n<p id=\"3bdf\" class=\"pw-post-body-paragraph na nb fp be b gn oy nd ne gq oz ng nh ni pa nk nl nm pb no np nq pc ns nt nu fi bj\" data-selectable-paragraph=\"\">The dataset for my project will be one that might require substantial changes through data cleaning as most real-world datasets would require. This is important as we can monitor multiple changes throughout the lifetime of this given project. It will also mimic the challenges that a real-life project could have.<\/p>\n<p id=\"dd25\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">One particularly good dataset that simulates real-life problems in data is the famous Kaggle Titanic Dataset. This dataset is messy, full of noise and has many places that are missing data. It is important to experience such problems as they reflect a lot of the issues that a data practitioner is bound to experience in a business environment. <a class=\"af mz\" href=\"https:\/\/heartbeat.comet.ml\/kaggle.com\/competitions\/titanic\/data?select=test.csv\" target=\"_blank\" rel=\"noopener ugc nofollow\">Here<\/a> is the link to the page with both training and test datasets.<\/p>\n<p id=\"0519\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We first get a snapshot of our data by visually inspecting it and also performing minimal <a class=\"af mz\" href=\"https:\/\/heartbeat.comet.ml\/exploratory-data-analysis-eda-for-categorical-data-870b37a79b65\" target=\"_blank\" rel=\"noopener ugc nofollow\">Exploratory Data Analysis<\/a> just to make this article easier to follow through. In a real-life scenario you can expect to do more EDA, but for the sake of simplicity we\u2019ll do just enough to get a sense of the process.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"1112\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-keyword\">import<\/span> pandas <span class=\"hljs-keyword\">as<\/span> pd\n<span class=\"hljs-keyword\">from<\/span> sklearn.model_selection <span class=\"hljs-keyword\">import<\/span> train_test_split\n\ndf = pd.read_csv(<span class=\"hljs-string\">\"\/train.csv\"<\/span>)\ndf.head()<\/span><\/pre>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*FIXLZOGIx8JdVIO7KEQCiQ.png\" alt=\"\" width=\"700\" height=\"223\"><\/figure><div class=\"mf mg px\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*FIXLZOGIx8JdVIO7KEQCiQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"d17a\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">The code written gives the result above. We can see the initial form of our data and what it looks like. We can go further and check if it has any null values and what datatype each column is.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"3bbf\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#Gives us the datatypes of different features<\/span>\ndf.dtypes<\/span><\/pre>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:233\/1*0Msls33FoOww6Spj8WSwRg.png\" alt=\"\" width=\"233\" height=\"254\"><\/figure><div class=\"mf mg py\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:466\/format:webp\/1*0Msls33FoOww6Spj8WSwRg.png 466w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 233px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*0Msls33FoOww6Spj8WSwRg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*0Msls33FoOww6Spj8WSwRg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*0Msls33FoOww6Spj8WSwRg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*0Msls33FoOww6Spj8WSwRg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*0Msls33FoOww6Spj8WSwRg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*0Msls33FoOww6Spj8WSwRg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:466\/1*0Msls33FoOww6Spj8WSwRg.png 466w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 233px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">screenshot by author<\/figcaption>\n<\/figure>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"ee7b\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#Gives us the total number of null values per column<\/span>\ndf.isnull().<span class=\"hljs-built_in\">sum<\/span>()<\/span><\/pre>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:230\/1*fpYEdWW32Ryzi2BvTqaqfw.png\" alt=\"\" width=\"230\" height=\"275\"><\/figure><div class=\"mf mg pz\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:460\/format:webp\/1*fpYEdWW32Ryzi2BvTqaqfw.png 460w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 230px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*fpYEdWW32Ryzi2BvTqaqfw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*fpYEdWW32Ryzi2BvTqaqfw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*fpYEdWW32Ryzi2BvTqaqfw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*fpYEdWW32Ryzi2BvTqaqfw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*fpYEdWW32Ryzi2BvTqaqfw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*fpYEdWW32Ryzi2BvTqaqfw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:460\/1*fpYEdWW32Ryzi2BvTqaqfw.png 460w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 230px\" data-testid=\"og\"><\/picture><\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"93a3\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Finally, we can have a look at the general scope of the data to understand whether the missing values affect all the data extensively by using the <code class=\"cw qa qb qc pp b\">describe()<\/code> method.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"b094\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\">df.describe()<\/span><\/pre>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:699\/1*hiY-oap4kPt1Cp3DVaJd_A.png\" alt=\"\" width=\"699\" height=\"233\"><\/figure><div class=\"mf mg qd\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1398\/format:webp\/1*hiY-oap4kPt1Cp3DVaJd_A.png 1398w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 699px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*hiY-oap4kPt1Cp3DVaJd_A.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*hiY-oap4kPt1Cp3DVaJd_A.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*hiY-oap4kPt1Cp3DVaJd_A.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*hiY-oap4kPt1Cp3DVaJd_A.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*hiY-oap4kPt1Cp3DVaJd_A.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*hiY-oap4kPt1Cp3DVaJd_A.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1398\/1*hiY-oap4kPt1Cp3DVaJd_A.png 1398w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 699px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"b983\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We can see that there are 891 entries in total and the \u201cCabin\u201d column has 687 nulls while the \u201cAge\u201d column has 177. The next step will be performing train-test splits to come up with validation data. After that, we will perform feature engineering in order to come up with the first version of our dataset that will be stored in an Artifact.<\/p>\n<h2 id=\"da89\" class=\"od oe fp be of og oh oi oj ok ol om on ni oo op oq nm or os ot nq ou ov ow ox bj\" data-selectable-paragraph=\"\">First Dataset Version<\/h2>\n<p id=\"537b\" class=\"pw-post-body-paragraph na nb fp be b gn oy nd ne gq oz ng nh ni pa nk nl nm pb no np nq pc ns nt nu fi bj\" data-selectable-paragraph=\"\">There are a few steps we will follow: We will perform the train-test split first and then we will perform one-hot encoding for categorical features. Finally, we will fill null values in the remaining features with medians.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"fd7f\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#performing train-test split and one-hot encoding<\/span>\n<span class=\"hljs-keyword\">from<\/span> sklearn.model_selection <span class=\"hljs-keyword\">import<\/span> train_test_split\n<span class=\"hljs-keyword\">from<\/span> sklearn.compose <span class=\"hljs-keyword\">import<\/span> ColumnTransformer\n<span class=\"hljs-keyword\">from<\/span> sklearn.impute <span class=\"hljs-keyword\">import<\/span> SimpleImputer\n<span class=\"hljs-keyword\">from<\/span> sklearn.preprocessing <span class=\"hljs-keyword\">import<\/span> OneHotEncoder\n<span class=\"hljs-keyword\">import<\/span> numpy <span class=\"hljs-keyword\">as<\/span> np\n\n<span class=\"hljs-comment\">#Dropping two rows where \"Embarked\" has nulls<\/span>\ndf = df.dropna(subset=<span class=\"hljs-string\">\"Embarked\"<\/span>)\n\ny = df[<span class=\"hljs-string\">\"Survived\"<\/span>]\nX = df.drop(columns=[<span class=\"hljs-string\">\"Survived\"<\/span>],  axis=<span class=\"hljs-number\">0<\/span>)\n\n\nX_train, X_val, y_train, y_val = train_test_split(X, y, test_size=<span class=\"hljs-number\">0.2<\/span>, random_state=<span class=\"hljs-number\">0<\/span>)\n\n<span class=\"hljs-comment\">#Columns needed in transformations<\/span>\ncolumns = [<span class=\"hljs-string\">\"PassengerId\"<\/span>,<span class=\"hljs-string\">\"Ticket\"<\/span>, <span class=\"hljs-string\">\"Name\"<\/span>, <span class=\"hljs-string\">\"Cabin\"<\/span>]\ncategorical_columns = [<span class=\"hljs-string\">\"Sex\"<\/span>, <span class=\"hljs-string\">\"Embarked\"<\/span>]\n\n<span class=\"hljs-comment\">#Imputing pipeline<\/span>\nimpute_median = SimpleImputer(missing_values=np.nan, strategy=<span class=\"hljs-string\">'median'<\/span>)\n\n<span class=\"hljs-comment\">#Transformation pipeline for training and validation data<\/span>\ndf_transformations = [(<span class=\"hljs-string\">'impute_median'<\/span>, impute_median, [<span class=\"hljs-string\">\"Age\"<\/span>]),\n                      (<span class=\"hljs-string\">'one-hot-encoder'<\/span>, OneHotEncoder(sparse_output=<span class=\"hljs-literal\">False<\/span>), categorical_columns),\n                      (<span class=\"hljs-string\">'drop'<\/span>, <span class=\"hljs-string\">'drop'<\/span>, columns)]\n\n\ndf_pipeline = ColumnTransformer(df_transformations, remainder=<span class=\"hljs-string\">'passthrough'<\/span>)\ndf_pipeline.set_output(transform=<span class=\"hljs-string\">'pandas'<\/span>)\n\n<span class=\"hljs-comment\">#Transforming X_train and X_val<\/span>\nX_train_transformed = df_pipeline.fit_transform(X_train)\nX_val_transformed = df_pipeline.fit_transform(X_val)<\/span><\/pre>\n<p id=\"635d\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">The above code will give us adequately cleaned data that we can pass through a model for prediction and accuracy. We can check the outcome of the preprocessing by running:<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"0a1a\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\">X_train_transformed.<span class=\"hljs-built_in\">head<\/span>()<\/span><\/pre>\n<p id=\"c437\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">The result will show that our desired transformations have been successful:<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*zNX6JFDKtpqatKPCPaN4Iw.png\" alt=\"\" width=\"700\" height=\"100\"><\/figure><div class=\"mf mg qe\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*zNX6JFDKtpqatKPCPaN4Iw.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*zNX6JFDKtpqatKPCPaN4Iw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*zNX6JFDKtpqatKPCPaN4Iw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*zNX6JFDKtpqatKPCPaN4Iw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*zNX6JFDKtpqatKPCPaN4Iw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*zNX6JFDKtpqatKPCPaN4Iw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*zNX6JFDKtpqatKPCPaN4Iw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*zNX6JFDKtpqatKPCPaN4Iw.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"ff87\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We are now a few steps away from storing our dataset in an Artifact in order to keep track of different versions.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fi fj fk fl fm\">\n<div class=\"ab ca\">\n<div class=\"ch bg eu ev ew ex\">\n<blockquote class=\"qf\"><p id=\"d44e\" class=\"qg qh fp be qi qj qk ql qm qn qo nu dw\" data-selectable-paragraph=\"\">Isolating difficult data samples? Comet can do that. Learn more with our <a class=\"af mz\" href=\"https:\/\/www.comet.com\/site\/blog\/debugging-your-machine-learning-models-with-comet-artifacts\/?utm_source=heartbeat&amp;utm_medium=referral&amp;utm_campaign=AMS_US_EN_AWA_heartbeat_CTA\" target=\"_blank\" rel=\"noopener ugc nofollow\">PetCam scenario<\/a> and discover Comet Artifacts.<\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fi fj fk fl fm\">\n<div class=\"ab ca\">\n<div class=\"ch bg eu ev ew ex\">\n<h2 id=\"5e9a\" class=\"od oe fp be of og oh oi oj ok ol om on ni oo op oq nm or os ot nq ou ov ow ox bj\" data-selectable-paragraph=\"\">Training a Model and Making an Artifact<\/h2>\n<p id=\"d149\" class=\"pw-post-body-paragraph na nb fp be b gn oy nd ne gq oz ng nh ni pa nk nl nm pb no np nq pc ns nt nu fi bj\" data-selectable-paragraph=\"\">Now that we have our training data and validation data preprocessed, we can use them for training the model and checking for accuracy. We will keep track of the model\u2019s accuracy through a Comet ML experiment that will be directly linked to the Artifact that we will create. All these can be performed in a few steps.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"9f79\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-keyword\">from<\/span> sklearn.ensemble <span class=\"hljs-keyword\">import<\/span> RandomForestClassifier\n<span class=\"hljs-keyword\">from<\/span> comet_ml <span class=\"hljs-keyword\">import<\/span> Artifact, Experiment\n<span class=\"hljs-keyword\">from<\/span> sklearn.metrics <span class=\"hljs-keyword\">import<\/span> mean_squared_error, accuracy_score\n\n<span class=\"hljs-comment\">#initializing Experiment to track model accuracy<\/span>\nexperiment = Experiment(api_key = <span class=\"hljs-string\">\"Personal API key\"<\/span>,\n                        project_name = <span class=\"hljs-string\">\"Artifact_Learning\"<\/span>)\n<span class=\"hljs-comment\">#model to fit and predict<\/span>\nmodel = RandomForestClassifier()\nmodel.fit(X_train_transformed, y_train)\ny_pred = model.predict(X_val_transformed)\nacc = accuracy_score(y_val, y_pred)\n\n<span class=\"hljs-comment\">#Logging model accuracy <\/span>\nexperiment.log_metric(<span class=\"hljs-string\">\"accuracy\"<\/span>, acc)<\/span><\/pre>\n<p id=\"0710\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">The above code will initialize a project called \u201cartifact-learning\u201d on the \u201cProject\u201d page that will have an experiment with your accuracy score as you have decided to log that with the \u201clog_metric()\u201d method. You could log additional details such as hyperparameters if your model has any to keep the work as meticulous as possible.<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*aoq6BaVAYm8yQNGXAIUmfg.png\" alt=\"\" width=\"700\" height=\"299\"><\/figure><div class=\"mf mg qp\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*aoq6BaVAYm8yQNGXAIUmfg.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*aoq6BaVAYm8yQNGXAIUmfg.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*aoq6BaVAYm8yQNGXAIUmfg.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*aoq6BaVAYm8yQNGXAIUmfg.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*aoq6BaVAYm8yQNGXAIUmfg.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*aoq6BaVAYm8yQNGXAIUmfg.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*aoq6BaVAYm8yQNGXAIUmfg.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*aoq6BaVAYm8yQNGXAIUmfg.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"3e37\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Our project named \u201cartifact-learning\u201d will appear as seen above. On opening it, we will see the page below:<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*b-dYK0BwJJh2ilDuK6-Lig.png\" alt=\"\" width=\"700\" height=\"299\"><\/figure><div class=\"mf mg qp\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*b-dYK0BwJJh2ilDuK6-Lig.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*b-dYK0BwJJh2ilDuK6-Lig.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*b-dYK0BwJJh2ilDuK6-Lig.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*b-dYK0BwJJh2ilDuK6-Lig.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*b-dYK0BwJJh2ilDuK6-Lig.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*b-dYK0BwJJh2ilDuK6-Lig.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*b-dYK0BwJJh2ilDuK6-Lig.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*b-dYK0BwJJh2ilDuK6-Lig.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"aa08\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">After clicking \u201cphilosophical_title_9572\u201d (the randomized name of our experiment), we will see the following page:<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*z1BhWkzl0eOBwunEYrCx2Q.png\" alt=\"\" width=\"700\" height=\"345\"><\/figure><div class=\"mf mg qq\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*z1BhWkzl0eOBwunEYrCx2Q.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*z1BhWkzl0eOBwunEYrCx2Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*z1BhWkzl0eOBwunEYrCx2Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*z1BhWkzl0eOBwunEYrCx2Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*z1BhWkzl0eOBwunEYrCx2Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*z1BhWkzl0eOBwunEYrCx2Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*z1BhWkzl0eOBwunEYrCx2Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*z1BhWkzl0eOBwunEYrCx2Q.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"e5ed\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Here we can see everything important that we would require such as hyperparameters, metrics and even the system metrics. Additionally, we can see that the accuracy we were looking for had been successfully logged.<\/p>\n<p id=\"36f3\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We then combine our \u201cX_train_transformed\u201d data with the corresponding \u201cy_train\u201d data in order to have a complete dataset that has all the nuanced information we will require in future.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"9956\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#recombining the dataset<\/span>\nX_train_transformed[<span class=\"hljs-string\">\"Survived\"<\/span>] = y_train\n<span class=\"hljs-built_in\">print<\/span>(X_train_transformed.head())<\/span><\/pre>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*oi_iuvWERztsvjm1gtbDLw.png\" alt=\"\" width=\"700\" height=\"533\"><\/figure><div class=\"mf mg qr\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*oi_iuvWERztsvjm1gtbDLw.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*oi_iuvWERztsvjm1gtbDLw.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*oi_iuvWERztsvjm1gtbDLw.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*oi_iuvWERztsvjm1gtbDLw.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*oi_iuvWERztsvjm1gtbDLw.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*oi_iuvWERztsvjm1gtbDLw.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*oi_iuvWERztsvjm1gtbDLw.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*oi_iuvWERztsvjm1gtbDLw.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"1f20\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">The screenshot shows our added column at the bottom named \u201cSurvived.\u201d<\/p>\n<p id=\"02d8\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We then make this into a csv file that will be uploaded to the \u201cArtifact\u201d page as the first version.<\/p>\n<pre class=\"mi mj mk ml mm po pp pq bo pr ba bj\"><span id=\"9312\" class=\"ps oe fp pp b bf pt pu l pv pw\" data-selectable-paragraph=\"\"><span class=\"hljs-comment\">#Making csv file<\/span>\nX_train_transformed.to_csv(<span class=\"hljs-string\">\"Desired path in local computer\"<\/span>)\n\n<span class=\"hljs-comment\">#Naming Artifact and adding csv file<\/span>\nartifact = Artifact(name=<span class=\"hljs-string\">\"Titanic_data\"<\/span>,artifact_type=<span class=\"hljs-string\">\"dataset\"<\/span>)\nartifact.add(<span class=\"hljs-string\">\"Location where csv was stored\"<\/span>)\n\nexperiment.log_artifact(artifact)\n\n<span class=\"hljs-comment\">#End experiment <\/span>\nexperiment.end()<\/span><\/pre>\n<p id=\"1706\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">After receiving a message that indicates success, you can now go to your Comet homepage where the Artifact is found.<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*937LqIF1M7aaaKHMpGpcMQ.png\" alt=\"\" width=\"700\" height=\"64\"><\/figure><div class=\"mf mg qs\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*937LqIF1M7aaaKHMpGpcMQ.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*937LqIF1M7aaaKHMpGpcMQ.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*937LqIF1M7aaaKHMpGpcMQ.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*937LqIF1M7aaaKHMpGpcMQ.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*937LqIF1M7aaaKHMpGpcMQ.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*937LqIF1M7aaaKHMpGpcMQ.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*937LqIF1M7aaaKHMpGpcMQ.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*937LqIF1M7aaaKHMpGpcMQ.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*1P6mr_kTBfbypGTy-Ysz8Q.png\" alt=\"\" width=\"700\" height=\"198\"><\/figure><div class=\"mf mg qt\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*1P6mr_kTBfbypGTy-Ysz8Q.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"5bc6\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">We can inspect that dataset to see if it has the data we seek and if every operation has been successful.<\/p>\n<figure class=\"mi mj mk ml mm mn mf mg paragraph-image\">\n<div class=\"mo mp ec mq bg mr\" tabindex=\"0\" role=\"button\">\n<figure><img loading=\"lazy\" decoding=\"async\" class=\"bg ms mt c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/1*BoaBq9HlF6K-zOscw6jUnA.png\" alt=\"\" width=\"700\" height=\"310\"><\/figure><div class=\"mf mg qu\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/1*BoaBq9HlF6K-zOscw6jUnA.png 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/1*BoaBq9HlF6K-zOscw6jUnA.png 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/1*BoaBq9HlF6K-zOscw6jUnA.png 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/1*BoaBq9HlF6K-zOscw6jUnA.png 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/1*BoaBq9HlF6K-zOscw6jUnA.png 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/1*BoaBq9HlF6K-zOscw6jUnA.png 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/1*BoaBq9HlF6K-zOscw6jUnA.png 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*BoaBq9HlF6K-zOscw6jUnA.png 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\"><\/picture><\/div>\n<\/div>\n<figcaption class=\"mu mv mw mf mg mx my be b bf z dw\" data-selectable-paragraph=\"\">Screenshot by author<\/figcaption>\n<\/figure>\n<p id=\"6c0f\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">On performing a quick check, you can see that all the columns and data are available as we desire.<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"fi fj fk fl fm\">\n<div class=\"ab ca\">\n<div class=\"ch bg eu ev ew ex\">\n<h2 id=\"790d\" class=\"od oe fp be of og oh oi oj ok ol om on ni oo op oq nm or os ot nq ou ov ow ox bj\" data-selectable-paragraph=\"\">Wrap Up<\/h2>\n<p id=\"a55e\" class=\"pw-post-body-paragraph na nb fp be b gn oy nd ne gq oz ng nh ni pa nk nl nm pb no np nq pc ns nt nu fi bj\" data-selectable-paragraph=\"\">In this article, we have successfully demonstrated that developing a dataset for a model could be additionally systematic and orderly. Here, we have been able to keep track of the model performance on a given dataset version that we also kept track of. Comet offers practitioners the opportunity to automatically and programmatically keep track of datasets which reduces the need to manually perform tasks and ensures higher efficiency.<\/p>\n<p id=\"f54a\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Performing an iterative process using the same methods employed here will keep a data practitioner up to date with the exact steps they performed to arrive at certain conclusions and performance as every important step will be documented and stored in Comet ML\u2019s workspace.<\/p>\n<p id=\"476a\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">In a business setting, it\u2019s crucial to keep a meticulous record of the datasets one has. Versioning these datasets according to the changes that have been made along the way ensures that increments in the improvement of predictive models become easily traceable. Comet Artifacts make model versioning easy and the visual interface one can access through Comet\u2019s website is unparalleled in its ease of use and navigation.<\/p>\n<p id=\"b01d\" class=\"pw-post-body-paragraph na nb fp be b gn nc nd ne gq nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu fi bj\" data-selectable-paragraph=\"\">Good luck with your next project with Comet ML!<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Photo by UX Indonesia on Unsplash Projects are often extensive and have intricacies that need to be more intuitive for a single individual to track. It is the same in machine learning and data science projects. It is necessary to keep track of many aspects of a given project. A dataset is one of the [&hellip;]<\/p>\n","protected":false},"author":79,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[9,7],"tags":[],"coauthors":[176],"class_list":["post-7880","post","type-post","status-publish","format-standard","hentry","category-product","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Dataset Tracking with Comet ML Artifacts - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dataset Tracking with Comet ML Artifacts\" \/>\n<meta property=\"og:description\" content=\"Photo by UX Indonesia on Unsplash Projects are often extensive and have intricacies that need to be more intuitive for a single individual to track. It is the same in machine learning and data science projects. It is necessary to keep track of many aspects of a given project. A dataset is one of the [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-06T23:42:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:05:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX\" \/>\n<meta name=\"author\" content=\"Mwanikii Njagi\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Mwanikii Njagi\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Dataset Tracking with Comet ML Artifacts - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/","og_locale":"en_US","og_type":"article","og_title":"Dataset Tracking with Comet ML Artifacts","og_description":"Photo by UX Indonesia on Unsplash Projects are often extensive and have intricacies that need to be more intuitive for a single individual to track. It is the same in machine learning and data science projects. It is necessary to keep track of many aspects of a given project. A dataset is one of the [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-10-06T23:42:50+00:00","article_modified_time":"2025-04-24T17:05:36+00:00","og_image":[{"url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX","type":"","width":"","height":""}],"author":"Mwanikii Njagi","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Mwanikii Njagi","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/"},"author":{"name":"Mwanikii Njagi","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/c7043b3e6b992af7b3220aa1f27d2162"},"headline":"Dataset Tracking with Comet ML Artifacts","datePublished":"2023-10-06T23:42:50+00:00","dateModified":"2025-04-24T17:05:36+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/"},"wordCount":1152,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX","articleSection":["Product","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/","url":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/","name":"Dataset Tracking with Comet ML Artifacts - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#primaryimage"},"thumbnailUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX","datePublished":"2023-10-06T23:42:50+00:00","dateModified":"2025-04-24T17:05:36+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#primaryimage","url":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX","contentUrl":"https:\/\/miro.medium.com\/v2\/resize:fit:700\/0*AZOn4YG5IazO-2AX"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/dataset-tracking-with-comet-ml-artifacts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Dataset Tracking with Comet ML Artifacts"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/c7043b3e6b992af7b3220aa1f27d2162","name":"Mwanikii Njagi","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/1a3c516cf04aca9418dfb2213081f4df","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/cropped-1_2jy9gyk0G_yaniWm8gJFVA-1-96x96.webp","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2023\/08\/cropped-1_2jy9gyk0G_yaniWm8gJFVA-1-96x96.webp","caption":"Mwanikii Njagi"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/freddynjagigmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7880","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/79"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=7880"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7880\/revisions"}],"predecessor-version":[{"id":15501,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/7880\/revisions\/15501"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=7880"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=7880"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=7880"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=7880"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}