skip to Main Content
Join Us for Comet's Annual Convergence Conference on May 8-9:

Introduction to Artifacts In Comet

When conducting machine learning (ML) experiments, often you’re in an ML hackathon or you’re building ML solutions for an organization. You will probably want to make sure you keep track of some pieces of the block that are involved in that experiment. For example, the dataset used, model types, etc. Also, you might want to track granular information such as the random state value that generated a particular metric, hyperparameters, etc.

Instead of manually tracking this information on Excel, you can leverage Comet which automates and simplifies the process. This way, you can focus on improving your model’s performance without spending time on tracking details.

In this post, we will discuss how to keep track of one of the precious assets of an ML experiment: a dataset (which Comet classifies as Artifacts). Without further ado, let’s get started!

Tools to be Used

  • Jupyter notebook or Colab for experimentation
  • Datasets can be downloaded through this link
  • Comet Library — if you don’t have it installed you can do that by typing the following command.
pip install comet-ml
  • Also a Comet account. If you don’t have an account click here to sign-up for free.

What are Artifacts?

Artifact in Comet is a keyword to describe any data files or datasets you use in your ML experiments. If you’re building a model that requires multiple datasets or using different versions of a specific dataset, it’s important to keep track of these datasets so you know which ones were used to train certain models. Comet offers functionality to help you do this.

Logging Artifacts to the Comet Platform

Now that we understand what Artifacts are, the next step is to learn how to log them to the Comet platform. You will learn how to log Artifacts to either a new experiment we will create on the platform or to an existing experiment that has been created previously on the platform.

Logging Artifacts to New Experiment

So the Artifacts we want to log to a new experiment we just want to create is a dataset on my local machine called bankerchurners.csv . Now to log the Artifact, we will leverage the following code:

 

Let’s go over the above code:

  • In the above code, we first imported the necessary libraries that we will use. Then, we initialized the Comet library to make it easier to pass the API key in interactive notebooks like Jupyter.
  • Next, we created an Experiment object and specified the name of the experiment and the workspace we want it to belong to. After that, we created an Artifact instance and specified its name and type.
  • We also specified the path of the artifact on our local machine and added artifact.add()it to the Artifact instance. Finally, we logged the artifact to the Comet platform and ended the experiment.

Logging Artifact to an Existing Experiment

So say we have an experiment that has been created previously on the Comet platform and then we want to log Artifacts to it. We can leverage the ExistingExperiment() object in Comet to achieve that. Check out the below code for details.

 

Let’s go over what we have in the above code:

So the above code is somewhat the same as the previous one except for the code at line 9:

#initialize the ExistingExperiment Object and pass in the name of project the experiment belong also with it key
experiment = ExistingExperiment(project_name = "customer_churn", experiment_key="85d50008eb9042788a0ea9037737df79")

So we make use of the ExistingExperiment object in Comet. We then pass in the name of the existing project we want to use which in our case is "customer_churn" . Now, a project can have a lot of experiments, but each experiment in that project has a key to them so you will need to specify the experiment you want to log the Artifact to. Firstly you will click on the experiment you want to log in to as shown below:

Image by Author
After you’ve selected the experiment you will copy the experiment key in the browser url as shown below:

 

Image by Author

Once that’s done it is then possible to log the Artifact to the existing experiment.

Downloading Artifacts from Comet to a Local Machine

There could be cases when you will need to download Artifacts from the Comet platform to your local machine. For example, it could be you’re collaborating on the Comet platform and your colleague have pushed some version of a dataset to the platform but you don’t have it with you on your machine. So you will want to download it then to your machine.

To download Artifacts from the Comet platform to your machine, use the following code:

 

Let’s go over the above code:

We initialized the experiment object. Then we use the experiment.get_artifact() method to get the Artifact in our workspace. What this does is it will scan through all the experiments that are present in our workspace to find the artifact we want to get and then assign it to the variable. After that, we can use the .download() method to download the artifact. We will need to specify the path we want the artifact to be downloaded.

Conclusion

In this tutorial you’ve learned how to log an Artifact to the Comet platform. You learned to log Artifacts to a new experiment and also an existing experiment. Also, you learned how to download an Artifact that exists in the Comet platform. There are also several operations you can do with an Artifact on Comet.

Artifacts make it easy to spend more time working on improving your experiments with automatic tracking and logging. You can access the GitHub link and also learn more about Artifacts.

References

The above code is a modified version of the code available on the Comet docs.


Ibrahim Ogunbiyi

Back To Top