In order to have full reproducibility of training runs, you need to not only track the code and hyperparameters used to train a model but also the data used.
Comet Artifacts allows you to keep track of any type of data assets with the added benefit of knowing which training runs used a particular version of data. The Artifacts UI allows you to view all the artifacts you have save, the versions as well as a full lineage of how each artifact was created and used.
Artifact pages are accessible in the UI from Workspaces.
Clicking Artifacts shows the list of all Artifacts for this Workspace. From here, you can search by tag name (select from list), visibility (public or private), Artifact type (select from list), and artifact name, project name. You can also sort by Artifact and ID.
Click a particular Artifact to see all of the versions for it. Here you can edit its symbol, visibility, type, and description. You can also delete an Artifact from here.
You can see all of the individual files that are stored in an Artifact. Clicking the file shows a preview, if available.
Finally, Artifact Lineage provides an interactive way for you and your team to explore and compare datasets so you can better understand and visualize the datasets that you are using for machine learning.
- Overview of Artifacts: Using Artifacts to enable data lineage
- Remote Artifacts: Using Remote Artifacts to track data stored outside of Comet
- Comet Artifact Lineage: an interactive way to better understand and visualize the datasets that your team is using for machine learning.
- Step through this end-to-end example to see Artifacts basics in action.