Skip to content

Experiment objects

The Comet Python SDK provides several objects for performing operations on Experiments. This page describes the difference between them and presents the typical use cases where they are best used.

Experiment

Experiment is the core class of Comet. An Experiment represents a unit of measurable research that defines a single execution of code with some associated data for example training a model on a single set of hyperparameters. Use Experiment to log new data to the Comet UI.

An Experiment automatically logs scripts output (stdout/stderr), code, and command-line arguments on any script and, for the supported libraries, also logs hyperparameters, metrics, and model configuration.

Add the following snippet to the top of your training script to quickly add Experiment functionality:

from comet_ml import Experiment
experiment = Experiment("YOUR-PERSONAL-API-KEY")

# Your code

Note

If we detect a loss of connectivity during a training run, we will save a local backup of data logged to an experiment to an OfflineExperiment. You can also save a local backup of your data for all experiments by setting up the configuration value keep_offline_zip as described here.

Automatic logging

By adding Experiment to your script, you automatically turn on logging for the following items:

  • Script code and file name, or Jupyter Notebook history
  • Git metadata and patch
  • Model graph representation (see below)
  • Model weights and biases (see below)
  • Model hyperparameters (see below)
  • Training metrics (see below)
  • Command-line arguments to script
  • Console and Jupyter Notebook standard output and error
  • Environment GPU, CPU, host name, and more

Framework-dependent automatic logging

FrameworkLogged items
fast.aiAll PyTorch items, plus epochs, and metrics. See examples.
KerasGraph description, steps, metrics, hyperparameters, weights and biases as histograms, optimizer config, and number of trainable parameters. See examples.
MLflowHyperparameters, assets, models, plus lower-level framework items (for example, TensorFlow's metrics, TensorBoard summaries).
ProphetHyperparameters, model, and figures
PyTorch LightningLoss and accuracy, with refinements in progress. See examples.
PyTorchGraph description, steps, and loss. See examples.
PyTorch + apexLoss. See examples.
Ray TrainDistributed system metrics. See examples.
Scikit-learnHyperparameters. See examples.
TensorBoardSummary scalars (as metrics) and summary histograms
TensorFlow model analysisTime series, plots, and slicing metrics
TensorFlowSteps and metrics. See examples.
XGBoostMetrics, hyperparameters. See examples.

ExistingExperiment

ExistingExperiment should be used in cases where you want to append data to an Experiment that has already been created.

The following are the typical use cases for this object:

  • You want to resume training a model after some sort of interruption (for example, an internet outage).
  • Your experimentation is split across multiple processes and you would like to consolidate the data to a single experiment.
  • You want to append additional data to a previously completed Experiment. This is useful when your training and testing experiments are in different scripts. For example, add test set predictions as an asset or test set metrics to the Experiment.

Caveats

Certain automatic logging features are automatically disabled in ExistingExperiment. You must manually enable them by setting them to True in the object. The following is the list of disabled features:

log_codelog_graphparse_args
log_env_detailslog_git_metadatalog_git_patch
log_env_gpulog_env_cpulog_env_host

OfflineExperiment

OfflineExperiment has all the same methods and properties as the Experiment object.

The following are the typical use cases for this object:

  • You might have to run your experiments on machines that do not have a public internet connection.
  • In cases your experimentation generates a high volume of data in a short period of time and you are using the Experiment object, there is a risk of throttling. Saving your data to an OfflineExperimentobject lets you bulk upload the data once the run is complete and avoid Comet’s throttling.

ExistingOfflineExperiment

ExistingOfflineExperiment has all the same methods and properties as the ExistingExperiment object.

The typical use cases for this object:

  • You have an Experiment that exists in Comet but you have an unreliable or unavailable internet connection.
  • You can use ExistingOfflineExperiment to log your experimentation data locally and upload it to the existing Comet Experiment once you have access to a reliable internet connection.

API

The API object is a wrapper around Comet’s REST API. It contains a set of convenience method for fetching data logged to Comet. This object is useful when you want to fetch or analyze your data outside of the Comet UI.

APIExperiment

APIExperiment is a wrapper object for the data returned from the Comet Python API object. Each APIExperiment object contains data for a single experiment that has been logged to Comet. This object has a set of convenience methods that make is easy to access and download the logged data present inside each experiment.

The typical use cases for this object:

  • You would like to fetch CPU or GPU metrics across all projects in your workspace and calculate summary metrics to assess how much compute the ML team is using. Currently, there is no way to do this through the UI. You would use the Python API object to pull this data from your workspace and compute these metrics in a custom script.
  • You would like to fetch all the models binaries logged to a particular project to create an ensemble model.
  • You want to calculate the average model accuracy across multiple projects working on the same task.

Model

The Model object can be retrieved from the API class and provides an easy to use and performant access to the models in the Model Registry. One can easily use the Model class by calling api.get_model(workspace=<your_workspace>, model_name=<your_model>). The model class is the recommended way to programmatically interact with your registered models using methods such as model.find_versions() or model.download().

Mar. 27, 2024