Integrate with MLflow¶
Comet has extensive support for users of MLflow.
Comet can support the use of MLflow in two different methods:
- Built-in, core Comet support for MLflow
- Comet for MLflow extension.
The following sections provide details of both methods.
Built-in, core Comet support for MLflow¶
If you're already using MLflow, then Comet will work with MLflow with no further configuration.
Run any MLflow script from the console, as follows:
comet python mlflow_script.py
Alternatively, you can add this one line of code to the top of your training MLflow script and run your MLflow script as you normally would.:
How it works¶
Comet's built-in, core support for MLflow attempts to create a live, online Experiment if a Comet API Key is configured. If a Comet API Key cannot be found, you will see the following log message:
No Comet API Key was found, creating an OfflineExperiment. Set up your API Key to get the full Comet experience: https://www.comet.com/docs/api-and-sdk/python-sdk/advanced/configuration/.
In case no API key is found, the Comet SDK still creates an OfflineExperiment so you still get all the additional tracking data from Comet. Just remember to upload the offline experiment archive later. At the end of the run, the script provides you the exact command to run, similar to the following:
comet upload /path/to/archive.zip
Any future Experiment runs that are created with this script automatically include Comet's extended Experiment tracking to MLflow.
When you run MLflow by importing
comet_ml or by using the command-line
comet python script.py, you automatically log all of the following items to Comet's single experiment page:
- Metrics: Logged to the metrics tab
- Hyperparameters: Logged to the hyperparameters tab
- Models: Logged to the assets tab
- Assets: Logged to the assets tab
- Source code: Logged to the code tab
- Git repo and patch info: Available by clicking the reproduce button
- System metrics
- CPU and GPU usage: Logged to the system metrics tab
- Python packages: Logged to the installed packages tab
- Command-line arguments
- Standard Output: Logged to the output tab
- Installed OS Packages: Available through the
For more information about using environment parameters in Comet, see Configure Comet.
For more information on using Comet in the console, see Comet Command-Line Utilities.
Now, explore the other support method for MLflow users.
If you would like to see your previously run MLflow Experiments in Comet, try the
comet_for_mlflow extension. To do this, first download the open-source Python extension and command-line interface (CLI) command:
pip install comet-for-mlflow
The Comet for MLflow Extension finds any existing MLflow runs in your current folder and make those available for analysis in Comet. For more options, use
comet_for_mlflow --help and see the following section.
The Comet for MLflow Extension is an open-source project and can be found at: github.com/comet-ml/comet-for-mlflow/
We welcome any questions, bug fixes, and comments in that Git repo.
Advanced CLI usage for Comet for MLflow Extension¶
comet_for_mlflow command offers several options to help you get the most out of previous MLflow runs with Comet:
--upload- automatically uploads the prepared Experiments to Comet.
--no-upload- do not upload the prepared Experiments to Comet.
--api-key API_KEY- set the Comet API key.
--mlflow-store-uri MLFLOW_STORE_URI- set the MLflow store URI.
--output-dir OUTPUT_DIR- set the directory to store prepared runs.
--force-reupload- force re-upload of prepared Experiments.
--yes- answer all yes/no questions automatically with 'yes'.
--no- answer all yes/no questions automatically with 'no'.
--email EMAIL- set email address, if needed, for creating an account.
For more information, use
comet_for_mlflow --help or see github.com/comet-ml/comet-for-mlflow.
Configure Comet for MLflow¶
mlflow.start_run() in your code will create an Experiment object. The auto-logging features of this Experiment object can be configured through either environment variables or the
|Item||Experiment Parameter||Environment Setting||Configuration Setting|
|metric logging rate||auto_metric_step_rate||COMET_AUTO_LOG_METRIC_STEP_RATE||comet.auto_log.metric_step_rate|
As mentioned, Comet supports MLflow users through two different approaches:
- Built-in, core Comet support for MLflow
- Comet for MLflow Extension
The first is useful for running new Experiments, and requires you to use
import comet_ml or
comet python script.py. The second is useful for previously run MLflow Experiments and requires the
There are some differences in the way these two methods operate. Specifically:
|Item logged?||Comet built-in, core support||Comet Extension for MLflow|
|git repo and patch info||Yes||No|
|CPU and GPU usage||Yes||No|
|Installed OS packages||Yes||No|
Limitations in Comet support for MLflow¶
When running the built-in, core Comet support, there are two limitations:
- It does not support MLflow nested runs.
- It does not support continuing a previous MLflow run. The MLflow extension creates a new Comet Experiment in this case.
import comet_ml import keras # The following import and function call are the only additions to code required # to automatically log metrics and parameters to MLflow. import mlflow import mlflow.keras import numpy as np from keras.datasets import reuters from keras.layers import Activation, Dense, Dropout from keras.models import Sequential from keras.preprocessing.text import Tokenizer # The sqlite store is needed for the model registry mlflow.set_tracking_uri("sqlite:///db.sqlite") # We need to create a run before calling keras or MLflow will end the run by itself mlflow.start_run() mlflow.keras.autolog() max_words = 1000 batch_size = 32 epochs = 5 print("Loading data...") (x_train, y_train), (x_test, y_test) = reuters.load_data( num_words=max_words, test_split=0.2 ) print(len(x_train), "train sequences") print(len(x_test), "test sequences") num_classes = np.max(y_train) + 1 print(num_classes, "classes") print("Vectorizing sequence data...") tokenizer = Tokenizer(num_words=max_words) x_train = tokenizer.sequences_to_matrix(x_train, mode="binary") x_test = tokenizer.sequences_to_matrix(x_test, mode="binary") print("x_train shape:", x_train.shape) print("x_test shape:", x_test.shape) print( "Convert class vector to binary class matrix " "(for use with categorical_crossentropy)" ) y_train = keras.utils.np_utils.to_categorical(y_train, num_classes) y_test = keras.utils.np_utils.to_categorical(y_test, num_classes) print("y_train shape:", y_train.shape) print("y_test shape:", y_test.shape) print("Building model...") model = Sequential() model.add(Dense(512, input_shape=(max_words,))) model.add(Activation("relu")) model.add(Dropout(0.5)) model.add(Dense(num_classes)) model.add(Activation("softmax")) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) history = model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1, ) score = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1) print("Test score:", score) print("Test accuracy:", score) mlflow.keras.log_model(model, "model", registered_model_name="Test Model") mlflow.end_run()
Try it out¶
Here is an example Colab Notebook for using Comet with MLflow.