Skip to content

Integrate with scikit-learn

Comet integrates with scikit-learn.

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Open In Colab

Log automatically

Below you will find a list of items that Comet can automatically log from Scikit-Learn without having to manually instrument your code

  • Hyperparameters

Configure Comet for scikit-learn

You can control what is automatically logged by Comet through an experiment parameter, environment variable, or configuration setting:

ItemExperiment ParameterEnvironment SettingConfiguration Setting
hyperparametersauto_param_loggingCOMET_AUTO_LOG_PARAMETERScomet.auto_log.parameters

For more information about using environment parameters in Comet, see Configure Comet.

End-to-end example

Here is a scikit-learn example.

For more examples using scikit-learn, see our examples GitHub repository.

import comet_ml

#create an experiment with your api key
exp = Experiment(project_name='sklearn-demos',
                 auto_param_logging=False)

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix
random_state = 42

def evaluate(y_test, y_pred):
  return {
      'f1': f1_score(y_test, y_pred),
      'precision': precision_score(y_test, y_pred),
      'recall': recall_score(y_test, y_pred)
  }

experiment = comet_ml.Experiment(
    api_key="<Your API Key>",
    project_name="<Your Project Name>"
)

cancer = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    cancer.data,
    cancer.target,
    stratify=cancer.target,
    random_state=random_state)

clf = RandomForestClassifier()
clf.fit(X_train, y_train)

# Log Training Metrics
y_train_pred = clf.predict(X_train)
with experiment.train():
  metrics = evaluate(y_train, y_train_pred)
  experiment.log_metrics(metrics)

# Log Test Metrics
y_test_pred = clf.predict(X_test)

with experiment.test():
  metrics = evaluate(y_test, y_test_pred)
  experiment.log_metrics(metrics)

Note

There are alternatives to setting the API key programatically. See more here.

This example shows you how to search across parameter combinations using grid search:

from comet_ml import Experiment
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV

iris = datasets.load_iris()
parameters = {'kernel': ('linear', 'rbf'), 'C': [1, 10]}
svr = svm.SVC()
clf = GridSearchCV(svr, parameters)
clf.fit(iris.data, iris.target)

for i in range(len(clf.cv_results_['params'])):
    exp = Experiment(workspace="your workspace",
                     project_name="grid_search_example")
    for k,v in clf.cv_results_.items():
        if k == "params":
            exp.log_parameters(v[i])
        else:
            exp.log_metric(k,v[i])

Try it out!

Try our example for using Comet with scikit-learn.

Open In Colab

Mar. 21, 2023