Integrate with scikit-learn¶
Comet integrates with scikit-learn.
Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Log automatically¶
Below you will find a list of items that Comet can automatically log from Scikit-Learn without having to manually instrument your code
- Hyperparameters
Configure Comet for scikit-learn¶
You can control what is automatically logged by Comet through an experiment parameter, environment variable, or configuration setting:
Item | Experiment Parameter | Environment Setting | Configuration Setting |
---|---|---|---|
hyperparameters | auto_param_logging | COMET_AUTO_LOG_PARAMETERS | comet.auto_log.parameters |
For more information about using environment parameters in Comet, see Configure Comet.
End-to-end example¶
Here is a scikit-learn example.
For more examples using scikit-learn, see our examples GitHub repository.
import comet_ml
#create an experiment with your api key
exp = Experiment(project_name='sklearn-demos',
auto_param_logging=False)
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix
random_state = 42
def evaluate(y_test, y_pred):
return {
'f1': f1_score(y_test, y_pred),
'precision': precision_score(y_test, y_pred),
'recall': recall_score(y_test, y_pred)
}
experiment = comet_ml.Experiment(
api_key="<Your API Key>",
project_name="<Your Project Name>"
)
cancer = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
cancer.data,
cancer.target,
stratify=cancer.target,
random_state=random_state)
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
# Log Training Metrics
y_train_pred = clf.predict(X_train)
with experiment.train():
metrics = evaluate(y_train, y_train_pred)
experiment.log_metrics(metrics)
# Log Test Metrics
y_test_pred = clf.predict(X_test)
with experiment.test():
metrics = evaluate(y_test, y_test_pred)
experiment.log_metrics(metrics)
Note
There are alternatives to setting the API key programatically. See more here.
Use scikit-learn grid search¶
This example shows you how to search across parameter combinations using grid search:
from comet_ml import Experiment
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
iris = datasets.load_iris()
parameters = {'kernel': ('linear', 'rbf'), 'C': [1, 10]}
svr = svm.SVC()
clf = GridSearchCV(svr, parameters)
clf.fit(iris.data, iris.target)
for i in range(len(clf.cv_results_['params'])):
exp = Experiment(workspace="your workspace",
project_name="grid_search_example")
for k,v in clf.cv_results_.items():
if k == "params":
exp.log_parameters(v[i])
else:
exp.log_metric(k,v[i])
Try it out!¶
Try our example for using Comet with scikit-learn.