Skip to content

Finding the best Hyperparameters for your ML experiments¶

A common operation in many machine learning tasks is to find the best set of hyperparameters for training a model. The best set of parameters is often determined by attempting to either maximize a metric like "accuracy", or minimize a metric like "loss." The search process is often called by various names, including:

  • hyperparameter optimization
  • hyperparameter tuning
  • running sweeps
  • grid search

There are a few tools for helping you find the optimal set of hyperparameters, such as:

  • scikit-learn - has many variations; can log to Comet
  • Optuna - flexible; version 4.0.0 now with built-in support for Comet
  • Comet's Optimizer - tight integration with experiment management; client-server architecture

You can use Comet's experiment management with any of the above, or really any hyperparameter search tools. For additional details on using other optimizers, please see: Third-party Optimizers

Comet's Optimizer comes with a powerful optimizer for sophisticated sweeps and grid searches. It is a client-server architecture that allows you to perform sweeps across threads, processes, computers, and clusters. It streamlines the process of finding the optimal set of hyperparameters for your ML experiments by offering a unified, framework-agnostic interface for all your workflows that seamlessly integrates with the experiment management capabilities offered by Comet Experiment Management.

Example Project view of Comet Optimizer hp selection
Example Project view for hyperparameter tuning performance with Comet Optimizer

The Comet Optimizer has many features, including:

  1. In the event that a training example fails to complete (e.g., due to a crash), Comet's Optimizer will retry the parameter combination a specified number of times
  2. With Comet's Optimizer, you can automatically run multiple runs of the same parameter set.
  3. Comet's Optimizer has support for distributed computing, which provides unlimited scalability and efficiency.
  4. You have control to easily vary the fine-grain resolution over each parameter. For example, for one variable you may be interested in a coarse sampling, whereas another variable may have fine sampling.
  5. Comet's Optimizer has many different parameter types and ranges.
  6. The Optimizer has three different types of sweeps: grid, random, and bayes.
  7. Comet's Optimizer has the ability to resume or extend a previous sweep. This is a unique functionality that is only possible when you combine an Optimizer with an Experiment Tracking tool.
  8. The Optimizer's Bayes algorithm uses the Sequential Model-Based Optimization (SMBO) with the Tree Parzen Estimator (TPE). Here is a nice overview of the techniques.
  9. Fully-integrated reports and charts for Optimizer Analysis
  10. All of the above is controlled through an easy-to-use configuration dictionary.

As mentioned, grid search, random search, and Bayes optimization are all supported by Comet Optimizer. You can learn more about each approach in our Hyperparameter Optimization With Comet blog post.

Get started: Hyperparameter Tuning¶

Hyperparameter tuning with Comet Optimizer is a simple two-step process which aims to configure and run an Optimizer object instance.

Comet Optimizer is accessible both from Python SDK and CLI. The step-by-step instructions below use the Python SDK, but you can refer to the end-to-end examples to explore both options.

Prerequisites: Choose your tuning strategy¶

Before you begin optimization, you need to choose:

  • The hyperparameters to tune.
  • The search space for each hyperparameter to tune.
  • The search algorithm (one of grid search, random search, and Bayesian optimization).

Additionally, make sure to refactor your existing model training code so that hyperparameters are defined as parametrized variables.

1. Define the Optimizer configuration¶

Optimizer accesses your hyperparameter search options from a config dictionary that you define.

Comet uses the config dictionary to dynamically find the best set of hyperparameter values that will minimize or maximize a particular metric of your choice.

You can define the dictionary config in code or in a .JSON config file. For example, you could define your configuration dictionary in code as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
config = {
    # Pick the algorithm:
    "algorithm": "bayes",

    # Declare your hyperparameters:
    "parameters": {
        "learning_rate": {"type": "float", "scaling_type": "log_uniform", "min": 0.00001, "max": 0.001},
        "batch_size": {"type": "discrete", "values": [32, 64, 128, 256]},
    },

    # Declare what to optimize, and how:
    "spec": {
      "maxCombo": 20,
      "metric": "accuracy",
      "objective": "maximize",
    },
}

On top of these three mandatory configuration keys, the Optimizer configuration dictionary also supports the optional keys name and trials. You can find a full description of each configuration key in the Configure the Optimizer page.

2. Run optimization¶

The Optimizer configuration defined in the previous step is used to initialize the Optimizer object, e.g. by running:

1
2
3
import comet_ml

opt = comet_ml.Optimizer(config=config)

Under the hood, the Optimizer object creates one Experiment object per tuning run, i.e. for each automated hyperparameter selection.

You can then perform your tuning trials by iterating through the tuning runs with opt.get_experiments(). Within the scope of each tuning experiment, you can train and evaluate your models as you normally do but providing the parameters selected by Optimizer with the experiment.get_parameter() method. For example, the pseudocode below showcases how to run tuning experiments against the learning_rate and batch_size parameters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
for exp in opt.get_experiments():
    model = my_model(
        learning_rate=exp.get_parameter("learning_rate"),
        batch_size=exp.get_parameter("batch_size"),
    )

    # Train and evaluate the model here
    loss, _ = train_and_evaluate(model, ...)

    # Don't forget to log other parameters and metrics not tracked by the optimizer,
    # and also relevant assets as you would for any experiment
    exp.log_metric("loss", loss)
    exp.log_model(f"model_{opt.get_id()}", model)

    # End the current experiment
    exp.end()

Comet Optimizer logs all of the optimizer-related values in the Other tab of the Single Experiment page with the name prefix "optimizer_".

In addition, the Comet Optimizer automatically logs all of the hyperparameters so that they are also accessible in the Hyperparameters tab of the Single Experiment page.

You can then navigate the results of a Comet Optimizer run as you would with any Comet ML Project containing multiple experiments. Learn more in the Analyze hyperparameter tuning results page.

Tip

Make your optimization runs more efficient through parallel execution!

Discover how from the Run hyperparameter tuning in parallel page.

End-to-end examples¶

Below, we provide an end-to-end example on how to run optimization with the Comet SDK, and references on how to run optimization from the Comet CLI.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
import comet_ml
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Initialize the Comet SDK
comet_ml.login(project_name="example-optimizer")

# Create a dataset
X, y = make_classification(n_samples=5000, n_informative=3, random_state=25)

# Split dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(X,y,shuffle=True,test_size=0.25,random_state=25)

# Set the parameters for tuning
model_params = {
    "n_estimators": {
        "type": "integer",
        "scaling_type": "uniform",
        "min": 100,
        "max": 300
    },
    "criterion": {
        "type": "categorical",
        "values": ["gini", "entropy"]
    },
    "min_samples_leaf": {
        "type": "discrete",
        "values": [1, 3, 5, 7, 9]
    },
}

# Set the spec for the Bayes Optimization algorithm
spec = {
    "maxCombo": 20,
    "objective": "maximize",
    "metric": "accuracy",
    "minSampleSize": 500,
    "retryLimit": 20,
    "retryAssignLimit": 0,
}

# Define the configuration dictionary
config_dict = {
    "algorithm": "bayes",
    "spec": spec,
    "parameters": model_params,
    "name": "Bayes Optimization",
    "trials": 10,
}

# Initialize the Comet Optimizer
opt = comet_ml.Optimizer(config=config_dict)

# Run optimization
for experiment in opt.get_experiments():
    # Initialize the algorithm, and set the parameters to be optimized with get_parameter
    random_forest=RandomForestClassifier(
        n_estimators=experiment.get_parameter("n_estimators"),
        criterion=experiment.get_parameter("criterion"),
        min_samples_leaf=experiment.get_parameter("min_samples_leaf"),
        random_state=25,
    )

    # Train the model and make predictions
    random_forest.fit(X_train, y_train)
    y_hat = random_forest.predict(X_test)

    # Log the random state and accuracy of each model
    experiment.log_parameter("random_state", 25)
    experiment.log_metric("accuracy", accuracy_score(y_test, y_hat))
    experiment.log_confusion_matrix(y_test, y_hat)

    # End the current experiment
    experiment.end()

comet, the command-line utility that is installed with the Comet SDK, provides you with an optimize command to run hyperparameter optimization from command line defined as:

$ comet optimize [options] [PYTHON_SCRIPT] OPTIMIZER

where OPTIMIZER is defined from a dictionary, a JSON file, or an Optimizer ID.

For more information, please refer to comet optimize and the end-to-end example for parallel execution.

That's it! As you launch the optimization process, Comet Optimizer iteratively explores the hyperparameter space, and automatically logs the hyperparameter selection and optimization metric performance for the tuning experiment.

For this example, additional internal sklearn parameters are logged on the the Hyperparameters tab of the Single Experiment page thanks to Comet's built-in scikit-learn integration.

Try it now!¶

We have created two beginner demo projects for you to explore as you get started with Comet Optimizer.

Click on the Colab Notebook below to review the code for an example hyperparameter tuning run of a Keras model, managed with Comet Optimizer.

Open In Colab

Additionally, you can explore the example Project created from running the end-to-end example in the Comet UI by clicking on the button below!

Sep. 17, 2024