Skip to content


The Comet Optimizer is a powerful intuitive tool in your automated hyperparameter tuning toolbox.

Use Optimizer to dynamically find the best set of hyperparameter values that will minimize or maximize a particular metric. It can make suggestions for what hyperparameter values to try next, either in serial or in parallel (or a combination).

In its simplest form, you can use the hyperparameter search this way:

# file:

from comet_ml import Optimizer

# You need to specify the algorithm and hyperparameters to use:
config = {
    # Pick the Bayes algorithm:
    "algorithm": "bayes",

    # Declare your hyperparameters:
    "parameters": {
        "x": {"type": "integer", "min": 1, "max": 5},

    # Declare what to optimize, and how:
    "spec": {
    "metric": "loss",
        "objective": "minimize",

# Next, create an optimizer, passing in the configuration:
opt = Optimizer(config)

# define fit function here!

# Finally, get experiments, and train your models:
for experiment in opt.get_experiments(
    # Test the model
    loss = fit(experiment.get_parameter("x"))
    experiment.log_metric("loss", loss)

That's it! Comet will provide you with an Experiment object already set up with the suggested parameters to try. You merely need to train the model and log the metric to optimize ("loss" in this case).

See the Optimizer class for more details on creating an optimizer.

Optimizer configuration

Optimizer Configuration is performed through a dictionary, either specified in code, or in a config file. The dictionary format is a JSON structure similar to the following:

{"algorithm": "bayes",
 "spec": {
    "maxCombo": 0,
    "objective": "minimize",
    "metric": "loss",
    "minSampleSize": 100,
    "retryLimit": 20,
    "retryAssignLimit": 0,
 "parameters": {
     "hidden-layer-size": {"type": "integer", "min": 5, "max": 100},
     "hidden2-layer-size": {"type": "discrete", "values": [16, 32, 64]},
 "name": "My Bayesian Search",
 "trials": 1,

As shown, the Optimizer configuration dictionary has five sections:

algorithmString, indicating the search algorithm to use
specDictionary, defining the algorithm-specific specifications
parametersDictionary, defining the parameter distribution space
name(Optional) String, specifying a personalizable name to associate with this search instance
trials(Optional) Integer, specifying the number of trials to run per experiment. Defaults to 1.

Details of the mandatory sections (algorithm, spec, and parameters) follow.




For the Random sampling algorithm.


For the Grid search algorithm. Grid is a sweep algorithm based on picking parameter values from discrete, possibly sampled, regions.


For the Bayesian search algorithm. Algorithm based on distributions, balancing exploitation, and exploration.


This table describes algorithm-specific specifications. Relevant options are indicated for the different algorithms.

OptionDescriptionRelevant algorithm
maxComboInteger. The limit of parameter combinations to try (default 0, meaning, to use 10 times the number of hyperparameters).random, grid, bayesian
metricString. The metric name that you are logging and want to minimize or maximize (default loss).random, grid, bayesian
gridSizeInteger. When creating a grid, the number of bins per parameter (default 10).random, grid
minSampleSizeInteger. The number of samples to help find appropriate grid ranges (default 100).random, grid, bayesian
retryLimitInteger. The limit to try creating a unique parameter set before giving up (default is 20).random, grid, bayesian
retryAssignLimitInteger. The limit to re-assign non-completed experiments (default is 0).random, grid, bayesian
objectiveString. Specify minimize or maximize, for the objective metric (default is minimize)bayesian


The parameters section of the Optimizer configuration is a dictionary containing all the hyperparameters to be optimized.

The format of each parameter was inspired by Google's Vizier, and exemplified by the open source version called Advisor.

The following example shows the configuration of the hyperparameters hidden-layer-size, momentum, and batch_size:

"parameters": {
     "hidden-layer-size": {"type": "integer", "scaling_type": "uniform", "min": 5, "max": 100},
     "momentum": {"type": "float", "scaling_type": "normal", "mu": 10, "sigma": 5},
     "batch_size": {"type": "discrete", "values": [16, 32, 64]},

For each hyperparameter, you must make settings to the following parameters:

  • type (mandatory). Specify one of the following options:
    • integer
    • float
    • double
    • categorical
    • discrete: Integer. The number of samples to help find appropriate grid ranges (default is 100).
    • categorical: The values must be strings.

Depending on the type used (unless type is categorical or discrete), you must also specify a value in the following list:

  • scaling_type (optional and not available when type is categorical or discrete). Specify one of the following options:
    • linear (default)
    • uniform
    • normal
    • loguniform
    • lognormal

Depending on the scaling_type used, you must also specify a value in the following list:

  • values: Only when the type is categorical or discrete.
  • min: Only when the scaling is one of [linear, uniform, loguniform, lognormal]
  • max: Only when the scaling is one of [linear, uniform, loguniform, lognormal]
  • mu: Only when the scaling is one of [normal, lognormal]
  • sigma: Only when the scaling is one of [normal, lognormal]
  • grid_size: Only when algorithm is grid. Each parameter is considered a distribution for those algorithms that sample randomly. Those algorithms include bayes and random. However, other algorithms need to know a resolution size for how to divide up the parameter space into discrete bins. Those algorithms include grid. For those, an additional entry named gridSize can be set for each parameter.


The "integer" type with "linear" scalingType when using the bayes algorithm indicates an independent distribution. This is useful for using integer values that have no relationship with one another, such as seed values. If your distribution is meaningful (for example, 2 is closer to 1 than it is to 6) then you should use the "uniform" scalingType.

End-to-end example

This Colab Notebook is an end-to-end program using Keras with the Comet Optimizer.

Open In Colab

Comet optimize

comet is a command-line utility that is installed with comet_ml. optimize is one of the commands that comet can use. The format is:

$ comet optimize [options] [PYTHON_SCRIPT] OPTIMIZER

For more information on comet optimize, see Comet Command-Line Utilities.

Learn more

May. 24, 2022