Tune hyperparameters with Comet¶
In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm.
First, what is the difference between parameters and hyperparameters?
Parameters: A learning algorithm learns or estimates model parameters for the given data set, then continues to update these values as it continues to learn. Once learning is complete, these parameters become part of the model. For example, each weight and bias in a neural network is a parameter.
Hyperparameters: These, on the other hand, are parameters whose values are used to control the learning process - they need to be fixed before running them, so they cannot be calculated from the data. Hyperparameters are used to estimate the model parameters. For example, the number of hidden layers in deep neural networks. Different hyperparameter values produce different model parameter values for a given data set.
Why optimize hyperparameters?¶
In many cases, the performance of an algorithm on a given learning task depends on its hyperparameter settings. Hyperparameter optimization is the activity of finding a set of optimal hyperparameters for a learning algorithm - while applying this optimized algorithm to out-of-sample data.
That combination of hyperparameters maximizes the model’s performance, while minimizing a predefined loss function to produce better results, with fewer errors. The learning algorithm optimizes the loss based on the input data and tries to find an optimal solution within the given setting. Comet's Optimizer is designed to dynamically find the best set of hyperparameter values that will minimize or maximize a particular objective metric.
Tuning works by running multiple trials in a single training process. Each trial is a complete execution of your training application (in Comet, an execution of an Experiment) with values for the select hyperparameters and set within the limits you specify. When complete, this process produces a set of hyperparameter values that are best suited for the model to deliver optimal results.
Methods of tuning hyperparameters¶
You can tune manually or by automation.
When tuning hyperparameters manually, you typically start using the default recommended values or rules of thumb, then searche through a range of values using a trial-and-error method. But manual tuning is a tedious approach - there can be many trials and keeping track can prove costly and time-consuming. Also, it isn’t practical when there are many hyperparameters with a wide range. Learn more in How to Manually Optimize Machine Learning Model Hyperparameters.
Manual tuning of the hyperparameters can be time-consuming and, in most cases, might require domain knowledge to find the best hyperparameters to pass to the model. And, in the end, it might not lead to a better-performing model.
The rest of this page examines automated hyperparameter tuning methods that use an algorithm to search for the optimal values. In particular, it shows you how to use the Comet Optimizer to dynamically find the best set of hyperparameter values that will minimize or maximize a particular objective metric.
Comet's Optimizer has many benefits over traditional hyperparameter optimizer search services because of its integration with Comet's Experiments. In addition, Comet's hyperparameter search has a powerful architecture for customizing your search or sweep. You can easily switch search algorithms, or perform phased searches.
The Optimizer can run in serial, in parallel, or in a combination of the two.
Want to use your own, or any third-party optimizer with Comet? No problem. Just make sure that all of the hyperparameters are logged. For more information, see use your own optimizers.
Automate tuning with Comet's Optimizer¶
Comet's Optimizer supports the three most common automated methods for tuning hyperparameters:
Random search: A random search method is used to find the best set of hyperparameters out of a large set of hyperparameters. It performs a few random searches within a bounded domain of hyperparameters, records the model performance, and arrives at the best set. Usually, Random search is performed first, and then the hyperparameters are further narrowed down using grid search. Random is best suited to a large search domain that uses several hyperparameters. The upside is that random search typically requires less time than grid search to return a comparable result. It also ensures you don’t come up with a model that’s biased toward value sets arbitrarily chosen by users. Its downside is that the result may not be the best possible hyperparameter combination.
Comet's Random algorithm is slightly more flexible than the Grid algorithm, in that it will continue to sample from the set of possible parameter values, until you stop the search, or have the "max combinations" value set. The Random algorithm, like the Grid algorithm, does not use past experiment metrics to inform future experiments.
Grid search: Defines a search space as a grid of hyperparameter values and evaluates every position in the grid. Grid is useful when you have already narrowed down your hyperparameters to a set of a few values, and you now want to find the best set. Grid is mostly a time-consuming method. Comet's Grid algorithm is slightly more flexible than many, as each time you run it, you will sample from the set of possible grids defined by the parameter space distribution. The Grid algorithm does not use past experiments to inform future experiments; it merely collects the objective metric for you to explore.
Use the Comet Optimizer with Grid search.
How do Grid and Random compare? Grid search is better for spot-checking combinations that are known to perform well generally. Random search is great for discovery and getting hyperparameter combinations that you may not have guessed intuitively. Both Grid and Random search take very long to execute the process, as they waste most of their time evaluating parameters in the search space that do not add any value.
Bayesian Optimization and Evolutionary Optimization Another important hyperparameter tuning method is using the Bayesian optimization technique. This technique differs from the grid search or random search methods -- It is an advanced and automated hyperparameter tuning technique that uses probabilities to find the best value for the hyperparameter. Several libraries use Bayesian searches, such as Optuna, Hyperopt, and Scikit-Optimize (skopt).
The Bayes algorithm may be the best choice for most of your Optimizer uses. It provides a well-tested algorithm that balances exploring unknown space, with exploiting the best known so far.
The Comet optimizer will never assign the same set of parameters twice, unless running multiple trials of an experiment, or reassigning a non-completed experiment (see below). Depending on the algorithm you have chosen, the number of possible experiments to run may be finite, or infinite. For example, the "Bayes" algorithm always samples from continuous parameter distributions, but the "grid" algorithm always breaks distributions into grids. However, some parameter types (including "categorical" and "discrete") there are only a finite number of options. To see the computed value of maximum number of possible experiments for an algorithm, see
You can also use your own optimizer within Comet. See how here.
Try it out!¶
Try out the Comet Optimizer in this Colab Notebook.