In my previous post, I talked about the exploration/exploitation dilemma in optimization, and then defined an algorithm that seeks to resolve it. The algorithm described in that post was a variation of neighborhood search and hill climbing methods. The amount of time it spends looking for improvements in the “neighborhood” of a given solution is a direct function of the magnitude of improvement that a current best solution demonstrates over the previous best.
Large improvements cause the algorithm to remain in the current best solution’s neighborhood for significant lengths of time. While this is a good thing if the neighborhood is rich in potential improvements, it presents a problem if the neighborhood turns out to be sparse in improving solutions.
To counter this, I outline another, far more exploratory, numerical optimization algorithm. This algorithm symetrically interleaves its exploration and exploitation steps, dividing its time equally betweem searching the neighborhood of the current global best, and searching the global solution space in a purely random pattern.
After initially evaluating a solution at random and setting its score as the global best, the algorithm evaluates another solution in the global best’s neighborhood. If this solution’s score is better than that of the global best, it becomes the new global best, otherwise the optimizer tries a protential solution chosen at random from the entire search space. If this solution is better than global best, it becomes the global best and the search resumes from its neighborhood, otherwise the search returns to the current global best’s neighborhood to try another solution at random.
In this way, on every other step, the algorithm bounces between searching the whole space randomly and a much more focussed search in the global best’s neighborhood.
The algorithm was designed to optimize machine learning model parameters, but it can be applied to any numerical optimization problem.
Below is an outline of the algorithm, and below that is a Python implementation of the algorithm tuning the parameters of a LightGBM prediction model.
Interleaved Neighborhood Algorithm
evaluate random solution rs set var global_best to rs score set var best_parameters to rs parameters set var horizon to "local" while true do if horizon = "local" neighborhood solution ns = gaussian distr(mean = best_parameters, deviation = random(0.1...1.0)) if ns better than global_best set var global_best to ns score set var best_parameters to ns parameters continue else set var horizon to "global" coninue else evaluate global random solution gs if gs score better than global_best set var global_best to gs score set var best_parameters to gs parameters set var horizon to "local" continue else: set var horizon to "local" continue
Optimizing A LightGBM Machine Learning Model
The Python code below uses the Interleaved Neighborhood Algorithm to search through the space of parameters of a LightGBM machine learning model. The model is trained on, and used to make predictions about, the Kaggle Iowa housing dataset.
LightGBM is a gradient boosting framework with a large number of parameters, each of which can take a huge range of values. This makes parameter optimization a nontrivial problem. For simplicity, the demonstration below uses only a few of the dozens of model parameters available for tuning.
2020-06-30 19:37 +0000