Each year at the ICML conference a Test of Time award is presented to the authors of an article which, over the past 10 years, has had a significant impact on the ICML community. Previous winners include Random Features for Large-Scale Kernel Machine (2007) and Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions (2003) (the award was referred to as the Classic Paper Prize in 2013).
This year, the article Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design will be awarded the ICML 2020 Test of Time award. We at SigOpt would like to congratulate Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger for the recognition of their outstanding work.
Prior to this publication, there was limited theoretical analysis of Bayesian optimization; no precise convergence rates were known to the optimization community. In 2010, even the term Bayesian optimization was still uncommon, with terms such as sequential kriging optimization or Gaussian process optimization more prevalent. Through a clever reformulation of Bayesian optimization as a multi-armed bandit problem, the authors were able to derive convergence rates for the popular upper confidence bound (UCB) acquisition function. These rates were important for their novelty and for their breadth; the authors provided convergence rates for different classes of objective functions, and they subdivided these convergence rates for different classes of kernels.
Like all great theoretical results, the proof’s elegant, yet straightforward, presentation provided greater insights into the problem itself. Srinivas et al. characterized convergence rates exclusively in terms of a single parameter gamma, which measured UCB’s total information gain. They then showed that for different classes of kernels, gamma itself could be bounded by the spectral decay of the kernel—itself known as function of kernel smoothness—for many existing kernels (the paper itself considered the linear, squared exponential, and Matern kernels). This modularity in gamma lent the proof a great degree of flexibility, allowing others to simply extend it by plugging in a new gamma for a new kernel. Furthermore, their proof techniques drew from a wide range of mathematics, ranging from bandit problems to experimental design, which sparked new Bayesian optimization research directions.
Fun fact: A preliminary version of the paper was presented as a poster at the 2009 NeurIPS workshop on Active Learning and Experimental Design, the first ML workshop to have Bayesian optimization as a keyword. I (Ruben) was one of the organizers and we considered promoting the presentation to a contributed talk. At the end, the schedule was too tight. Furthermore, we had already invited two of the authors to give a talk (Andreas Krause and Matthias Seeger) even before the workshop was accepted.
UCB is still used in Bayesian optimization today, and its continued popularity should be largely credited to this outstanding work, which has over 1000 citations at the time of writing. For more information, we refer you to the award and the paper itself. Congratulations again to the authors for this recognition of their outstanding contribution to the community.