Efficient BERT at Ray Summit

Nick Payton
Advanced Optimization Techniques, Applied AI Insights, BERT, Deep Learning, Experiment Management, Hyperparameter Optimization, Natural Language, Transformers

Transformers have unlocked significant potential for natural language processing (NLP). No architecture has demonstrated the potential of Transformers more than BERT, which beat state-of-the-art benchmarks in question answering, general language understanding, and commonsense inference. But BERT is also very, very large. So, practically, using BERT in an applied setting requires the capacity to tune at scale, track and visualize runs to understand model behavior, and weigh tradeoffs between model size and accuracy.

In this talk at Ray Summit, SigOpt Machine Learning Engineer Meghana Ravikumar applies Experiment Management, Metric Management, and Multimetric Bayesian Hyperparameter Optimization with Ray to weigh practical tradeoffs for BERT that hold implications for applied machine learning settings. Through these experiments, Meghana develops Efficient BERT: configurations of BERT that are much smaller than a relevant benchmark model, but attempt to retain similar accuracy.

Watch the recording of Meghana’s talk, Efficient BERT, at Ray Summit

Are you interested in trying SigOpt? Take advantage of a limited time offer to test our product for free or check out our docs to learn how our API and dashboard work.

Are you interested in reproducing Meghana’s work? Here are a few resources to get you started:

Are you interested in learning more? Here are a variety of resources on Efficient BERT:

Nick Payton
Nick Payton Head of Marketing & Partnerships

Want more content from SigOpt? Sign up now.