Model Optimization at the Center of MLOps

Michael McCourt and Barrett Williams
All Model Types, Augmented ML Workflow, Hyperparameter Optimization, Model Type, Modeling Best Practices, Training & Tuning

When it comes to predictive analytics, there are many factors that influence whether your model is performant for the real-world business problem you are trying to address. In order to hit or exceed your business goals, it’s important to define the right metrics, define the right parameters to test and tune, and then train models that converge and perform according to your business’s needs. At first glance, it can be overwhelming to determine how you should navigate all of these options when developing a model. In this post, we summarize a chapter from the Practical MLOps ebook in which Michael McCourt, our head of research, drew on experience working with AI leaders across a variety of industries to share his expertise.

  1. Be thoughtful about how you define your metrics

Metrics should reflect your business needs, and the outcomes you value in your production system, not simply a metric that gets your model to converge fastest. First, you have to define metrics that robustly connect your models to tangible business needs. Many businesses have multiple competing priorities, so it’s important to track multiple metrics, and even choose two of them to evaluate tradeoffs: for example, inference time versus accuracy. You’ll want to consciously apply these metrics in different ways depending on how they relate to your modeling problem; some you’ll want to save for possible later use, or define others as thresholds. You can learn more about the broader concept of metric strategy and metric management here.

  1. Train parameters, tune hyperparameters

Training a machine learning model means iteratively adjusting thousands (or even millions) of weights in a model. These are often known as parameters, similar to how scalars or coefficients might modulate a polynomial. These are the core values that define your model, and help it fit your data. (You’ll fit your model on a training dataset, leaving a validation set for your hyperparameters, and a fully independent test set to predict how your model will behave in production.) But by what process do you get that best-fitting model? This process is often controlled by hyperparameters (also less frequently known as meta-parameters) such as learning rate, batch size, number of filters, or even number of layers. These hyperparameters modulate the training process itself, or even define the architecture of your model through categorical or numerical parameters. It’s important to make the distinction between a model’s weights and its hyperparameters, because the weights use a standard loss function, which iteratively minimizes loss with access to gradient information, while the more global scope of hyperparameter tuning is most often gradient-free, and tested on a validation dataset. It’s important to keep this dataset distinct from both your training set and your holdout test set, which is what will help you evaluate the model in as close to a “real-world” production setting as possible. With just a few hyperparameters, it might be fine to experiment with a grid search, but as your model grows in complexity, you’ll find yourself looking for other more efficient strategies, such as Baysian optimization. You can see how you might execute this workflow in this previous blog post.

  1. Be sample-efficient in tuning

Tuning can definitely improve the quality of business-critical models, but many teams and projects never get around to tuning their models. In some scenarios, data scientists assume that the parameters found in the reference research they’re using will suffice to get their model to converge, or they don’t realize they might be leaving performance on the table. There’s also both the fear and the reality that tuning can prove expensive in terms of both time and compute resources, since each tuning run traditionally involves retraining the model from scratch with an entirely new set of hyperparameters. But SigOpt delivers on a number of tools that help you reduce this tuning burden: first, parallelism will reduce the wall-clock time it takes to tune your model; second, multitask optimization lets you execute partial training runs that still help you assess the improvement in your model; and lastly, Bayesian optimization proves to be more efficient than grid search or random search for a large proportion of real-world modeling problems. In sum, yes, tuning is possible and beneficial under both budgetary and time constraints.

On Tuesday, March 16 at 9AM PT, SigOpt is co-hosting a webinar to discuss how metrics, optimization, and tracking can serve as the cornerstone of a complete MLOps pipeline for your business. If you’d like to join SigOpt’s Research Engineer Dr. Michael McCourt, Kevin Stumpf, CTO of Tecton, and Eero Laaksonen, CEO of Valohai, please sign up for the webinar here. If you’re interested in downloading the MLOps ebook that contains a full chapter on this content, plus much more, you can download it here. If you’d like to sign up for free access to SigOpt, you can do so here, and if you’re an academic researcher or professor, you can find our academic program sign-up page here.

MichaelMcCourt
Michael McCourt Research Engineer
Barrett-Williams
Barrett Williams Product Marketing Lead