Overview
In this blog post we are displaying a novel approach to fine-tuning natural language processing (NLP) models which ultimately leads to 50% faster chatbots!
We are also giving the reader an introduction to the SigOpt Intelligent Experimentation (IE) Platform and talk about some of the examples that we have used to familiarize ourselves with the platform.
We hope that by sharing our learnings, we will make it easier for the reader to get started on using the SigOpt Intelligent Experiment Platform, as well as giving the reader a framework for how to create better and faster contact center bots using intelligent experimentation.
With the improvements from SigOpt, the deployed models saw up to a 54.9% improvement in inferencing time, which means less time the customers are waiting for a response and less time required to help the customers. This translates to more customers being helped in less time. Using SigOpt significantly increased the value of our deployed models and improved our developers’ use of time when creating the models.
Contact Center Importance
Contact Center Transformation
The global contact center transformation market was valued at USD 23.83 billion in 2019 and is expected to reach USD 88.52 billion by 2027, registering a CAGR of 17.91% during 2020-2027.
With the post covid era customers expect more flexibility like choosing their own medium, time of the day to communicate, greater personalization and faster issue resolution. Hence a successful contact center experience needs to have customer experience at its pinnacle. That’s why, in this post, we will show how SigOpt has helped to achieve 50% faster and better Contact Center Bots for the task of Call Summarization.
Who is MindTree?
Why MindTree Used SigOpt
A key component of developing artificial intelligence (AI) applications at scale is making sure that the learnings from one project can be transferred to the next project and so on and so forth. The SigOpt Intelligent Experiment Platform gives us a framework that allows us to transfer knowledge from one project to the next. On top of that we have located 3 key components which are driving additional value for our projects:
- Sample Efficiency – A key component of building out Machine Learning (ML) models is optimizing for performance. When using approaches such as grid search and other less sample efficient optimization methods, one of the challenges oftentimes becomes narrowing down the search space to not having to apply infinite computational resources. The SigOpt IE Platform on the other hand provides a set of Bayesian and other global optimization algorithms, which are specifically designed to be as sample efficient as possible and bring down the computational resources that it will normally take to optimize a model.
- Advanced Experimentation Features – The SigOpt IE Platform offers a wide verity of advanced experimentation features that help modelers to better align their modeling objectives with their business objectives. One of these features is multimetric optimization[AS1] , which allows the modeler to optimize multiple competing metrics at the same time.
- Ease of Use – Finally, the SigOpt IE Platform offers an easy-to-use client library that allow modelers to easily bolt SigOpt onto what they are doing today. Plus, an intuitive web dashboard experience that allows the user to store artifacts, visualize results and work closely with other members of the modeling team as well as other key stakeholders.
We hope that by highlighting these areas we can bring forward some of our learnings to the AI community and help to make the process of developing AI models more scalable.
Results
Comparing SigOpt to other optimization methods
To familiarize ourselves with the platform we started out by comparing the SigOpt optimization algorithms to other standard approaches for hyperparameter optimization. For the comparison we used grid search, random search and the scipy optimization library and compared performance on the eggholder function. The eggholder functions is considered a classic function within the optimization literature. Our comparison gave a quick glance into how SigOpt was able to find the best value in fewer iterations compared to the other approaches.
Method | Number of evaluations | Max value achieved
(Eggholder function) |
Bayesian optimization by SigOpt tool | 40 | 158.55 |
Grid search | 80 | 138.22 |
Random search | 70 | 145.95 |
Using SigOpt for image classification
Another example that we did to familiarize ourselves with the platform was image classification using the MNIST dataset. For the example we used a multilayer perceptron model.
A screenshot of the summary page from the web dashboard is shown below.
Figure 1: summary page on the SigOpt web dashboard for MNIST image classification.
Here we saw that after just 5 iterations we were able to reach an accuracy of 96.5%, something which we believe can be improved through more iterations.
Overall, running these two experiments gave us the confidence to continue with our work.
If you are not interested in producing your own examples, the SigOpt dashboard experience provide a large set easy-to-access examples.
Choosing the Right Framework
First, we tried optimizing a Pegasus model by minimizing evaluation loss. We settled on optimizing just two hyperparameters, batch size and learning rate, when pre-training an already existing model. The best seen trace over just 8 iterations is seen below.
Initially we tried optimize Pegasus model by minimizing evaluation loss as computation metric and we played with parameters batch size and learning rate just for understanding how sigopt works with transformer models. With minimum number of observations we tried and trace for 8 experiments and best value shown below. It can be optimized further with more parameters and more metrics.
Figure 2: summary page on the SigOpt web dashboard for Pegasus model.
Seeing this trace gave us the confidence to rule out the Pegasus model as a potential model candidate. This gave us the time to only focus on fine tuning BART.
Reducing Inference Time
Defining, selecting, and optimizing with the right set of metrics is critical to every modeling process, but these steps are often hard to execute well. Building a useful model requires that the modeler select the right set of metrics, then maximize or minimize them during the model training and tuning process.
In many applications, however, it may be necessary to maximize two competing metrics which have different parameters giving the best results. Such a situation may be referred to as a Multimetric (or multicriteria/multiobjective) experiment. The SigOpt Multimetric experiment was conducted to explore the maximum values achievable for both metrics such as loss and rouge score.
The final inference times are below when compared to not using SigOpt:
Figure 3: Inference Improvements using SigOpt
With the improvements from SigOpt, the deployed models saw up to a 54.9% improvement in inferencing time, which means less time the customers are waiting for a response and less time required to help the customers. This translates to more customers being helped in less time. Using SigOpt significantly increased the value of our deployed models and improved our developers’ use of time when creating the models.
Take Action
To try SigOpt today, you can access it for free at sigopt.com/signup.