SigOpt hosted our first user conference, the SigOpt AI & HPC Summit, on Tuesday, November 16, 2021. It was virtual and free to attend, and you can access content from the event at sigopt.com/summit. For more than any other reason, we were excited to host this Summit to showcase the great work of some of SigOpt’s customers. Today, we share the most prominent lessons we learned from these customers during the event.
1. The juice is worth the squeeze.
It can take a long time to find the right combination of hyperparameters with extensive experimentation, but the results have been shown to be worth the time invested. During the summit, we heard how Venkatesh from Paypal significantly accelerated their training time and improved their AUC by using extensive hyper parameter optimization. This resulted in newly optimized models that beat their existing champion models. We also heard how Pablo at Anastasia AI was able to get good quality and efficient forecasting by leveraging extensive HPO. Pablo’s results have the potential to transform the way that time series forecasting is handled by a variety of industries. Finally, we heard from Subutai at Numenta on ways that he applied Bayesian optimization to optimize novel sparse networks and define a new set of best practices. As a result, they were able to leverage sparse networks to run 100x faster than their dense counterparts. These speakers all reminded us that even though HPO does involve extensive experimentation, the juice is worth the squeeze.
2. Physical Simulations Benefit from Bayesian Optimization.
It’s not just tech and banking that are reaping the rewards of HPO. Physical processes are also benefiting from AI/ML and Bayesian optimization. Physical process simulations are typically expensive. They don’t have 1 trillion transactions per minute captured in a streaming database like PayPal. Instead, procuring each data point can require running a new time-intensive field experiment. But they can still benefit from machine learning and, in particular, Bayesian optimization. During the event, we heard how Paul at the University of Pittsburgh accelerated their glass design process and created an entirely novel glass design structure that balanced a variety of competing – but equally valuable – properties. The result is more efficient solar panels. We also heard how Rafa from MIT applied deep learning to discover new energy materials that could impact chemistry in a variety of ways. This has the potential to transform the way we discover chemicals for a variety of applications. Finally, we heard from Vish at Novelis on ways that he applied Bayesian optimization to discover new designs for recycling processes that were more efficient and cheaper to implement. These cases are all a reminder that physical simulations and processes are some of the most interesting potential applications for machine learning and Bayesian optimization.
3. You don’t need to train GPT-3 to apply AI to your problems.
AI has popularly needed an immense amount of computing power to be able to train the latest achievements, such as DeepMind’s AlphaFold or OpenAI’s GPT-3. GPT-3 has billions of parameters, would require a multi-million dollar budget, and hundreds of years to train. However, during the event, Pablo from Anastasia showed us how their work has helped AI reach any industry. Additionally, Jian Zhang from Intel showed us how parallelized data processing reduces the need to scale up on expensive GPU clusters. As a result, they were able to reduce training time significantly without the need for cost-prohibitive hardware. Finally, Scott Clark at SigOpt is now offering their proprietary hyper parameter optimization algorithm for anyone to use free of charge. This will empower modellers to supercharge their AI solutions without the need for high compute resources. These talks showed us that AI is being democratized and it’s making it possible for anyone to leverage AI. You don’t need thousands of clusters to leverage the power of AI.
4. There’s a variety of ways to accelerate training.
Training machine learning algorithms can take an extremely long amount of time. It’s not a surprise that modellers are always excited when they’re presented with the opportunity to decrease their training time. During the event, Ke Ding from Intel talked about how they leveraged novel parallelizable solutions to scale up their training clusters. By successfully increasing their number of sockets, they were able to reduce training time from hours to a few minutes. Also, we heard from Venkatesh at PayPal who spoke about how they overcame the challenges of graph neural networks and their large memory and compute needs. By leveraging specialized data storage solutions that optimized the graph format of the data, they were able to reduce the training time for large real world problems. These cases are all a reminder that when we’re scaling up solutions and reducing training time we need to think of the entire foundation of the ML technology stack when attempting to reduce costs. There isn’t just one way to accelerate training.
5. We’re just scratching the surface.
With the advent of massive deep learning systems, a world of applications lies ahead. However, we’re still discovering how to best configure these systems for tasks that have never been investigated before. The researchers that presented at our conference have been working on hard problems for years, but they’re still just scratching the surface. They are continuing to push the boundaries of what’s possible. For example, we heard how Alexander at Stanford University has conducted many years of deep learning research within the health and protein space. Building on his past research, he was able to reach state-of-the-art results which yielded multiple peer-reviewed papers. Plus, he’s expecting to be published in Nature Biology in the next few months. We also heard from Rafael from MIT who has been exploring the space of hypothetical materials and has published a series of papers that continue to build on each other. In the wake of their Nature paper, they anticipate pushing even further down this path. Additionally, we heard how Vishwanath at Novelis is working on research and development that showcases how you can apply intelligent experimentation to fundamentally transform materials design. However, the big opportunity is to push these techniques across their entire organization and replace outdated methods for simulation. Finally, we heard how Venkatesh at Paypal is exploring how the adversarial nature of fraud and the trillions of data points they get each day combine to make this an evolving challenge that they will continuously work to solve over time. This body of work has resulted in new discoveries in the spaces of healthcare, material science, recycling, and finance but there is still an immense landscape of research and development to explore.
If you want to get a better sense of how SigOpt could impact your workflow than by simply reading about use cases, sign up for free at sigopt.com/signup. To see the recordings from the Summit go to sigopt.com/summit. We look forward to you joining us for future Summits!