How B.Next Applies SigOpt to COVID-19 Research

Nick Payton and JJ Ben-Joseph
Experiment Management, Healthcare, Hyperparameter Optimization, Natural Language, Time Series

Many researchers are working to apply deep learning to benefit the global response to the COVID-19 pandemic. For example, SigOpt worked with the University of Waterloo and Darwin AI to optimize the hyperparameters for COVID-Net, an image classification model that detects COVID-19 from lung x-rays. And the HPC Consortium was established to connect these research projects with access to free, scalable computing resources. 

This includes the work of In-Q-Tel (IQT), the strategic US national security investment arm, and its biodefense innovation and applied research lab, B.Next. B.Next has compiled a variety of tools and resources to facilitate COVID-19 research, including datasets that can be used to train machine learning models. JJ Ben-Joseph, a research fellow and member of the technical staff at B.Next, is using these resources to develop a natural language processing (NLP) model to assess the toxicity of antiviral drugs and validity of molecular compounds.

JJ used SigOpt through the process to track his training runs, explore the modeling problem space, evaluate model behavior, and optimize the model hyperparameters. In this post, we interview JJ to learn more about his work.

How would you summarize your research and for whom is it most useful?

The goal of this research is to accelerate the discovery of a small molecule antiviral for COVID-19. If I am able to accomplish this goal, the entire world stands to benefit from this research. 

To date, I have focused on training NLP with datasets that use the simplified molecular-input line-entry system (SMILES), which is a format that works much like written language (hence the application of NLP). I have trained traditional machine learning models (like random forests), but have run into what seems to be a performance threshold that is below the state-of-the-art that I want to reach. 

How did you decide on the model architecture you chose for this research?

More specifically, I fed SMILES definitions for antiviral candidates into a custom parser, which I then analyzed with a technique called information gain to create a large chemical language I can use conventional natural language processing techniques on. In this pipeline, three consecutive models are connected before a final prediction is created, which generates many hyperparameters that need to be optimized.

This works quite well because the industry-standard SMILES format works much like written language (except that definitions are more explicit rather than ambiguous!), so a combination of parsers and a random forest works well to assess the validity of molecular candidates.

Alternatively, I could have explored the use of graph convolutions. But because this NLP technique is novel, I made the choice to go in this direction to see if I could create a new, more effective approach to addressing this modeling problem. The efficacy of this path versus others is still unknown. With further research, I expect to be able to draw firmer conclusions on this architecture decision. 

How did you employ SigOpt? 

I used SigOpt for a variety of purposes. I used Runs to track training, Experiments to automate hyperparameter optimization, and the Dashboard to visualize these training and tuning jobs. 

SigOpt was particularly useful for hyperparameter optimization. Compared to other optimizers, SigOpt allows for optimization of mixed parameter spaces (including categoricals) and its sample efficiency makes it possible to optimize a higher number of parameters than is possible with other optimization methods. There are a huge amount of hyperparameters because it’s a multi-stage machine learning pipeline, so this latter capability is particularly important. Some examples include max-ngram-size, window-size, epochs, max_vocab, estimator_trees, estimator_criteron.

Why did you choose SigOpt?

SigOpt is way easier to use, has a nice UI, is more intelligent compared to grid search, and the server based centralization integrates nicely with SLURM, which is commonly used on supercomputers (HPC) that I used in this research. 

Most importantly, SigOpt was useful in providing me with highly useful situational awareness on what parts of my pipeline are actually important, it will influence the project in the next iteration. This insight might never have come to me otherwise.

What’s next for this research?

I am going to test deep learning methods that have proven particularly high-performing for NLP in the next phase of this research. I hope to achieve and potentially exceed state-of-the-art for similar tasks, boosting the performance of this model for this important task. 

If you’re interested in applying SigOpt to optimize models useful in the fight against COVID-19, you can sign up for free, unrestricted access to our platform here. If you’re performing any kind of academic research, you can sign up for our academic program here.

Nick Payton
Nick Payton Head of Marketing & Partnerships
JJ Ben-Joseph Guest Author