The Document Image and Voice Analysis Group (DIVA) at the University of Fribourg experienced a familiar challenge for any scientific practitioner: reproducibility. They found this problem was particularly challenging in machine learning and deep learning. So Michele Alberti, Vinaychandran Pondenkandath, Marcel Wursch, Rolf Ingold, and Marcus Liwicki set out to design a combination of best-in-class infrastructure and frameworks capable of facilitating a more reproducible approach to machine learning and deep learning models. They call this solution DeepDIVA.
One of the key problems that needed to be solved was reproducing the configuration of relevant hyperparameters for any given model. To automate this hyperparameter optimization process, the team chose SigOpt as the best-in-class tuning solution for their DeepDIVA framework.
Here at SigOpt, we love to highlight the work done by academics who take advantage of a free version of our product for their research (learn more here). To this end, below is an interview with the DeepDIVA team from the University of Fribourg to learn more about their framework and what it means for the broader community.
What is DeepDIVA?
DeepDIVA is a Python framework, designed to enable quick and easy setup of reproducible experiments supported with a large range of useful analysis functionality.
Why did you develop it?
At the start of our PhD we were confronted with several problems that were common across projects: automating, evaluating, and, most importantly, reproducing experiments. Reproducing the scientific results is a frustrating experience in and of itself. But, in certain circumstances, keeping track of and reproducing your own results can be even more challenging.
Most research code is written with a short-term life expectancy, unlike in the industry where best practices for long-term usability from software engineering are usually considered when designing and writing the code. This leads to a few particular challenges when attempting to reproduce an experiment, such as hard-coded values for parameters and multiple untracked versions of the code.
These research code challenges make it hard to reproduce your own research after a few months, because you aren’t sure which version of the code you used to generate results or which parameters you chose to configure your model. Even when research groups agree on a set of tools for versioning, parameter-tracking and other tasks, the challenge of implementing them easily, efficiently, and consistently across team members will severely limit their impact on reproducibility over the longer term.
With DeepDIVA, our team produced a single and centralized open-source framework that is designed to automate reproducibility for each aspect of the model development process. This framework provides researchers and deep learning engineers the tools to automate key inexpert tasks in the process so they can spend their time focused on the tasks that do benefit from their expertise. The goal of this project is to make model development more productive and reproducible.
How does it work?
Our framework integrates the most critical deep learning model development software libraries and tools into a single best-in-class bundle that includes:
- High-end deep learning frameworks from Pytorch,
- Visualization and analysis from TensorFlow,
- Versioning from Github, and
- Hyperparameter optimization from SigOpt.
Who can benefit from using it?
Any researcher working on machine learning problems will benefit from DeepDIVA. But it has been specifically designed for deep learning, and researchers focused on particularly complex deep learning models stand to benefit most from it.
Which areas of the process were most important to automate?
It is important to automate as many tasks as possible, but particularly important to automate the tasks that do not benefit from domain expertise. These inexpert tasks are particularly unproductive for research teams.
Through DeepDIVA, we have successfully automated many tasks such as:
- Aggregating and visualizing results in real time,
- Ensuring an experiment can be reproduced, and
- Optimizing hyperparameters.
Automation of hyperparameter optimization was particularly high priority, because it is the most time consuming task that benefits least from expertise. Automating it with SigOpt empowered our team to run this task overnight or during weekends to significantly accelerate the model development process, while saving researcher time.
Why did you select SigOpt as a part of the framework?
We originally learned about SigOpt at NIPS in 2017. As a company founded by academics like Scott Clark, they believe in giving back to the academic community. And one way they do so is through a free version of their solution for academia. We signed up for this program and had experiments up and running in a few short hours.
The first benefit of SigOpt is that it automates the tuning process, which is value enough for most teams that waste time on this problem. But SigOpt is also highly effective at tuning. After testing SigOpt on a variety of models – ranging from simple to complex, including traditional machine learning and deep learning – we learned that SigOpt is highly effective at tuning hyperparameters to improve model performance across this wide cross-section of models. Finally, SigOpt offers analytics like parameter importance that are particularly valuable for reproducibility. Parameter importance will show you the degree to which specific hyperparameters contribute to its performance.
In general, these mathematical approaches to hyperparameter tuning – rather than intuition – produce better unbiased results with less expert time. Standardizing on SigOpt was an obvious choice for our team.
What are your hopes for this solution in the coming months?
Our goal is to facilitate a more efficient scientific process, and, as a result, promote ongoing scientific progress. To do so, we hope to promote reproducible research, while fostering a friendly environment in which both academy and industry collaborate on problems, share best practices, and contribute to mutual success. More tactically, we will encourage teams to publish research articles that include the database, source code, and experimental setup.
We believe that DeepDIVA is a valuable tool to make this entire process easier for teams. So DeepDIVA is our contribution to this process, designed specifically to facilitate reproducible research. In the best case scenario, all research published in the future will come with DeepDIVA URLs or other similar competing frameworks. Doing so will benefit the broader community.