MIT’s Rafael Gomez-Bombarelli works on cutting edge research to develop new materials. Rafael leverages insights from generative architectures, such as GANs and VAEs, to create novel materials. He and his team are creating new materials for healthcare, sustainability, carbon storage, and more.
How did GANs inspire you to build novel materials?
Rafael: My ultimate goal is for the computer to help us design for a target property. So given my desired property, I want to know which material fulfills it. So it’s an inverse design problem, and it might not be bijective. There are big challenges to this, but we’re trying to discover the correct arrangement of matter that will give us the desired property. This 2018 GAN paper keeps getting older and older: Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras, Aila, Laine, Lehtinen). And it still looks amazing.
Karras, Aila, Laine, Lehtinen 2018
This paper inspired me to leverage generative models in machine learning to create new molecules. What we’re trying to do is to sample members of a complex distribution. Only the distribution this time is not the distribution of celebrity faces, like in the paper. Now we want a sample from the distribution of molecules to find the molecule that is the deepest pure blue color to create a blue pixel in your television. That’s the nature of the task, and the generative model architecture is great at this task.
How do Variational Autoencoders generate Novel Materials?
Rafael: So back in 2015 and 2016, we started playing around with the variational autoencoders, and that was one of the first examples where one could do this. We would take a molecule, and we encoded it as a vector based on the initial graph. And now that it’s a continuous vector we can numerically modify it. We can do gradient descent over property prediction. We can sample from the distribution at random and learn to decode all of these molecules back into the input.
Encoder-Decoder Architecture to Generate Novel Materials
So we set up our variational autoencoder architecture and it learns to reconstruct an input into its own image and the output. And in the meantime, it learns a continuous embedding where every latent point corresponds to a molecule. It learns to generalize and produce new molecules that are not necessarily in the training data set. They come from the same distribution of the training data, and they’re related to the training data. They have sort of the same distribution of properties, but they are new. They haven’t been made yet.
Creating a latent space from the initial molecular distribution
Watch the Talk
To learn more about how MIT leverages generative models to create novel molecules and novel materials, I encourage you to watch the talk from Professor Rafael Gomez-Bombarelli.