Our research highlights serve as a collection of feature articles detailing recent scientific achievements on GCS HPC resources.
Enzymes are the tiny, unheralded heroes of the bioengineering world. From ensuring that drug compounds reach their intended place in the human body to accelerating specific chemical reactions for a variety of industrial uses, enzymes play a role in countless medical and industrial processes.
For decades, bioengineering research advanced almost exclusively through experimental work, but scientists on the leading edge want to more fully understand enzymes’ biochemical interactions with one another and their environments at a fundamental level. To do this, researchers have increasingly turned to high-performance computing (HPC), which allows them to offer narrow predictions that can be tested experimentally as well as verify experimental results quickly and efficiently.
To that end, a group of researchers led by Dr. Holger Gohlke, Professor of Pharmaceutical and Medicinal Chemistry at Heinrich Heine Universität Düsseldorf and head of the John von Neumann Institute of Computing (NIC) research group Computational Biophysical Chemistry at the Jülich Supercomputing Centre (JSC) and Institute of Biological Information Processing (IBI-7: Structural Biochemistry) at Forschungszentrum Jülich (FZJ), is using HPC resources at JSC, one of the three centres comprising the Gauss Centre for Supercomputing (GCS), to learn how to imbue enzymes with increased tolerance to detergents and solvents encountered in a variety of environments—an important characteristic needed for drug design and other biotechnological applications.
“I would like to understand, predict, and modulate bimolecular properties and interactions of biomolecules with their environments,” Gohlke said. “In this case of our current research, environments are different types of solvents, mixtures, or solvents containing detergents. In our latest study, we looked at an enzyme. Enzymes are biotechnologically interesting molecules, where they often have to function under conditions that do not comply with their natural environments. For that, such molecules need to be optimized.”
To its knowledge, the team just completed the largest-ever study of variants of the enzyme Bacillus subtilis lipase A with respect to two enzyme properties, which has significant applications in the biotechnology sector. The work would be the largest computational analysis to date of a complete library of the enzyme’s variants, or the sum total of all possible single mutations that could be made to a specific enzyme. The team’s research was published in the Journal of Chemical Information and Modeling.
Creating a virtual test tube
Biomolecules are a broad class of “soft matter,” meaning that they are easily influenced and deformed by changes in their environments. Even under normal temperature settings, the weak bonds holding biomolecules together can break and change quickly. This also means biomolecules can appear in different molecular shapes as they shift and change.
In the vast majority of contexts, a limited range of shapes is required for biomolecules to function, such as helping drug molecules reach their target destination and play their designed role in the human body or accelerating a certain (bio-)chemical reaction. However, there can be instances when biomolecules adopt shapes, or even become shape-less, which leads to a loss of their function. Gohlke and his collaborators use HPC to create large “ensembles” of biomolecular shapes that are post-processed to learn about their behavior in different environments.
Researchers see a two-fold benefit from using molecular simulations in bioengineering: In certain contexts, experimentalists send researchers large amounts of testing data with the hope that computational scientists can uncover the origin of their findings at the atomic level or more broadly help make predictions based on a given dataset; Inversely, computational scientists can also help generate hypotheses about certain systems, which can then be tested by experimentalists.
For the team’s current research, it focused on the enzyme Lipase A in Bacillus subtilis, a well-studied enzyme and bacteria with applications in bioengineering. The team was able to do a complete “site-saturation mutagenesis” on the enzyme, meaning that every amino acid in the protein is mutated to every other natural one, leading to about 3,440 variants of the enzyme.
Gohlke pointed out that learning how to catalogue these mutations computationally helps experimentalists drastically reduce the amount of time needed to chart biomolecule mutations experimentally in the future. “What makes this system notable is that the mutation data available in public databases are usually limited to only a few positions of proteins, or limited to substitutions of simple amino acids,” he said. “It is not possible to use this publically available data to exhaustively validate algorithms or completely understand what factors are influencing protein stability, because you can’t see all the combinations that are possible.”
Further, Christina Nutschel, a PhD student on this project, pointed out that simulations such as this help develop better data-driven approaches to protein engineering, particularly with respect to computational models. That means that researchers don’t need to make perfect predictions with their simulations, but the insights point to more narrow research focuses for experimentalists to verify.
Finally, the team has been developing a computational approach to study enzymes it calls Constraint Network Analysis. This approach analyzes proteins’ static properties like they were a bridge—it looks for hot spots and specific points in the enzyme particularly useful for improving structural stability of the biomolecule. When compared with scattershot predictions made at random, the team’s approach saw a nine-fold gain in accuracy for identifying these points in the enzyme. To put that in perspective—an experimentalist is barely able to completely analyze an enzyme with 6 mutation points, because as there are 20 different possible mutations in each place, that leaves the researcher with 64 million different possibilities.
Bridging the gap in bioengineering
Moving forward, the team is working with several pharmaceutical and biotechnological companies to apply its approach to the companies’ collections of enzymes, and they are ramping up the work to not only model predictions for various substitutions and mutations, but also predict enzyme structure at a fundamental level.
The team has been working with JSC’s Dr. Olav Zimmermann very closely on the work, who also appeared as an author on the team’s Lipase A paper. Zimmermann helped verify the team’s predictions using the ProFASi tool developed at JSC. “There is a clear, tight interaction and collaboration with JSC,” Gohlke said.
Further, the team looks forward to the upcoming booster module to be installed on JSC’s current flagship supercomputer, JUWELS. The team’s simulation code is optimized for GPU-based architectures, and with the booster module installed later this year, JUWELS peak performance will jump from 12 to over 70 petaflops.
Ultimately, hardware is only one part of the whole equation, though. Gohlke emphasized that his collaborations with Zimmermann and co-author Dr. Karl-Erich Jäger, Professor at Heinrich Heine Universität Düsseldorf and director at the Institute of Bio- and Geosciences IBG-1 at FZJ, would only be possible in a collaborative, multidisciplinary environment like FZJ. “We can all come together at JSC and do this kind of work, and in my view, for this field to continue to progress, we are going to have to continue these cycles of simulation and experiment,” he said.
Related publication: Gohlke, et al. “Systematically Scrutinizing the Impact of Substitution Sites on Thermostability and Detergent Tolerance for Bacillus subtilis Lipase A.” J. Chem. Inf. Model. 2020, 60, 3, 1568-1584. DOI: 10.1021/acs.jcim.9b00954