-
DeepLNE++ leveraging knowledge distillation for accelerated multi-state path-like collective variables
Authors:
Thorben Fröhlking,
Valerio Rizzi,
Simone Aureli,
Francesco Luigi Gervasio
Abstract:
Path-like collective variables can be very effective for accurately modeling complex biomolecular processes in molecular dynamics simulations. Recently, we introduced DeepLNE, a machine learning-based path-like CV that provides a progression variable s along the path as a non-linear combination of several descriptors, effectively approximating the reaction coordinate. However, DeepLNE is computati…
▽ More
Path-like collective variables can be very effective for accurately modeling complex biomolecular processes in molecular dynamics simulations. Recently, we introduced DeepLNE, a machine learning-based path-like CV that provides a progression variable s along the path as a non-linear combination of several descriptors, effectively approximating the reaction coordinate. However, DeepLNE is computationally expensive for realistic systems needing many descriptors and limited in its ability to handle multi-state reactions. Here we present DeepLNE++, which uses a knowledge distillation approach to significantly accelerate the evaluation of DeepLNE, making it feasible to compute free energy landscapes for large and complex biomolecular systems. In addition, DeepLNE++ encodes system-specific knowledge within a supervised multitasking framework, enhancing its versatility and effectiveness.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Deep learning path-like collective variable for enhanced sampling molecular dynamics
Authors:
Thorben Fröhlking,
Luigi Bonati,
Valerio Rizzi,
Francesco Luigi Gervasio
Abstract:
Several enhanced sampling techniques rely on the definition of collective variables to effectively explore free energy landscapes. Existing variables that describe the progression along a reactive pathway offer an elegant solution but face a number of limitations. In this paper, we address these challenges by introducing a new path-like collective variable called the `Deep-locally-non-linear-embed…
▽ More
Several enhanced sampling techniques rely on the definition of collective variables to effectively explore free energy landscapes. Existing variables that describe the progression along a reactive pathway offer an elegant solution but face a number of limitations. In this paper, we address these challenges by introducing a new path-like collective variable called the `Deep-locally-non-linear-embedding', which is inspired by principles of the locally linear embedding technique and is trained on a reactive trajectory. The variable mimics the ideal reaction coordinate by automatically generating a non-linear combination of features through a differentiable generalized autoencoder that combines a neural network with a continuous k-nearest-neighbor selection. Among the key advantages of this method is its capability to automatically choose the metric for searching neighbors and to learn the path from state A to state B without the need to handpick landmarks a priori. We demonstrate the effectiveness of DeepLNE by showing that the progression along the path variable closely approximates the ideal reaction coordinate in toy models such as the Müller-Brown-potential and alanine dipeptide. We then use it in molecular dynamics simulations of an RNA tetraloop, where we highlight its capability to accelerate transitions and converge the free energy of folding.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Boosting ensemble refinement with transferable force field corrections: synergistic optimization for molecular simulations
Authors:
Ivan Gilardoni,
Thorben Fröhlking,
Giovanni Bussi
Abstract:
A novel method combining the ensemble refinement by maximum entropy principle and the force field fitting approach is presented. Its formulation allows to continuously interpolate in between these two methods, which can thus be interpreted as two limiting cases. A cross-validation procedure enables to correctly assess the relative weight of both of them, distinguishing scenarios where the combined…
▽ More
A novel method combining the ensemble refinement by maximum entropy principle and the force field fitting approach is presented. Its formulation allows to continuously interpolate in between these two methods, which can thus be interpreted as two limiting cases. A cross-validation procedure enables to correctly assess the relative weight of both of them, distinguishing scenarios where the combined approach is meaningful from those in which either ensemble refinement or force field fitting separately prevails. The efficacy of their combination is examined for a realistic case study of RNA oligomers. Within the new scheme, molecular dynamics simulations are integrated with experimental data provided by nuclear-magnetic-resonance measures. We show that force field corrections are in general superior when applied to the appropriate force field terms, but are automatically discarded by the method when applied to inappropriate force field terms.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Refinement of molecular dynamics ensembles using experimental data and flexible forward models
Authors:
Thorben Fröhlking,
Mattia Bernetti,
Giovanni Bussi
Abstract:
A novel method combining maximum entropy principle, the Bayesian-inference of ensembles approach, and the optimization of empirical forward models is presented. Here we focus on the Karplus parameters for RNA systems, which relate the dihedral angles of $γ$, $β$, and the dihedrals in the sugar ring to the corresponding $^3J$-coupling signal between coupling protons. Extensive molecular simulations…
▽ More
A novel method combining maximum entropy principle, the Bayesian-inference of ensembles approach, and the optimization of empirical forward models is presented. Here we focus on the Karplus parameters for RNA systems, which relate the dihedral angles of $γ$, $β$, and the dihedrals in the sugar ring to the corresponding $^3J$-coupling signal between coupling protons. Extensive molecular simulations are performed on a set of RNA tetramers and hexamers and combined with available nucleic-magnetic-resonance data. Within the new framework, the sampled structural dynamics can be reweighted to match experimental data while the error arising from inaccuracies in the forward models can be corrected simultaneously and consequently does not leak into the reweighted ensemble. Carefully crafted cross-validation procedure and regularization terms enable obtaining transferable Karplus parameters. Our approach identifies the optimal regularization strength and new sets of Karplus parameters balancing good agreement between simulations and experiments with minimal changes to the original ensemble.
△ Less
Submitted 19 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Molecular simulations matching denaturation experiments for N6-Methyladenosine
Authors:
Valerio Piomponi,
Thorben Fröhlking,
Mattia Bernetti,
Giovanni Bussi
Abstract:
Post-transcriptional modifications are crucial for RNA function and can affect its structure and dynamics. Force-field based classical molecular dynamics simulations are a fundamental tool to characterize biomolecular dynamics and their application to RNA is flourishing. Here we show that the set of force-field parameters for N$^6$-methyladenosine (m$^6$A) developed for the commonly used AMBER for…
▽ More
Post-transcriptional modifications are crucial for RNA function and can affect its structure and dynamics. Force-field based classical molecular dynamics simulations are a fundamental tool to characterize biomolecular dynamics and their application to RNA is flourishing. Here we show that the set of force-field parameters for N$^6$-methyladenosine (m$^6$A) developed for the commonly used AMBER force field does not reproduce duplex denaturation experiments and, specifically, cannot be used to describe both paired and unpaired states. Then we use reweighting techniques to derive new parameters matching available experimental data. The resulting force field can be used to properly describe paired and unpaired m$^6$A in both syn and anti conformation, and thus opens the way to the use of molecular simulations to investigate the effects of N6 methylations on RNA structural dynamics.
△ Less
Submitted 2 May, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
-
Automatic learning of hydrogen-bond fixes in an AMBER RNA force field
Authors:
Thorben Fröhlking,
Vojtěch Mlýnský,
Michal Janeček,
Petra Kührová,
Miroslav Krepl,
Pavel Banáš,
Jiří Šponer,
Giovanni Bussi
Abstract:
The capability of current force fields to reproduce RNA structural dynamics is limited. Several methods have been developed to take advantage of experimental data in order to enforce agreement with experiments. We herein extend an existing framework, which allows arbitrarily chosen force-field correction terms to be fitted by quantification of the discrepancy between observables back-calculated fr…
▽ More
The capability of current force fields to reproduce RNA structural dynamics is limited. Several methods have been developed to take advantage of experimental data in order to enforce agreement with experiments. We herein extend an existing framework, which allows arbitrarily chosen force-field correction terms to be fitted by quantification of the discrepancy between observables back-calculated from simulation and corresponding experiments. We apply a robust regularization protocol to avoid overfitting, and additionally introduce and compare a number of different regularization strategies, namely L1-, L2-, Kish Size-, Relative Kish Size- and Relative Entropy-penalties. The training set includes a GACC tetramer as well as more challenging systems, namely gcGAGAgc and gcUUCGgc RNA tetraloops. Specific intramolecular hydrogen bonds in the AMBER RNA force field are corrected with automatically determined parameters that we call gHBfix$_{opt}$. A validation involving a separate simulation of a system present in the training set (gcUUCGgc) and new systems not seen during training (CAAU and UUUU tetramers) displays improvements regarding native population of the tetraloop as well as good agreement with NMR-experiments for tetramers when using the new parameters. Then we simulate folded RNAs (a kink-turn and L1 stalk rRNA) including hydrogen bond types not sufficiently present in the training set. This allows a final modification of the parameter set which is named gHBfix21 and is suggested to be applicable to a wider range of RNA systems.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
Towards empirical force fields that match experimental observables
Authors:
Thorben Fröhlking,
Mattia Bernetti,
Nicola Calonaci,
Giovanni Bussi
Abstract:
Biomolecular force fields have been traditionally derived based on a mixture of reference quantum chemistry data and experimental information obtained on small fragments. However, the possibility to run extensive molecular dynamics simulations on larger systems achieving ergodic sampling is paving the way to directly using such simulations along with solution experiments obtained on macromolecular…
▽ More
Biomolecular force fields have been traditionally derived based on a mixture of reference quantum chemistry data and experimental information obtained on small fragments. However, the possibility to run extensive molecular dynamics simulations on larger systems achieving ergodic sampling is paving the way to directly using such simulations along with solution experiments obtained on macromolecular systems. Recently, a number of methods have been introduced to automatize this approach. Here we review these methods, highlight their relationship with machine learning methods, and discuss the open challenges in the field.
△ Less
Submitted 29 May, 2020; v1 submitted 3 April, 2020;
originally announced April 2020.