\UseRawInputEncoding

Accurate nuclear quantum statistics on machine-learned classical effective potentials

Iryna Zaporozhets Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany Department of Chemistry, Rice University, Houston, Texas 77005, United States Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States    Félix Musil Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany    Venkat Kapil v.kapil@ucl.ac.uk Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK Department of Physics and Astronomy, University College London WC1E 6BT, UK Thomas Young Centre and London Centre for Nanotechnology, London WC1E 6BT, UK    Cecilia Clementi cecilia.clementi@fu-berlin.de Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States Department of Chemistry, Rice University, Houston, Texas 77005, United States
Abstract

The contribution of nuclear quantum effects (NQEs) to the properties of various hydrogen-bound systems, including biomolecules, is increasingly recognized. Despite the development of many acceleration techniques, the computational overhead of incorporating NQEs in complex systems is sizable, particularly at low temperatures. In this work, we leverage deep learning and multiscale coarse-graining techniques to mitigate the computational burden of path integral molecular dynamics (PIMD). Specifically, we employ a machine-learned potential to accurately represent corrections to classical potentials, thereby significantly reducing the computational cost of simulating NQEs. We validate our approach using four distinct systems: Morse potential, Zundel cation, single water molecule, and bulk water. Our framework allows us to accurately compute position-dependent static properties, as demonstrated by the excellent agreement obtained between the machine-learned potential and computationally intensive PIMD calculations, even in the presence of strong NQEs. This approach opens the way to the development of transferable machine-learned potentials capable of accurately reproducing NQEs in a wide range of molecular systems.

I Introduction

Molecular dynamics (MD) simulations provide insight into phenomena at atomic resolution and are an invaluable tool in chemistry, biophysics, and material science. Recent advances in force field [1, 2, 3], algorithms [4] and hardware development [5] allow for increasingly higher accuracy and a broader scope of application of molecular dynamics simulations.

Traditional MD simulations rely on the classical evolution of nuclear positions, neglecting an explicit treatment of so-called quantum nuclear effects (NQEs) such as zero-point fluctuations, tunneling phenomena, and isotope effects. This simplification has a historical justification as the parameters of classical empirical force field simulations are fit to reproduce (quantum) experimental data. Hence, NQEs are effectively included in the classical potential energy surface, at least partially, albeit with limited transferability. On the other hand, bottom-up ab initio MD methods or machine learning interatomic potentials (MLIPs), based on the Born-Oppenheimer approximation [6], only account for the quantization of electrons. Hence, these simulations require an explicit treatment of quantum nuclear motion on the Born-Oppenheimer potential energy surface (PES) in theory. Particularly when accurate electronic structure methods are used to estimate the Born-Oppenheimer PES, the neglect of quantum nuclear motion may be the largest source of error for obtaining quantitative agreement with experiments [7, 8, 9, 10]. Unfortunately, NQEs are typically neglected due to their significant computational overhead or because they are assumed to have small effects. However, even at ambient temperatures, NQEs significantly affect properties of hydrogen-bounded systems, e.g. static and dynamic properties of water [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], free energies of hydrogen bonding and proton transfer [24, 25, 26, 27, 28, 29], stability of molecular crystals [30, 31], strength of intermolecular interactions [32, 9], and enzyme activity [33, 19].

The standard condensed-phase simulation technique for incorporating NQEs in equilibrium properties is Feynman’s imaginary time path integral (PI) formalism [34]. In this framework, a system’s quantum statistical mechanics is mapped onto the classical statistical mechanics of a ring polymer, which consists of multiple replicas of the system connected by harmonic interactions that depend on temperature and mass. The effective classical PI partition function of the ring polymer system can be sampled using MD or Monte Carlo techniques, yielding the PIMD [35] and PI Monte Carlo [36] methods. In theory, PI methods incorporate quantum statistics exactly with classical sampling, in the limit of the number of replicas going to infinity [37]. However, typically, the required number of replicas to converge structural properties, and thus the increased computational cost compared to classical MD, rises with increasing physical frequencies or lowering temperatures [38]. As a result, most molecular systems require 1632163216-3216 - 32 replicas at room temperature, which increases proportionally with lowering temperature, such as over 104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT replicas to obtain converged energies for molecular systems at cryogenic temperatures [39].

Over the last decades, several systematically improvable reduced-cost PI approaches have been proposed. These include high-order factorization of the Boltzmann operator [40, 41, 42, 43, 44, 45], , perturbative expansion of the Boltzmann operator [46, 47] , multiple time stepping [48], ring polymer contraction [49], and coloured-noise thermostats [50, 51, 52]. Despite their notable success towards mainstreaming the use of PI methods [38], these approaches do not yet present a “fix-all" solution. Issues still exist, including, but not limited to, lack of universal applicability of multiple time stepping and ring polymer contraction, tedious implementation of high-order path integral techniques, and poor ergodicity [53] or zero-point energy leakage [50] due to the coupling between physical and fictitious PI collective modes. Finally, these approaches do not eliminate the diverging cost of path integral simulations at low temperatures.

An appealing direction that avoids the computational overhead of PI methods is to avoid modeling the entire imaginary-time path. For instance, the Wigner-Kirkwood effective potential [54, 55] expands the partition functions in powers of the reduced Plank’s constant and retains computationally tractable terms. Similarly, an effective quantum potential was suggested by Feynman and Hibbs [34, 56] by integrating out all degrees of freedom except the centroid of the ring polymer, i.e., by coarse-graining the ring polymer to the centroid. Ever since, the idea of ring polymer coarse-graining has been explored for over 20 years [57, 58, 59] and has recently gained increasingly more attention thanks to the advent of MLIPs [60, 61, 62].

A number of coarse-grained effective potentials can be developed by identifying (a) a suitable mapping from the fine-grained ring polymer to the coarse-grained classical system and (b) the functional form of the interactions in the coarse-grained system. For instance, the so-called “CG-PI theory", developed by Voth et al., [63, 64, 65] proposes to map the imaginary time path to a two-replica coarse-grained system. This mapping gives access to both diagonal and nondiagonal elements of the thermal density matrix. However, the corresponding formalism is challenging to implement for realistic molecular systems in the regime of strong NQEs [65]. Another common mapping involves coarse-graining of the ring polymer to the centroid [61, 62, 60]. Amongst others, we have recently shown that [60], the resulting effective potential can be used in the same manner as a classical force field, providing an approximation to the dynamical properties of the system within the centroid molecular dynamics approach [66]. However, centroid-based ring-polymer coarse-graining does not give rigorous access to exact static properties [67]. For instance, for a harmonic oscillator (which is a good approximation for many chemical systems with stiff bonds), the centroid approximation is equivalent to a classical description. On the other hand, a prominent feature of our work is the path-integral coarse-grained simulations (PIGS) method that uses many-body universal approximators [68] to fit the coarse-grained potential of mean force. The PIGS approach utilizes machine-learned potentials, capable of approximating arbitrary complex functions, to learn the full many-body coarse-grained potentials as demonstrated recently on biomolecular simulations [69, 70, 71, 72].

Here, instead of dynamical properties, we focus on characterizing static position-dependent properties, including NQEs, and we develop an effective machine-learned potential to compute them accurately and at the cost of classical MD. Our approach builds upon the PIGS approach that uses multiscale coarse-graining [68] and our work on the use of deep learning architectures for classical coarse-grained force-fields [73] but maps the ring polymer onto a single replica instead of the centroid. This coarse-grained mapping is distinct from earlier choices and allows quantum thermodynamic expectation values of position-dependent observables to be estimated rigorously as simple time averages akin to classical MD. We demonstrate that this approach yields accurate quantum nuclear statistics of bulk and isolated systems. The remaining sections of this paper are organized as follows: in Section II, we briefly summarize the imaginary-time PI theory and the application of thermodynamic coarse-graining to the problem. Section III discusses the numeric implementation of the approach for four model systems: one-dimensional Morse potential, single H2O molecule, Zundel cation, and bulk water. The results are presented and discussed in Section IV. Section V provides concluding remarks.

II Theory and Methods

II.1 The path-integral approach

We consider a Hamiltonian H𝐻Hitalic_H of N𝑁Nitalic_N distinguishable nuclei, living in three-dimensional Cartesian space, with masses {m1,,mN}{mi}isubscript𝑚1subscript𝑚𝑁subscriptsubscript𝑚𝑖𝑖\{m_{1},\dots,m_{N}\}\equiv\{m_{i}\}_{i}{ italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_m start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } ≡ { italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and position vector 𝐪{𝐪i}i𝐪subscriptsubscript𝐪𝑖𝑖\mathbf{q}\equiv\{\mathbf{q}_{i}\}_{i}bold_q ≡ { bold_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, interacting with the potential U(𝐪)𝑈𝐪U(\mathbf{q})italic_U ( bold_q ). The equivalent PI partition function comprising P𝑃Pitalic_P replicas can be expressed as,

Z=tr[eβH^]=limPZP,𝑍trdelimited-[]superscript𝑒𝛽^𝐻subscript𝑃subscript𝑍𝑃Z=\text{tr}\left[e^{-\beta\hat{H}}\right]=\lim_{P\to\infty}Z_{P},italic_Z = tr [ italic_e start_POSTSUPERSCRIPT - italic_β over^ start_ARG italic_H end_ARG end_POSTSUPERSCRIPT ] = roman_lim start_POSTSUBSCRIPT italic_P → ∞ end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT , (1)

where

ZPsubscript𝑍𝑃\displaystyle Z_{P}italic_Z start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT d𝐐eβUPI(𝐐),proportional-toabsentd𝐐superscript𝑒𝛽superscript𝑈PI𝐐\displaystyle\propto\int\textrm{d}{}\mathbf{Q}~{}e^{-\beta U^{\text{PI}}(% \mathbf{Q})},∝ ∫ d bold_Q italic_e start_POSTSUPERSCRIPT - italic_β italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q ) end_POSTSUPERSCRIPT ,
=j=1d𝐪(j)eβP[j=1P[i12miP2β22(𝐪i(j+1)𝐪i(j))2+U(𝐪(j))]].absentsubscriptproduct𝑗1dsuperscript𝐪𝑗superscript𝑒𝛽𝑃delimited-[]superscriptsubscript𝑗1𝑃delimited-[]subscript𝑖12subscript𝑚𝑖superscript𝑃2superscript𝛽2superscriptPlanck-constant-over-2-pi2superscriptsuperscriptsubscript𝐪𝑖𝑗1superscriptsubscript𝐪𝑖𝑗2𝑈superscript𝐪𝑗\displaystyle=\int\prod_{j=1}\textrm{d}{}\mathbf{q}^{(j)}~{}e^{-\frac{\beta}{P% }\left[\sum_{j=1}^{P}\left[\sum_{i}\frac{1}{2}\frac{m_{i}P^{2}}{\beta^{2}\hbar% ^{2}}\left(\mathbf{q}_{i}^{(j+1)}-\mathbf{q}_{i}^{(j)}\right)^{2}+U\left(% \mathbf{q}^{(j)}\right)\right]\right]}.= ∫ ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT d bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_β end_ARG start_ARG italic_P end_ARG [ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT [ ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG divide start_ARG italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ℏ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( bold_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j + 1 ) end_POSTSUPERSCRIPT - bold_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_U ( bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) ] ] end_POSTSUPERSCRIPT . (2)

Here, 𝐪(j)superscript𝐪𝑗\mathbf{q}^{(j)}bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT represents the position vector of the j𝑗jitalic_j-th replica of the system in the ring polymer with cyclic boundary conditions 𝐪(j)𝐪(j+P)superscript𝐪𝑗superscript𝐪𝑗𝑃\mathbf{q}^{(j)}\equiv\mathbf{q}^{(j+P)}bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ≡ bold_q start_POSTSUPERSCRIPT ( italic_j + italic_P ) end_POSTSUPERSCRIPT. In addition, 𝐐{𝐪(j)}j𝐐subscriptsuperscript𝐪𝑗𝑗\mathbf{Q}\equiv\{\mathbf{q}^{(j)}\}_{j}bold_Q ≡ { bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is a shorthand for the set of positions of the P𝑃Pitalic_P replicas of the ring polymer, and UPI(𝐐)superscript𝑈PI𝐐U^{\text{PI}}(\mathbf{Q})italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q ) is a shorthand notation for the total potential experienced by the ring polymer. In the limit P𝑃P\rightarrow\inftyitalic_P → ∞, the ring polymer statistics is an unbiased estimator of quantum statistics of the target system.

The quantum thermodynamic average of a position-dependent operator O(𝐪^)𝑂^𝐪O(\hat{\mathbf{q}})italic_O ( over^ start_ARG bold_q end_ARG ) is typically estimated as the ensemble average of the position-dependent function averaged over all the replicas,

𝒪=1Pj=1PO(𝐪(j))PI,𝒪subscriptdelimited-⟨⟩1𝑃superscriptsubscript𝑗1𝑃𝑂superscript𝐪𝑗PI\displaystyle\mathcal{O}=\left\langle\frac{1}{P}\sum_{j=1}^{P}O(\mathbf{q}^{(j% )})\right\rangle_{\text{PI}},caligraphic_O = ⟨ divide start_ARG 1 end_ARG start_ARG italic_P end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT italic_O ( bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) ⟩ start_POSTSUBSCRIPT PI end_POSTSUBSCRIPT , (3)

where PIsubscriptdelimited-⟨⟩PI\langle\square\rangle_{\text{PI}}⟨ □ ⟩ start_POSTSUBSCRIPT PI end_POSTSUBSCRIPT corresponds to the ensemble average defined in Eq. 2. Estimating Eq. 3 is computationally demanding due to the need for simulating P𝑃Pitalic_P replicas of the system. Typically, P𝑃Pitalic_P should be larger than βωmax𝛽Planck-constant-over-2-pisubscript𝜔max\beta\hbar\omega_{\text{max}}italic_β roman_ℏ italic_ω start_POSTSUBSCRIPT max end_POSTSUBSCRIPT with ωmaxsubscript𝜔max\omega_{\text{max}}italic_ω start_POSTSUBSCRIPT max end_POSTSUBSCRIPT being the largest physical frequency of the system [74]. Notably, for a given system (with fixed ωmaxsubscript𝜔max\omega_{\text{max}}italic_ω start_POSTSUBSCRIPT max end_POSTSUBSCRIPT), the number of replicas scales inversely with T𝑇Titalic_T, implying a sleepy rising computational cost as the temperature goes to zero.

II.2 Path-integral coarse-graining for quantum statistics

Our goal is to reduce the computational cost of simulating ring polymer statistics by integrating additional degrees of freedom, such as (linear combinations of) the replicas of the system, in a way that retains the accuracy of Eq. 2. In the most extreme and computationally beneficial scenario, we can integrate the P1𝑃1P-1italic_P - 1 replicas of the physical system to an effective potential felt by a single replica, reducing the dimensionality of the ring polymer to the physical system. In doing so, we reduce the cost of estimating thermodynamic properties to that of classical MD.

The problem can be reformulated as follows, using the bottom-up coarse-graining technique widely used in biomolecular simulations. We wish to coarse-grain the ring polymer system in the high dimensional configurational space 𝐐3NP𝐐superscript3𝑁𝑃\mathbf{Q}\in\mathds{R}^{3NP}bold_Q ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_N italic_P end_POSTSUPERSCRIPT described by a potential UPI(𝐐U^{\text{PI}}(\mathbf{Q}italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q) into an effective thermodynamically consistent potential Ueff(𝐪)superscript𝑈eff𝐪U^{\text{eff}}(\mathbf{q})italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q ) with 𝐪3N𝐪superscript3𝑁\mathbf{q}\in\mathds{R}^{3N}bold_q ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_N end_POSTSUPERSCRIPT,

Ueff(𝐪)=β1ln[d𝐐eβUPI(𝐐)δ(𝐪ξ(𝐐))d𝐐eβUPI(𝐐)],superscript𝑈eff𝐪superscript𝛽1d𝐐superscript𝑒𝛽superscript𝑈PI𝐐𝛿𝐪𝜉𝐐d𝐐superscript𝑒𝛽superscript𝑈PI𝐐U^{\text{eff}}(\mathbf{q})=-\beta^{-1}\ln\left[\frac{\int\textrm{d}{}\mathbf{Q% }~{}e^{-\beta U^{\text{PI}}(\mathbf{Q})}~{}\delta(\mathbf{q}-\xi(\mathbf{Q}))}% {\int\textrm{d}{}\mathbf{Q}~{}e^{-\beta U^{\text{PI}}(\mathbf{Q})}}\right],italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q ) = - italic_β start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ln [ divide start_ARG ∫ d bold_Q italic_e start_POSTSUPERSCRIPT - italic_β italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q ) end_POSTSUPERSCRIPT italic_δ ( bold_q - italic_ξ ( bold_Q ) ) end_ARG start_ARG ∫ d bold_Q italic_e start_POSTSUPERSCRIPT - italic_β italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q ) end_POSTSUPERSCRIPT end_ARG ] , (4)

where δ𝛿\deltaitalic_δ is a Dirac delta function, and ξ𝜉\xiitalic_ξ is a linear operator that maps coordinates from 3NP3Nsuperscript3𝑁𝑃superscript3𝑁\mathds{R}^{3NP}\to\mathds{R}^{3N}blackboard_R start_POSTSUPERSCRIPT 3 italic_N italic_P end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT 3 italic_N end_POSTSUPERSCRIPT. The effective potential of mean force in Eq. 4 can be obtained using the force-matching variational approach [68]. We first represent the effective potential as a function, U~eff(𝐪,θ)superscript~𝑈eff𝐪𝜃\tilde{U}^{\text{eff}}(\mathbf{q},\theta)over~ start_ARG italic_U end_ARG start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q , italic_θ ), parametrized by 𝜽𝜽\bm{\theta}bold_italic_θ, and then minimize the force-matching loss [69] with respect to 𝜽𝜽\bm{\theta}bold_italic_θ:

=ξf(𝐅)+𝐪U~eff(𝐪;𝜽).normsubscript𝜉𝑓𝐅subscript𝐪superscript~𝑈eff𝐪𝜽\mathcal{L}=||\xi_{f}(\mathbf{F})+\nabla_{\mathbf{q}}\tilde{U}^{\text{eff}}(% \mathbf{q};\bm{\theta})||.caligraphic_L = | | italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_F ) + ∇ start_POSTSUBSCRIPT bold_q end_POSTSUBSCRIPT over~ start_ARG italic_U end_ARG start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q ; bold_italic_θ ) | | . (5)

Here, 𝐅𝐐UPI(𝐐)𝐅subscript𝐐superscript𝑈PI𝐐\mathbf{F}\equiv-\nabla_{\mathbf{Q}}U^{\text{PI}}(\mathbf{Q})bold_F ≡ - ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT PI end_POSTSUPERSCRIPT ( bold_Q ) is the force associated with the (high-dimensional) ring polymer systems, ξfsubscript𝜉𝑓\xi_{f}italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is a linear operator that maps the forces from 3NP3Nsuperscript3𝑁𝑃superscript3𝑁\mathds{R}^{3NP}\to\mathds{R}^{3N}blackboard_R start_POSTSUPERSCRIPT 3 italic_N italic_P end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT 3 italic_N end_POSTSUPERSCRIPT, i.e., the high-dimensional ring polymer space into the low-dimensional space associated with the physical system.

II.3 Single replica coarse-graining

While several choices of the mapping operators ξ𝜉\xiitalic_ξ and ξfsubscript𝜉𝑓\xi_{f}italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT are possible [73], not all choices are ideal for estimating generic position-dependent operators with the full accuracy of Eq. 3. To make a physically meaningful and computationally favorable choice, we note that all replicas of the ring polymer are equivalent thanks to the invariance of the trace in Eq. 2 to cyclic shifts. Hence, a thermodynamic expectation value can be estimated by averaging the position-dependent function estimated on any replica jsuperscript𝑗j^{\prime}italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT

𝒪=O(𝐪(j))β.𝒪subscriptdelimited-⟨⟩𝑂superscript𝐪superscript𝑗𝛽\displaystyle\mathcal{O}=\left\langle O(\mathbf{q}^{(j^{\prime})})\right% \rangle_{\beta}.caligraphic_O = ⟨ italic_O ( bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ) ⟩ start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT . (6)

Eq. 6 suggests that the statistics of a single replica are sufficient to estimate equilibrium averages of position-dependent observables. From the perspective of developing an effective potential, it is sufficient to integrate all but one (arbitrary) replica. Hence, in this work, we coarse-grain the ring polymer to a single replica using as a configurational map the following transformation:

ξ(𝐐)=𝐪(j).𝜉𝐐superscript𝐪superscript𝑗\xi(\mathbf{Q})=\mathbf{q}^{(j^{\prime})}.italic_ξ ( bold_Q ) = bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT . (7)

Given this configurational map, a thermodynamically consistent force map ξfsubscript𝜉𝑓\xi_{f}italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT can be constructed with the matrix elements given by following relationship [73]:

ξfji={1,for atom i identified with replica j0,for atom i identified with replica kjcji,for all other atoms,superscriptsubscript𝜉𝑓𝑗𝑖cases1for atom 𝑖 identified with replica 𝑗0for atom 𝑖 identified with replica 𝑘𝑗subscript𝑐𝑗𝑖for all other atoms\xi_{f}^{ji}=\begin{cases}1,&\text{for atom }i\text{ identified with replica }% j\\ 0,&\text{for atom }i\text{ identified with replica }\;k\neq j\\ c_{ji}\in\mathbb{R},&\text{for all other atoms},\end{cases}italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j italic_i end_POSTSUPERSCRIPT = { start_ROW start_CELL 1 , end_CELL start_CELL for atom italic_i identified with replica italic_j end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL for atom italic_i identified with replica italic_k ≠ italic_j end_CELL end_ROW start_ROW start_CELL italic_c start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT ∈ blackboard_R , end_CELL start_CELL for all other atoms , end_CELL end_ROW (8)

where cjisubscript𝑐𝑗𝑖c_{ji}italic_c start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT are arbitrary. For a ring polymer, setting cji=0subscript𝑐𝑗𝑖0c_{ji}=0italic_c start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT = 0 leads to the following primitive estimator of force projected onto coarse-grained space,

ξf(𝐅)=U(𝐪(j))PmPβ22(2𝐪(j)𝐪(j+1)𝐪(j1)).subscript𝜉𝑓𝐅𝑈superscript𝐪superscript𝑗𝑃𝑚𝑃superscript𝛽2superscriptPlanck-constant-over-2-pi22superscript𝐪superscript𝑗superscript𝐪superscript𝑗1superscript𝐪superscript𝑗1\xi_{f}(\mathbf{F})=-\frac{\nabla U(\mathbf{q}^{(j^{\prime})})}{P}-\frac{mP}{% \beta^{2}\hbar^{2}}(2\mathbf{q}^{(j^{\prime})}-\mathbf{q}^{(j^{\prime}+1)}-% \mathbf{q}^{(j^{\prime}-1)}).italic_ξ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_F ) = - divide start_ARG ∇ italic_U ( bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_P end_ARG - divide start_ARG italic_m italic_P end_ARG start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_ℏ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( 2 bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT - bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 1 ) end_POSTSUPERSCRIPT - bold_q start_POSTSUPERSCRIPT ( italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_POSTSUPERSCRIPT ) . (9)

However, the numerical efficiency of the force-matching minimization can be further improved by optimizing cjisubscript𝑐𝑗𝑖c_{ji}italic_c start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT to minimize the variance of the mapped force [73]. With the configurational mapping operator given by Eq. 7, expectation values of a position-dependent operator can be estimated as a simple ensemble average

𝒪=O(𝐪)β,𝒪subscriptdelimited-⟨⟩𝑂𝐪𝛽\mathcal{O}=\left\langle O(\mathbf{q})\right\rangle_{\beta},caligraphic_O = ⟨ italic_O ( bold_q ) ⟩ start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT , (10)

akin to standard MD, albeit in the classical ensemble of the effective potential Ueff(𝐪)superscript𝑈eff𝐪{U}^{\text{eff}}(\mathbf{q})italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q ).

The choice of the mapping operator in Eq. 7 is distinct from earlier choices made in the context of coarse-graining of imaginary-time PIs. The most popular coarse-graining approach maps the ring polymer positions to its centroid. While this approach is useful for simulating dynamical properties, it doesn’t give correct equilibrium averages of generic position-dependent operators when estimated as an ensemble average similar to Eq. 10. Similarly, the PICG theory introduced in Ref. 63, 64, 65 coarse-grains the ring polymer to two replicas, which are related to the positions at which the off-diagonal elements of the Boltzmann operator are estimated. This approach gives access to generic position- and momentum-dependent operators; however, it requires derivations of bespoke (complicated) operators. On the other hand, our mapping reduces the ring polymer to the position at which the trace of the Boltzmann operator is estimated, with Ueff(𝐪)=β1log𝐪|eβH^𝐪superscript𝑈eff𝐪superscript𝛽1conditional𝐪superscript𝑒𝛽^𝐻𝐪{U}^{\text{eff}}(\mathbf{q})=-\beta^{-1}\log{\left<\mathbf{q}\right|e^{-\beta% \hat{H}}\left|\mathbf{q}\right>}italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q ) = - italic_β start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_log ⟨ bold_q | italic_e start_POSTSUPERSCRIPT - italic_β over^ start_ARG italic_H end_ARG end_POSTSUPERSCRIPT | bold_q ⟩, modulo an additive constant, giving access to position-dependent operators as simple ensemble averages.

II.4 Workflow using the PIGS method

Refer to caption
Figure 1: Procedure to train an effective NQE potential

To develop the effective potentials that incorporate NQEs, we utilize the PIGS approach, outlined in Fig. 1. First, we generate training data via PIMD simulation. The resulting ring polymer trajectories are then projected onto a coarse-grained space using a configurational and force map discussed above. Notice that the remaining replica jsuperscript𝑗j^{\prime}italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in Eq. 7 is selected arbitrarily, and for a ring polymer with P replicas Eq. 7 defines P equivalent projections. Thus, each ring polymer configuration yields P data points for training.

The effective potential is represented in a form

Ueff(𝐪,θ)=αU(𝐪)+UML(𝐪,𝜽),superscript𝑈eff𝐪𝜃𝛼𝑈𝐪superscript𝑈ML𝐪𝜽U^{\text{eff}}(\mathbf{q},\theta)=\alpha U(\mathbf{q})+U^{\text{ML}}(\mathbf{q% },\bm{\theta}),italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( bold_q , italic_θ ) = italic_α italic_U ( bold_q ) + italic_U start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT ( bold_q , bold_italic_θ ) , (11)

where UML(𝐪,𝜽)superscript𝑈ML𝐪𝜽U^{\text{ML}}(\mathbf{q},\bm{\theta})italic_U start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT ( bold_q , bold_italic_θ ) is a learnable correction. The hyperparameter α𝛼\alphaitalic_α determines the contribution of the physical potential to the total energy. A common approach is to set α=1𝛼1\alpha=1italic_α = 1 so the ML network learns an effective NQEs correction to the classical potential [62, 60]. This choice is effective at high temperatures when NQEs are small. However, as temperature decreases, prior classical distribution quickly becomes more localized, and the difference between the classical and target quantum potential increases, making the correction harder to learn. However, in the harmonic limit, it is possible to make classical distribution more "quantum-like" by changing the simulation temperature. For this reason, to simplify the learning, enhance the data efficiency of the training, and the model stability, we consider α𝛼\alphaitalic_α to be temperature dependent: α=T/T0𝛼𝑇subscript𝑇0\alpha=T/T_{0}italic_α = italic_T / italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, where T<T0𝑇subscript𝑇0T<T_{0}italic_T < italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT denotes the target temperature and T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is a hyperparameter. If not specified otherwise, we set T0=600subscript𝑇0600T_{0}=600italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 600 K. The learnable model parameters are then optimized with standard ML approaches. The resulting effective potential can then be used for calculations as a standard MD force field.

II.5 Applicability of the models to different thermodynamic conditions

The thermodynamically consistent CG model approximates the potential of mean force, which depends on the thermodynamic state, e.g., system temperature. Consequently, the model is strictly applicable only at the same conditions it was trained, and targeting the coarse-grained model to different conditions requires a different set of training data. However, in the case of ring polymer coarse-graining, we notice two cases where the applicability of a trained model can be extended, or training data generated for one condition can be reused for a different one. In the first scenario, the molecule predominantly populates the ground vibrational state, and the population of the excited states is negligible. In this case, the general equation for the quantum PMF (see Eq. 18) simplifies to:

Ueff=1βln|Ψ0|2,superscript𝑈eff1𝛽superscriptsubscriptΨ02U^{\text{eff}}=-\frac{1}{\beta}\ln|\Psi_{0}|^{2},italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT = - divide start_ARG 1 end_ARG start_ARG italic_β end_ARG roman_ln | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (12)

where Ψ0subscriptΨ0\Psi_{0}roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT denotes the ground-state nuclear wave function.

Consequently, the effective CG potentials at different temperatures in the ground vibrational state are related to each other as:

Ueff(β2)=β1β2Ueff(β1).superscript𝑈effsubscript𝛽2subscript𝛽1subscript𝛽2superscript𝑈effsubscript𝛽1U^{\text{eff}}(\beta_{2})=\frac{\beta_{1}}{\beta_{2}}U^{\text{eff}}(\beta_{1}).italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) . (13)

This relationship implies that if a model is trained at a temperature that is low enough to predominately populate the ground state, the same model can be used to run simulations and compute the quantum statistics at any lower temperature by rescaling Ueffsuperscript𝑈effU^{\text{eff}}italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT and the corresponding forces according to Eq. 13.

This simple relation breaks down when the population of the excited vibrational states is non-negligible. However, a dataset generated with expensive PIMD simulation at one temperature can be used to generate synthetic datasets corresponding to different temperatures and isotope compositions. This data recycling process involves two steps. First, we reweight the configuration to the target temperature and isotope composition, efficiently using the coordinate rescaling approaches from Refs. [75, 76]. Within the coordinate scaling approach, given the initial coordinates 𝐪𝐪\mathbf{q}bold_q, inverse temperature β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and mass m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, to model a system at inverse temperature β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and mass m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we rescale the coordinates as

𝐪(j)=𝐪¯m2β1m1β2(𝐪¯𝐪(j)),superscriptsuperscript𝐪𝑗¯𝐪subscript𝑚2subscript𝛽1subscript𝑚1subscript𝛽2¯𝐪superscript𝐪𝑗\mathbf{q^{\prime}}^{(j)}=\bar{\mathbf{q}}-\sqrt{\frac{m_{2}\beta_{1}}{m_{1}% \beta_{2}}}(\bar{\mathbf{q}}-\mathbf{q}^{(j)}),bold_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT = over¯ start_ARG bold_q end_ARG - square-root start_ARG divide start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_ARG ( over¯ start_ARG bold_q end_ARG - bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) , (14)

where 𝐪¯=1Pj=1P𝐪(j)¯𝐪1𝑃superscriptsubscript𝑗1𝑃superscript𝐪𝑗\bar{\mathbf{q}}=\frac{1}{P}\sum_{j=1}^{P}\mathbf{q}^{(j)}over¯ start_ARG bold_q end_ARG = divide start_ARG 1 end_ARG start_ARG italic_P end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT represents the centroid position. Then, the probability ρ(𝐪)𝜌𝐪\rho(\mathbf{q})italic_ρ ( bold_q ) of a configuration in the ensemble characterized by m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is given by (see SI for the details):

ρ(𝐪;β2;m2)=ρ(𝐪;β1;m1)exp(j=1P(β1𝐪(j)β2𝐪(j)).)\rho(\mathbf{q^{\prime}};\beta_{2};m_{2})=\rho(\mathbf{q};\beta_{1};m_{1})\exp% \left(\sum_{j=1}^{P}(\beta_{1}\mathbf{q}^{(j)}-\beta_{2}\mathbf{q^{\prime}}^{(% j)}).\right)italic_ρ ( bold_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ; italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_ρ ( bold_q ; italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_exp ( ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_q start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT bold_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) . ) (15)

We used Eq. 15 to reweight the original dataset.

Second, we recompute the forces originating from the harmonic couplings between adjacent replicas, 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, for the target temperature and isotope compositions as:

𝐅(𝐪;m2;β2)=β12m2β22m1𝐅(𝐪;m1;β1).superscript𝐅𝐪subscript𝑚2subscript𝛽2subscriptsuperscript𝛽21subscript𝑚2subscriptsuperscript𝛽22subscript𝑚1superscript𝐅𝐪subscript𝑚1subscript𝛽1\mathbf{F}^{\prime}(\mathbf{q};m_{2};\beta_{2})=\frac{\beta^{2}_{1}m_{2}}{% \beta^{2}_{2}m_{1}}\mathbf{F}^{\prime}(\mathbf{q};m_{1};\beta_{1}).bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_q ; italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_q ; italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) . (16)

The resulting dataset is then used for model training, as described above.

III Computational details

III.1 1-D Morse potential

III.1.1 Model and training procedure

We first demonstrate the approach on a systems for which the exact effective potential can be estimated numerically. We study a particle with mass μ𝜇\muitalic_μ experiencing a one-dimensional Morse potential,

UMorse(q)=De(1ea(qq0))2,superscript𝑈Morse𝑞subscript𝐷𝑒superscript1superscript𝑒𝑎𝑞subscript𝑞02U^{\text{Morse}}(q)=D_{e}(1-e^{-a(q-q_{0})})^{2},italic_U start_POSTSUPERSCRIPT Morse end_POSTSUPERSCRIPT ( italic_q ) = italic_D start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( 1 - italic_e start_POSTSUPERSCRIPT - italic_a ( italic_q - italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (17)

where q𝑞qitalic_q is the particle position. The parameters, q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, a𝑎aitalic_a, and Desubscript𝐷𝑒D_{e}italic_D start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT, that determine the equilibrium position, shape, and depth of the potential, are chosen to reproduce the potential of the O–H bond, and μ𝜇\muitalic_μ is set to the reduced mass of the O–H bond (see SI for the numeric values).

The effective potential corresponding to the mapping in Eq. 7 can be estimated numerically by solving the Schrödinger equation for the bound states of a Morse potential [77] as

Ueff(q)=β1lnieβEi|Ψi(q)|2ieβEi,superscript𝑈eff𝑞superscript𝛽1subscript𝑖superscript𝑒𝛽subscript𝐸𝑖superscriptsubscriptΨ𝑖𝑞2subscript𝑖superscript𝑒𝛽subscript𝐸𝑖U^{\text{eff}}(q)=-\beta^{-1}\ln{}\frac{\sum_{i}e^{-\beta E_{i}}|\Psi_{i}(q)|^% {2}}{\sum_{i}e^{-\beta E_{i}}},italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( italic_q ) = - italic_β start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ln divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_β italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_q ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_β italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG , (18)

where Eisubscript𝐸𝑖E_{i}italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Ψi(q)subscriptΨ𝑖𝑞\Psi_{i}(q)roman_Ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_q ) are the the energy eigenvalues and wave functions associated with the i𝑖iitalic_i-th eigenstate, respectively. To estimate the effective potential using the PIGS approach, we optimize a coarse-grained potential in the following Eq. 11 with α=P1𝛼superscript𝑃1\alpha=P^{-1}italic_α = italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT:

Ueff(q,𝜽)=UMorse(q)P+UML(q,𝜽),superscript𝑈eff𝑞𝜽superscript𝑈Morse𝑞𝑃superscript𝑈ML𝑞𝜽U^{\text{eff}}(q,\bm{\theta})=\frac{U^{\text{Morse}}(q)}{P}+U^{\text{ML}}(q,% \bm{\theta}),italic_U start_POSTSUPERSCRIPT eff end_POSTSUPERSCRIPT ( italic_q , bold_italic_θ ) = divide start_ARG italic_U start_POSTSUPERSCRIPT Morse end_POSTSUPERSCRIPT ( italic_q ) end_ARG start_ARG italic_P end_ARG + italic_U start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT ( italic_q , bold_italic_θ ) , (19)

where UML=i=1Mθiϕi(q)superscript𝑈MLsuperscriptsubscript𝑖1𝑀subscript𝜃𝑖subscriptitalic-ϕ𝑖𝑞U^{\text{ML}}=\sum_{i=1}^{M}\theta_{i}~{}\phi_{i}(q)italic_U start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_q ) is a linear combination of radial basis functions ϕi(q)subscriptitalic-ϕ𝑖𝑞\phi_{i}(q)italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_q ) (see SI for details) [78], and M𝑀Mitalic_M is the total number of basis functions (we set M=10𝑀10M=10italic_M = 10). Hence, instead of learning the full effective potential, we fit a correction to the classical potential that captures the quantum effects learned from the full-ring polymer simulations. The adjustable parameters 𝜽𝜽\bm{\theta}bold_italic_θ are obtained by minimizing Eq. 5 using ridge regression, with the L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT regularization parameter set to 0.1 and 5-fold cross-validation.

III.1.2 Training data generation

The training data was generated from PIMD simulations using the i-PI code [79] and a custom Python-based force provider implementing the one-dimensional Morse potential. Six simulations were performed in an NVT ensemble for 100 - 600 K in steps of 100 K, using 64 replicas and a PILE-L thermostat with a time constant of 100100100100 fs [80] The integration was performed with a timestep of 0.50.50.50.5 fs and positions and forces acting on each bead were sampled every 20202020 steps. The first 2.52.52.52.5 ps of the simulation were discarded, and the positions and forces sampled in the subsequent 500500500500 ps were used to train the model.

III.2 Molecular systems: water molecule, Zundel cation and liquid water

III.2.1 Model and training procedure

We use Eq. 11 to model the effective potential of the single water molecule, the Zundel cation, and bulk water. We represent UMLsuperscript𝑈MLU^{\text{ML}}italic_U start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT with the MACE graph neural network architecture [81]. One of the hallmarks of the MACE architecture is the use of many-body features to represent atomic local environments, improving the model accuracy and data efficiency.

The models were trained with the AdamW optimizer, a learning rate of 0.0010.0010.0010.001, and a weight decay of 0.0010.0010.0010.001. The MACE neural network cutoff was set to 6.06.06.06.0 Å, with 15151515 radial basis functions, 6666 polynomial, and 3333 radial channels. The correlation order for Zundel cation and bulk water was set to 3333, which corresponds to 4444-body features. For a single water molecule, 3333-body features were used. In all cases, 80808080 % of the data available were used for training, and the remaining 20202020 % were used for validation. The training was performed for 60606060 epochs for the H2O molecule and the Zundel cation and for 15151515 epochs for liquid H2O. For each system, ten different models were trained, each using different parameter initialization and train-validation splitting.

III.2.2 Training data generation

The training data was generated from PIMD simulations using the i-PI code using the Partridge and Schwenke potential for a single water molecule [82], the HBB potential [83] for the Zundel cation and the MB-pol for bulk water [84]. The simulation parameters are summarized in Table S1. We use a force map optimized to minimize the average magnitude of the mapped forces, subject to the constraints of the valid force map. This force optimization was performed according to the procedure introduced in  [73], with an L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT regularization parameter set to 0.050.050.050.05.

III.2.3 Model evaluation

For each of the trained models, we performed molecular dynamics simulations, using as a force field a combination of classical potential and a trained correction as defined in Eq. 11. The simulations were performed in the NVT ensemble, with a Langevin integrator and a friction thermostat with a time constant set to 100100100100 ps-1. The time step and periodic boundary conditions were the same as in the training simulations. All the frames were used for further analysis. For each of the systems, we performed additional simulations with only the classical potential as a control to compare the classical and quantum statistics.

IV Results and Discussion

IV.1 1-D Morse potential

Refer to caption
Figure 2: Exact and machine-learned PMF for the Morse potential. Panel A: Classical Morse potential and quantum PMFs at different temperatures. Even at 600600600600K, the system exhibits significant quantum effects. Panels B-D: comparison of exact quantum PMF and machine-learned PMF at 100100100100K (B), 300300300300K (C), and 600600600600K (D). Panel E: Results for models trained at 300 K for OH and OT from reweighted simulations performed at 600600600600K. The average PMF over 5 models is reported as a dashed line, and ±plus-or-minus\pm± 3 standard deviations are reported as shaded areas

First, we study a particle in a one-dimensional Morse potential, a model for a single O–H bond. The Schrödinger equation for this system can be solved analytically [77], thus making Morse potential an ideal test system. It also exhibits significant nuclear quantum effects, as shown in Fig. 2A where the classical Morse potential (Eq. 17) is compared with the corresponding quantum PMFs (Eq. 18) at different temperatures. The classical approximation leads to the over-localization of the particle, and, as expected, the quantum effects are more prominent at low temperatures. However, for this system, even at a temperature of 600600600600K, there are significant deviations between the classical and quantum pictures.

The developed approach allows us to get a quantitative agreement with the reference, especially in the high-probability regions that are well-sampled in the training dataset (Fig. 2B-D). Further away from the minimum energy, the model’s prediction slightly deviates from the ground truth. This is to be expected, as these regions of the phase space are not as well represented in the training data as the high-probability regions.

Refer to caption
Figure 3: A: Comparison of the probability distributions of the O--H bond length (left) and the H--O--H angle (right) for classical MD (P=1), PIMD (P=64), and ML models at 100 K. Inserts show the same distributions at a different scale. B: Jensen-Shannon (JS) divergence between collective variable distributions obtained with PIMD and either classical potential (shades of blue) or with our machine-learned model (shades of brown). For each case, JS divergence for the O--H bond length distribution and the H--O--H angle at three different temperatures are shown.

We test the reweighting scheme proposed in Section II.5 by generating datasets corresponding to different temperatures (300300300300 K) and mass of the particle that is either Protium (corresponding to an O-H bond) or increased to Tritium (corresponding to an O-T bond) from the 600600600600K O-H dataset previously used. The resulting reweighted datasets were used to train new models. As shown in Fig. 2E, the models trained with the reweighted dataset are in excellent agreement with the analytical results for the corresponding temperature and isotope composition. Such data recycling allows for a further decrease in the computational cost of generating training data, though it becomes less effective for more complex systems. Nevertheless, we expect that this approach may be particularly useful for generating large datasets for training transferable models.

IV.2 Water molecule

The next test case we considered is a single water molecule in a vacuum. The quantum statistics of this 3-body system can be exhaustively described in terms of the length of two O--H bonds and the H--O--H bond angle. The comparison of the probability distributions for the O--H distances and H--O--H bond angles is shown in Fig. 3, with additional details provided in Fig. S1. As was the case for the Morse potential, the classical distribution is significantly more localized than the quantum counterpart. The ML model consistently samples the correct quantum distribution: the results obtained from 10 independently trained models produce similar results with low variance.

Refer to caption
Figure 4: The probability distribution for the Zundel cation, sampled with PIMD simulation, MD simulation, and the proposed ML model. The top row (panels A-D) represents the model trained and simulated at 100100100100K, and the bottom row (panels E-H) shows the results trained at 300300300300K. Under the plots, a visual representation of the corresponding collective variable is shown.

For all trained models, we run the MD simulation for one ns, which is four times longer than the PIMD simulation used to generate the training data. This allows us to obtain slowly converging properties by performing short PIMD simulations and using those to train a model that can be run for a much longer time.

Even at 600600600600K, the water molecule predominantly populates the ground vibrational state, with a negligible population of the excited states. Thus, as discussed in Section II.5, the model trained at 600600600600K can be used to simulate the system at any lower temperatures. To demonstrate this point further, we report in Fig. S2 the results obtained with a model trained on PIMD data generated at 600600600600K and then simulated at lower temperatures, with forces rescaled according to Eq. 13.

IV.3 Zundel cation

A more challenging case is the Zundel cation, which consists of two water molecules sharing a hydrogen-bonded proton H. This system has been extensively studied both theoretically [85, 86] and experimentally [87, 88, 89, 90, 91]. The Zundel cation exhibits large conformational fluctuations, including anharmonic and floppy motion associated with the transfer of the hydrogen-bounded proton. Following Ref. [86], we characterize these fluctuations by means of the O--O distance d(OO)𝑑OOd(\text{O}-\text{O})italic_d ( O - O ), the proton transfer coordinate δ=d(O1H)d(O2H)𝛿𝑑subscriptO1superscriptH𝑑subscriptO2superscriptH\delta=d(\text{O}_{1}\text{H}^{*})-d(\text{O}_{2}\text{H}^{*})italic_δ = italic_d ( O start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT H start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_d ( O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT H start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), the dihedral angle ϕitalic-ϕ\phiitalic_ϕ that measures the rotation of the H3O+ fragments around the O--H{}^{*}-start_FLOATSUPERSCRIPT ∗ end_FLOATSUPERSCRIPT -O axis, and the d(Oplane)𝑑Oplaned(\text{O}-\text{plane})italic_d ( O - plane ) that measures the nonplanarity of the H3O+ fragment. The dihedral angle ϕitalic-ϕ\phiitalic_ϕ is defined by points X1{}_{1}-start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT -O1{}_{1}-start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT -O2{}_{2}-start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT -X2, where X1 and X2 represent a midpoint between the hydrogen atoms that are not involved in the hydrogen bond formation and bound to O1 and O2, correspondingly. The non-planarity of H3O+ is characterized by a distance between an O atom and the plane, defined by the three closest H atoms (see the lower panel of Fig. 4).

Refer to caption
Figure 5: Radial distribution functions g(r)OO𝑔superscript𝑟𝑂𝑂g(r)^{OO}italic_g ( italic_r ) start_POSTSUPERSCRIPT italic_O italic_O end_POSTSUPERSCRIPT, g(r)OH𝑔superscript𝑟𝑂𝐻g(r)^{OH}italic_g ( italic_r ) start_POSTSUPERSCRIPT italic_O italic_H end_POSTSUPERSCRIPT, and g(r)HH𝑔superscript𝑟𝐻𝐻g(r)^{HH}italic_g ( italic_r ) start_POSTSUPERSCRIPT italic_H italic_H end_POSTSUPERSCRIPT for bulk H2O at 300300300300 K. The teal line represents the RDF obtained with a classical MD simulation (P=1), the orange line represents the PIMD result, and the dashed brown line represents the results obtained with the ML model. The result reported for the ML model is the mean over ten independent models with different train-test splits and model initialization.

At 300300300300K, the proton transfer coordinate and the rotational motion are almost classical, and only a small correction is needed to account for NQEs. The O--O distance and the O--plane distance require larger corrections, yet the NQEs are relatively small. At 100100100100K, however, significant quantum effects appear in all the collective variables, including changes in the equilibrium geometry. Nevertheless, our methods recover the accurate quantum distribution both at 300300300300K and at 100100100100K. Despite the diversity of conformational fluctuations, the model not only reproduces the target probability distributions but leads to remarkably stable simulations: all of the models were used to perform 1111 ns simulation and did not exhibit any instabilities, often plaguing ML force fields [92, 93].

IV.4 Liquid water

Lastly, we applied our approach to model liquid water to reproduce the NQEs manifesting in both bonded and nonbonded interactions. Since NQEs are highly local, the former are more affected than the latter. The quality of the machine-learned model is evaluated in Fig. 5 by computing radial distribution functions (RDFs). At 300300300300K, the first peak in the O--H (Fig. 5B) and H--H  (Fig. 5C) RDFs predicted by classical MD (P=1) are too narrow and significantly deviate from the correct quantum PIMD results. The deviations between the classical and quantum RDFs rapidly decrease with the distance.

Our neural-network-based potential is able to accurately predict the correct quantum RDFs, including the contributions that arise from both bonded and nonbonded interactions (Fig. 5), in contrast to another similar study [62]. There is also a remarkable agreement between independently trained models. The high accuracy and low uncertainty of our ML model prediction show that it reliably captures the subtle differences between classical and quantum RDFs.

Compared to the simulations for the ML models of a single water molecule and of the Zundel cation, the simulations for bulk water exhibit a decreased stability. We note that some simulations of our ML models, trained on 35353535 ps of PIMD simulations, exhibit numerical instabilities. However, the ML models allow, on average, at least 38383838 ps of simulation time before the onset of any instability. Simulations can be reinitialized and restarted to improve statistics. The numerical instability of the model can be potentially cured by increasing the size of the training dataset. Alternatively, one can modify the learning problem by simplifying the ML correction, for example, by introducing physics-informed energy terms, as it is done in the learning of classical ML coarse-grained force-fields [69, 70, 71, 72]. Given that the stability of the simulation strongly depends on the parameter α𝛼\alphaitalic_α (see Eq. 11), this approach may be especially promising.

V Conclusion

We developed an approach to coarse-grain the ring-polymer dynamics describing NQEs and we obtain results thermodynamically consistent with PIMD simulations. This approach allows us to accurately compute position-dependent NQEs at the cost of classical MD. We have demonstrated this method on four distinct systems where NQEs are prevalent — the Morse potential, a single water molecule, the Zundel cation, and bulk water - and reported results in excellent agreement with (much more expensive) reference PIMD simulations.

The proposed method builds an effective NQEs correction potential by combining the rigorous framework of multiscale coarse-graining [68] with the MACE architecture [81], an expressive state-of-the-art graph neural network potential. We have shown that these effective potentials accurately reproduce the quantum statistics of nuclei, both near the classical limit and in the regime of strong NQE. Moreover, our method eliminates the need to simulate multiple replicas. Once parametrized, the model seamlessly integrates into existing classical force fields and doesn’t require specialized algorithms other than ones already used for classical MD simulations. However, this approach is limited to reproducing quantum thermodynamics, so dynamical and momentum-dependent properties should be approximated using a previously proposed method [60].

For the results presented here, different sets of training data were used to train models for different molecular species at different temperatures. However, if the temperature associated with the training data is low enough to significantly populate only the quantum ground state, the trained model can be used to also simulate any lower temperature, as we have demonstrated in the case of the H2O molecule. The proposed method is designed such that the forces originating from the spring coupling between the adjacent replicas have a non-zero contribution to the force projected onto the coarse-grained space. As the dependence of these forces on the atomic mass and temperature is known (Eq. 9), the training dataset obtained for one isotope composition and temperature can be, in principle, reweighted to construct a dataset for different isotope compositions and temperatures. Thus, a single PIMD simulation can provide data for training several models, as we have shown in the case of the Morse potential.

The rapid proliferation of machine-learned force fields parametrized using accurate quantum chemical calculations to account for the electronic effects means the missing NQEs are now becoming a dominant source of error. At the same time, NQEs are crucial to describe biomolecular processes accurately, for instance, in the case of proton transfer reactions. While the results presented here are model-specific, building upon the recent development of machine-learned force-fields (either coarse-grained [69, 70, 71, 72], or atomistic [94, 78, 95, 96, 97, 81]), we believe this study represents a significant step forward towards the design of transferable potentials that explicitly incorporate also NQEs but retain the computational cost of classical MD.

Supplementary material

Section I of the supplementary material details the Morse potential (I.A) and ML model used for a particle in Morse potential (I.B). Section II contains the simulation parameters used to generate the training data. Section III provides the derivation of coordinate rescaling. Section IV presents extended results for a water molecule.

Acknowledgements

We thank members of the Clementi’s group at FU for insightful discussions and comments on the manuscript. We gratefully acknowledge funding from the Deutsche Forschungsgemeinschaft DFG (SFB/TRR 186, Project A12; SFB 1114, Projects B03, B08, and A04; SFB 1078, Project C7), the National Science Foundation (PHY-2019745), and the Einstein Foundation Berlin (project 0420815101), and the computing time provided on the supercomputer Lise at NHR@ZIB as part of the NHR infrastructure (project beb00040), the HPC Service of FUB-IT, Freie Universität Berlin, the HPC Service of the Physics department, Freie Universität Berlin, and the Swiss National Supercomputing Centre (under projects s1209 and s1288). V.K. acknowledges support from the Ernest Oppenheimer Early Career Fellowship and the Sydney Harvey Junior Research Fellowship, Churchill College, University of Cambridge.

Author declarations

Conflict of Interest

The authors have no conflicts to disclose.

Data availability

All the training data, code, and scripts required to reproduce the results are made available in the online repository: cg_nuclear_quantum_statistics.

References