\externaldocument

supplementary

\equalcont

These authors contributed equally to this work.

\equalcont

These authors contributed equally to this work.

[1]\fnmClark \surTempleton \equalcontThese authors contributed equally to this work.

[2,3,4,5]\fnmKlaus-Robert \surMüller

[1,6,7]\fnmCecilia \surClementi

1]\orgdivDepartment of Physics, \orgnameFreie Universität Berlin, \orgaddress\streetArnimallee 12, \postcode14195, \cityBerlin, \countryGermany

2]\orgdivMachine Learning Group, \orgnameTechnische Universität Berlin, \orgaddress\streetMarchstr. 23, \postcode10587, \cityBerlin, \countryGermany

3]\orgdivBIFOLD - Berlin Institute for the Foundations of Learning and Data, \orgaddress\countryGermany

4]\orgdivDepartment of Artificial Intelligence, Korea University, \orgnameKorea University, \orgaddress\citySeoul, \street136-713, \countrySouth-Korea

5]\orgnameMax Planck Institute for Informatics, \orgaddress\postcode66123, \citySaarbrücken, \countryGermany

6]\orgdivCenter for Theoretical Biological Physics, \orgnameRice University, \orgaddress\cityHouston, \postcode77005, \stateTX, \countryUSA

7]\orgdivDepartment of Chemistry, \orgnameRice University, \orgaddress\cityHouston, \postcode77005, \stateTX, \countryUSA

Peering inside the black box: Learning the relevance of many-body functions in Neural Network potentials

\fnmKlara \surBonneau \fnmJonas \surLederer clarktemple03@gmail.com \fnmDavid \surRosenberger klaus-robert.mueller@tu-berlin.de cecilia.clementi@fu-berlin.de [ [ [ [ [ [ [

Abstract

Machine learned potentials are becoming a popular tool to define an effective energy model for complex systems, either incorporating electronic structure effects at the atomistic resolution, or effectively renormalizing part of the atomistic degrees of freedom at a coarse-grained resolution. One of the main criticisms to machine learned potentials is that the energy inferred by the network is not as interpretable as in more traditional approaches where a simpler functional form is used. Here we address this problem by extending tools recently proposed in the nascent field of Explainable Artificial Intelligence (XAI) to coarse-grained potentials based on graph neural networks (GNN). We demonstrate the approach on three different coarse-grained systems including two fluids (methane and water) and the protein NTL9. On these examples, we show that the neural network potentials can be in practice decomposed in relevance contributions to different orders, that can be directly interpreted and provide physical insights on the systems of interest.

keywords:

Neural network potentials, molecular dynamics, coarse-graining, explainable AI

1 Introduction

Molecular simulations have emerged in the last 75 years as a valuable tool to recover or even predict interesting physical phenomena at the microscopic scale and provide a detailed mechanism for grasping the underlying molecular processes [1]. In principle, the most accurate description of a molecular system is given by the solution of the associated Schrödinger’s equation. However, it is common practice to invoke the separation of scales between electrons and nuclei (Born-Oppenheimer approximation) and define an effective energy function for the nuclei that should take into account the electronic effects [2]. Historically, this has been done empirically in the definition of classical atomistic force-fields, which have been designed, refined, and used for the study of molecular systems [3, 4]. Classical force fields assume that the energy of a molecular system can be described as a function of “bonded” terms (e.g. bonds, angles, dihedrals) and “non-bonded” pairwise potentials (e.g., Van der Waals, Coulomb) [2, 1]. All the potential energy terms are defined by fixed functional forms, with parameters tuned to reproduce experimental data and/or first principle calculations on small test systems [3, 1].

Recent advances in machine learning have triggered a step-change in the development of data-driven force fields. In particular, Artificial Neural Networks (ANNs) have been proposed to more accurately capture the electronic effects in the potential energy functions for the nuclei [5, 6]. While classical, non-bonded potential terms are generally limited to 2-body interactions, the use of ANNs, and more specifically Graph Neural Networks (GNNs) [7] defining connections between neighboring atoms, significantly increases the expressivity of the energy function and allows a flexible parameterization of many-body interactions [8, 9, 10, 11, 12, 13].

While leaps in the development of GNN-based models have shown great promise in studying complex macromolecular systems [14] and predicting material properties [15], the results and the models themselves are often seen as black boxes. In a machine learned force-field, a molecular conformation is given as input to the neural network and only the total energy and its derivatives are obtained as output. The increased model accuracy comes at the cost of insight into the nature and strength of molecular interactions: In a classical force field each term in the energy function can be dissected, but deciphering which terms in the potential energy are important for stabilizing certain physical states or interpreting a prediction is significantly more difficult in a GNN-based model.

This black box problem is not unique to molecular systems, but rather ubiquitous in the application of machine learning. As a response, the new area of “Explainable Artificial Intelligence (XAI)” has emerged to start providing tools to tackle the interpretation of deep neural networks [16]. Different approaches have been proposed to interpret the results obtained with ANNs, ranging from self-explainable architectures [17, 11, 18, 19] to post-hoc explanations [20, 21, 22, 23]. Some of those approaches are starting to find use also in physical and chemical applications, e.g., for explaining predictions regarding toxicity or mutagenicity [24, 25, 26], predictions of electronic-structure properties [27, 28], guiding strategies in drug discovery [29, 30], analyzing protein-ligand binding [31, 32, 33], or uncertainty attribution [34, 35]. XAI approaches have also been utilized to provide a better understanding of the error introduced in coarse-grained molecular models [36].

In principle, an “interpretable model” should allow a researcher to extract scientific knowledge in a successful application and to identify the sources of deficiencies/anomalies when the model fails. In this work, we use ideas from XAI and implement them in the context of machine learned molecular models for molecular dynamics simulations. In particular, we focus our attention on the understanding of coarse-grained (CG) models. In parallel to the development of atomistic force-fields, GNNs have been successfully employed in the definition of models at reduced resolutions [37, 38, 39, 40, 41], where some of the atomistic degrees of freedom are renormalized into a reduced number of effective “beads” to speed up simulation time. The difficulty in the definition of CG models lies in the fact that many-body terms play an important role, as a reduction in the number of degrees of freedom is associated with increased complexity in the effective CG energy function. It has been shown that to reproduce either experimentally measured free energy differences [42] or the thermodynamics of a finer-grained model [43, 44], many-body terms need to be included. The need for many-body terms and the difficulty in capturing them make CG models an ideal test ground for GNN interpretation.

In the present study, we train a GNN energy function at CG resolution from atomistic simulation data of several systems of different complexity and then interpret the model to provide a deeper understanding, rather than a mere energy prediction. By using explainable AI in the context of CG models, we provide evidence that machine learned force-fields are indeed learning physically significant interactions.

To interpret the energy prediction of a CG molecular model, we use the method of “Layerwise-relevance propagation” (LRP) for GNNs, which has been recently proposed to explain a model prediction by decomposing the output (in our case, the energy of the system) in terms of the contribution (or “relevance”) of groups of nodes (i.e. CG beads) [28]. In a sense, we can see the GNN-LRP as a multi-body expansion of the total energy of the system provided by the GNN.

As a first example, we compare different classes of GNN architectures to obtain CG models of bulk fluids and show that an interpretation of accurate CG models provide a meaningful physical information. Interestingly, two GNN architectures, even if different from each other, convey the same physical interpretation: at least in terms of 2-body and 3-body terms, the two networks offer different functional representations of the same underlying energy landscape.

As a second example, we examine a machine learned CG model for the protein NTL9 and show that its interpretation allows us to pinpoint the stabilizing and destabilizing interactions in the various metastable states, and even interpret the effects of mutations.

Results

1.1 Methane and Water

We start by analyzing and comparing CG models for bulk methane (CH₄) and water (H₂O). Methane has been previously studied with various coarse-graining methods since its non-polar and weak Van-der-Waals interactions make it a simple test system [37]. On the other hand, water is capable of forming complex hydrogen bonding structures and much research has been devoted to its modelling, both at the atomistic scale [45] and on the CG level [46, 47, 48, 49, 38].

For both systems, we define CG models by integrating out the hydrogen atoms and positioning an effective CG bead in place of the central carbon or oxygen atom. For each system, we train two CG models with different choices of GNN architectures, PaiNN [50] and SO3Net [51], for the definition of the CG effective energy. Both architectures are trained using the force-matching variational principle for coarse-graining, to create a thermodynamically consistent CG model from the atomistic data [52, 53, 47, 54, 39]. A brief discussion of the specific features of these GNNs is provided in the Methods section.

To interpret the network predictions, we use the method of “Layerwise-relevance propagation” (LRP) for GNN, called GNN-LRP [28]. The idea of GNN-LRP is schematically described in Fig. 1 and

Refer to caption — Figure 1: Concept of GNN-LRP illustrated for a system of four particles (i.e. CG beads, in the present context). a) In GNNs, the input graph is defined by a cutoff radius that determines the direct neighbors for each input node. By stacking several message aggregations in multiple layers, information can be exchanged between more distant nodes (outside of the cutoff region). The model output is obtained by passing the learned feature representations through a multilayer perceptron and final pooling. Obtaining the relevance involves propagating the output back through the network, by considering the connections between each node in one layer and the nodes in the previous layer. This procedure defines “walks” across the network layers. b) The walks involving the same subset of $n$ nodes are aggregated to obtain a decomposition of the output into $n$ -body contributions.

a more detailed description is provided in the Methods section as well as in the Supplementary Section S5. Essentially, GNN-LRP provides us with a “relevance score” for each subset of $n$ CG beads (with $n=1,\dots,N_{l}+1$ , where $N_{l}$ is the number of layers in the GNN) in each configuration of the system, indicating the contribution of their interaction to the total energy.

The ability of the two CG models to reproduce structural features of the two systems is shown in Figs. 2-3. In particular, the radial distribution function (RDF) as obtained in the CG models is shown in Fig. 2 against the atomistic reference model for methane (right column) and water (left column). Note that the SO3Net model exhibits irreducible representations up to an angular momentum of $l_{\mathrm{max}}=2$ , while PaiNN utilizes a maximum angular momentum of $l_{\mathrm{max}}=1$ . As a consequence, in comparison to SO3Net, PaiNN requires a larger cutoff to accurately reproduce the RDF of water. For a comparison between PaiNN models with different cutoff radii, please refer to Supplementary Fig. S2.

For methane, the smooth oscillatory behavior of the RDF is similar to a Lennard-Jones (LJ) fluid [55], suggesting that many-body interactions may not be very relevant in a CG model of this molecule. In contrast, for water, the height of the first solvation shell is more sharply peaked and decays more rapidly than in the case of methane. For both systems, both architectures reproduce the corresponding RDF.

In Fig. 2, the average relevance score for the 2-body contributions is plotted (red curves) as a function of their distance, alongside the RDF for each model. Since the relevance score corresponds essentially to a decomposition of the output energy, a positive (negative) relevance score implies an increase (decrease) in the energy, thus a destabilizing (stabilizing) effect of the associated interaction.

For both methane and water, both PaiNN and SO3Net show a 2-body relevance score that diverges as the distance between two beads goes to zero. This observation matches our intuition that at distances below a certain “effective radius”, the network should learn a repulsive excluded-volume interaction to avoid the overlapping of the CG beads. For all models, the relevance score decays to zero as the distance between two atoms approaches the cutoff value considering two atoms connected in the corresponding GNN. This corresponds to the intuition that the interactions between two atoms become weaker as the atoms move further apart, and is enforced by the cosine shape of the cutoff function.

Differences between the two systems emerge at intermediate distances. Fig. 2b) and d) shows that, for methane, the relevance score is mostly positive. The PaiNN model displays a slight stabilizing well around the first peak of the RDF, that is washed out in the SO3Net model due to the proximity between the network cutoff and the location of the first solvation shell. On the other hand, the relevance score for both water models (Fig. 2 a) and c)) dips significantly below 0 at $\sim$3\text{\,}\mathrm{\SIUnitSymbolAngstrom}$$ , indicating a stabilizing interaction between molecules at a distance corresponding to the first solvation shell.

Fig. 3 examines the angle distributions between triplets of interacting atoms within the first solvation shell, and reports the average 3-body relevance score as a function of this angle (as red curves). As illustrated in Fig. 1, the number of steps in the “walks” of connections between the nodes (CG beads) from one layer to the previous one in the GNN is determined by the number of message passing interaction blocks of the ML model (see Methods section for details). Here, we use models with three interaction blocks in both PaiNN and SO3Net models, thus we obtain walks of four steps in the input graph (with corresponding relevance attribution $R_{ijkl}$ ). As the architecture of both models include skip connections in their interaction blocks, a given graph node in a network layer may be connected to the same node in the previous layer. That is, a node may appear more than once in a single network walk. We indicate the case where a node is connected to itself in two subsequent layers as a “self loop” (see Fig. 1). Since features involved in self loops are expected to dominate the respective walk relevance, here, we distinguish between 3-body walks containing a self-loop (eg. connecting nodes [i, i, j, k] in Fig. 1) and walks containing no self-loops (eg. connecting nodes [i, j, k, i] in Fig. 1).

For methane, both PaiNN and SO3Net can reproduce the local atomistic angle distribution and they both have an associated relevance score very close to zero that does not particularly favor any angular configuration, indicating that the 3-body terms are not very important for the CG methane. For water, both models produce a similar behavior for the averaged 3-body relevance (shown in red). The walks without self-loop are stabilizing and favor angles around $50$ - $60$ degrees, corresponding to the population of water molecules sitting interstitially inside the tetrahedral arrangement [56, 57]. The walks containing self-loops are mostly destabilizing and likely correct for an overstructuring of the 2-body interactions. Indeed, in Supplementary Fig. S5, one can see that the relevance from the walks with self-loops varies much more as a function of the distance than the angle, whereas the relevance of walks without self-loops depends strongly on the angles and is not merely dominated by pairwise distances. Having 3-body interactions correct for 2-body interactions is a known effect when parametrizing explicit $n$ -body functions for constructing CG models [48, 49, 58]. In both models presented here, 3-body terms are crucial to recover structural properties of CG water.

To further support this point, a 2-body-only model is shown in the Supplementary Section S3, where the Inverse Monte Carlo (IMC) method is used for parametrizing a pair potential on the system’s RDF. Supplementary Fig. S1 shows that, for methane, the IMC model is capable of reproducing the correct distributions of the relevant features, whereas for water, this 2-body-only model fails at reproducing the angular distribution. Additionally, we also observe that a non-equivariant GNN such as SchNet performs well on methane, but fails at fully reproducing the atomistic distributions for water (see Supplementary Section S3).

It’s worth noting that while the average relevance is shown in Fig. 2 and Fig. 3 as a function of distances and angles, the individual relevance scores cover a broad range. Supplementary Figs. S3 and S4 show the entire distribution of 2-body relevance values over all the configurations of the water models. High (destabilizing) relevance values correspond to short distances between beads and low (stabilizing) relevance values to distances corresponding to the first peak of the RDF. A relevance score of almost zero corresponds to distances approaching the network cutoff.

1.2 NTL9

Finally, we show the GNN-LRP interpretation in a coarse-grained protein model, specifically that of 39 residue NTL9 (PDB ID: 2HBA, residues 1-39) which has been a system of interest in a recent study by Majewski et. al. [40]. We use the PaiNN architecture introduced in the previous section to learn a CG model of NTL9 from the same atomistic reference data as used in the previous study [40], following the procedure introduced by Husic et. al. [39]. More details on the training procedure can be found in the Supplementary Section S2. In this study, only $C_{\alpha}$ atoms are kept in the CG resolution and each amino-acid type is represented by a unique bead embedding. We selected this protein due to its well-characterized folded/unfolded state and non-trivial folding pathways. Comparison of the Free Energy surface (FES) projected onto the first two TICA components (collective variables capturing the slow motions of the system) [59] is shown in Fig 4.

The folded state of NTL9 contains 3 $\beta$ sheets that are formed by the residues along the C- and N-terminal regions as well as a central $\alpha$ helix (shown on the bottom left in Fig. 4). The stability of this short fragment of the N-terminal domain of the Ribosomal Protein L9 is likely due to the strong hydrophobic core between the $\beta$ -sheets and the $\alpha$ -helix [60, 61]. To test whether the model is learning these interactions, we compute the relevance score for 2- and 3-body interactions in the trained model, for structures taken from different metastable states. The contact map in Fig. 5a) shows the 2-body relevance scores for each pair of amino-acids in the folded (upper right) and unfolded (lower left) states. Interestingly, the 2-body interactions stabilizing the folded state correspond to contacts associated with the main secondary structure elements, while in the unfolded state both stabilizing and destabilizing interactions are found also outside of the secondary structure. The strongest 2- and 3-body interactions inside the folded state are shown in Fig. 6. The strongest 2-body contributions are found in the $\beta_{13}$ sheet, most of which are stabilizing interactions. Notably, the VAL3-GLU38 interaction is destabilizing, which indicates that the CG model learns the side-chain specific interaction between the charged Glutamate and the hydrophobic Valine. The strongest 3-body interactions in the folded state, Fig. 6 panel (b), stabilize the helix and the tertiary structure of the protein.

NTL9 can fold by two different pathways, which appear as two distinct ”branches” in the free energy landscapes in Fig. 4. We examine the differences in relevance patterns between the two pathways connecting the folded to unfolded states. In panel b) of Fig. 5 we show the difference in relevance for both intermediate states relative to the folded state. Here, a positive difference means that the interaction has a lower relevance in the folded state than in the intermediate state and thus that this interaction is less stable in the intermediate state. For the state indicated as P1 in Fig. 4, many destabilizing interactions are located inside the $\beta_{12}$ sheet, whereas the difference to the folded state is essentially null in the $\beta_{13}$ sheet and in the $\alpha$ -helix. This indicates that P1 corresponds to an intermediate state where the $\beta_{13}$ sheet and the $\alpha$ -helix are native-like but the $\beta_{12}$ is less stable than in the native state. Indeed, analysis of the structures in P1 reveal that $\beta_{12}$ is register-shifted. The P2 state shows the opposite behavior with destabilizing interactions in the $\beta_{13}$ sheet and in the $\alpha$ -helix, indicating that in the P2 state, only the $\beta_{12}$ sheet is correctly formed. These two folding pathways with the same characteristics are also found in previous computational studies of NTL9 [63, 64, 65]. Note that if the relevance attribution in the folded state shows that the network captures the interaction decay with the distance between residues, the relevance attribution in a given state provides more information than contained merely in the contact maps for these states, as can be seen by comparing the relevance attribution of both intermediate states to their distance contact maps shown in Supplementary Fig. S6.

The interpretation of the learned interactions can be pushed a step further by considering the effects of mutations on the relevance analysis. We consider mutations of residues deemed stabilizing in the folded state. In the machine learned C_α CG model of the protein employed here, one can straightforwardly perform a mutation by changing the aminoacid identity, that is by changing the embedding of the corresponding C_α bead. We select two mutations to illustrate the ability of the CG model to learn specific interactions such as hydrophobic/hydrophilic interactions between side-chains and side-chain specific packing. In particular, the mutation ILE4ASN is chosen to disrupt the hydrophobic interaction of the $\beta$ -sheets, and the mutation LEU30PHE is chosen to disrupt the central $\alpha$ -helix as well as the tight packing between the $\alpha$ -helix and the $\beta$ -sheets. These residues are flagged as important in the analysis above and have been shown in mutation experiments to play a role in the stabilization of the folded state [61, 66].

In Fig. 7a) we show the 2-body relevance difference between the mutated states (LEU30PHE on the upper right and ILE4ASN on the lower left half) and the wild-type folded state. Replacing the identities of hydrophobic Isoleucine by polar Asparagine of about the same size at position 4 introduces a strong destabilization of all contacts with hydrophobic residues in the neighboring sheets, as it is also visualized in panel c). In the crystal structure of NTL9, LEU30 is tightly packed between the $\alpha$ -helix and the $\beta$ -sheets and mutation studies suggest that even a small change in side-chain size has a destabilizing effect [61]. Indeed, replacing the identity of Leucine 30 by the bigger Phenylalanine in our CG model induces a destabilization of the entire $\alpha$ -helix (see Fig. 7, panels a) and b)). Interestingly, the disruption also has an effect on contacts inside the $\beta_{13}$ sheet that do not directly involve the mutated residue, indicating that the model has indeed learned non-trivial many-body interactions. These findings further corroborate the ability of the CG model to learn amino-acid specific interactions in a $C_{\alpha}$ -only representation.

Discussion

In this work, we propose an extension of GNN-LRP to interpret the effective energy of machine-learned CG protein models in terms of a multi-body decomposition. We have shown on the application to CG fluids that the learned interactions are physically meaningful and consistent even if different ML architectures are used. The explanations provided by GNN-LRP indicate when multi-body interactions are required to recover the thermodynamics of the fine-grained system, showing the suitability of this higher-order explanation method to ML CG force-fields. Moreover, the multi-body relevance contributions show that the different ML models have learned similar physically relevant interatomic interactions, indicating that these models effectively learn the same underlying potential energy surface. The application of this idea on ML CG protein models allows to disentangle the strength of the different interactions between residues. It can also be used to evaluate the effect of mutations in the model and shows that bottom-up $C_{\alpha}$ -based ML CG models capture amino-acid specific interactions without explicit representation of the side-chains.

These results provide reassurance that bottom-up machine learned CG force-fields indeed learn physically relevant terms by approximating the many-body potential of mean-force associated with the integration of part the degrees of freedom [47]. We note that the methods introduced are model agnostic and can be used in general to interpret machine learned potentials of different systems at different resolutions.

Future work is still needed to refine these concepts to provide greater insight into ML models and allow researchers to be more systematic in their choice of architecture and functionalization. It is our hope that this work helps lay the groundwork for researchers to better understand the outputs of their models as well as give the coarse-graining community a way to probe these learned many-body effects more explicitly.

Methods

Machine learned CG Potentials

Graph Neural Networks (GNNs) have been proposed as a promising method to learn interatomic potentials [10], and many different model architectures have been developed in recent years [67]. GNNs represent the underlying atomic details using structured graph data where nodes represent atoms and use the idea that locality dominates the energy landscape to draw edges between nodes if two nodes are within a pre-defined cutoff distance. In GNNs there are generally three steps that go into producing the network output based on the input positions and node identities: (i) a message-passing step, where neighboring nodes (connected by edges) exchange information about their respective feature values, (ii) an update step, where the node features are modified based on the received messages, and (iii) a final readout step, where the features of each node are used to predict the target property [10]. Once the node features are fed through the readout layer the network can make use of backpropagation to extract the force by taking derivatives of the energy based on molecule positions that can then be used to train during force matching or to propagate the dynamics.

In this manuscript, we examine two different GNN architectures for the effective CG energy: PaiNN [50] and SO3Net [51], two equivariant message-passing architectures that mainly differ on the order of their SO3-equivariant features ( $l_{\mathrm{max}}=1$ for PaiNN and $l_{\mathrm{max}}=2$ for SO3Net). Both preserve the basic euclidean symmetries of the system, notably those of translation, rotation and reflection. Both architectures are parametrized to reproduce the CG potential of mean force using the force-matching approach [52, 53, 47, 54, 39]. As a comparison to more classical methods, Inverse Monte Carlo (IMC) [68, 69] is also performed using the votca library [70] with results shown in the Supplementary Section S3.

Layerwise-relevance propagation

Layerwise-relevance propagation (LRP) has emerged as a method to explain model predictions in a post-hoc and model agnostic manner [71, 21, 72, 73]. Originally, LRP has been used to obtain first-order explanations in the form of relevance attributions (also referred to as relevance scores) in the input domain. E.g., for image classification tasks, the relevance attributions would indicate to what extent a respective pixel is responsible for the network decision [21, 74]. The relevance attributions can be visualized in the input domain in form of a heat map, where large relevance attributions highlight features of the classified object that predominantly lead to the respective decision of the neural network [21]. Recently, efforts have been made to adapt LRP and other explanation methods to regression tasks [27, 75, 76], such as, e.g., the prediction of atomization energies [27, 28].

Pixel-wise relevance attributions have contributed enormously to a better understanding of the inner workings of ANNs. However, in some cases, restricting explanations to first-order (input features) may result in oversimplified explanations. Especially for problems where the interaction between several input nodes is considerably strong, the relevance information of higher-order features becomes increasingly important. This is the case for the coarse-grained systems considered in this study, where multi-body interactions are essential [77, 43, 44, 42]. To this end, a variety of higher-order explanation frameworks have been introduced [78, 79, 28, 80, 81, 82, 83, 84, 85]. One of those methods is GNN-LRP, which extends LRP to higher-order explanations for GNNs [28, 80]. In the following, we will give a brief summary of the methodology of GNN-LRP, first describing first-order relevance propagation and then extending it to higher-order. For an in-depth introduction, please refer to [71, 28].

In general, relevance attributions are obtained by propagating relevance from the model output back to the input features, with the relevance being a conserved quantity much the same as the mass flow through a pipe or electrical current through a junction. This means that the sum relevance attribution of a layer in the network remains the same as it is propagated backwards through the network. The relevance propagation from the neurons $\{j\}$ to a lower-layer neuron $i$ reads

R_{i}^{l}=\sum_{j}\frac{q_{ij}}{\sum_{i}q_{ij}}\cdot R_{j}^{l+1}~{},

(1)

where $q_{ij}$ quantifies the contribution of neuron $i$ to the activation of neuron $j$ . This propagation approach can be applied subsequently from the output neuron until all relevance is attributed to the input neurons. There are multiple ways to define $q_{ij}$ associated with different propagation rules (see also [71]). According to the deep Taylor decomposition [72], LRP is equivalent to a decomposition of the neural network into several first-order Taylor expansions. In this picture, the different LRP rules implicitly help find suitable root points for the respective Taylor expansions [72].

In contrast to the first-order node-wise relevance attributions, GNN-LRP explains the network prediction based on so-called relevance “walks”. Each walk is a collection of connected edges and nodes of the input graph. The length of the walks is dictated by the number of interaction layers (message passes) in the model. Hence, e.g., a model with two interaction layers would allow for up to 3-body walks. The concept of GNN-LRP is illustrated in Fig. 1a) for a model with two interaction layers applied to an exemplary system composed of four particles (e.g., CG beads). The top part of the figure shows the forward pass of a common GNN starting from the feature embedding on the input graph until the model output, while the bottom figure illustrates how the walks with their associated relevance attributions are constructed. Note that the connections for message passing in the GNN are defined by the graph structure. Depending on the number of interaction layers and the size of the node features, each graph node will have multiple corresponding neurons in the GNN. We distinguish between graph nodes ( $i,j,k,...$ ) and neurons ( $a,b,c,...$ ) by using subscript and superscript, respectively. Similar to Eq. 1, the relevance of a walk $\mathcal{W}=[j,k,l]$ is obtained by propagating back the relevance from upper-layer neurons

R_{jkl}^{b}=\sum_{c}\frac{\lambda_{jk}Q_{jk}^{bc}}{\sum_{b,j}\lambda_{jk}Q_{jk% }^{bc}}R_{kl}^{c}

(2)

where $\lambda_{ij}$ denotes the edge feature between graph nodes $i$ and $j$ , and $Q_{jk}^{bc}$ is the contribution of node $j$ with its respective neuron $b$ to neuron $c$ with the associated node $k$ . Depending on the propagation rule, again, $Q_{jk}^{bc}$ takes different forms. For more information on how $Q_{jk}^{bc}$ is obtained considering the propagation rules used in this work, please refer to the Supplementary Section S5. Note that Eq. 2 yields the walk relevance for a specific neuron $b$ . The total relevance of the respective walk is given by summing over all corresponding neurons:

R_{jkl}=\sum_{b}R_{jkl}^{b}~{}.

(3)

The relevance of each walk can be seen as a contribution to the model output resulting from interactions between the subset of particles in the graph corresponding to the respective walk. More precisely, the relevance of a walk is associated with a sequence of interactions between pairs of particles or a single bead with itself. However, such a sequence of interactions is rather abstract and cannot be associated with a physical quantity in a meaningful way. In order to obtain a quantity that we can interpret as a multi-body decomposition of the output, we aggregate a collection of walks to so-called $n$ -body contributions, as depicted in Fig. 1b). As suggested by Schnake et. al. [28], this can be achieved by summing up the relevance of all walks $\left\{\mathcal{W}\right\}$ that are part of a certain graph substructure $\mathcal{S}$ . Hence, the relevance of a substructure $\mathcal{S}$ is given as

\displaystyle R_{\mathcal{S}}=\sum_{\mathcal{W}\in\mathcal{S}}R_{\mathcal{W}}~% {}.

(4)

In this way, for the example of Fig. 1, we obtain 2-body and 3-body relevance attributions by summing over the relevance attributions associated to the walks which are part of the respective 2-body or 3-body substructure. In the case of energy predictions, this yields the energy contribution of the respective $n$ -body substructures.

Acknowledgements

We gratefully acknowledge funding from the Deutsche Forschungsgemeinschaft DFG (SFB/TRR 186, Project A12; SFB 1114, Projects B03, B08, and A04; SFB 1078, Project C7), the National Science Foundation (PHY-2019745), and the Einstein Foundation Berlin (Project 0420815101), and the computing time provided on the supercomputer Lise at NHR@ZIB as part of the NHR infrastructure. K.R.M. was in part supported by the German Ministry for Education and Research (BMBF) under Grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D, and 01IS18037A, and by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grants funded by the Korean government (MSIT) No. 2019-0-00079, Artificial Intelligence Graduate School Program, Korea University and No. 2022-0-00984, Development of Artificial Intelligence Technology for Personalized Plug-and-Play Explanation and Verification of Explanation.

Data Availability

Simulation data and the code to reproduce the analysis and the plots shown in the manuscript are accessible at https://box.fu-berlin.de/apps/files/?dir=/peering_inside_the_black_box_supplementary_data_and_code&fileid=555049372

References

[1] Best, R. B. Atomistic force fields for proteins. In Bonomi, M. & Camilloni, C. (eds.) Biomolecular Simulations: Methods and Protocols, 3–19 (Springer, 2019).
[2] Monticelli, L. & Tieleman, D. P. Force fields for classical molecular dynamics. In Monticelli, L. & Salonen, E. (eds.) Biomolecular Simulations: Methods and Protocols, 197–213 (Humana Press, 2013).
[3] Lindorff-Larsen, K. et al. Systematic validation of protein force fields against experimental data. PloS one 7, e32131 (2012).
[4] Robustelli, P., Piana, S. & Shaw, D. E. Developing a molecular dynamics force field for both folded and disordered protein states. Proc. Natl. Acad. Sci. USA 115, E4758–E4766 (2018).
[5] Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
[6] Noé, F., Tkatchenko, A., Müller, K.-R. & Clementi, C. Machine learning for molecular simulation. Annu. Rev. Phys. Chem. 71, 361–390 (2020).
[7] Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. Learn. Syst. 20, 61–80 (2009).
[8] Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints. In Adv. Neural Inf. Process. Syst., vol. 28 (Curran Associates, Inc., 2015).
[9] Kearnes, S., McCloskey, K., Berndl, M., Pande, V. & Riley, P. Molecular graph convolutions: moving beyond fingerprints. J. Comput. Aided Mol. Des. 30, 595–608 (2016).
[10] Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Int. Conf. Mach. Learn., 1263–1272 (PMLR, 2017).
[11] Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 13890 (2017).
[12] Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet - a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
[13] Unke, O. T. & Meuwly, M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
[14] Unke, O. T. et al. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. Sci. Adv. 10, eadn4397 (2024).
[15] Ceriotti, M. Beyond potentials: Integrated machine learning models for materials. Mrs Bulletin 47, 1045–1053 (2022).
[16] Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J. & Müller, K.-R. Explaining deep neural networks and beyond: A review of methods and applications. Proc. IEEE 109, 247–278 (2021).
[17] Schütt, K. T., Gastegger, M., Tkatchenko, A. & Müller, K.-R. Quantum-Chemical Insights from Interpretable Atomistic Neural Networks, 311–330 (Springer International Publishing, Cham, 2019).
[18] Gallegos, M., Vassilev-Galindo, V., Poltavsky, I., Martín Pendás, Á. & Tkatchenko, A. Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors. Nat. Commun. 15, 4345 (2024).
[19] Zhang, Z., Liu, Q., Wang, H., Lu, C. & Lee, C. ProtGNN: Towards self-explaining graph neural networks. In Proc. AAAI Conf. Artif. Intell., vol. 36, 9127–9135 (2022).
[20] Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Guyon, I. et al. (eds.) Adv. Neural Inf. Process. Syst., vol. 30 (Curran Associates, Inc., 2017).
[21] Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, e0130140 (2015).
[22] Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Int. Conf. Mach. Learn., 3319–3328 (PMLR, 2017).
[23] Ribeiro, M. T., Singh, S. & Guestrin, C. ”Why should I trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144 (2016).
[24] Debnath, A. K., Lopez de Compadre, R. L., Debnath, G., Shusterman, A. J. & Hansch, C. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34, 786–797 (1991).
[25] Kazius, J., McGuire, R. & Bursi, R. Derivation and validation of toxicophores for mutagenicity prediction. J. Med. Chem. 48, 312–320 (2005).
[26] Baehrens, D. et al. How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010).
[27] Letzgus, S. et al. Toward explainable artificial intelligence for regression models: A methodological perspective. IEEE Signal Processing Magazine 39, 40–58 (2022).
[28] Schnake, T. et al. Higher-order explanations of graph neural networks via relevant walks. IEEE Trans Pattern Anal Mach Intell 44, 7581–7596 (2022).
[29] Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).
[30] Wellawatte, G. P., Gandhi, H. A., Seshadri, A. & White, A. D. A perspective on explanations of molecular prediction models. J. Chem. Theory Comput. 19, 2149–2160 (2023).
[31] McCloskey, K., Taly, A., Monti, F., Brenner, M. P. & Colwell, L. J. Using attribution to decode binding mechanism in neural network models for chemistry. Proc. Natl. Acad. Sci. USA 116, 11624–11629 (2019).
[32] Mahmoud, A. H., Masters, M. R., Yang, Y. & Lill, M. A. Elucidating the multiple roles of hydration for accurate protein-ligand binding prediction via deep learning. Commun. Chem. 3, 1–13 (2020).
[33] Hochuli, J., Helbling, A., Skaist, T., Ragoza, M. & Koes, D. R. Visualizing convolutional neural network protein-ligand scoring. J. Mol. Graph. Model. 84, 96–108 (2018).
[34] Yang, C.-I. & Li, Y.-P. Explainable uncertainty quantifications for deep learning-based molecular property prediction. J. Cheminform. 15, 13 (2023).
[35] Roy, S., Dürholt, J. P., Asche, T. S., Zipoli, F. & Gómez-Bombarelli, R. Learning a reactive potential for silica-water through uncertainty attribution (2023). Preprint at https://arxiv.org/abs/2307.01705.
[36] Durumeric, A. E. & Voth, G. A. Using classifiers to understand coarse-grained models and their fidelity with the underlying all-atom systems. J. Chem. Phys. 158 (2023).
[37] Jiang, C., Ouyang, J., Wang, L., Liu, Q. & Li, W. Coarse graining of the fully atomic methane models to monatomic isotropic models using relative entropy minimization. J. Mol. Liq. 242, 1138–1147 (2017).
[38] Zhang, L., Han, J., Wang, H., Car, R. & E, W. DeePCG: Constructing coarse-grained models via deep neural networks. J. Chem. Phys. 149, 034101 (2018).
[39] Husic, B. E. et al. Coarse graining molecular dynamics with graph neural networks. J. Chem. Phys. 153 (2020).
[40] Majewski, M. et al. Machine learning coarse-grained potentials of protein thermodynamics. Nat. Commun. 14, 5739 (2023).
[41] Charron, N. E. et al. Navigating protein landscapes with a machine-learned transferable coarse-grained model (2023). Preprint at https://arxiv.org/abs/2310.18278.
[42] Zaporozhets, I. & Clementi, C. Multibody terms in protein coarse-grained models: A top-down perspective. J. Phys. Chem. B 127, 6920–6927 (2023).
[43] Larini, L. & Shea, J.-E. Coarse-grained modeling of simple molecules at different resolutions in the absence of good sampling. J. Phys. Chem. B 116, 8337–8349 (2012).
[44] Wang, J. et al. Multi-body effects in a coarse-grained protein force field. J. Chem. Phys. 154 (2021).
[45] Ouyang, J. F. & Bettens, R. P. A. Modelling water: A lifetime enigma. CHIMIA 69, 104–104 (2015).
[46] Molinero, V. & Moore, E. B. Water modeled as an intermediate element between carbon and silicon. J. Phys. Chem. B 113, 4008–4016 (2009).
[47] Noid, W. G. et al. The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models. J. Chem. Phys. 128, 244114 (2008).
[48] Larini, L., Lu, L. & Voth, G. A. The multiscale coarse-graining method. VI. implementation of three-body coarse-grained potentials. J. Chem. Phys. 132 (2010).
[49] Das, A. & Andersen, H. C. The multiscale coarse-graining method. IX. a general method for construction of three body coarse-grained force fields. J. Chem. Phys. 136, 194114 (2012).
[50] Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Int. Conf. Mach. Learn., 9377–9388 (PMLR, 2021).
[51] Schütt, K. T., Hessmann, S. S., Gebauer, N. W., Lederer, J. & Gastegger, M. Schnetpack 2.0: A neural network toolbox for atomistic machine learning. J. Chem. Phys. 158 (2023).
[52] Ercolessi, F. & Adams, J. B. Interatomic potentials from first-principles calculations: The force-matching method. EPL 26, 583 (1994).
[53] Izvekov, S. & Voth, G. A. A multiscale coarse-graining method for biomolecular systems. J. Phys. Chem. B 109, 2469–2473 (2005).
[54] Wang, J. et al. Machine learning of coarse-grained molecular dynamics force fields. ACS Cent. Sci. 5, 755–767 (2019).
[55] McQuarrie, D. A. Statistical Mechanics. Harper’s chemistry series (Harper Collins, New York, 1976).
[56] Stock, P. et al. Unraveling hydrophobic interactions at the molecular scale using force spectroscopy and molecular dynamics simulations. ACS Nano 11, 2586–2597 (2017).
[57] Monroe, J. I. & Shell, M. S. Decoding signatures of structure, bulk thermodynamics, and solvation in three-body angle distributions of rigid water models. J. Chem. Phys. 151, 094501 (2019).
[58] Scherer, C. & Andrienko, D. Understanding three-body contributions to coarse-grained force fields. Phys. Chem. Chem. Phys. 20, 22387–22394 (2018).
[59] Pérez-Hernández, G., Paul, F., Giorgino, T., De Fabritiis, G. & Noé, F. Identification of slow molecular order parameters for markov model construction. J. Chem. Phys. 139, 015102 (2013).
[60] Horng, J.-C., Moroz, V., Rigotti, D. J., Fairman, R. & Raleigh, D. P. Characterization of large peptide fragments derived from the N-terminal domain of the ribosomal protein L9: Definition of the minimum folding motif and characterization of local electrostatic interactions. Biochemistry 41, 13360–13369 (2002).
[61] Anil, B., Sato, S., Cho, J.-H. & Raleigh, D. P. Fine structure analysis of a protein folding transition state; distinguishing between hydrophobic stabilization and specific packing. J. Mol. Biol. 354, 693–705 (2005).
[62] Schrödinger, L. & DeLano, W. PyMOL v2.5.0. URL http://www.pymol.org/pymol.
[63] Snow, C. D., Rhee, Y. M. & Pande, V. S. Kinetic definition of protein folding transition state ensembles and reaction coordinates. Biophys. J. 91, 14–24 (2006).
[64] Voelz, V. A., Bowman, G. R., Beauchamp, K. & Pande, V. S. Molecular simulation of ab initio protein folding for a millisecond folder NTL9 (1-39). J. Am. Chem. Soc. 132, 1526–1528 (2010).
[65] Lindorff-Larsen, K., Piana, S., Dror, R. O. & Shaw, D. E. How fast-folding proteins fold. Science 334, 517–520 (2011).
[66] Sato, S., Cho, J.-H., Peran, I., Soydaner-Azeloglu, R. G. & Raleigh, D. P. The N-terminal domain of ribosomal protein L9 folds via a diffuse and delocalized transition state. Biophys. J. 112, 1797–1806 (2017).
[67] Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
[68] Lyubartsev, A. P. & Laaksonen, A. Calculation of effective interaction potentials from radial distribution functions: A reverse monte carlo approach. Phys. Rev. E 52, 3730–3737 (1995).
[69] Lyubartsev, A., Mirzoev, A., Chen, L. & Laaksonen, A. Systematic coarse-graining of molecular models by the newton inversion method. Faraday Discuss. 144, 43–56 (2009).
[70] Ruhle, V., Junghans, C., Lukyanov, A., Kremer, K. & Andrienko, D. Versatile object-oriented toolkit for coarse-graining applications. J. Chem. Theory Comput. 5, 3211–3223 (2009).
[71] Montavon, G., Binder, A., Lapuschkin, S., Samek, W. & Müller, K.-R. Layer-wise relevance propagation: An overview. In Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, 193–209 (Springer International Publishing, 2019).
[72] Montavon, G., Lapuschkin, S., Binder, A., Samek, W. & Müller, K.-R. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern recognition 65, 211–222 (2017).
[73] Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018).
[74] Lapuschkin, S. et al. Unmasking clever hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019).
[75] Letzgus, S., Müller, K.-R. & Montavon, G. XpertAI: uncovering model strategies for sub-manifolds (2024). Preprint at https://arxiv.org/abs/2403.07486.
[76] Spooner, T. et al. Counterfactual explanations for arbitrary regression models (2021). Preprint at https://arxiv.org/abs/2106.15212.
[77] Wang, H., Junghans, C. & Kremer, K. Comparative atomistic and coarse-grained study of water: What do we lose by coarse-graining? Eur. Phys. J. E 28, 221–229 (2009).
[78] Eberle, O. et al. Building and interpreting deep similarity models. IEEE Trans Pattern Anal Mach Intell 44, 1149–1161 (2020).
[79] Ying, Z., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. GNNExplainer: Generating explanations for graph neural networks. In Adv. Neural Inf. Process. Syst.s, vol. 32 (Curran Associates, Inc., 2019).
[80] Xiong, P., Schnake, T., Montavon, G., Müller, K.-R. & Nakajima, S. Efficient computation of higher-order subgraph attribution via message passing. In Int. Conf. Mach. Learn., 24478–24495 (PMLR, 2022).
[81] Blücher, S., Vielhaben, J. & Strodthoff, N. PredDiff: Explanations and interactions from conditional expectations. Artif. Intell. 312, 103774 (2022).
[82] Faber, L., K. Moghaddam, A. & Wattenhofer, R. When comparing to ground truth is wrong: On evaluating gnn explanation methods. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 332–341 (2021).
[83] Janizek, J. D., Sturmfels, P. & Lee, S.-I. Explaining explanations: Axiomatic feature interactions for deep networks. J. Mach. Learn. Res. 22, 1–54 (2021).
[84] Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
[85] Sundararajan, M., Dhamdhere, K. & Agarwal, A. The shapley taylor interaction index. In Proceedings of the 37th International Conference on Machine Learning, 9259–9268 (PMLR, 2020).