Taking materials dynamics to new extremes using machine learning interatomic potentials

Understanding materials dynamics under extreme conditions of pressure, temperature, and strain rate is a scientific quest that spans nearly a century. Atomic simulations have had a considerable impact on this endeavor because of their ability to uncover materials’ microstructure evolution and properties at the scale of the relevant physical phenomena. However, this is still a challenge for most materials as it requires modeling large atomic systems (up to millions of particles) with improved accuracy. In many cases, the availability of sufficiently accurate but efficient interatomic potentials has become a serious bottleneck for performing these simulations as traditional potentials fail to represent the multitude of bonding. A new class of potentials has emerged recently, based on a different paradigm from the traditional approach. The new potentials are constructed by machinelearning with a high degree of fidelity from quantum-mechanical calculations. In this review, a brief introduction to the central ideas underlying machine learning interatomic potentials is given. In particular, the coupling of machine learning models with domain knowledge to improve accuracy, computational efficiency, and interpretability is highlighted. Subsequently, we demonstrate the effectiveness of the domain knowledge-based approach in certain select problems related to the kinetic response of warm dense materials. It is hoped that this review will inspire further advances in the understanding of matter under extreme conditions.


INTRODUCTION
Materials exposed to extreme environments, e.g., small scales, high temperature, high pressure, or high strain rate, can undergo significant changes in their bonding, structures, and properties that induce new physical phenomena that do not occur under ordinary conditions [1][2][3][4] . Those extreme phenomena are central to many of the fascinating grand challenges of science, including behavior far from equilibrium, planetary dynamics, the evolution of stars, and the origin of life on Earth [5,6] . On the other hand, the need to develop materials that can perform well in severe operating environments also drives us to understand how the micro-and nano-structures evolve, such as crystallite size, dislocations, voids, and grain boundaries [7][8][9][10][11] . Researchers must not only observe and understand the behavior of materials under such extreme environments but also tailor and control a material's response to enhance a device's performance, extend its lifetime, or enable new technologies [12][13][14][15][16] .
It is widely accepted that the material morphology and timescales of physical phenomena have a profound impact on the dynamics of the materials under extreme conditions. While the development of in-situ experimental investigations was unimaginable just a decade ago [17] , the widespread applicability of these experimental methods is limited by the associated expenses and equipment needs. These methods also lack the temporal and spatial resolution needed to provide a real-time description of the physical quantities (stresses, density, temperature, etc.) and deformation phenomena (e.g., phase transformation, spall, etc.). Recent advancements in simulation capabilities greatly extend our tools for studying materials behavior over a wide range of spatial and temporal scales [18][19][20][21] . Atomistic computer simulations, which match the scale of the relevant physical phenomena -nanometers to micrometers and picoseconds to nanoseconds, can be especially valuable in gaining insights into the materials' underlying structure and dynamics, driven by physical laws and chemical bonding [18,19] . For example, increasing molecular dynamics (MD) simulations on specific materials are helping to shed light on the atomic-scale understanding of how materials respond to extreme compression [22][23][24][25][26][27] .
However, a challenge for atomistic simulations of most materials under extreme conditions continues to be achieving the optimal trade-off between computational accuracy and efficiency. So far, most computational simulations of high-accuracy or first-principles calculations have mainly focused on the thermodynamics and the crystal and electronic structures of systems involved [28,29] . For example, the phase stability and equation of state, which are determined by the corresponding Gibbs free energy, can be easily evaluated by first-principles calculations. Nevertheless, the microscopic kinetic response of materials and associated mechanisms (e.g., the transition pathway, transition barrier, etc.) of phase changes are often far from accessible due to a high computational cost via first-principles calculations, e.g., ab initio molecular dynamics is very expensive. More importantly, studying these processes remains a challenging task since their microstructural development involves the use of a large number of atoms [22,25] i.e., a high degree of configurational entropy. In addition, new phase or defect nucleation is a rare event that occurs over much longer timescales than can be achieved by ab initio molecular dynamics. Thus, it is necessary to resort to large-scale MD simulations using appropriate interatomic potentials. Although several orders of magnitude faster than first principle calculations, the classic semi-empirical potentials, i.e., which assume certain physical forms for the atomic interactions, are unable to describe the multitude of bonding that often exists and thus fail to reproduce materials dynamics faithfully at extremes.
Over the past decade, a class of machine learning (ML) based atomic simulation methods has emerged as a promising means to solve the aforementioned dilemma [30] , combining the high accuracy of density functional theory (DFT) calculations and the computational high efficiency of MD simulations. The basic idea of ML potentials is to map local structural motifs onto the potential energy of the system by numerical interpolation between known reference quantum-mechanical calculations using a machine learning algorithm. This mapping is built in higher dimensional space, which is easier to cover the complex structural and bond evolution, beyond the limitation of given expressions. This scheme is quite different from the traditional potentials aimed at capturing the basic physics of interatomic bonding in materials [31][32][33][34][35] . Whereas several such prescriptions have been proposed for solids in the past, a complete description of warm dense materials places a higher demand on machine learning interatomic potentials (MLIPs). In the present work, we will review recent developments in MLIPs [36][37][38][39][40][41][42] . Attention will be paid to MLIPs that facilitate large-scale atomic simulations of materials dynamics under different external fields. We will show selected applications of ML potentials to problems in different types of materials, including transforming metals and alloys, ionic solids and liquids, molecular crystals and liquids and covalent bond materials. Finally, we include a summary that presents a perspective of the future of this field.

Machine learning potentials
The aim of an MLIP is to map the configuration of a system in real space into its potential energy surface (PES). Based on a set of discrete points on PES (generated by DFT calculations), ML regression algorithms are applied to learn a smooth PES. We can then predict the potential energy E of a given configuration. The total force F i acting on individual atom i in this system can be expressed as F i = -, where r i is the position of atom i. MD simulations can be performed similarly to those with classical interatomic potentials. The general ingredients of MLIPs include the representation for local atomic environments, learning databases, regression models, and evaluation algorithms [ Figure 1]. We will review these key aspects in this section.

Learning databases
Learning database involves the information of different structural configurations and the corresponding energies, forces, and stresses. The target energies (forces and/or stresses) are obtained by DFT calculations [28,29] . Therefore, a high-quality learning database should use limited configurations to capture the PES accurately. However, it is difficult to deal with the trade-off between accuracy and efficiency.
The most frequently used method to obtain a good database is to collect reference configurations that are most related to the specific applications. For example, a database including crystal defects such as vacancies, interstitials, dislocations, twin boundaries, and free surfaces is desirable for developing an MLIP to describe the mechanical behavior of metallic materials [43][44][45][46] . However, the ML potential learned from such a dataset may have poor predictions for other properties. Noticeably, the learning dataset collection will become extremely demanding when developing MLIP to study complex phase transformations in materials, which include structures under different conditions [40,47,48] e.g., temperature, pressure, etc. A highly efficient way to build the learning database is to collect reference data by active learning [49][50][51][52] . A feedback flow was integrated into the new reference selection [ Figure 1]. In this scenario, a raw MLIP was first constructed by a small dataset, then MD simulations were run using this potential to generate more configurations. The configurations with high uncertainty are selected and then re-labeled by DFT calculations and updated the training dataset [ Figure 2A]. Although it often takes several iterations to achieve the acceptable learning dataset, active learning can effectively reduce the size of the training dataset without loss of training accuracy. The most important process in active learning is re-sampling. However, most ML regression algorithms cannot provide the uncertainty directly. To solve this problem, random sampling methods (e.g., Bootstrapping) have been used to evaluate the uncertainty [ Figure 2B]. Even so, not all important metastable structures can be sampled. Therefore, we often enrich the learning dataset with the help of some global optimization method [52] , which can be more efficient in sampling metastable structures along the phase transition pathways [ Figure 2C].

Representations for local atomic environments
Representation of atomic structures entails quantifying local structural information in certain mathematical expressions, named descriptors or fingerprints [53] . In order to simulate large systems, the total energy is expressed as a linear combination of the sum of local energy contributions from all the atoms. Similarly, the fingerprints can also be simplified as a linear summation of local atomic environments, i.e., D = ∑D i , where D is the fingerprints of the system, and D i is the fingerprint of atom i. This assumption greatly improves the efficiency of MLIPs [54,55] .
The aim of D i is to reveal the local structural characteristics of atom i in feature space. In general, the descriptor D i should fulfill the symmetric requirement, namely, the invariance with symmetry operation such as translation and rotation. Nevertheless, there are still some tricky situations where different structures generate the same descriptors [56] , which should be avoided as much as possible.

Gaussian function based descriptors
The core part of this type of descriptors is a Gaussian function smoothened by a cutoff function [66] . The cutoff function usually has the following form: (1) where R ij is the interatomic distance, R c is the cutoff distance. The Gaussian function form can be two-body (focused on the bond length) or three-body (focused on the bond angle information). A typical two-body Gaussian function based descriptors can be: (2) where η and R s are parameters to control the position and the width of the Gaussian function [ Figure 3A]. The summation goes through all neighbor j of atom i within the R c . The two-body Gaussian descriptors can be easily extended to three-body ones by introducing the angle θ ijk centered at atom i. A typical three-body Gaussian type descriptor can have the form of: (3) with the parameter γ (= +1, -1), η , and δ . The Gaussian function based structural descriptors is easy enough to ensure a high efficiency upon calculations. There are different functional forms [30,58,[66][67][68] that have equal efficiency.

Spherical harmonic function based descriptors
Expanding the Gaussian representation of neighbor information by spherical harmonic functions is the central idea for this type of descriptors [ Figure 3B]. Typical formats include smooth overlap of atomic positions descriptors [57,69] , spectral neighbor analysis potential (SNAP) descriptors [70] and atomic cluster expansion descriptors [59] . More detailed theoretical analysis of these descriptors can be found in the references.
A particularly simple descriptor was adopted by deep potential molecular dynamics (DPMD) [71] , as shown in Figure 3C. DPMD has shown good applicability to many kinds of materials [72][73][74][75] because the DeepPot-SE descriptors [76] adopted by DPMD are no longer limited to fixed expression but are automatically learned through neural network thanks to the well-designed symmetry invariant function: (4) where D i is the descriptor matrix of atom i; is the generalized local environment matrix constructed by the function of relative position between the central atom i and each local environment atom. and are embedding matrix learned by the neural network according to the weighting function smoothly going to zero at the boundary of the local region.  [69] . (C) Descriptors used in deep potential molecular dynamics [71] .
Despite the great success of the abovementioned descriptors in MLIPs, the construction of efficient fingerprints for systems with complex structures, which often emerge under extreme conditions, remains difficult. An alternative strategy to design structural descriptors is to combine the functions with proper domain knowledge, which is an efficient way to describe the materials with complex structures. Using domain knowledge screened descriptors, sensitive to the structural changes can help to solve the tradeoff between complexity and efficiency. For example, Zong et al. [25,67] use lattice-invariant shear based descriptors for the phase transformations in Zr and predict a new phase mediating the shock-induced phase transition. In addition, they use physical model based descriptors that can describe up to 14 different structures in K [77,78] . Also, domain knowledge can be coupled with angle dependent structural descriptors, successfully capturing the change of chemical bond in solid H 2 [79] .
It is not easy to evaluate the quality of descriptors because descriptors are not the only factor that can affect the performance of MLIPs. However, the sensitivity of descriptors to the perturbation of specific structures is a good choice to validate its application range. Onat et al. [53] used four different databases to evaluate the sensitivity of eight different descriptors, which can be accessed in reference.

Regression models
A regression model is needed to map the high-dimensional fingerprints onto the corresponding PES. The prevailing regression algorithms include linear regression, kernel function based regression, and artificial neural network (NN) regression.

Linear regression
The linear regression algorithm is the simplest method to obtain fitting parameters, which are determined by using weighted least-squares linear regression against the full training database, as shown, , where ω is the learned weight factor. Among them, the most widely used MLIPs include spectral neighbor analysis potential [70] and moment tensor potentials [65,80] . However, the performance of a linear regression model often lacks transferability. To resolve this issue, high-quality descriptors are desired to capture the nonlinear relationship between local structures and energy (force and/or stress).

Neural network regression
Artificial NN regression has excellent generalization ability and has flexible requirements [36,46,66,68,71,81 -107] . A typical frame is illustrated in Figure 4 [89] . The input and output of a NN model are descriptors D i and energy E, respectively. The complex network structures will find the implicit nonlinear relationship between D i and E. Since the relationship between atomic positions R and D i is constructed artificially, the total energy E can be connected to R. Despite a direct prediction of energy E, Pun et al. [68] proposed a new method of physically informed artificial neural networks (PINN) to generate interatomic potentials. Instead of energy E, parameters in a given physical model become the target in this NN. The PINN model can drastically improve the transferability of ML potentials by informing them of the physical nature of interatomic bonding [68] . Parameterizing a NN regression model needs a relatively large training dataset, which is too time-consuming for DFT calculations.

Kernel based regression
The basic idea underlying kernel based regressions is to train the regression model in the kernel space, which is a higher dimensional feature space [108,109] . A kernel function K(D) is used to transform the descriptors from the feature space into the kernel space [110] . The common kernel based regression contains Gaussian process regression [111] and kernel ridge regression (KRR) [67] . Taking KRR as an example, a kernel function is introduced into the ridge regression method. The applied kernel function K(D input , D ref ) measures the distance between the input descriptor D input and the reference descriptor D ref . Based on this, the final expression of potential energy is: , Where ω t are the learned weight factor.

Performance evaluation
MLIPs should be comprehensively evaluated before we achieve a reliable potential that can be used to perform further atomic simulations. The evaluation should be divided into two parts: the general performance of the ML model and the prediction of interested physical properties.
The precision of the ML model can be estimated by the root mean square error (RMSE) or the mean absolute error. Generally speaking, an RMSE of the potential energy is usually smaller than 5 meV/atom, while the RMSE of force is about 0.1 eV/Å [67] . In addition, n-fold cross validation is often used to avoid the possible overfitting problem. This is particularly important as the potential has to capture more than one interested property.
The evaluation of the performance of the learned potential in MD simulations depends on the interested properties. For example, stable structure prediction at given temperature and pressures is important for phase-diagram prediction [25,77,78] , while dislocation core structure, generalized stacking fault energy prediction is important for mechanical properties [46,102] ; accurate phonon spectra are indispensable for the prediction of thermal conductivity [112] .

Transforming metals and alloys
Metallic materials show complex material response as a function of chemical composition, temperature, and pressure, etc. [ Figure 5] [113][114][115] . Atomistic simulations provide powerful tools to study their atomlevel processes with standard interatomic potentials, e.g., embedded atom method (EAM) and modified embedded atom methods, both of which have been successful in describing the mechanical behavior of metals and their alloys. However, these physical models based classical interatomic potentials still fail to represent the multitude of bonding in metallic materials that suffer from phase changes or plastic deformation in response to extremely loading or heating.
ML interatomic potentials shed some light on these issues as they can accurately predict the atomlevel properties of metallic materials over a wide range of temperatures and pressures. The reported ML Figure 4. Structure of a feed-forward neural network [89] . potentials show success in W [116] , Cu [117] , Ti [70] , Al [118] , Fe [119,120] , Zr [67] , Mg [102] , K [77,78] , CuZr [107] , MgCa [121] , and AlCu [46] , etc. For example, a NN potential for Fe developed by Maillet et al. [120] predict both elasticplastic behavior and phase transitions over the entire range of pressures. A full temperature-pressure phase diagram for Zr is produced by Zong et al. [67] via ML potential based MD simulations. No classical potentials i.e., EAM or MEAM, can depict the pressure induced hcp-ω phase transition. Based on the ML potential, the authors investigated the shock-induced phase transformations in Zr, which helps to clarify the longstanding controversy over the phase transformation pathways. Potassium (K) possesses a more complex phase transformation behavior with fourteen different phases in the temperature-pressure phase diagram. Combining ML models with DFT calculations, the constructed ML potential reproduces the complex phase diagram, and predicts the commensurate-incommensurate phase transformation [115] , semi-melting phase transformations [77] , and metal-electride phase transition [78] in potassium, which are difficult to describe by classical interatomic potentials.
Atomic simulation of complex multi-component alloys is another challenge for metallic materials. Luckily, ML potential also performs well in capturing these alloy systems of binary or ternary components [46,96,107,121,122] . Also, many intermetallic compounds have been studied with the help of MLIPs, such as TiO 2 [95] , HfO 2 [73,123] , ZnO [124] , etc. Interestingly, Sivaraman et al. [125] developed a new approach to develop ML potentials, which are directly machine learned from experimental measurements, and successfully describe the multi-phase state of refractory oxides. Multi-component concentrated solid solutions are successful examples for ML potentials related to alloying effects [122] . Kobayashi et al. [97] and Jain et al. [126] developed a ternary ML potential for Al-Mg-Si alloys, which exactly reproduces the heat of solution, solute-solute, and solute-vacancy interaction energies, and formation energies of small solute clusters and precipitates. Our previous work also reported the ability of MLIP in describing complex dislocation core structures in high-strain-rate deformed or shock compressed bcc high entropy alloys [127] . Figure 5. Complex material dynamics in metals and alloys under extreme conditions of high temperature, high pressure, and various components. At high temperature, metallic materials can show rich structural phase transformations e.g., martensite phase transformation [67] , glass transition [113] , melting [47,114] , incommensurate transition [115] , etc. While at high pressure, new states of matter via structural phase transformation, metal-electride transition [78] , and liquid-liquid phase transition may be achieved. The increase of components leads to the formation of complex solid-solutions, clusters, segregations, intermetallic compounds, and short-rangeordered high entropy alloys.

Ionic crystals and liquids
Different from metallic materials, the interaction between atoms in an ionic crystal is controlled by Coulomb bonding, which is a long-range force field. As a function of the ratio between the anionic-cationic radius, ionic materials can show crystal structures from simple e.g., NaCl or CsCl to complex ones e.g., perovskite and rutile. Several physical models based semi-empirical potentials have been used to describe the dynamic properties of these ionic crystal materials. However, these semi-empirical formulas are out of reach in studying disordered ionic liquids or molten salts, which are important for developing new types of clean energy materials, such as molten salt reactors and concentrated solar power technology.
Li et al. [128] developed a robust neural network interatomic potential model for molten salts such as NaCl, LiF, and FLiBe [129] . In their machine learning model, both long-rang and short-range interactions are captured for both ionic crystals and liquids. More importantly, these NN interatomic potentials have high accuracy in calculating the structural, thermophysical, and transport properties but are 1000 times faster than first principle methods. Figure 6A shows the NN interatomic potential based MD simulation of complex structural phase transformation behavior in LiF and Flipe (LiF-BeF2 mixtures). The NN interatomic potential accurately reproduced the ab initio interactions of dimers, crystalline solids under deformation, crystalline LiF near the melting point, and liquid LiF at high temperatures. These applications suggest a paradigm shift from empirical/semi-empirical/ab initio approaches to an efficient and accurate machine learning scheme in molten salt modeling.
Perovskites are another typical ionic crystal ceramics with extensive applications in memory, solar cells, and transistors, etc. However, the development of corresponding interatomic potentials is still full of thorns due to the presence of rich and complicated polymorphism. Taking CsPbBr 3 as an example in Figure 6B, there exit complex structural phase transformations, during which the symmetry breaking does not only occur at the lattice level but also at the sub-lattice level, e.g., tetrahedral rotations. Thomas and collaborators developed an ML potential for CsPbBr 3 [48] , which shows similar accuracy but higher efficiency than DFT calculations. Such ML potentials enable MD simulations to take into account the contribution of inharmonic Hamiltonians during the phase transformation. More strikingly, the integration of ML potential in the processes of predicting or searching stable chemical compositions can greatly accelerate the discovery of new perovskite materials with novel functional properties.  [129] and (B) CsPbBr 3 [48] .

Molecular crystals and liquids
Molecular crystal materials possess complex bonding, including inter-molecular and intra-molecular bond types. In general, inter-molecular bonds are weak van der Waals interactions while the intra-molecular bonds are strongly covalent, and the dynamics of materials in molecular crystals are closely related to the bond breaking and forming process. Although reactive force field (ReaxFF) [130] potentials show good performance in the atomic simulation of molecular crystal systems, it is still difficult to fulfill the demand for describing the rich conformational changes in molecular systems under extreme conditions. Figure 7A shows the complex structural changes in hydrogen as a function of pressure [131] . The rich structures of H 2 O at different temperatures and pressures are shown in Figure 7B [75] . In organic systems e.g., Azapentacene 5A [132] , the polymorphs are also very complex [ Figure 7C]. Still lacking is the microscopic measurement of phase transformation dynamics for molecular crystal materials, even though the possible structure of stable phases has been largely solved by DFT calculations. and Azapentacene [132] have been investigated recently. Taking hydrogen as an example, we review the power of MLIP in investigating phase transformation in high-density molecular systems. Zong et al. [79] and Cheng et al. [131] developed MLIPs for hydrogen separately almost at the same time. Cheng et al. [131] reproduce both the reentrant melting behavior and the polymorphism of the solid phase. They prove a continuous molecularto-atomic transition in the liquid, with no first-order transition observed above the melting line, which suggests a smooth transition between insulating and metallic layers in giant gas planets, and reconciles existing discrepancies between experiments as a manifestation of supercritical behavior. Zong et al. [79] get an accurate potential to describe the complex Phase I, II, III and liquid phase of hydrogen. Phase I shows a typical hcp lattice of the molecular centers while the orientation of the H 2 molecular is disordered and remains rotating freely. The inter-molecular interactions are mainly from the quadrupole-quadrupole interactions and steric repulsion. With increasing pressure (> 100 GPa), steric repulsion becomes the most important driving force for molecular reorientation, inducing a denser liquid phase than solid. H 2 molecules in Phase II have an ordered molecular axis orientation, and the lattice is more likely to be P2 1 /c-24. The details of Phase III need to be further investigated because the present potential cannot fully describe the bond breaking events under high pressure. These results deepen our understanding of the solid polymorphism and anomalous melting behavior in hydrogen and show the potential of MLIPs in providing an understanding of molecular systems in extremes.
GeTe [92] , Ge 2 Sb 2 Te 5 [144] and 2D ferroic GeSe [145] , etc. With the help of MLIPs, there is a much clearer picture of structural changes in covalent-bonded liquids and solids under high pressure. For example, Deringer et al. [139] performed MD simulations to study the complex phase transformations in disordered silicon, including the liquid-amorphous phase transformations and amorphous-amorphous phase transformations under compression. They first created an MLIP that can accurately describe the liquidamorphous transition upon quenching at a cooling rate of 10 11 K/s [ Figure 8A-D]. After applying external pressure, the amorphous silicon is gradually crystallized in a three-step transformation pathway. At 11 GPa, a coexisting state of low-density amorphous and high-density amorphous is observed [ Figure 8E-F]. Further increasing the pressure to 13 GPa gives rise to a very high-density amorphous phase [ Figure 8G], which is followed by full crystallization [ Figure 8H-I]. This work demonstrates the feasibility of data-driven atomic simulation approaches in uncovering the complex material behavior of covalent-bonded systems.

Limitation of present MLIPs and possible solutions
Machine learning interatomic potentials have emerged as a powerful new tool for atomic modeling materials in extremes. A strategy combining domain knowledge with a machine learning model can greatly accelerate the development of interatomic potentials and improve their accuracy. Zuo et al. [63] compared the test error of main MLIPs, including Gaussian approximation potential, moment tensor potential, neural network potential, SNAP, and quadratic spectral neighbor analysis potential, etc. as functions of the computational cost of bcc Mo, as shown in Figure 9. Obviously, potentials generated from different features have different performances. In general, MLIPs based MD simulations are ~100 times slower than classical ones, although the accuracy is much higher. However, the increase of the complexity, e.g., complex phase transitions and complex compositions etc., leads to more complex descriptors, which will burden the computational efficiency. Hajibabaei et al. [146,147] explored sparse Gaussian process regression to balance this trade-off. The key idea is to get a low-rank approximation of the covariance matrices upon calculating the kernel functions between the test configurations and the reference configurations. In this way, phase transitions of large and complex systems can be accessible. Despite the high computational cost of complex structural descriptors, the designing of descriptors usually requires rich experiences for developers. In other words, the designing strategies of descriptors for different materials are usually not transferable, although almost all MLIPs are based on the same assumption: the potential energy of an atom only relates to its local structural environment (usually less than 1 nm), which is also the basis for the designing of structural descriptors. This assumption is acceptable in metals; however, the cutoff distance is too short for some long-range interactions, e.g., Coulomb force and van der Waals interactions, in ionic and molecular crystals. Although there have been some attempts [104,[148][149][150] considering long-range interactions, the key issue has not been solved. In a word, the lack of transferability and the lack of description of long-range interaction remain the main challenge of MLIPs.
From our perspective, graph neural network (GNN) [151] based MLIPs show promise in addressing these issues about transferability and long-range interaction. In GNN, the topological structure of materials can be expressed by "nodes" and "edges" in a graph. One can easily describe the atoms using "nodes" and chemical bonds using "edges". Therefore, the structural characters are not expressed by some special and semi-empirical formulas. With this end-to-end model, GNN based MLIPs can be transferred to any system easily. It should be noted that not only the explicit interactions between atoms (e.g., the chemical bonding and the interaction within a short-range cutoff), but also the implicit connection (long-range interactions) are described in a graph by the message passing between sub-graphs in GNN.

CONCLUSIONS
In summary, we reviewed the essential aspects of how to develop a machine learning interatomic potential and illustrated how such potentials have been applied in describing the dynamic/kinetics of different materials during structural transformations under extreme conditions. Machine learning interatomic potentials bridges accurate density functional theory calculations to quantitative multiscale simulations covering various time-and length-scales. This provides a good example for the realization of cross-scale computational materials science. We also discussed the limitations of the present MLIPs and gave our outlooks. We hope our review work could inspire and attract more attention in this research direction.

Authors' contributions
Designed this project: Ding X, Zong H Contributed to the review paper writing: Yang Y, Zhao L, Han CX, Ding SD, Lookman T, Sun J, Zong HX Figure 9. Evolution of test error as functions of computational cost for Gaussian approximation potential (GAP), moment tensor potential (MTP), neural network potential (NNP), spectral neighbor analysis potential (SNAP), and quadratic spectral neighbor analysis potential (qSNAP) for bcc Mo [63] .