Journal of Materials Informatics

Open Access Research Article

^{1}School of Materials Science and Engineering and Institute of Materials Genome and Big Data, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.

^{2}College of Materials and Fujian Provincial Key Laboratory of Materials Genome, Xiamen University, Xiamen 361005, Fujian, China.

^{3}State Key Laboratory of Advanced Welding and Joining, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.

^{#}Authors contributed equally.

Correspondence to: Prof. Xingjun Liu, State Key Laboratory of Advanced Welding and Joining, Harbin Institute of Technology, Taoyuan street, Shenzhen 518055, Guangdong, China. E-mail: xjliu@hit.edu.cn ; Prof. Cuiping Wang, College of Materials and Fujian Provincial Key Laboratory of Materials Genome, Xiamen University, Siming South Road 422, Xiamen 361005, Fujian, China. E-mail: wangcp@xmu.edu.cn ; Prof. Rongpei Shi, School of Materials Science and Engineering and Institute of Materials Genome and Big Data, Harbin Institute of Technology, Taoyuan Street, Shenzhen 518055, Guangdong, China. E-mail: shirongpei@hit.edu.cn .

This article belongs to the Special Issue Accelerating the Fusion Between Machine Learning and Additive Manufacturing

Views:114 | Downloads:70 | Cited:0 | Comments:0 | :0

© The Author(s) 2022. **Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

As promising next-generation candidates for applications in aero-engines, L1_{2}-strengthened cobalt (Co)-based superalloys have attracted extensive attention. However, the L1_{2} strengthening phase in first-generation Co-Al-W-based superalloys is metastable, and both its solvus temperature and mechanical properties still need improvement. Therefore, it is necessary to discover new L1_{2}-strengthened Co-based superalloy systems with a stable L1_{2} phase by exploring the effect of alloying elements on their stability. Traditional first-principles calculations are capable of providing the crystal structure and mechanical properties of the L1_{2} phase doped by transition metals but suffer from low efficiency and relatively high computational costs. The present study combines machine learning (ML) with first-principles calculations to accelerate crystal structure and mechanical property predictions, with the latter providing both the training and validation datasets. Three ML models are established and trained to predict the occupancy of alloying elements in the supercell and the stability and mechanical properties of the L1_{2} phase. The ML predictions are evaluated using first-principles calculations and the accompanying data are used to further refine the ML models. Our ML-accelerated first-principles calculation approach offers more efficient predictions of the crystal structure and mechanical properties for Co-V-Ta- and Co-Al-V-based systems than the traditional counterpart. This approach is applicable to expediting crystal structure and mechanical property calculations and thus the design and discovery of other advanced materials beyond Co-based superalloys.

Co-based superalloys, first-principles calculations, site occupancy, phase stability, mechanical properties, machine learning

Ni-based superalloys have been widely used in the aviation, aerospace and petrochemical industries due to their superior combination of highly desirable properties, such as microstructural stability, mechanical properties and oxidation and thermal corrosion resistance at elevated temperatures^{[1,2]}. The signature coherent γ/γ' two-phase precipitate microstructure can maintain the strength of the superalloys under high-temperature conditions^{[3]}. However, due to the limitation of the melting temperature of elemental Ni _{2} phase) that it is really exsist in the Ni-based superalloys.^{[4,5]}. In 2006, Sato *et al.* first discovered the coherent γ/γ' microstructure in the Co-Al-W-based alloy, thereby opening up new avenues for alloy development^{[6]}. However, the L1_{2 }phase is metastable and only exists within a narrow composition region^{[5,7,8]}, necessitating the further development of this material. The alloying of transition metals (TMs) has been found to be effective in promoting the precipitation of the stable L1_{2 }phase^{[9-11]} and increasing the solvus temperature and mechanical properties of the L1_{2} phase at different levels^{[12,13]}. Using this approach, Co-Al-Mo-X^{[14-17]}, Co-Ga-W-X^{[18]}, Co-Ge-W-X^{[19]}, Co-Ti-Cr-X^{[20,21]} and several other alloy systems have already been designed, and all of which have a stable γ/γ' two-phase microstructure.

Nevertheless, to explore the high-dimensional composition and temperature space through the alloying strategy, the traditional experimental methods based on trial and error are labor intensive and time-consuming. In order to guide the design and discovery of new L1_{2}-strengthened Co-based superalloys with enhanced mechanical properties, the basic information, such as the crystal structures and atomic occupancies, of the L1_{2} phase are highly desirable, which is defined as the site occupied by a doped TM. Through structural optimization and static calculations based on first-principles calculations, the ground-state static energy of the L1_{2} phase at 0 K can be accurately calculated and the stable formation enthalpy and reaction energy of the L1_{2 }phase can then be derived^{[22,23]}. First-principles calculations can also be combined with Hook’s law to predict the elastic constant of the supercell of Co-based superalloys, which allows for the prediction of the mechanical properties, such as the bulk, shear and elastic moduli^{[24,25]}. However, the procedures of traditional first-principles calculations are tedious and require significant computational resources. In the case of a system with more than four elements, the number of nonequivalent sites for each element in the supercell will dramatically increase due to the increase in the types of elements, resulting in a significant increase in computational cost and a reduction in computational efficiency. Therefore, improving the computational efficiency to speed up alloy discovery requires an alternative approach^{[26]}.

To date, there has been a push towards big data and artificial intelligence in materials research^{[27,28]}. Machine learning (ML) is a type of algorithm that can acquire new knowledge “automatically” like human beings, mine the existing data, extract key information, establish a predictive model that describes the relationship between influencing factors and a target property and use the model to predict new materials of new unknown systems^{[26]}. ML-based methods have been widely used for assisting the design and discovery of a wide class of materials, including alloys, ceramics and composites, polymers, two-dimensional materials, organic-inorganic hybrids, and so on^{[29,30]}. Using ML algorithms, new materials with excellent performance have been developed successfully and efficiently. However, most of the data used to train the models are collected from experimental studies^{[31-36]}. Only a few studies have relied on data from first-principles calculations to train ML algorithms. For example, Guo *et al. *made efforts to establish and train ML models using the formation energies and lattice constants obtained from first-principles calculations of the _{3}(Al, X)^{[37]}. The structures of a new class of _{3}(Al, WX_{3})_{2}-strengthed Co-based superalloys.

To overcome the limitations posed by the inherent low efficiency in predicting the crystal structure and mechanical properties of the L1_{2} phase using conventional first-principles calculations, a ML-accelerated first-principles approach is proposed in the present work. First, ML algorithms are established and trained using the data provided by conventional density functional theory (DFT) calculations. A small number of predictions made by these ML models are then validated by the first-principles calculations and the resulting dataset is used for improving the ML models if necessary. Finally, the models are employed to predict the crystal structure and mechanical properties of the L1_{2} phase. These predictions may provide a theoretical basis for the design and discovery of new L1_{2}-strengthed Co-based superalloys. In particular, it is found that the efficiency of this ML-assisted method is twice as fast as that based on conventional first-principles calculations alone.

In order to obtain the crystal structure and mechanical properties of the new L1_{2}-strengthened Co-based superalloys more efficiently, ML algorithms are combined with first-principles calculations to predict the properties of the superalloys mentioned above in three steps.

Before attempting to use ML algorithms, it is necessary to conduct a detailed analysis of the first-principles calculations to determine the concept of establishing the ML models, as shown in Figure 1. First, the types of TM dopants contained in the supercells are assumed and the relaxed structures of the L1_{2} phase and its competing D0_{19} phase are calculated through relaxation optimization. Second, the occupation tendency of the TM dopants in the L1_{2} and D0_{19} phases is evaluated to determine the occupancy that is defined as the site in a supercell occupied by a TM dopant in these two phases. Third, the stabilities of the L1_{2} and D0_{19} phases are compared in terms of the stable formation enthalpy, followed by the calculation of the mechanical properties for the L1_{2} phase if it is more stable than the D0_{19 }phase.

Figure 1. Schematic workflow of ML-assisted first-principles calculations for designing L1_{2}-strengthened Co-based superalloys.

In this study, we propose a new type of approach for predicting the L1_{2} phase crystal structure and mechanical properties based on ML algorithms in new Co-based superalloys in three steps, namely, occupied sites, stability prediction and mechanical property prediction, similar to the procedures of first-principles calculations mentioned above. Since the reaction energy and enthalpy of formation between different superalloy systems are incomparable numerically, the classification algorithm in ML should be selected to make a qualitative judgment rather than a quantitative prediction when predicting the occupancy of the doped TM atoms and the stability of the doped L1_{2} and D0_{19} phases.

First-principles calculations are employed to generate data for training the ML model and verifying the ML model predictions, so as to improve the ML model iteratively. The details of the first-principles calculations are briefly summarized below. Generally, first-principles calculations can only deal with a completely ordered phase. If a completely ordered structure can be found and the correlation function of the structure is close to that of a disordered alloy, it is considered that the structure can reflect the configuration of the disordered alloy and the structure is used as the cell model of the disordered alloy in the calculation. The essence of the special quasi-random structure (SQS) method is to find a completely ordered structure to represent the disordered structure by matching the correlation function^{[38,39]}. Therefore, we use the SQS method to construct 2 × 2 × 2 supercells of the Co-based superalloys and consider two types of structures for the Co-Al-W-, Co-V-Ti-, Co-V-Ir-, Co-V-Ta- and Co-Al-V-based systems, namely, the AuCu_{3} and Ni_{3}Sn prototype structures corresponding to the L1_{2} and D0_{19 }phases, respectively^{[39,40]} (see Figure 2 for the L1_{2} and D0_{19 }structures). In addition, the Alloy Theoretic Automated Toolkit (ATAT) is used to identify the nonequivalent positions in the supercells^{[41]}.

Figure 2. Crystal structures of (A) Co_{3}(Al, W); (B) Co_{3}(V, Ti); (C) Co_{3}(V, Ir); (D) Co_{3}(V, Ta) and (E) Co_{3}(Al, V) of L1_{2}-ordered γ'-Co_{3}(X, Y); and (F) Co_{3}(Al, W); (G) Co_{3}(V, Ti); (H) Co_{3}(V, Ir); (I) Co_{3}(V, Ta) and (J) Co_{3}(Al, V) of D0_{19}-ordered γ'-Co_{3}(X, Y). Sites #1, #2 and #3 represent Co and the X and Y dopants, respectively.

The Vienna Ab initio Simulation Package (VASP) is used to perform all the first-principles calculations with the projector augmented wave (PAW) method^{[42-46]} and Perdew-Burke-Ernzerhoff (PBE) exchange-correlation functional using the generalized gradient approximation (GGA)^{[23]}. During the structural relaxation, the criteria for the convergence of energy and maximum force are set to be 10^{-5 }eV/atom and 10^{-3 }eV/Å, respectively. The kinetic energy cutoff is set to 450 eV. Spin polarization is considered during the calculations because of the presence of the ferromagnetic Co. The Brillouin zones are sampled using _{2} and D0_{19} structures, respectively, which balance the computational accuracy, efficiency and cost.

Determining the occupancy of the TM dopants in the L1_{2} phase is a vital prerequisite for obtaining an accurate atomic configuration. The occupancy of an alloying element can be evaluated using the binding^{[23]} and formation energies of the impurity^{[47]}. Each system calculated contains three main elements, each of which is designated according to the name of the alloy system. For instance, Co, Al and W are the main elements #1, #2 and #3 in the Co-Al-W system, respectively. In order to discover the role played by each TM element, the reaction energy of the 3d, 4d or 5d TM element occupying sites #1, #2 and #3 in the supercells of each system is calculated as follows^{[48,49]}:

where represents the energy of Co_{3}(X, Y), denotes the energy of TM-doped Co_{3}(X, Y) and *µ _{i}* and

The stability of the L1_{2} phase is then evaluated by comparing the stable formation enthalpy *ΔH _{S}*of the TM-doped L1

where *µ _{j}* is the chemical potential of element

Elastic properties, such as the bulk (** B**), shear (

The data for the L1_{2} phase in the new Co-based superalloys with TM alloying elements are first generated by first-principles calculations. A total of 61 data from the Co-Al-W-, Co-V-Ti- and Co-V-Ir-based systems are collected for constructing a training set, which are all included in Supplementary Table 1^{[49,57]}. The characteristics of the data are described briefly as follows:

(1) The microscopic characteristics of the elements are used to replace the names of the main and doping elements, including the melting point, boiling point, density, atomic weight, atomic radius, covalent radius, electronegativity and first ionization energy;

(2) For the occupancy prediction model, the microscopic characteristics of the main and doping elements are set as *X *and the occupancy of the doping elements are set as *Y* in the occupied site prediction models;

(3) For the L1_{2} phase stability prediction model, the microscopic characteristics of the main and doping elements and the occupancy of the doping elements are set as *X* and the L1_{2} phase stability is set as *Y* in the stability prediction models;

(4) For the mechanical properties of the L1_{2} phase prediction model, the microscopic characteristics of the main and doping elements, the occupancy of the doping elements and the L1_{2} phase stability are set as *X* and the mechanical properties are set as *Y* in the mechanical property prediction models of the L1_{2} phase.

There are two research routes of choice:

Route **I**: Predict *C _{11}*,

Route **II**: Predict elastic properties, including ** B**,

According to the “no free lunch” theory^{[58]}, no algorithm can be applied to all situations, i.e., one algorithm (algorithm A) outperforms another (algorithm B) on a specific data set and therefore algorithm A will be inferior to algorithm B on another specific data set. As a result, a variety of ML algorithms are first employed to predict the crystal structure and mechanical properties of the L1_{2} phase, followed by a model performance evaluation and comparison. The algorithm with the best performance is selected for making predictions.

Random forest classification, gradient boosting classification (GBC), AdaBoost classification, a support vector machine, an artificial neural network (ANN), K-nearest neighbor classification and Gaussian process classification are selected to establish the classification models. In contrast, regression models are established using random forest regression, gradient boosting regression, AdaBoost regression, support vector regression, an ANN, K-nearest neighbor regression and Gaussian process regression.

All the ML algorithms are run through Python 3.0 and the sklearn package is used to carry out the calculations. All calculations are performed using a PC (Microsoft Windows 10, Intel Core (TM) i7-10875H, CPU 2.30 GHz, 16 GB of RAM).

The performance of the various ML algorithms mentioned above is compared using the *K*-fold cross-validation method. Since the test results of the *K*-fold cross-validation do not depend on the training set, the occurrence of overfitting can be avoided. The original data set is randomly divided into *K* equal subsets. One of the subsets is used as the test set, while the remaining ones consist of a new training set. Each subset should be used as a verification data set in turn, i.e., the above process is repeated *K* times. In this study, *K* is set to be ten^{[59,60]}.

The performance of a classification model is quantified by the so-called “*accuracy*”, which is the ratio of the total number of samples divided by the number of correct predictions, defined as:

where *n _{t}*and

In this study, a principal component analysis (PCA) algorithm is also employed to reduce the dimensionality of the data. PCA is a statistical process that uses orthogonal transformation method to convert a series of observations of possible related variables into a set of linear independent variables referred to as principal components. A new feature vector is defined by the following linear transformation:

where *W ^{T}*is a matrix with orthonormal columns and has fewer rows than . The first three principal components are used to represent most of the information contained in more than 25 features

Several accuracy metrics, such as the coefficient of determination *R*, *R ^{2}*, mean absolute error (

where *Y* and denote the true and predicted values of the targeted properties, respectively, *n* is the number size of the data, *R *value falls between (-1,1) and thus *R ^{2}*value falls within (0,1). The closer the value of

We evaluated the importance of the features with the relative importance (*I _{r}*) to measure the impact of these features on the occupancy of each doping element and the stability and mechanical properties of the L1

where *I _{T}* is the importance of the feature calculated by the model and

The performance of the selected ML algorithms is then iteratively improved through the interaction with the first-principles calculations. First, the selected algorithm is used to predict the target properties for a small amount of randomly chosen input data. Second, the predictions are verified using first-principles calculations. Third, if the accuracy of the models does not meet the requirements, the new data will be used as an additional dataset for re-training the ML model. The procedures above are repeated until the predefined precision is met. The improved models are then employed to predict all the remaining data (the workflow is schematically shown in Supplementary Figure 1).

The occupancy of a TM dopant may significantly influence both the stability and mechanical properties of the L1_{2} phase in Co-based superalloys^{[62]}. In new Co-based superalloys, the D0_{19} phase usually competes against the L1_{2 }phase^{[49]}. The performance of various ML algorithms for predicting the dopant occupancy and stability of the L1_{2} structures are evaluated using 10-fold cross-validation and the results are shown in Figure 3. The gradient boosting algorithm is found to have the highest accuracy (reaching 88.52% and 93.44% for occupancy and stability predictions, respectively) and is thus selected for predicting these two properties. The PCA classification results regarding the effect of TM dopant occupancy and the stability of L1_{2 }are shown in Figure 3 and their interpretation degrees are 92.05% and 93.44%, respectively. All the parameters of the ML algorithm are shown in Supplementary Table 2.

Figure 3. Ranking of prediction accuracies of (A) dopant occupancy and (B) L1_{2 }phase stability by different models. The GBC model has the highest accuracy (up to 88.52% and 93.44%, respectively). Prediction results of (C) occupied sites and (D) L1_{2} phase stability from the model based on the GBC algorithm on the training set. Three features (main features #1, #2 and #3) are selected out of 25 using PCA for visualization (accuracy is 88.52%).

The mechanical properties of the L1_{2} phase in the new Co-based superalloys are the most important indicators of alloy properties. There are two routes for predicting them, as shown in Supplementary Figure 2. Route **I** sets *C _{11}*,

Route I: We start by presenting the results using route **I**. The performances of each regression algorithm are shown in Supplementary Figure 1. AdaBoost is found to outperform the others in predicting *C _{11}* and

Route II**: **Next, we present the results of the mechanical property predictions using route **II**. The performances of each ML algorithm are shown in Figure 4A-F. Compared with the rest of the ML models, the AdaBoost regression model has the best performance for ** B**,

Figure 4. Model performance of each regression model in terms of *R*, *R ^{2}*,

Selection between two routes: Figure 5 compares the performance of the two routes. It can be found that the precision of *C _{12}* is relatively low and its highest

Figure 5. Comparison of model performance of two routes based on Adaboost regression models. The warm color system (including vermeil, red and orange bars) represents the model performance of route **I**, while the cool color system (including blue, turquoise and cyan bars) represents the model performance of route **II**. (A) *R* and *R ^{2}* of Adaboost regression models. (B)

The relative importance of different features on the dopant occupancy, stability of the L1_{2} structures and the mechanical properties of the L1_{2} phase are extracted from the gradient boosting classification and AdaBoost regression models, as shown in Figure 6. The names of the features are too long to be directly reflected in the figure and we therefore use codes to represent the full feature names, which are provided in Supplementary Table 3.

Figure 6. Calculated relative importance of different features on (A) dopant occupancy prediction based on gradient boosting classification model; (B) the stability of L1_{2} structure prediction based on gradient boosting classification model; (C) bulk modulus prediction based on Adaboost regression model; (D) shear modulus prediction based on Adaboost regression model and (E) elastic modulus prediction based on Adaboost regression model. The ranking of the features is in accord with the related references.

The first ionization energy and electronegativity quantify the attraction between atoms and affect the distortion of the supercell, and are thus capable of evaluating the occupancy of a dopant in the supercell^{[63]}. The covalent radius of a dopant affects the stability of the supercell^{[62]}. The melting and boiling points of a dopant and the mechanical properties (such as bulk, shear and elastic moduli^{[64]}) are correlated. It can be seen from Figure 6A that the values of relative importance for the electronegativity and the first ionization energy of the dopant are the highest, indicating that these two features predominantly determine the occupancy of the doped atom. Similarly, Figure 6B shows that the covalent radius and the first ionization energy of the dopant determine the stability of the L1_{2} phase. Figure 6C indicates that the melting and boiling points of the dopant have the greatest influence on the mechanical properties, including the bulk, shear and elastic moduli.

The L1_{2} phase exists at high temperatures in the Co-Al-W-, Co-V-Ti- and Co-V-Ir-based systems^{[1,6,65]}. Building a new alloy system based on the properties of the major alloying elements is highly desirable. Ta can increase the L1_{2 }solvus temperature, while V can improve the strength of the alloy^{[66-68]}. Herein, the trained ML models are employed to predict the crystal structure and mechanical properties of the L1_{2} phase in new alloy systems containing V and Ta elements, such as the Co-V-Ta- and Co-Al-V-based systems. The prediction precision of the ML models without information for the Co-V-Ta- and Co-Al-V-based systems is usually low, so it is necessary to modify the models. The ML model modification precision is shown in Table 1.

Table 1

Precision standard of ML model modification

Prediction model | Indicator | Precision requirement of three pieces of data |

Dopant occupancy models | Accuracy | 100% |

L1_{2} phase stability prediction models | Accuracy | 100% |

Mechanical property prediction models | R | > 0.9 |

MAE | < 5 | |

RMSE | < 5 |

A rule is established where each round of random calculation verifies three data points for evaluating the model performance. In order to verify the prediction capability of the model for an unknown system, the calculated results of the Co-V-Ta-based system are added to the previous trained models as a new training set and the optimized models are used to predict the new Co-Al-V-based system. Through one round of iteration, the accuracy of the ML model for predicting dopant occupancy in the Co-V-Ta-based system is improved from 66.67% to 100%. The accuracy of the prediction in the Co-Al-V-based system reaches 100%, i.e., the model does not need to be modified. In addition, in order to verify the generalization ability of the ML model, we use first-principles calculations to compute the rest of the data that have not yet been verified. The results are compared with those predicted using the improved ML model. The results show that the prediction accuracy is improved from 80.00% to 95.00% for the Co-V-Ta-based system after only one-time model optimization. The accuracy of the Co-Al-V-based system is 95.24%. The PCA classification effect of the model is shown in Figure 7. The interpretation degrees of the Co-V-Ta- and Co-Al-V-based systems are 88.37% and 88.51%, respectively.

Figure 7. PCA classification result of occupied site prediction model based on GBC algorithm after one round of modification: (A) original Co-V-Ta-based system (accuracy reaches 80.00%); (B) modified Co-V-Ta-based system (accuracy reaches 95.00%); (C) original Co-Al-V-based system (accuracy reaches 95.24%).

The accuracy of the ML model for predicting the L1_{2} phase stability in the Co-V-Ta-based system is improved from 66.67% to 100% through a one-round iteration. The accuracy of the prediction in the Co-Al-V-based system reaches 100%, i.e., the model does not need to be modified. As before, we use first-principles calculations to compute the rest of the data that have not yet been verified. The verified results show that the accuracy of model prediction in the Co-V-Ta-based system after one round of iteration is improved from 70.00% to 95.00%. The results show that the model predictions in the Co-Al-V-based system are all correct. The display effect of the PCA classification effect of the models is shown in Figure 8. The interpretation degrees of the Co-V-Ta- and Co-Al-V-based systems are 88.37% and 89.12%, respectively. It can be found that the modified gradient boosting algorithm is capable of making accurate predictions for both the occupancy of TM dopants and the stability of the L1_{2} phase for both the Co-V-Ta- and Co-Al-V-based systems.

Figure 8. Display effect of PCA classification effect of L1_{2} phase stability prediction model based on GBC algorithm after one round of modification: (A) original Co-V-Ta-based system (accuracy reaches 70.00%); (B) modified Co-V-Ta-based system (accuracy reaches 95.00%); (C) original Co-Al-V-based system (accuracy reaches 100%).

The iterative processes for improving the accuracy of the ML for predicting the mechanical property L1_{2} phase are shown in Supplementary Figure 4. It can be found that the accuracy of model prediction is significantly improved.

The optimization processes of the ML models for predicting the mechanical properties of the L1_{2} phase in the Co-V-Ta- and Co-Al-V-based systems are shown in Supplementary Figures 5 and 6, respectively. For a small amount of predicted data, it can be seen that the performance of the ** B**,

Figure 9 shows the overall prediction results of the modified mechanical performance models of the Co-V-Ta- and Co-Al-V-based systems and their model performances are shown in Figure 10. For the Co-V-Ta-based system, the *R* values of the ** B**,

Figure 9. Overall prediction results of modified mechanical performance models: (A) ** B**; (B)

It takes about two days for traditional first-principles calculations to compute a data point, while establishing a ML model requires five days. However, it takes less than a minute for the trained ML models to predict the calculation results. By comparing the calculation amount and time between the modified ML models and the traditional first-principles calculations, we find the prediction method based on ML algorithms can improve the calculation efficiency by more than double using the modified ML model, as shown in Table 2.

Table 2

Comparison of time costs for first-principles calculations alone and ML-accelerated first principles calculations

Task | Time | |

Traditional DFT method | First-principles calculations | 92 days |

ML-accelerated method | First-principles calculations | 22 days |

Establish ML models | 5 days | |

ML prediction | 1 minute | |

Total | 27 days |

Comparison of the predicted ** B**,

This work aims to address the challenges encountered by the traditional experimental approaches and first-principles calculation methods for the discovery of new Co-based superalloys (strengthened by L1_{2} ordered precipitates), both of which are inefficient, time-consuming and labor-intensive when used alone.

A new approach is proposed that combines machine learning (ML) and first-principles calculations to speed up the prediction of crystal structure, phase stability and mechanical properties for systems, such as Co-V-Ta- and Co-Al-V-based alloys. This information is critical for developing new Co-based superalloys with superior properties at elevated temperatures. ML models are established and trained for predicting the site occupancy, phase stability and mechanical properties. Through iterative interactions between model predictions and validations using first-principles calculations, the ML models are further improved. Finally, the refined models are used to make accurate predictions for the crystal structure and mechanical properties for Co-V-Ta- and Co-Al-V-based systems.

The combination of ML and first-principles calculations may shed light on the rapid prediction of crystal structure and mechanical properties of other advanced materials beyond Co-based alloys.

Project conception: Liu X, Wang C

Calculation task: Xi S, Yu J

Analysis: Xi S, Yu J, Bao L

Investigation: Xi S, Yu J, Bao L, Chen L, Li Z, Shi R

Draft Preparation: Xi S, Yu J, Shi R

Supervision: Liu X

Availability of data and materialsNot applicable.

Conflict of InterestAll authors declare that there are no conflict of interest.

Financial support and sponsorshipThis work was supported by the National Key R&D Program of China (No. 2020YFB0704503), the National Natural Science Foundation of China (Grant No. 52001098 and Grant No. 51831007), and the Key-Area Research and Development Program of GuangDong Province (Grant No. 2019B010943001), as well as the open research fund of Songshan Lake Materials Laboratory (2021SLABFK06).

Ethical approval and consent to participateNot applicable.

Consent for publicationNot applicable.

Copyright© The author(s) 2022.

Supplementary Materials1. Sims C. , Stoloff N., Hagel W. Superalloys II: High-temperature materials for aerospace and industrial power; 1987. Available from: https://www.researchgate.net/profile/James-Smialek/publication/283993132_High_Temperature_Oxidation_in_Superalloy/links/5829db5e08ae138f1bf2f305/High-Temperature-Oxidation-in-Superalloy.pdf [Last accessed on 14 Sep 2022].

2. Ruan J, Xu W, Yang T, et al. Accelerated design of novel W-free high-strength Co-base superalloys with extremely wide γ/γʹ region by machine learning and CALPHAD methods.

DOI*Acta Materialia*2020;186:425-33.3. Zhao S, Xie X, Smith GD, Patel SJ. Research and Improvement on structure stability and corrosion resistance of nickel-base superalloy INCONEL alloy 740.

DOI*Mater Des*2006;27:1120-7.4. Sahay S, Goswami B. Recent developments in co-base alloys.

DOI*SSP*2009;150:197-219.5. Zhu J, Titus MS, Pollock TM. Experimental investigation and thermodynamic modeling of the Co-rich region in the Co-Al-Ni-W quaternary system.

DOI*J Phase Equilib Diffus*2014;35:595-611.6. Sato J, Omori T, Oikawa K, Ohnuma I, Kainuma R, Ishida K. Cobalt-base high-temperature alloys.

DOIPubMed*Science*2006;312:90-1.7. Miura S, Ohkubo K, Mohri T. Mechanical properties of Co-based L1

DOI_{2}intermetallic compound Co_{3}(Al,W).*Mater Trans*2007;48:2403-8.8. Kobayashi S, Tsukamoto Y, Takasugi T, et al. Determination of phase equilibria in the Co-rich Co-Al-W ternary system with a diffusion-couple technique.

DOI*Intermetallics*2009;17:1085-9.9. Yu Y, Wang C, Liu X, Ohnuma I, Kainuma R, Ishida K. Experimental determination of phase equilibria in the Co-Ti-Mo ternary system.

DOI*Intermetallics*2008;16:1199-205.10. Yao Q, Shang S, Hu Y, et al. First-principles investigation of phase stability, elastic and thermodynamic properties in L1

DOI_{2}Co_{3}(Al,Mo,Nb) phase.*Intermetallics*2016;78:1-7.11. Qiang Y, Shang S, Kang W, et al. Phase stability, elastic, and thermodynamic properties of the L1

DOI_{2}(Co,Ni)_{3}(Al,Mo,Nb) phase from first-principles calculations.*J Mater Res*2017;32:1-9.12. Kobayashi S, Tsukamoto Y, Takasugi T. Phase equilibria in the Co-rich Co-Al-W-Ti quaternary system.

DOI*Intermetallics*2011;19:1908-12.13. Kobayashi S, Tsukamoto Y, Takasugi T. The effects of alloying elements (Ta, Hf) on the thermodynamic stability of γ′-Co

DOI_{3}(Al,W) phase.*Intermetallics*2012;31:94-8.14. Makineni S, Samanta A, Rojhirunsakool T, et al. A new class of high strength high temperature Cobalt based γ-γ′ Co-Mo-Al alloys stabilized with Ta addition.

DOI*Acta Materialia*2015;97:29-40.15. Makineni S, Nithin B, Chattopadhyay K. A new tungsten-free γ-γ’ Co-Al-Mo-Nb-based superalloy.

DOI*Scripta Materialia*2015;98:36-9.16. Makineni S, Nithin B, Chattopadhyay K. Synthesis of a new tungsten-free γ-γ′ cobalt-based superalloy by tuning alloying additions.

DOI*Acta Materialia*2015;85:85-94.17. Makineni SK, Nithin B, Palanisamy D, Chattopadhyay K. Phase evolution and crystallography of precipitates during decomposition of new “tungsten-free” Co(Ni)-Mo-Al-Nb γ-γ′ superalloys at elevated temperatures.

DOI*J Mater Sci*2016;51:7843-60.18. Chinen H, Omori T, Oikawa K, Ohnuma I, Kainuma R, Ishida K. Phase Equilibria and Ternary Intermetallic Compound with L1

DOI_{2}Structure in Co-W-Ga System.*J Phase Equilib Diffus*2009;30:587-94.19. Chinen H, Sato J, Omori T, et al. New ternary compound Co

DOI_{3}(Ge,W) with L1_{2}structure.*Scripta Materialia*2007;56:141-3.20. Zenk CH, Povstugar I, Li R, et al. A novel type of Co-Ti-Cr-base γ/γ′ superalloys with low mass density.

DOI*Acta Materialia*2017;135:244-51.21. Im HJ, Makineni SK, Gault B, Stein F, Raabe D, Choi P. Elemental partitioning and site-occupancy in γ/γ′ forming Co-Ti-Mo and Co-Ti-Cr alloys.

DOI*Scripta Materialia*2018;154:159-62.22. Chen M, Wang C. First-principles investigation of the site preference and alloying effect of Mo, Ta and platinum group metals in γ′-Co

DOI_{3}(Al,W).*Scri Mater*2009;60:659-62.23. Chen M, Wang C. First-principle investigation of 3d transition metal elements in γ′-Co

DOI_{3}(Al,W).*J Appl Phys*2010;107:093705.24. Mao Z, Booth-morrison C, Sudbrack CK, Noebe RD, Seidman DN. Interfacial free energies, nucleation, and precipitate morphologies in Ni-Al-Cr alloys: calculations and atom-probe tomographic experiments.

DOI*Acta Materialia*2019;166:702-14.25. Xu W, Shang S, Wang C, et al. Accelerating exploitation of Co-Al-based superalloys from theoretical study.

DOI*Mater Des*2018;142:139-48.26. Yu J, Wang C, Chen Y, Wang C, Liu X. Accelerated design of L1

DOI_{2}-strengthened Co-base superalloys based on machine learning of experimental data.*Mater Des*2020;195:108996.27. Nosengo N. Can artificial intelligence create the next wonder material?

DOIPubMed*Nature*2016;533:22-5.28. Raccuglia P, Elbert KC, Adler PD, et al. Machine-learning-assisted materials discovery using failed experiments.

DOIPubMed*Nature*2016;533:73-6.29. Pilania G. Machine learning in materials science: From explainable predictions to autonomous design.

DOI*Comp Mater Sci*2021;193:110360.30. Yu J, Xi S, Pan S, et al. Machine learning-guided design and development of metallic structural materials.

DOI*J Mater Inf*2021;1:9.31. Liu P, Huang H, Antonov S, et al. Machine learning assisted design of γ′-strengthened Co-base superalloys with multi-performance optimization.

DOI*npj Comput Mater*2020:6.32. Swetlana S, Khatavkar N, Singh AK. Development of Vickers hardness prediction models via microstructural analysis and machine learning.

DOI*J Mater Sci*2020;55:15845-56.33. Ruan J, Liu X, Yang S, et al. Novel Co-Ti-V-base superalloys reinforced by L1

DOI_{2}-ordered γ′ phase.*Intermetallics*2018;92:126-32.34. Zou M, Li W, Li L, Zhao J, Feng Q. Machine learning assisted design approach for developing γ′-strengthened Co-Ni-base superalloys. 2020.

DOI35. Li W, Li L, Antonov S, Wei C, Zhao J, Feng Q. High-throughput exploration of alloying effects on the microstructural stability and properties of multi-component CoNi-base superalloys.

DOI*J Alloys Compd*2021;881:160618.36. Tamura R, Osada T, Minagawa K, et al. Machine learning-driven optimization in powder manufacturing of Ni-Co based superalloy.

DOI*Mater Des*2021;198:109290.37. Guo J, Xiao B, Li Y, et al. Machine learning aided first-principles studies of structure stability of Co

DOI_{3}(Al, X) doped with transition metal elements.*Comp Mater Sci*2021;200:110787.38. Zunger A, Wei S, Ferreira LG, Bernard JE. Special quasirandom structures.

DOIPubMed*Phys Rev Lett*1990;65:353-6.39. Jiang C. First-principles study of Co

DOI_{3}(Al,W) alloys using special quasi-random structures.*Scr Mater*2008;59:1075-8.40. Asta M, Ozolins V, Woodward C. A first-principles approach to modeling alloy phase equilibria.

DOI*JOM*2001;53:16-9.41. de Walle A, Asta M, Ceder G. The alloy theoretic automated toolkit: a user guide.

DOI*Calphad*2002;26:539-53.42. Kresse G, Hafner J. Ab initio molecular dynamics for liquid metals.

DOIPubMed*Phys Rev B Condens Matter*1993;47:558-61.43. Kresse G, Hafner J. Ab initio molecular-dynamics simulation of the liquid-metal-amorphous-semiconductor transition in germanium.

DOIPubMed*Phys Rev B Condens Matter*1994;49:14251-69.44. Kresse G, Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set.

DOI*Comp Mater Sci*1996;6:15-50.45. Kresse G, Furthmüller J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set.

DOIPubMed*Phys Rev B Condens Matter*1996;54:11169-86.46. Kresse G, Joubert D. From ultrasoft pseudopotentials to the projector augmented-wave method.

DOI*Phys Rev B*1999;59:1758-75.47. Dang H, Wang C, Shu X. Electronic structure of edge dislocation of core-doped Ti in Fe.

DOI*Prog Nat Sci*2004;14:477-82.48. Freysoldt C, Grabowski B, Hickel T, et al. First-principles calculations for point defects in solids.

DOI*Rev Mod Phys*2014;86:253-305.49. Xi S, Chen L, Bao L, et al. Effects of alloying elements on the atomic structure, elastic and thermodynamic properties of L1

DOI_{2}-Co_{3}(V, Ti) compound.*Mater Today Comm*2022;30:102931.50. Xu W, Wang Y, Wang C, Liu X, Liu Z. Alloying effects of Ta on the mechanical properties of γ’ Co

DOI_{3}(Al, W): A first-principles study.*Scr Mater*2015;100:5-8.51. Saal JE, Wolverton C. Thermodynamic stability of Co-Al-W L12 γ′.

DOI*Acta Materialia*2013;61:2330-8.52. Wang S, Ye H. Ab initio elastic constants for the lonsdaleite phases of C, Si and Ge.

DOI*J Phys Condens Matter*2003;15:5307.53. Shang S, Wang Y, Liu Z. First-principles elastic constants of α- and θ-Al2O3.

DOIPubMed*Appl Phys Lett*2007;90:101909.54. Chung D. Elastic moduli of single crystal and polycrystalline MgO.

DOI*Philos Mag*1963;8:833-41.55. Anderson OL. A simplified method for calculating the debye temperature from elastic constants.

DOI*J Phys Chem Sol*1963;24:909-17.56. Chung DH, Buessem WR. The Voigt-Reuss-Hill (VRH) approximation and the elastic moduli of polycrystalline ZnO, TiO

DOI_{2}(Rutile), and α-Al_{2}O_{3}.*J Appl Phys*1968;39:2777-82.57. Liu X, Wang Y, Xu W, Han J, Wang C. Effects of transition elements on the site preference, elastic properties and phase stability of L1

DOI_{2}γ′-Co_{3}(Al, W) from first-principles calculations.*J Alloys Compd*2020;820:153179.58. Ho Y, Pepyne D. Simple explanation of the No-Free-Lunch theorem and its implications.

DOI*J Optimiz The Appl*2002;115:549-70.59. Goldstein D. Analyzing microarray gene expression data.

DOI*J Am Stat Assoc*2005;100:1464-5.60. Yu J, Guo S, Chen Y, et al. A two-stage predicting model for γ′ solvus temperature of L12-strengthened Co-base superalloys based on machine learning.

DOI*Intermetallics*2019;110:106466.61. Belhumeur P, Hespanha J, Kriegman D. Eigenfaces vs. fisherfaces: recognition using class specific linear projection.

DOI*IEEE Trans patt analys mach intell*1997;19:711-720.62. Wang C, Zhang C, Wang Y, et al. Effects of transition elements on the structural, elastic properties and relative phase stability of L1

DOI_{2}γ′-Co_{3}Nb from first-principles calculations.*Metals*2021;11:933.63. Sanyal S, Waghmare UV, Hanlon T, Hall EL. Ni/boride interfaces and environmental embrittlement in Ni-based superalloys: a first-principles study.

DOI*Mater Sci Engineer: A*2011;530:373-7.64. Geng P, Li W, Zhang X, et al. A theoretical model for yield strength anomaly of Ni-base superalloys at elevated temperature.

DOI*J Alloys Compd*2017;706:340-3.65. Wang CP, Deng B, Xu WW, et al. Effects of alloying elements on relative phase stability and elastic properties of L1

DOI_{2}Co_{3}V from first-principles calculations.*J Mater Sci*2018;53:1204-16.66. Bauer A, Neumeier S, Pyczak F, Göken M. Microstructure and creep strength of different γ/γ′-strengthened Co-base superalloy variants.

DOI*Scr Mater*2010;63:1197-200.67. Ruan J, Wang C, Yang S, et al. Experimental investigations of microstructures and phase equilibria in the Co-V-Ta ternary system.

DOI*J Alloys Compd*2016;664:141-8.68. Chen Y, Wang C, Ruan J, et al. High-strength Co-Al-V-base superalloys strengthened by γ′-Co

DOI_{3}(Al,V) with high solvus temperature.*Acta Materialia*2019;170:62-74.

Xi S,
Yu J,
Bao L,
Chen L,
Li Z,
Shi R,
Wang C,
Liu X. Machine learning-accelerated first-principles predictions of the stability and mechanical properties of L1_{2}-strengthened cobalt-based superalloys.
* J Mater Inf* 2022;2:15. http://dx.doi.org/10.20517/jmi.2022.22

114

70

0

0

0

© 2016-2022 OAE Publishing Inc., except certain content provided by third parties

## Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.