The evaluation and prediction of fatigue properties are crucial for metallic materials. Although the determination of SN curves represents the most important methods for evaluating such properties, its fatigue testing is costly and timeconsuming. Furthermore, fatigue testing involves different test conditions, thereby complicating the evaluation of the fatigue properties. This study develops a transfer convolutional neural network (TRCNN) framework, in which the prediction of the reversed torsion SN curves of steels is transferred from rotating bending SN curves. In the TRCNN framework, the source CNN models for rotatingbending curve prediction are first trained based on the composition and process conditions. Subsequently, based on the source models, the reversed torsion SN curves are estimated by training the TRCNN models based on only a small dataset. After proving the rationality of the framework, its universality with respect to different amounts of data is further investigated. The reversed torsion curves under smallsample conditions (22 samples) are predicted accurately by the TRCNN. Additionally, the TRCNN models remain accurate under varying amounts of data (22112 samples), showing excellent generality for different amounts of fatigue data. The predictive capability of the TRCNN models is improved by introducing tensile properties into the source models. The proposed TRCNN framework can significantly reduce the cost of evaluating fatigue properties, and the prediction of SN curves can be optimized by combining the transfer framework and lowcost properties related to fatigue.
The evaluation of the fatigue properties of metallic structural materials is of significant importance. The relationship between fatigue life and applied stress is the basis for fatigue analysis and antifatigue design, i.e., the stresslife approach. The resulting SN curve is one of the most important methods to describe fatigue behavior. However, obtaining SN curves via experimental testing remains costly and timeconsuming. Furthermore, fatigue testing involves various test conditions, such as Rvalue and frequency, which further complicate the evaluation of fatigue properties. Therefore, modeling the prediction of SN curves remains of significant interest.
Traditionally, various models have been developed for the prediction of fatigue properties. The Coffin–Manson and Basquin equations are two widely used classical models^{[1]}. However, these models mainly rely on fitting of the testing data. Based on cumulative damage calculations^{[24]}, several models have been developed, such as the Palmgren–Miner linear damage rule, continuum damage mechanics, and energybased models. Several models have also been developed based on the prediction of crack nucleation and growth^{[48]}. Although they exhibit a high predictive ability, these models can only predict the lives of a limited number of alloys. The model parameters are usually determined experimentally or obtained from previous studies. Furthermore, most of the above models are again essentially based on fitting. Recently, the focus has been given to fatigue prediction using the unified mechanics theory (UMT) proposed by Basaran^{[9]}, which is a purely physicsbased approach that does not require the fitting of an empirical evolution function^{[10,11]}. The UMT has been validated for the fatigue prediction of metals in studies by Noushad
Some efforts have been made to directly build a relationship between material characteristics/loading conditions and fatigue properties via databased machine learning (ML) to overcome the limitations of conventional models, as it does not require a clear knowledge of the mechanism^{[1417]}. For example, fatigue research based on ML has made significant progress in both fatigue strength prediction^{[1824]} and fatigue crackdriving force prediction^{[25]}. Several researchers have recently developed MLbased models for fatigue life prediction^{[2632]}. Zhang
Recently, deep learning methods, such as convolutional neural networks (CNNs), have been applied to address the aforementioned problems. Kim
In this work, a transfer CNN (TRCNN) predictive framework is proposed, in which the reversed torsion SN curve prediction of lowalloy steels is transferred from the corresponding rotating bending SN curves. In the framework, based on the source CNN models, which predict the rotating bending SN curves with the steel composition and processing parameters as inputs (source task), the reversed torsion SN curves are predicted by the TRCNN models trained using only dozens of SN curve examples (target task). The CNN approach in this framework is proven to be applicable for SN curve prediction. The transfer method in the TRCNN further helps to reduce the data requirements of the model and therefore reduces the cost of fatigue data accumulation, since a target SN curve can be predicted using existing SN curves and particularly lowcost data, like highfrequency fatigue.
The Matnavi fatigue dataset built by the National Institute of Material Science (NIMS)^{[41]} was used in the present work. In this dataset, the compositions, heat treatment conditions, and highcycle SN data of steels were recorded. Two datasets of rotating bending and reversed torsion fatigue, containing 411 and 141 samples, respectively, were collected. Each sample had 20 input features (composition and processing details) and a set of SN data containing 1320 data points. The complete datasets collected consist of data for carbon, lowalloy, spring, and stainless steel. The Matnavi dataset is high quality because all fatigue tests were performed at a single institution, with only small scattering within the data.
Input and output ranges of various features in the database






Input  Carbon (wt.%)  0.09  0.63  0.396  0.098 
Silicon (wt.%)  0.16  2.05  0.306  0.254  
Manganese (wt.%)  0.32  1.6  0.825  0.289  
Phosphorus (wt.%)  0.004  0.031  0.017  0.005  
Sulphur (wt.%)  0.002  0.03  0.014  0.006  
Nickel (wt.%)  0.01  2.78  0.493  0.853  
Chromium (wt.%)  0.01  12.7  1.154  2.61  
Copper (wt.%)  0  0.26  0.061  0.049  
Molybdenum (wt.%)  0  0.24  0.061  0.085  
Normalizing temperature (℃)  30  900  820.47  188.76  
Through hardening temperature (℃)  30  975  833  136.5  
Through hardening time (min)  0  30  29.2  4.84  
Cooling rate for through hardening (℃/s)  0  24  11.76  7.15  
Tempering temperature (℃)  30  750  589.76  109.29  
Tempering time (min)  0  60  58.39  9.68  
Cooling rate for tempering (℃/s)  0  24  23.36  3.87  
Reduction ratio (ingot to bar)  289  5530  964.1  576.77  
Area proportion of inclusions deformed by plastic work  0  0.13  0.047  0.032  
Area proportion of inclusions occurring in discontinuous array  0  0.05  0.004  0.009  
Area proportion of isolated inclusions  0  0.06  0.009  0.012  
Output  Rotating bending curves  411  
Revered torsion curves  141 
For each set of SN data, the data points were fitted to a curve based on the following equation from the JSMSSD604^{[42]}:
where
Preparation of SN curves for two alloys in the dataset. Points and solid lines represent the SN data points and fitted SN curves, respectively.
For data preprocessing, the standard statistical zscore measurement for eliminating the dimensional differences of the features was used^{[43]}. The inputs and outputs were normalized and the equation is given by:
where
For the rotatingbending curves, a ratio of 4:1 was used to generate the training and testing sets from the dataset of 411 samples. Crossvalidation was adopted to avoid the “lucky split” and the partitioning process was carried out randomly five times.
For the reversed torsion curves, from the initial dataset of 141 samples, 29 samples were selected as the validation set. Subsequently, 22 samples were randomly selected from the remaining 112 samples for TRCNN model training and testing. Similarly, the above partitioning processes were carried out ten times randomly. In addition, to investigate the impact of data volumes on the predictive capability of the model, various subsets containing 33112 samples were randomly selected for modeling. In the above cases, the same 4:1 ratio was used to generate the training and test sets.
In the present work, a TRCNN framework was constructed, which treats the rotating bending SN curves (source domain) as intermediate steps towards obtaining an estimate of the reversed torsion SN curves (target domain). This process is schematically illustrated in
Transfer prediction framework for reversed torsion SN curves.
First, a source model for rotatingbending SN curve prediction was trained via the CNN method using a relatively large dataset. In this model, twentydimensional features (compositions and processing parameters) reshaped into a 5 × 5 matrix were used as inputs of the CNN. For the reshaping process, the twentydimensional input values were sequentially filled into a 5 × 5 matrix and the value of the last five elements in the matrix was set to zero. The rotating bending SN curve containing 50 data points formed the output.
Unlike some commonly used CNN model architectures, the complexity of the source CNN model was reduced by simplifying the architecture to adapt to smallsample characteristics. Furthermore, the pool layer was removed to reduce the loss of information during training. The sequence details of the layers in the source model are shown in
Source CNN model architecture details


conv2d_1 (Conv2D)  (5, 5, 32) 
conv2d_2 (Conv2D)  (5, 5, 64) 
flatten_1 (Flatten)  1600 
dense_1 (Dense)  128 
dense_2 (Dense)  64 
dropout_1 (Dropout)  64 
dense_3 (Dense)  50 
A brief review of the calculation process of the CNN is provided. When the CNN is used in this work, the output volume of a convolutional layer is referred to as a feature map, since the purpose of these layers is to extract features from the input volume. The convolution operation can be written as:
where
The feature map generated by the last convolutional layer is flattened into a vector. The vector is then fed to the first fully connected layer. The operation in the fully connected layer can be written as:
where the weight and bias of the
where
The target TRCNN predictive model for reversed torsion curves was then constructed. The model was trained as follows: (1) Copy the convolutional layers and first fully connected layer of the source CNN model to the corresponding layers of the target TRCNN model, which were called “transferred feature layers”. They remained frozen and did not participate in further training. (2) Initialize and train the remaining layers of the target TRCNN model for fatigue strength prediction. For comparison, the corresponding NonTRCNN (nontransfer) model was also trained. During training, the composition and processing parameters were directly coupled to the reversed torsion SN curves without the rotating bending SN curves as intermediate steps.
For the above CNN modeling, data preprocessing and model training were implemented using Keras and Scikitlearn. For training, the model was obtained after 1000 iterations, during which the loss function of the mean square error, a learning rate of 0.001, and the Adam optimizer were used.
The metrics used to evaluate the model performance are the mean absolute error (MAE) and squared correlation coefficient (R^{2}), given by^{[16,44,45]}:
Five CNN models were first built to predict the rotatingbending SN curves based on the steel composition and processing parameters according to the transfer framework.
Performance of different source models. (A) MAE distribution of training set and testing set. (B) Training loss curves of source model.
Prediction results of source model for rotating bending curves. (A) Eight predicted curves in (A) training and (B) testing sets. Fatigue strength for (C) training and (D) testing sets.
The TRCNN models for the reversed torsion curves were further trained based on the source model. In addition, CNN models without transfer (NonTRCNN) were also built to show how the TR framework can succeed in curve prediction using only a small dataset. As described in the Methods, the above process was implemented ten times according to different partitions for the dataset. The results of the model trained from one of the partitions (Part2) are presented first.
Training loss curves for TRCNN and NonTRCNN models.
The SN curves in the validation set predicted by the TRCNN and NonTRCNN models are compared in
Comparison of prediction results for validation set by TRCNN and NonTRCNN models: (A) SN curves; (B) fatigue strength.
Comparison of MAE results by TR and NonTR models: (A) validation set of Part2; (B) overfitting of all partitions.
The above results demonstrate the advantages of the TRCNN framework. Using the TRCNN model, the fatigue data used to train a reliable model for reversed torsion curve prediction can be replaced with the already accumulated rotating bending data. Hence, the time and funding required for the data accumulation can be significantly reduced.
The predictive capability of the TRCNN model was remarkable when trained using only 22 samples. The dependence of the model predictions on the fatigue data amounts for training is further investigated in this section. A series of fatigue data with varying samples from 33 to 112 was used to train the TR and NonTR models.
MAE results of validation set for NonTR and TR models trained with different fatigue data amounts. (A) MAE distribution of samples in one partition. (B) Mean MAE distribution with ten partitions.
In this section, efforts are made to help solve the problem of curve prediction under smallsample conditions from other perspectives. For the NonTR model, hyperparameter modification and architecture improvement were considered to improve its predictive performance. On this basis, for the model hyperparameters, the dropout value, the number of neurons in the fully connected layer and the number of convolution kernels were adjusted; for the model architecture, the number of convolutional and fully connected layers were adjusted. However, the NonTR model performance under different parameters is always unsatisfactory (their average MAE values are reasonably close, both at ~40 MPa, as shown in
In addition, we further optimized the TRCNN model architecture and demonstrated its good scalability. It is well known that quasistatic mechanical properties, such as tensile properties and hardness, are highly correlated with fatigue properties, such as fatigue strength, and various traditional empirical models have been established. For the dataset in the present work, a strong correlation exists between the tensile properties and two fatigue properties (fatigue strength of rotating bending and reversed torsion), as shown in
Prediction results after introducing tensile properties. (A) Distribution of UTS, TEL and rotating bending fatigue strength of dataset. (B) Distribution of UTS, TEL and reversed torsion fatigue strength of dataset. (C) An optimized TRCNN architecture incorporating a source model for tensile properties. (D) MAE results of basic TR models and TR models after introducing tensile properties.
In the present work, a deep learningbased transfer framework (TRCNN) is proposed to provide an efficient method for reversed torsion SN curve prediction under smallsample conditions. The proposed framework utilizes the correlations between rotating bending and reversed torsion fatigue. The main conclusions are as follows:
(1) The TRCNN framework accurately predicts the reversed torsion curves under the condition of a small number of samples (22), which is significantly better than the NonTR model. Therefore, the demand for fatigue data has significantly decreased, resulting in a significant reduction in the cost of fatigue data accumulation. The transfer framework provides a basis for building an accurate SN curve prediction.
(2) The TR model remained accurate under varying amounts of data (from 22 to 112), maintaining considerable advantages compared to the NonTR model, showing excellent generality given various fatigue data amounts.
(3) The predictive capability of the TR models was improved by introducing tensile properties into the source model. This is presented as an effective method to optimize the prediction of SN curves by combining the transfer framework and lowcost properties related to fatigue. The proposed transfer framework combines certain physical interpretability and powerful data analysis capabilities and can be extended to smallsample prediction problems of other mechanical properties.
Made substantial contributions to the conception, supervision, and design of the study and performed manuscript editing and review: Xu W, Wang C
Performed machine learning modeling and data analysis and interpretation, as well as draft writing: Wei X
Provided professional guidance: Jia Z
This study was financially supported by the National Key R&D Program (No. 2021YFB3702404). The financial support provided by the National Natural Science Foundation of China (No. U1808208, 52171109) is gratefully acknowledged. The authors gratefully acknowledge the financial support provided by the Basic Scientific Research Funds of the Northeastern University (N2007011).
All authors declared that there are no conflicts of interest.
Not applicable.
Not applicable.
© The Author(s) 2022.
Supplementary Materials