REFERENCES

1. Hadadian M, Smått J, Correa-baena J. The role of carbon-based materials in enhancing the stability of perovskite solar cells. Energy Environ Sci 2020;13:1377-407.

2. Liu Y, Li Y, Wu Y, et al. High-efficiency silicon heterojunction solar cells: materials, devices and applications. Mater Sci Eng: R: Rep 2020;142:100579.

3. Kim M, Ham S, Cheng D, Wynn TA, Jung HS, Meng YS. Advanced characterization techniques for overcoming challenges of perovskite solar cell materials. Adv Energy Mater 2021;11:2001753.

4. Li H, Li F, Shen Z, et al. Photoferroelectric perovskite solar cells: principles, advances and insights. Nano Today 2021;37:101062.

5. L. R. Devereux, J. M. Cole. in Data science applied to sustainability analysis, edited by Jennifer Dunn and Prasanna Balaprakash (Elsevier, 2021), pp. 129.

6. Kojima A, Teshima K, Shirai Y, Miyasaka T. Organometal halide perovskites as visible-light sensitizers for photovoltaic cells. J Am Chem Soc 2009;131:6050-1.

7. Zhang F, Lu H, Tong J, Berry JJ, Beard MC, Zhu K. Advances in two-dimensional organic-inorganic hybrid perovskites. Energy Environ Sci 2020;13:1154-86.

8. Kim G, Min H, Lee KS, Lee DY, Yoon SM, Seok SI. Impact of strain relaxation on performance of α-formamidinium lead iodide perovskite solar cells. Science 2020;370:108-12.

9. Green MA, Dunlop ED, Hohl-ebinger J, Yoshita M, Kopidakis N, Hao X. Solar cell efficiency tables (Version 58). Prog Photovolt Res Appl 2021;29:657-67.

10. NREL, Best research-cell efficiency chart. Available from: https://www.nrel.gov/pv/cell-efficiency.html [Last accessed on 8 Jun 2022].

11. Luo Q, Wu R, Ma L, et al. Recent advances in carbon nanotube utilizations in perovskite solar cells. Adv Funct Mater 2021;31:2004765.

12. Luo D, Su R, Zhang W, Gong Q, Zhu R. Minimizing non-radiative recombination losses in perovskite solar cells. Nat Rev Mater 2020;5:44-60.

13. Wu T, Liu X, Luo X, et al. Lead-free tin perovskite solar cells. Joule 2021;5:863-86.

14. O'regan B, Grätzel M. A low-cost, high-efficiency solar cell based on dye-sensitized colloidal TiO2 films. Nature 1991;353:737-40.

15. Zeng K, Tong Z, Ma L, Zhu W, Wu W, Xie Y. Molecular engineering strategies for fabricating efficient porphyrin-based dye-sensitized solar cells. Energy Environ Sci 2020;13:1617-57.

16. Kakiage K, Aoyama Y, Yano T, Oya K, Fujisawa J, Hanaya M. Highly-efficient dye-sensitized solar cells with collaborative sensitization by silyl-anchor and carboxy-anchor dyes. Chem Commun (Camb) 2015;51:15894-7.

17. Tang CW. Two-layer organic photovoltaic cell. Appl Phys Lett 1986;48:183-5.

18. Armin A, Li W, Sandberg OJ, et al. A history and perspective of non-fullerene electron acceptors for organic solar cells. Adv Energy Mater 2021;11:2003570.

19. Luo Z, Liu T, Yan H, Zou Y, Yang C. Isomerization strategy of nonfullerene small-molecule acceptors for organic solar cells. Adv Funct Mater 2020;30:2004477.

20. Zheng Z, Yao H, Ye L, Xu Y, Zhang S, Hou J. PBDB-T and its derivatives: a family of polymer donors enables over 17% efficiency in organic photovoltaics. Mater Today 2020;35:115-30.

21. Mishra A. Material perceptions and advances in molecular heteroacenes for organic solar cells. Energy Environ Sci 2020;13:4738-93.

22. Kini GP, Jeon SJ, Moon DK. Latest progress on photoabsorbent materials for multifunctional semitransparent organic solar cells. Adv Funct Mater 2021;31:2007931.

23. Zhao C, Wang J, Zhao X, Du Z, Yang R, Tang J. Recent advances, challenges and prospects in ternary organic solar cells. Nanoscale 2021;13:2181-208.

24. Schmidt J, Marques MRG, Botti S, Marques MAL. Recent advances and applications of machine learning in solid-state materials science. npj Comput Mater 2019:5.

25. Rajan K. Materials informatics. Mater Today 2005;8:38-45.

26. Kohn W, Sham LJ. Self-consistent equations including exchange and correlation effects. Phys Rev 1965;140:A1133-8.

27. Hohenberg P, Kohn W. Inhomogeneous electron gas. Phys Rev 1964;136:B864-71.

28. Luo S, Zeng Z, Wang H, et al. Recent progress in conjugated microporous polymers for clean energy: synthesis, modification, computer simulations, and applications. Progress in Polymer Science 2021;115:101374.

29. Chen C, Zuo Y, Ye W, Li X, Deng Z, Ong SP. A critical review of machine learning of energy materials. Adv Energy Mater 2020;10:1903242.

30. Haghighatlari M, Vishwakarma G, Altarawy D, et al. ChemML: a machine learning and informatics program package for the analysis, mining, and modeling of chemical and materials data. WIREs Comput Mol Sci 2020:10.

31. Moosavi SM, Jablonka KM, Smit B. The role of machine learning in the understanding and design of materials. J Am Chem Soc 2020:20273-87.

32. Chen L, Pilania G, Batra R, et al. Polymer informatics: current status and critical next steps. Mater Sci Eng: R: Rep 2021;144:100595.

33. Masood H, Toe CY, Teoh WY, Sethu V, Amal R. Machine learning for accelerated discovery of solar photocatalysts. ACS Catal 2019;9:11774-87.

34. Jia Y, Hou X, Wang Z, Hu X. Machine learning boosts the design and discovery of nanomaterials. ACS Sustainable Chem Eng 2021;9:6130-47.

35. Brown KA, Brittman S, Maccaferri N, Jariwala D, Celano U. Machine learning in nanoscience: big data at small scales. Nano Lett 2020;20:2-10.

36. Domingos P. A few useful things to know about machine learning. Commun ACM 2012;55:78-87.

37. Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intell Syst 2009;24:8-12.

38. Jablonka KM, Ongari D, Moosavi SM, Smit B. Big-data science in porous materials: materials genomics and machine learning. Chem Rev 2020;120:8066-129.

39. Xiong J, Shi S, Zhang T. Machine learning of phases and mechanical properties in complex concentrated alloys. J Mater Sci Technol 2021;87:133-42.

40. BONEAU CA. The effects of violations of assumptions underlying the test. Psychol Bull 1960;57:49-64.

41. Edgell SE, Noon SM. Effect of violation of normality on the t test of the correlation coefficient. Psychological Bulletin 1984;95:576-83.

42. R. G. Lomax, An introduction to statistical concepts. (Mahwah, N.J. : Lawrence Erlbaum Associates Publishers, 2007), p. 10.

43. Breunig MM, Kriegel H, Ng RT, Sander J. LOF: identifying density-based local outliers. SIGMOD Rec 2000;29:93-104.

44. Liu FT, Ting KM, Zhou Z. Isolation-based anomaly detection. ACM Trans Knowl Discov Data 2012;6:1-39.

45. Rousseeuw PJ. Least median of squares regression. J Am Stat Assoc 1984;79:871-80.

46. Rousseeuw PJ, Driessen KV. A fast algorithm for the minimum covariance determinant estimator. Technometrics 1999;41:212-23.

47. Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC. Estimating the support of a high-dimensional distribution. Neural Comput 2001;13:1443-71.

48. Chang C, Lin C. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011;2:1-27.

49. Zhao Y, Hryniewicki MK. Improving supervised outlier detection with unsupervised representation learning. Available from: https://arxiv.org/abs/1912.00290 [Last accessed on 10 Jun 2022].

50. Chen T, Guestrin C. XGBoost: a scalable tree boosting system (2016), https://xgboost.readthedocs.io/en/latest/install.html.

51. Dorogush AV, Ershove V, Guilin A. CatBoost: gradient boosting with categorical features support (2018). Available from: https://catboost.ai/docs [Last accessed on 8 Jun 2022].

52. Jain A, Ong SP, Hautier G, et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Materials 2013;1:011002.

53. Zagorac D, Müller H, Ruehl S, Zagorac J, Rehme S. Recent developments in the inorganic crystal structure database: theoretical crystal structure data and related features. J Appl Crystallogr 2019;52:918-25.

54. Saal JE, Kirklin S, Aykol M, Meredig B, Wolverton C. Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD). JOM 2013;65:1501-9.

55. Kirklin S, Saal JE, Meredig B, et al. The open quantum materials database (OQMD): assessing the accuracy of DFT formation energies. npj Comput Mater 2015:1.

56. P. Villars. Materials platform for data science (2019). Available from: https://mpds.io/ [Last accessed on 8 Jun 2022].

57. Su Y. Materials genome engineering databases (University of Science and Technology Beijing, 2018). Available from: https://www.mgedata.cn/ [Last accessed on 8 Jun 2022].

58. Qian Q, Wang Y, Zhao S. Materials data specification: methods and use cases. Comput Mater Sci 2019;169:109086.

59. Tao Q, Xu P, Li M, Lu W. Machine learning for perovskite materials design and discovery. npj Comput Mater 2021:7.

60. Ramakrishna S, Zhang T, Lu W, et al. Materials informatics. J Intell Manuf 2019;30:2307-26.

61. Groom CR, Bruno IJ, Lightfoot MP, Ward SC. The cambridge structural database. Acta Cryst 2016;72:171-9.

62. Grazulis S, Chateigner D, Downs RT, Yokochi AFT, Quiros M, Lutterotti L, Manakova E, Butkus J, Moeck P, Bail AL. Crystallography open database - an open-access collection of crystal structures. J Appl Crystallogr 2009;42:726-9.

63. Gómez-Bombarelli R, Wei JN, Duvenaud D, et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 2018;4:268-76.

64. G. Landrum. RDKit: Open-source cheminformatics (2012). Available from: http://www.rdkit.org/ [Last accessed on 8 Jun 2022].

65. Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA. Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 2014;1:140022.

66. Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 2012;52:1757-68.

67. IBM. World Community Grid. Available from: http://www.worldcommunitygrid.org/ [Last accessed on 8 Jun 2022].

68. Lopez SA, Sanchez-lengeling B, de Goes Soares J, Aspuru-guzik A. Design principles and top non-fullerene acceptor candidates for organic photovoltaics. Joule 2017;1:857-70.

69. Lopez SA, Pyzer-Knapp EO, Simm GN, et al. The Harvard organic photovoltaic dataset. Sci Data 2016;3:160086.

70. Venkatraman V, Raju R, Oikonomopoulos SP, Alsberg BK. The dye-sensitized solar cell database. J Cheminform 2018;10:18.

71. Odabaşı Ç, Yıldırım R. Performance analysis of perovskite solar cells in 2013-2018 using machine-learning tools. Nano Energy 2019;56:770-91.

72. Odabaşı Ç, Yıldırım R. Machine learning analysis on stability of perovskite solar cells. Sol Energy Mater Sol Cells 2020;205:110284.

73. Odabaşı Ç, Yıldırım R. Assessment of reproducibility, hysteresis, and stability relations in perovskite solar cells using machine learning. Energy Technol 2020;8:1901449.

74. Yılmaz B, Yıldırım R. Critical review of machine learning applications in perovskite solar research. Nano Energy 2021;80:105546.

75. D. Systèmes, BIOVIA MATERIALS STUDIO (Dassault Systèmes, 2002-2021). Available from: https://www.3ds.com/products-services/biovia/products/molecular-modeling-simulation/biovia-materials-studio/ [Last access on 8 Jun 2022].

76. Kresse G, Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput Mater Sci 1996;6:15-50.

77. Kresse G, Furthmüller J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys Rev B Condens Matter 1996;54:11169-86.

78. Kresse G, Hafner J. Ab initio molecular dynamics for liquid metals. Phys Rev B Condens Matter 1993;47:558-61.

79. Kresse G, Joubert D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys Rev B 1999;59:1758-75.

80. Frisch MJ, Trucks GW, Schlegel HB et al. Gaussian 16 Rev. C. 01. Available from: https://gaussian.com/citation_b01/ [Last accessed on 10Jun 2022].

81. Hartono NTP, Thapa J, Tiihonen A, et al. How machine learning can help select capping layers to suppress perovskite degradation. Nat Commun 2020;11:4172.

82. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2020;2:56-67.

83. Saidi WA, Shadid W, Castelli IE. Machine-learning structural and electronic properties of metal halide perovskites using a hierarchical convolutional neural network. npj Comput Mater 2020:6.

84. Zhao Y, Zhang J, Xu Z, et al. Discovery of temperature-induced stability reversal in perovskites using high-throughput robotic learning. Nat Commun 2021;12:2191.

85. Mahmood A, Wang J. Machine learning for high performance organic solar cells: current scenario and future prospects. Energy Environ Sci 2021;14:90-105.

86. Kode-Chemoinformatics. Dragon 7 (2021). Available from: https://gaussian.com/citation_b01/ [Last accessed on 8 Jun 2022].

87. Available from: https://match.pmf.kg.ac.rs/electronic_versions/Match56/n2/match56n2_237-248.pdf [Last accessed on 10 Jun 2022].

88. Krenn M, Häse F, Nigam A, Friederich P, Aspuru-guzik A. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach Learn : Sci Technol 2020;1:045024.

89. Kar S, Roy JK, Leszczynski J. In silico designing of power conversion efficient organic lead dyes for solar cells using todays innovative approaches to assure renewable energy for future. npj Comput Mater 2017:3.

90. Krishna JG, Ojha PK, Kar S, Roy K, Leszczynski J. Chemometric modeling of power conversion efficiency of organic dyes in dye sensitized solar cells for the future renewable energy. Nano Energy 2020;70:104537.

91. Yap CW. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 2011;32:1466-74.

92. Ju L, Li M, Tian L, Xu P, Lu W. Accelerated discovery of high-efficient N-annulated perylene organic sensitizers for solar cells via machine learning and quantum chemistry. Mater Today Commun 2020;25:101604.

93. Lu T, Li M, Yao Z, Lu W. Accelerated discovery of boron-dipyrromethene sensitizer for solar cells by integrating data mining and first principle. J Mater 2021;7:790-801.

94. Shemetulskis NE, Weininger D, Blankley CJ, Yang JJ, Humblet C. Stigmata: an algorithm to determine structural commonalities in diverse datasets. J Chem Inf Comput Sci 1996;36:862-71.

95. Cereto-Massagué A, Ojeda MJ, Valls C, Mulero M, Garcia-Vallvé S, Pujadas G. Molecular fingerprint similarity search in virtual screening. Methods 2015;71:58-63.

96. Muegge I, Mukherjee P. An overview of molecular fingerprint similarity search in virtual screening. Expert Opin Drug Discov 2016;11:137-48.

97. Pattanaik L, Coley CW. Molecular representation: going long on fingerprints. Chem 2020;6:1204-7.

98. Sun W, Zheng Y, Yang K, et al. Machine learning-assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Sci Adv 2019;5:eaay4275.

99. Kranthiraja K, Saeki A. Experiment-oriented machine learning of polymer: non-fullerene organic solar cells. Adv Funct Mater 2021;31:2011168.

100. Lo. Mentel. mendeleev - a python resource for properties of chemical elements, ions and isotopes (2014). Available from: https://github.com/lmmentel/mendeleev [Last accessed on 8 Jun 2022].

101. Li C, Hao H, Xu B, et al. A progressive learning method for predicting the band gap of ABO$$_3$$ perovskites using an instrumental variable. J Mater Chem C 2020;8:3127-36.

102. Ong SP, Richards WD, Jain A, et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput Mater Sci 2013;68:314-9.

103. Pilania G, Balachandran PV, Kim C, Lookman T. Finding new perovskite halides via machine learning. Front Mater 2016:3.

104. N. Chen, Bond parameter function and application (In Chinese), 1st ed. (CHINA SCIENCE PUBLISHING & MEDIA LTD, Beijing, China, 1976.

105. Slater JC. A simplification of the hartree-fock method. Phys Rev 1951;81:385-90.

106. Available from: https://jsc.niic.nsc.ru/ [Last accessed on 10 Jun 2022].

107. Pauling L. The nature of the chemical bond. application of results obtained from the quantum mechanics and from a theory of paramagnetic susceptibility to the structure of molecules. J Am Chem Soc 1931;53:1367-400.

108. Quill LL. The chemistry and metallurgy of miscellaneous materials. J Chem Educ 1950;27:583.

109. Zachariasen WH. A set of empirical crystal radii for ions with inert gas configuration. Zeitschrift für Kristallographie - Crystalline Materials 1931;80:137-53.

110. Sanderson RT. Principles of electronegativity Part I. general nature. J Chem Educ 1988;65:112.

111. Beskow G. V. M. Goldschmidt: geochemische verteilungsgesetze der elemente. Geologiska Föreningen i Stockholm Förhandlingar 2010;46:738-43.

112. Lu W, Lv W, Zhang Q, Lu K, Ji X. Material data mining in Nianyi Chen's scientific family: material data mining in Nianyi Chen's scientific family. J Chemom 2018;32:e3022.

113. Murray JS, Lane P, Brinck T, Paulsen K, Grice ME, Politzer P. Relationships of critical constants and boiling points to computed molecular surface properties. J Phys Chem 1993;97:9369-73.

114. Byrd EF, Rice BM. Improved prediction of heats of formation of energetic materials using quantum mechanical calculations. J Phys Chem A 2006;110:1005-13.

115. Rice BM, Byrd EF. Evaluation of electrostatic descriptors for predicting crystalline density. J Comput Chem 2013;34:2146-51.

116. T. Lu. fast machine learning (2021). Available from: https://pypi.org/project/fast-machine-learning/ [Last accessed on 8 Jun 2022].

117. Sun W, Li M, Li Y, et al. The use of deep learning to fast evaluate organic photovoltaic materials. Adv Theory Simul 2019;2:1800116.

118. Jang J, Gu GH, Noh J, Kim J, Jung Y. Structure-based synthesizability prediction of crystals using partially supervised learning. J Am Chem Soc 2020;142:18836-43.

119. Chen C, Ye W, Zuo Y, Zheng C, Ong SP. Graph networks as a universal machine learning framework for molecules and crystals. Chem Mater 2019;31:3564-72.

120. Xie T, Grossman JC. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys Rev Lett 2018;120:145301.

121. T. Stephens. gplearn: Genetic Programming in Python (2016). Available from: https://gplearn.readthedocs.io/ [Last accessed on 8 Jun 2022].

122. Fortin FA, Rainville FMD, Gardner MA, Parizeau M, Gagné C. DEAP: Evolutionary Algorithms Made Easy J Mach Learn Res 13, 2171 (2012). Available from: https://www.jmlr.org/papers/v13/fortin12a.html[last accessed on 10 Jun 2022].

123. Ouyang R, Curtarolo S, Ahmetcik E, Scheffler M, Ghiringhelli LM. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys Rev Materials 2018:2.

124. Bartel CJ, Sutton C, Goldsmith BR, et al. New tolerance factor to predict the stability of perovskite oxides and halides. Sci Adv 2019;5:eaav0693.

125. Varoquaux G, Gramfort A, Pedregosa F, Michel V, Thirion B. Multi-subject dictionary learning to segment an atlas of brain spontaneous activity. J Mach Learn Res 2011;12:2825.

126. Golbraikh A, Shen M, Xiao Z, Xiao Y, Lee K, Tropsha A. Rational selection of training and test sets for the development of validated QSAR models. J Comput Aided Mol Des 2003;17:241-53.

127. Golbraikh A, Tropsha A. Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Mol Divers 2000;5:231-43.

128. Chandrashekar G, Sahin F. A survey on feature selection methods. Comput Electr Eng 2014;40:16-28.

129. Guyon I, Nikravesh M, Gunn S, Zadeh LA. Feature Extraction. Fuzziness Soft Comput 2006;207:778.

130. Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 2005;3:185-205.

131. Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005;27:1226-38.

132. Ramírez-gallego S, Lastra I, Martínez-rego D, et al. Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data: fast-mRMR algorithm for big data. Int J Intell Syst 2017;32:134-52.

133. Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Machine Learning 2022;46:389-422.

134. Genetic programming in python, with a scikit-learn inspired API (2016), https://gplearn.readthedocs.io/ [Last accessed on 8 Jun 2022].

135. Collette Y, Hansen N, Pujol G, Salazar Aponte D, Le Riche R. In multidisciplinary design optimization in computational mechanics (2013), pp. 499.

136. S. Mirjalili. in Evolutionary algorithms and neural networks: theory and applications, edited by Seyedali Mirjalili. Springer International Publishing: Cham; 2019. pp. 43.

137. Whitley D. A genetic algorithm tutorial. Stat Comput 1994:4.

138. Ferri F, Pudil P, Hatef M, Kittler J. Comparative study of techniques for large-scale feature selection. Comparative Studies and Hybrid Systems. Elsevier; 1994. pp. 403-13.

139. Baraniuk R. Compressive sensing[Lecture Notes]. IEEE Signal Process Mag 2007;24:118-21.

140. L. Breiman, J. H. Friedman, and R. A. Olshen, Classification and regression trees. (Wadsworth International Group, Belmont, CA, 1984).

141. Wen Y, Fu L, Li G, Ma J, Ma H. Accelerated discovery of potential organic dyes for dye-sensitized solar cells by interpretable machine learning models and virtual screening. Sol RRL 2020;4:2000110.

142. Shapley LS. A value for n-person games. (Contrib. Theor. Games, 1953).

143. Zhang S, Lu T, Xu P, Tao Q, Li M, Lu W. Predicting the formability of hybrid organic-inorganic perovskites via an interpretable machine learning strategy. J Phys Chem Lett 2021;12:7423-30.

144. Guolin K, Qi M, Thomas F, Taifeng W, et al. In advances in neural information processing systems 30 (NIPS 2017) (Long Beach, CA, USA, 2017).

145. Paszke A, Gross S, Massa F, et al. PyTorch: an imperative style, high-performance deep learning library. (Curran Associates, Inc, 2019), pp. 8024. Available from: https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf [Last accessed on 13 Jun 2022].

146. M. Abadi, A. Agarwal, P. Barham, et al. TensorFlow: large-scale machine learning on heterogeneous systems (2015). Available from: https://arxiv.org/abs/1603.04467 [Last accessed on 13 Jun 2022].

147. Zárate Hernández LA, Camacho-Mendoza RL, González-Montiel S, Cruz-Borbolla J. The chemical reactivity and QSPR of organic compounds applied to dye-sensitized solar cells using DFT. J Mol Graph Model 2021;104:107852.

148. Wu Y, Guo J, Sun R, Min J. Machine learning for accelerating the discovery of high-performance donor/acceptor pairs in non-fullerene organic solar cells. npj Comput Mater 2020:6.

149. David TW, Anizelli H, Tyagi P, Gray C, Teahan W, Kettle J. Using large datasets of organic photovoltaic performance data to elucidate trends in reliability between 2009 and 2019. IEEE J Photovoltaics 2019;9:1768-73.

150. Kar S, Roy J, Leszczynska D, Leszczynski J. Power conversion efficiency of arylamine organic dyes for dye-sensitized solar cells (DSSCs) explicit to cobalt electrolyte: understanding the structural attributes using a direct QSPR approach. Computation 2017;5:2.

151. Roy JK, Kar S, Leszczynski J. Insight into the optoelectronic properties of designed solar cells efficient tetrahydroquinoline dye-sensitizers on TiO2(101) surface: first principles approach. Sci Rep 2018;8:10997.

152. Roy JK, Kar S, Leszczynski J. Electronic structure and optical properties of designed photo-efficient indoline-based dye-sensitizers with D-A-$$\pi$$-A framework. J Phys Chem C 2019;123:3309-20.

153. Roy K, Ambure P, Kar S, Ojha PK. Is it possible to improve the quality of predictions from an "intelligent" use of multiple QSAR/QSPR/QSTR models?: quality of predictions from an "intelligent" use of multiple models. J Chemom 2018;32:e2992.

154. Cramer J. The origins of logistic regression. SSRN J ; doi: 10.2139/ssrn.360300.

155. Tolles J, Meurer WJ. Logistic regression: relating patient characteristics to outcomes. JAMA 2016;316:533-4.

156. Walker SH, Duncan DB. Estimation of the probability of an event as a function of several independent variables. Biometrika 1967;54:167.

157. Yu Y, Tan X, Ning S, Wu Y. Machine learning for understanding compatibility of organic-inorganic hybrid perovskites with post-treatment amines. ACS Energy Lett 2019;4:397-404.

158. Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010:33.

159. Santosa F, Symes WW. Linear inversion of band-limited reflection seismograms. SIAM J Sci and Stat Comput 1986;7:1307-30.

160. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970;12:55-67.

161. Li X, Dan Y, Dong R, et al. Computational screening of new perovskite materials using transfer learning and deep learning. Appl Sci 2019;9:5510.

162. Stoddard RJ, Dunlap-shohl WA, Qiao H, Meng Y, Kau WF, Hillhouse HW. Forecasting the decay of hybrid perovskite performance using optical transmittance or reflected dark-field imaging. ACS Energy Lett 2020;5:946-54.

163. Padula D, Simpson JD, Troisi A. Combining electronic and structural features in machine learning models to predict organic solar cells properties. Mater Horiz 2019;6:343-9.

164. Wu X, Kumar V, Ross Quinlan J, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2008;14:1-37.

165. Raccuglia P, Elbert KC, Adler PD, et al. Machine-learning-assisted materials discovery using failed experiments. Nature 2016;533:73-6.

166. Jiménez-luna J, Grisoni F, Schneider G. Drug discovery with explainable artificial intelligence. Nat Mach Intell 2020;2:573-84.

167. Breiman L. Pasting small votes for classification in large databases and on-line. Machine Learning 1999;36:85-103.

168. Breiman L. Bagging predictors. Mach Learn 1996;24:123-40.

169. Tin Kam Ho. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Machine Intell ;20:832-44.

170. Louppe G, Geurts P. Ensembles on random patches. (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012), pp. 346. Available from: https://link.springer.com/chapter/10.1007/978-3-642-33460-3_28 [Last accessed on 13 Jun 2022].

171. Takahashi K, Takahashi L, Miyazato I, Tanaka Y. Searching for hidden perovskite materials for photovoltaic systems by combining data science and first principle calculations. ACS Photonics 2018;5:771-5.

172. Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput and Sys Sci 1997;55:119-39.

173. J. H. Friedman, Greedy function approximation: a gradient boosting machine. Ann. Stat 2001; 29, 1189 (2001),.

174. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. -Y. Liu, in Proceedings of the 31st International Conference on Neural Information Processing Systems (Curran Associates Inc., Long Beach, California, USA, 2017), pp. 3149.

175. Prokhorenkova L, . Gusev G, A. Vorobev, A. V. Dorogush, A. Gulin. CatBoost: unbiased boosting with categorical features. Available from: https://arxiv.org/abs/1706.09516 [Last accessed on 13 Jun 2022].

176. Sahu H, Ma H. Unraveling correlations between molecular properties and device parameters of organic solar cells using machine learning. J Phys Chem Lett 2019;10:7277-84.

177. Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20:273-97.

178. Smola AJ, Schölkopf B. A tutorial on support vector regression. Stat Comput 2004;14:199-222.

179. Wu T, Wang J. Global discovery of stable and non-toxic hybrid organic-inorganic perovskites for photovoltaic systems by combining machine learning method with first principle calculations. Nano Energy 2019;66:104070.

180. Ambikasaran S, Foreman-Mackey D, Greengard L, Hogg DW, O'Neil M. Fast direct methods for gaussian processes. IEEE Trans Pattern Anal Mach Intell 2016;38:252-65.

181. Rasmussen CE, Williams CKI. Gaussian processes for machine learning The MIT Press; 2006.

182. Pyzer-knapp EO, Simm GN, Aspuru Guzik A. A Bayesian approach to calibrating high-throughput virtual screening results and application to organic photovoltaic materials. Mater Horiz 2016;3:226-33.

183. Li J, Pradhan B, Gaur S, Thomas J. Predictions and strategies learned from machine learning to develop high-performing perovskite solar cells. Adv Energy Mater 2019;9:1901891.

184. Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: generative models for matter engineering. Science 2018;361:360-5.

185. Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative adversarial networks: an overview. IEEE Signal Process Mag 2018;35:53-65.

186. Goodfellow I, Pouget-abadie J, Mirza M, et al. Generative adversarial networks. Commun ACM 2020;63: 139-44. Available from: https://proceedings.neurips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html [Last accessed on 13 Jun 2022].

187. Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Networks. 2014. Available from: https://arxiv.org/abs/1406.2661 [Last accessed on 13 Jun 2022].

188. Kingma D P, Welling M. Auto-encoding variational bayes. Available from: https://arxiv.org/abs/1312.6114 [Last accessed on 13 Jun 2022].

189. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 1987;20:53-65.

190. Choudhary K, Bercx M, Jiang J, Pachter R, Lamoen D, Tavazza F. Accelerated discovery of efficient solar-cell materials using quantum and machine-learning methods. Chem Mater 2019;31:5900-8.

191. Komer B, Socastro MT, Kim W. Hyperopt: distributed hyperparameter optimization (2012-2021). Available from: https://github.com/hyperopt/hyperopt [Last accessed on 9 Jun 2022].

192. Akiba T, Sano S, Yanase T, Ohta T, Koyama M. A next-generation hyperparameter optimization framework. Preferred Networks, Inc., 2017-2021.

193. Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I. tune: scalable hyperparameter tuning (The Ray Team, 2018). Available from: https://docs.ray.io/en/latest/tune/index.html [Last accessed on 10 Jun 2022].

194. Lu S, Zhou Q, Ma L, Guo Y, Wang J. Rapid discovery of ferroelectric photovoltaic perovskites and material descriptors via machine learning. Small Methods 2019;3:1900360.

195. Kim C, Pilania G, Ramprasad R. Machine learning assisted predictions of intrinsic dielectric breakdown strength of ABX3 perovskites. J Phys Chem C 2016;120:14575-80.

196. Körbel S, Marques MAL, Botti S. Stability and electronic properties of new inorganic perovskites from high-throughput ab initio calculations. J Mater Chem C 2016;4:3157-67.

197. Gómez-Bombarelli R, Aguilera-Iparraguirre J, Hirzel TD, et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat Mater 2016;15:1120-7.

198. Tao Q, Lu T, Sheng Y, Li L, Lu W, Li M. Machine learning aided design of perovskite oxide materials for photocatalytic water splitting. J Energy Chem 2021;60:351-9.

199. Xu P, Lu T, Ju L, Tian L, Li M, Lu W. Machine learning aided design of polymer with targeted band gap based on DFT computation. J Phys Chem B 2021;125:601-11.

200. Jin H, Zhang H, Li J, et al. Discovery of novel two-dimensional photovoltaic materials accelerated by machine learning. J Phys Chem Lett 2020;11:3075-81.

201. Rajagopal A, Yao K, Jen AK. Toward perovskite solar cell commercialization: a perspective and research roadmap based on interfacial engineering. Adv Mater 2018;30:e1800455.

202. Li Z, Klein TR, Kim DH, et al. Scalable fabrication of perovskite solar cells. Nat Rev Mater 2018:3.

203. Zhou G, Chu W, Prezhdo OV. Structural deformation controls charge losses in MAPbI$$_3$$: unsupervised machine learning of nonadiabatic molecular dynamics. ACS Energy Lett 2020;5:1930-8.

204. Kennard RW, Stone LA. Computer Aided design of experiments. Technometrics 1969;11:137-48.

205. Park H, Jun C. A simple and fast algorithm for K-medoids clustering. Expert Syst Appl 2009;36:3336-41.

206. Ertl P, Schuffenhauer A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform 2009;1:8.

207. Venkatraman V, Yemene AE, de Mello J. Prediction of absorption spectrum shifts in dyes adsorbed on titania. Sci Rep 2019;9:16983.

208. Isida fragmentor. Available from: http://infochim.u-strasbg.fr/downloads/manuals/Fragmentor2017/Fragmentor2017_Manual_nov2017.pdf [Last accessed on 13 Jun 2022].

209. Cooper CB, Beard EJ, Vázquez-mayagoitia Á, et al. Design-to-device approach affords panchromatic co-sensitized solar cells. Adv Energy Mater 2019;9:1802820.

210. Swain MC, Cole JM. ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J Chem Inf Model 2016;56:1894-904.

211. Lee M. Insights from machine learning techniques for predicting the efficiency of fullerene derivatives-based ternary organic solar cells at ternary blend design. Adv Energy Mater 2019; doi: 10.1002/aenm.201900891.

212. Zhao Z, del Cueto M, Geng Y, Troisi A. Effect of increasing the descriptor set on machine learning prediction of small molecule-based organic solar cells. Chem Mater 2020;32:7777-87.

213. Meftahi N, Klymenko M, Christofferson AJ, Bach U, Winkler DA, Russo SP. Machine learning property prediction for organic photovoltaic devices. npj Comput Mater 2020:6.

214. Carbonell P, Carlsson L, Faulon JL. Stereo signature molecular descriptor. J Chem Inf Model 2013;53:887-97.

215. Scharber M, Mühlbacher D, Koppe M, et al. Design rules for donors in bulk-heterojunction solar cells-towards 10 % energy-conversion efficiency. Adv Mater 2006;18:789-94.

216. Winkler DA, Burden FR. Robust QSAR models from novel descriptors and bayesian regularised neural networks. Mol Simul 2006;24:243-58.

217. Lucic B, Amic D, Trinajstic N. Nonlinear multivariate regression outperforms several concisely designed neural networks on three QSPR data sets. J Chem Inf Comput Sci 2000;40:403-13.

218. David TW, Anizelli H, Jacobsson TJ, Gray C, Teahan W, Kettle J. Enhancing the stability of organic photovoltaics through machine learning. Nano Energy 2020;78:105342.

219. Reese MO, Gevorgyan SA, Jørgensen M, et al. Consensus stability testing protocols for organic photovoltaic materials and devices. Sol Energy Mater Sol Cells 2011;95:1253-67.

220. Flake GW, Lawrence S. Efficient SVM regression training with SMO. Machine Learning 2002;46:271-90.

221. Du X, Lüer L, Heumueller T, et al. Elucidating the full potential of OPV materials utilizing a high-throughput robot-based platform and machine learning. Joule 2021;5:495-506.

222. Quinlan JR. Induction of decision trees. Mach Learn 1986;1:81-106.

223. Quinlan J. R. C4.5: Programs for machine learning. Mach Learn 1994;16:235-240.

224. Lu T, Li H, Li M, Wang S, Lu W. Predicting experimental formability of hybrid organic-inorganic perovskites via imbalanced learning. J Phys Chem Lett 2022;13:3032-8.

225. Im J, Lee S, Ko T, Kim HW, Hyon Y, Chang H. Identifying Pb-free perovskites for solar cells by machine learning. npj Comput Mater 2019:5.

226. Sun S, Hartono NT, Ren ZD, et al. Accelerated development of perovskite-inspired materials via high-throughput synthesis and machine-learning diagnosis. Joule 2019;3:1437-51.

227. Lu S, Zhou Q, Ouyang Y, Guo Y, Li Q, Wang J. Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nat Commun 2018;9:3405.

228. Li Z, Xu Q, Sun Q, Hou Z, Yin W. Thermodynamic stability landscape of halide double perovskites via high-throughput computing and machine learning. Adv Funct Mater 2019;29:1807280.

229. Schmidt J, Shi J, Borlido P, Chen L, Botti S, Marques MAL. Predicting the thermodynamic stability of solids combining density functional theory and machine learning. Chem Mater 2017;29:5090-103.

230. Venkatraman V, Foscato M, Jensen VR, Alsberg BK. Evolutionary de novo design of phenothiazine derivatives for dye-sensitized solar cells. J Mater Chem A 2015;3:9851-60.

231. Majeed N, Saladina M, Krompiec M, Greedy S, Deibel C, Mackenzie RCI. Using deep machine learning to understand the physical performance bottlenecks in novel thin-film solar cells. Adv Funct Mater 2020;30:1907259.

232. Pokuri BSS, Ghosal S, Kokate A, Sarkar S, Ganapathysubramanian B. Interpretable deep learning for guided microstructure-property explorations in photovoltaics. npj Comput Mater 2019:5.

233. Sahu H, Yang F, Ye X, Ma J, Fang W, Ma H. Designing promising molecules for organic solar cells via machine learning assisted virtual screening. J Mater Chem A 2019;7:17480-8.

234. Sahu H, Rao W, Troisi A, Ma H. Toward predicting efficiency of organic solar cells via machine learning and improved descriptors. Adv Energy Mater 2018;8:1801032.

235. Padula D, Troisi A. Concurrent optimization of organic donor-acceptor pairs through machine learning. Adv Energy Mater 2019;9:1902463.

236. Nagasawa S, Al-Naamani E, Saeki A. Computer-aided screening of conjugated polymers for organic solar cell: classification by random forest. J Phys Chem Lett 2018;9:2639-46.

Journal of Materials Informatics
ISSN 2770-372X (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/