研究目的
To predict the highest occupied molecular orbital (HOMO) energy values of donor compounds for organic photovoltaic applications using extremely randomized tree learning models.
研究成果
The proposed ERT models outperform other state-of-the-art architectures in predicting HOMO values for donor compounds in organic photovoltaic applications. The models are generalizable and less prone to overfitting, making them suitable for small datasets. A web application was developed to facilitate the prediction process.
研究不足
The models were trained on a small dataset, which may limit their generalizability. The web application must be used with caution for compounds not similar to those in the HOPV dataset.
1:Experimental Design and Method Selection:
Extremely randomized trees (ERTs) were used for predicting HOMO values. The methodology included generating molecular fingerprints (MACCS and Atom Pair) from SMILES notations and performing feature reduction.
2:Sample Selection and Data Sources:
The Harvard Organic Photovoltaic (HOPV) dataset and a subset of the Clean Energy Project (CEP) dataset were used.
3:List of Experimental Equipment and Materials:
RDKit Python Library for generating fingerprints, Scikit-Learn Python Library for implementing ERTs, and other machine learning models.
4:Experimental Procedures and Operational Workflow:
The datasets were divided into training and test subsets. Grid search was performed across hyper-parameters to discover the model architecture with the least mean absolute error for 5-fold cross-validation.
5:Data Analysis Methods:
The performance of the models was evaluated using % Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and goodness of prediction (Q2).
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容