| Summary: | The dye-sensitized solar cell (DSSC) is a promising candidate, offering an attractive substitute for conventional silicon-based photovoltaic technologies. The performance advantages of the DSSC have led to a surge in research activity reflected in the number of publications over the years. To deliver data-driven analysis of DSSC performance, machine learning models have been applied. As a first step, a literature-based database has been developed and after the data preprocesses, Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), xgboost (XGB), and Artificial Neural Network (ANN) algorithms were applied with stratified train-test splits. The performance of the models has been assessed via metrics, and the model interpretability relied on SHAP analysis. Based on the employed metrics and the confusion matrix, DT, RF, and KNN are the most accurate models for predicting DSSC efficiency on the developed dataset. Furthermore, it was revealed that synthesis temperature and the thickness of thin film were identified as the dominant drivers, followed by precursor and dye. Mid-tier contributors were morphological structure, electrolyte concentrations, and the absorption maximum. The results suggest that in optimizing the manufacturing process, targeted tuning of the synthesis temperature, the thickness of thin film, the precursor, and the dye are likely to improve the performance of the device. Therefore, experimental effort should concentrate on these factors.
|