Benchmarking DFT and Supervised Machine Learning: An Organic Semiconducting Polymer Investigation

Using a training set consisting of twenty-two well-known semiconducting organic polymers, we studied the ability of a simple linear regression supervised machine learning algorithm to accurately predict the bandgap (BG) and ionization potential (IP) of new polymers. We show that using the PBE or PW91 exchange–correlation functionals and this simple linear regression, calculated BGs and IPs can be obtained with average percent errors of less than 3 and 4%, respectively. We then apply this method to predict the BG and IP of a group of new polymers composed of monomers used in the training set and their derivatives in AABB and ABAB orientations.


Band Gap Study
In Fig. S1, we present the calculated band gaps (BG) plotted as against the experimental values reported in the literature for the polymers presented in the manuscript.We list the slope and intercept of the line of best fit and the R 2 value of the fit.The BGs are then calibrated as shown in the manuscript.Table S1 presents the corrected calculations and experimentally measured values for the polymers.Table S2 presents the COHSEX and PPA G 0 W 0 calculations and experimentally measured values for the polymers.Table S1: Calibrated Band Gaps in eV for each polymer.The average percent error (A.P.E.) and the average difference (A.D.) are also listed for each functional.

IP Calibration
Figure S2 presents our calculated ionization potentials (IPs) against the experimental IP.
We list the slope and intercept of the line of best fit and the R 2 value of the fit.The IPs are then calibrated as shown in the manuscript.Table S1 presents the corrected calculations and experimentally measured values for the polymers.

Stress Study
As alluded to in the manuscript, the starting angle between the planes of the monomers can affect the Z-component of the stress tensor of the relaxed polymer.We combined PFU and tPA in the ABAB orientation to test this idea.We ran multiple relaxations following the process outlined in the "Computational Methods" section with different starting angles between the planes of the two monomers.It is worth noting that this was the only new SOP were rotated by 180 degrees, so in the ABAB orientation, all B monomers were rotated by 180 degrees.In this quick study, we found that by not rotating the tPA 180 degrees, we could relax the system to a point where the Z-component of the stress tensor was near zero, as shown in table S4.This relaxed polymer also did not crumple during this process, as shown in Fig. S3A.
Looking at the results in table S4, a conclusion can be drawn that if we were to continue decreasing the Z-component of the UC, the Z-component of the stress tensor would eventually reduce to a value less than the one kbar threshold.This would, however, lead to a crumpling of the SOP as shown in figure S3B, which is not experimentally observed.In most cases, we could see the start of the crumpling process in the SOP by observing how the Z-component of the stress tensor changed with respect to the Z-component of the UC.As the UC value decreased, the stress tensor value also reduced.Then, the stress tensor value would increase in magnitude at a specific UC value, followed by a continuous decrease.Looking at the results from the system that started with a zero degree rotation, this crumpling effect took place as the UC was decreased from 5.721 Å down to 5.521 Å , as shown in the first row of table S4 and figure S3.During the initial investigation, we repeated the rotation study for the PFU-PPY AABBoriented SOP that failed to relax below the one kbar threshold to show that this starting angle dependency also carried through to the AABB orientation.In this case, the rotation angle was between the planes of both A monomers and B monomers, where the rotation between the first A and second A monomers was 180 degrees, and similarly for the B monomers.For this SOP, we found multiple starting angles that lead to a Z-component of the stress tensor less than the one kbar threshold, and their calculated values are presented in table S5.Except for the 10-degree starting angle, all calculated values were very similar.We also looked at the difference in energy between each system with respect to the zero-degree starting angle and found that there is very little difference, especially when compared to the total energy of the system, which is six orders of magnitude larger than these differences.This suggests that

Results for the SOP Mixtures
Tables S6 and S7 present the calculated BG and IP values of these mixtures and provide "side-by-side" comparisons when having different orientations.Some calculations resulted in unrealistic stresses that did not go below 1 kbar; we indicate those with NA.S6 and S7, be a good check of the validity of our calibrations.The two orientations of the monomers have relatively the same BG and IP values (0.06 eV for the BG and 0.14 eV for the IP).

Figure S1 :
Figure S1: The calculated BG, obtained from the difference between the lowest unoccupied crystal orbital (LUCO) and the highest occupied crystal orbital (HOCO), plotted as a function of the experimental BG.

Figure S2 :
Figure S2: The calculated IP plotted as a function of the experimental IP.

Figure S3 :
Figure S3: The PFU-tPA ABAB oriented polymer relaxed with a UC of A) 5.721 Å and B) 5.521 Å

Table S3 :
Calibrated Ionization Potential (eV) investigated in the ABAB orientation that did not initially relax with a Z-component of the stress tensor below one kbar in magnitude.In the original study, all alternating monomers

Table S4 :
Z-component of the stress tensor as a function of UC and rotation

Table S5 :
Calculated values for the PFU-PPY AABB SOP with starting angles that lead to stress tensor values less than one kbar in magnitude.

Table S6 :
Corrected BG (eV) for a mixture a SCPs in the ABAB orientation (Red) and AABB orientation (Blue).NA denotes a calculation with stress above 1 kbar.
two SOPs and a sizable change in one value with little change in the other.A prime example is the mixture of tPA and PPY in the ABAB and AABB orientations, where the BGs were

Table S7 :
Corrected IP (eV) for a mixture a SCPs in the ABAB orientation (Red) and AABB orientation (Blue).eV and 1.92 eV, and the IPs were 4.42 eV and 4.51 eV, respectively.Looking at these values from the perspective of pure PPY, this mixture maintained the approximate IP while decreasing the BG by ∼ 0.7 eV.Characterizing the BG and IP of a PTH and tPA mixture would, according to the results presented in Tables