0% found this document useful (0 votes)
21 views11 pages

Zhang Et Al 2025 Fusion Deep Learning For Predicting Conductivity in Electron Doped Organic Polymers

Uploaded by

larahdiasl02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views11 pages

Zhang Et Al 2025 Fusion Deep Learning For Predicting Conductivity in Electron Doped Organic Polymers

Uploaded by

larahdiasl02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

pubs.acs.

org/JACS Article

Fusion Deep Learning for Predicting Conductivity in Electron-Doped


Organic Polymers
Ziyu Zhang, Xinzheng Yang, Liang Yan, Sungwoo Jung, Wei You, Ting Cao, and Xiaosong Li*
Cite This: https://doi.org/10.1021/jacs.5c09172 Read Online

ACCESS Metrics & More Article Recommendations *


sı Supporting Information
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

ABSTRACT: The development of efficient and air-stable n-type


organic semiconductors suitable for molecular n-doping is critical
for advancing high-performance, durable organic electronic devices,
Downloaded via MCMASTER UNIV on September 28, 2025 at 03:51:57 (UTC).

including transistors, thermoelectrics, and photovoltaics. Machine


learning offers a powerful approach to uncovering hidden
relationships between molecular structures and their electronic
properties, thereby accelerating the discovery and design of
promising materials. To support this effort, we constructed a
curated database comprising 84 n-type conductive polymers, each
characterized by experimental measurements under n-doping
conditions with 4-(1,3-dimethyl-2,3-dihydro-1H-benzoimidazol-2-
yl)phenyl dimethylamine (N-DMBI-H), and augmented with
density functional theory calculations to provide complementary
molecular descriptors. After constructing the database, we developed a fusion deep learning model that integrates convolutional
neural networks with fully connected artificial neural networks to capture both structural and property-based features of the
polymers. The model was trained on the data set and evaluated using leave-one-out cross-validation. The model was further applied
to a test set of n-type polymers bearing oligoethylene glycol (OEG) side chains, enabling the identification of key physical factors
that influence their conductivity when doped with N-DMBI-H. Finally, a double-blind experiment was conducted to validate the
model’s practical utility by predicting the conductivity of four BDPPV-type polymers and one N2200-type polymer doped with N-
DMBI-H. For the N2200-type polymer and two of the BDPPV-type polymers, the predicted conductivities agreed with experimental
values within the same order of magnitude. These results demonstrate the fusion model’s reliability and establish a strong foundation
for data-driven property prediction and the design of high-conductivity n-type polymers.

1. INTRODUCTION initiating a new iteration of the process. While this method has
Molecular doping of conjugated polymers is one of the most proven effective, it is inherently slow, even when supported by
effective strategies to dramatically enhance their electrical substantial resource investment. In recent years, the rapid
conductivity by generating a high density of mobile charge advancement of machine learning (ML) has offered a promising
carriers (e.g., polarons) through redox reactions between the path to accelerate the discovery of novel materials.17−22
polymer and the dopant.1−7 Depending on the relative redox In the field of organic electronics, ML has shown remarkable
potentials (or energy levels) of the conjugated polymer and the
success in predicting material properties and related device
molecular dopant, either p-type doping (i.e., oxidation of the
polymer via electron loss) or n-type doping (i.e., reduction of the performance. For example, Pyzer-Knapp et al. trained an
polymer via electron gain) can occur. artificial neural network (ANN) to predict frontier molecular
While significant progress has been made in p-type doping orbital energies using data from the Harvard Clean Energy
over the past few decades, with reported electrical conductivities Project (CEP), a library of molecular structures generated from
(σ) reaching as high as 2 × 105 S/cm,8 n-type doping remains 26 basic building blocks.23 Similarly, Nagasawa et al. used a
considerably more challenging. Typical σ values for n-doped convolutional neural network (CNN) and CEP data to predict
conjugated polymers are much lower, often less than 10 S/
cm,9−15 with only one notable example where σ surpassed 2000
S/cm.16 Received: May 30, 2025
The traditional approach to addressing materials challenges is Revised: September 17, 2025
best characterized as a feedback loop that proceeds from Accepted: September 18, 2025
conceptual formulation, to material synthesis, to testing and
characterization. The resulting structure−property relationships
are then used to refine or reformulate the original concept,

© XXXX American Chemical Society https://doi.org/10.1021/jacs.5c09172


A J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

power conversion efficiency of organic solar cells with over 91% l


o 0.5Zi2.4 ,
o
o if i = j
accuracy.24 o
o
o
m ZiZj
Vij = o
While machine learning has been applied to doped conjugated o
o , if i j
o
o
polymers, its use has been largely limited to p-type systems. For o |R i R j|
n (1)
example, Yoon et al. developed an ML model using absorbance
spectra to classify p-type polymers with 100% accuracy and where Zi is the atomic number of the ith atom, Ri is its position vector,
predict high conductivities with an R2 of 0.984.25 Jeong et al. and |Ri − Rj| is the Euclidean distance between atoms i and j. The
used Bayesian-optimized tree models to tune processing diagonal elements reflect the interaction of an atom with itself,
conditions for p-doped thermoelectric films, maximizing expressed as a polynomial fit of the atomic energy to the nuclear charge
Zi. In contrast, the off-diagonal elements represent the Coulomb
power factor with minimal trial-and-error experimentation repulsion between nuclei i and j.
across a broad parameter space.26 Sahu et al. trained SVM and To make the size of the Coulomb matrix compatible with the CNN
GPR models on 398 p-doped polymer−dopant pairs to predict input, all Coulomb matrices are zero-padded to the size of 150 × 150.
conductivity and screened over 800,000 candidates, identifying Although such padding could potentially influence the performance of
500 for experimental validation.27 the CNN, it is a necessary trade-off to maintain the consistency of the
To the best of our knowledge, there have been no prior inputs.
reports applying machine learning to predict the properties of n- The Coulomb matrix is symmetric and invariant to translation and
type doped conjugated polymers; we thus set out to leverage ML rotation. However, permutation invariance cannot be achieved with the
to accelerate the identification of conjugated polymers with Coulomb matrix, as its values change when the order of two atoms is
switched. To address this issue, a norm-2 sort was applied. The norm-2
appreciable electrical conductivity (σ) upon n-type doping. Our sort is defined by the following equation
goal is to overcome the longstanding challenges associated with
n-type doping by developing predictive tools that can guide n
molecular design before synthesis. x 2 = |xi|2
In this work, we present several key advances toward that goal: i= 1 (2)
• We constructed a curated database of conjugated where ∥x∥2 denotes the Euclidean norm of a vector x n
, and xi
polymers n-doped with N-DMBI-H, a widely used represents the ith component of x.
molecular dopant.2,11 In the sorted Coulomb matrix, atoms with higher atomic numbers
are assigned higher positions. For atoms sharing the same atomic
• We developed a fusion deep learning model that number, the atom experiencing the most substantial Coulomb
integrates a convolutional neural network (CNN) for repulsion from neighboring atoms is given a higher position. This
capturing structural patterns and an artificial neural sorting approach achieves permutation invariance for the input. These
network (ANN) for leveraging computed quantum matrices were generated using the Python packages DeepChem and
descriptors. RDKit.
2.3. Machine Learning. Artificial neural networks (ANNs) and
• We performed a rigorous double-blind validation to assess convolutional neural networks (CNNs) are computational models
model generalizability. inspired by the structure and functioning of biological neural networks.
• The resulting model demonstrates high predictive ANNs consist of interconnected layers of artificial neurons, where each
accuracy, enabling the reliable identification of high- neuron processes inputs and transmits the results to subsequent layers.
They are widely employed for tasks such as classification, regression,
conductivity n-type polymers before experimental testing.
and clustering.
These results establish a new data-driven framework for In contrast, CNNs are specialized deep learning architectures
accelerating the discovery of high-conductivity n-type organic designed to process data with a grid-like topology, such as images.
materials. These networks have achieved significant success in tasks including
image classification, object detection, and segmentation. In this study,
CNNs were utilized to uncover hidden relationships between polymer
2. METHODS structures and their conductivity.
2.1. First-Principles Calculations. In this study, density functional Building on the strengths of both ANN and CNN, we developed a
theory (DFT) with periodic boundary condition calculations were fusion deep learning model that integrates these two architectures. As
employed to characterize the fundamental electronic properties of shown in Figure 5, numerical data were input into the ANN
polymers, including the highest occupied molecular orbital (HOMO), component, while structural data were processed by the CNN
the lowest unoccupied molecular orbital (LUMO), HOMO−LUMO component. A fully connected layer combined the outputs from both
gap, and electron affinity (EA). To expedite the calculations, several components, enabling the model to predict polymer conductivity.
simplification rules were implemented. For polymers with side chains For hyperparameter optimization, we employed the random search,
containing a carbon chain longer than three carbon atoms, the long an efficient technique to explore high-dimensional hyperparameter
carbon chain portion will be substituted with a methyl group. For spaces. Compared to grid search, random search increases the
polymers whose side chains have a repeating segment, a single repeating likelihood of identifying optimal hyperparameters, thereby enhancing
unit will be employed in the calculations. In all remaining cases, the the model performance.
complete structure will be used for the computation. All geometric Our fusion deep learning model comprises three hidden layers in
optimizations and property calculations were performed using Gaussian both the ANN and CNN components. Given the relatively small size of
16 with the HSE06 functional and the 6−31+G(d) basis set under the data set (84 cases), adding more hidden layers was avoided to
periodic boundary conditions.36 The theoretical properties were prevent overfitting. Detailed descriptions of each layer and their
directly extracted from the Gaussian output files. configurations are provided in Table 1.
2.2. Structural Coulomb Matrix. Among various molecular 2.4. Electron-Doped Polymer Database. In the realm of n-type
representations, the Coulomb matrix offers a simple yet effective global doped organic semiconductors (OSCs), no existing conducting
descriptor that captures the three-dimensional structure of a molecule polymer database is available for predicting experimental conductivity.
by mimicking the electrostatic interactions between its nuclei. The To address this gap, we developed a specialized database that integrates
elements of the Coulomb matrix are computed using the following experimental conductivity data collected from the literature, density
equation functional theory (DFT)-computed electronic descriptors, and DFT-

B https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

Table 1. Hyperparameters of the Fusion Deep Learning 3. RESULTS AND DISCUSSION


Model 3.1. Training Database. The success of the data-driven
component hyperparameter value method relies heavily on the quality of the training data set.
CNN Conv Filters (Layer 1) 128 Among n-dopants, 1,3-Dimethyl-2-phenyl-2,3-dihydro-1H-ben-
Conv Kernel Size (Layer 1) 3 zimidazole (DMBI-H) derivatives, particularly (4-(1,3-dimeth-
Conv Filters (Layer 2) 32 yl-2,3-dihydro-1H-benzoimidazol-2-yl) phenyl) dimethylamine
Conv Kernel Size (Layer 2) 3 (N-DMBI-H), are considered the most popular and successful
Conv Filters (Layer 3) 128 air-stable n-dopants due to their excellent stability and doping
Conv Kernel Size (Layer 3) 3 activity.13 N-DMBI-H has been applied to dope a wide range of
Dropout (Conv Layer 3) 0.2 semiconductors, including both inorganic and organic materi-
Dense Units (CNN Branch) 128 als.37−39 Table 2 highlights recent applications of N-DMBI-H in
ANN Dense Units (Features Branch 1) 128 the organic semiconductor field.
Dropout (Features Branch 1) 0.2
Dense Units (Features Branch 2) 96 Table 2. Thermoelectric Properties of Polymers Doped with
Dropout (Features Branch 2) 0.3 N-DMBI-H, Including LUMO Energy Levels, Electrical
Dense Units (Features Branch 3) 192 Conductivity (σ), and Maximum Seebeck Coefficient (Smax)
Dropout (Features Branch 3) 0.1
LUMO Smax
combined Dense Units (Combined Branch) 192 polymers (eV) σ (S/cm) (μV·K−1) refs
Dropout (Combined Branch) 0.4 PBTI −3.48 2.0 × 10 −3
−7.8 × 10 2
13
general Learning Rate 0.00033 BDPPV −4.01 2.6 × 10−1 −3.0 × 102 28
FBDPPV −4.17 1.4 × 101 −1.4 × 102 28
optimized polymer structures. This database facilitates more accurate LPPV-1 −4.49 1.1 × 100 −1.7 × 102 29
and efficient predictions of OSC properties. PDPF −4.11 1.3 × 100 −2.4 × 102 30
The database comprises 84 experimentally tested polymers that can PFClTVT −4.03 3.8 × 101 −2.6 × 102 31
be doped with N-DMBI-H to achieve good conductivity. These PNDI2OD-2T(N2200) −3.75 1.2 × 10−3 − 32
polymers were carefully selected from the literature, focusing on four f-BTI2TEG-FT −3.82 1.0 × 102 8.2 × 101 33
seed polymers: poly(2,2′-bithiophene-3,3′-dicarboxyimide) (PBTI), PNDTI-BBT-DP −4.40 5.0 × 100 −1.7 × 102 34
benzodifurandione-based poly(phenylene vinylene) (BDPPV), and
P(FBDOPV-2T-C12) −4.00 4.2 × 10−2 −2.7 × 102 35
poly[N,N′-bis(2-octyldodecyl)-naphthalene-1,4,5,8-bis-
(dicarboxyimide)-2,6-diyl]-alt-5,5′-(2,2′-bithiophene) (N2200), poly-
[(diketo-pyrrolopyrrole)-alt-(pyrazine-2-carbonitrile)] P(DPP- To support our study, we curated a database of 84
CNPz), as shown in Figure 1. experimentally tested n-type polymers that can be doped with
To ensure the model’s general applicability, the database also N-DMBI-H, compiled from the literature (see the Supporting
includes polymers with backbones different from the four seed Information). The experimental database comprises conductiv-
polymers. When a specific polymer was reported multiple times in ity data for these polymers. The dopant concentration and other
the literature, the instance with the highest conductivity was selected for
inclusion, ensuring that the database represents the best experimental process conditions could strongly influence the observed
performance for each polymer. conductivities. To ensure data integrity, we applied strict criteria
for conductivity data collection, including consistency in dopant
use, dopant molar ratio, and measurement conditions (see the
Supporting Information for more details). We excluded
conductivity data that used catalysts, such as gold nano-
particles,40 which are known to accelerate electron transfer and
artificially inflate conductivity. Because experimental measure-
ments on electron-doped n-type polymers are relatively scarce,
the database size is inevitably limited. However, this careful
curation ensures consistency and reliability, making it a solid
foundation for model development. In the Supporting
Information, Figure S1 shows the distribution of log-
(conductivity) for the training set, and Table S2 summarizes
the statistics for each polymer family.
To visualize the database, the monomers of each polymer
were clustered by combining Morgan fingerprints41 and the t-
distributed stochastic neighbor embedding (t-SNE) method,42
as shown in Figure 2. The t-SNE algorithm reduces
dimensionality while preserving local structures. It identifies
clusters of similar molecules and maps these to neighboring
points in two-dimensional space.
Here, all 84 monomers are divided into five categories:
BDPPV-type (blue circles), DPP-type (yellow downward
triangles), N2200-type (bluish green diamonds), PBTI-type
(sky blue triangles), and others (orange squares). The regions
Figure 1. The four conjugated backbones and their corresponding are almost clearly separated from each other, demonstrating
structures. both the power of the t-SNE algorithm and the computer’s
C https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

Figure 2. Visualization of the n-type doping database generated using a t-SNE clustering algorithm, where each data point represents an individual
polymer. In the plot, polymers are distinguished by both color and marker shape: BDPPV-type polymers are depicted as blue circles, DPP-type
polymers as yellow downward triangles, N2200-type polymers as bluish green diamonds, other polymers as orange squares, PBTI-type polymers as sky
blue triangles.

ability to distinguish between structures of different monomer ideal input for chemical structural machine learning algorithms.
types. This t-SNE visualization is used solely to illustrate Moreover, the norm-2 SCMs contain more chemical informa-
structural diversity and is not involved in the subsequent tion than the regular CM. Specifically, atoms with higher atomic
machine learning workflow. numbers are assigned higher positions, and the atom
To represent the polymer structures, the Coulomb matrix experiencing the strongest Coulomb repulsion from its
(CM) was used as a global descriptor.43 This simple yet effective neighbors is similarly positioned at a higher level.
method approximates the electrostatic interactions between For n-type doping, the highest occupied molecular orbital
nuclei and captures the three-dimensional structure of the (HOMO) of the dopant must be higher than or close to the
monomers. Unfortunately, the CM does not exhibit permuta- OSC’s lowest unoccupied molecular orbital (LUMO), enabling
tion invariance for the input; any change in the atom order electrons to transfer from the dopant to the polymer. In contrast,
produces a distinct matrix. To address this issue, a norm-2 sorted p-type doping involves the transfer of electrons from the OSC to
Coulomb matrix (SCM) was employed (see the Section 2 for the dopant, which requires the HOMO of the OSC to be higher
details).44 An example of the SCM is presented in Figure S2 in than or close to the LUMO of the dopant. This process
the Supporting Information. The SCM achieves invariance with effectively alters the charge carrier concentration within the
respect to translation, rotation, and permutation, making it an polymer, leading to improved electrical conductivity. Electron
D https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

affinity (EA) is another critical factor in determining the


efficiency of chemical doping.45 Polymers with higher EA values
exhibit a greater tendency to accept electrons, enhancing the
stability of the doped state and improving overall conductivity.
To enable effective ML training and predictive modeling, the
electron-doped n-type polymer data set was augmented with
first-principles data as key descriptors. These include optimized
polymer geometries for constructing structural Coulomb
matrices, as well as crystalline HOMO and LUMO energies,
the HOMO−LUMO energy gap, and monomer electron affinity
(see the Section 2 for details). Notably, the energy gap is defined
as the difference between crystalline LUMO and HOMO. While
this may seem redundant given the availability of HOMO and
LUMO values, it helps reduce noise and provides the model with
an explicit interaction it would otherwise need to infer.
Furthermore, the energy gap is a chemically meaningful
descriptor, often correlating with electrical conductivity. To
complement the information from the crystalline LUMO, we
also include the monomer EA, calculated as the energy
difference between a fully optimized monomer and its anion.
These parameters represent essential electronic structure
features that govern charge transfer in n-doped polymers.
3.2. Statistical Descriptors and Correlation. Key
statistical descriptors were employed to characterize the
distribution of conductivity values. Simultaneously, correlations
between electronic properties and conductivities were quanti-
fied using a correlation heatmap and scatter plots for each
descriptor against conductivity.
Figure 3A illustrates the distribution of conductivity values,
revealing a skewed data set with 71 data points concentrated in
the 0 to 10 range. Notably, the conductivity values span from a
Figure 3. (A) Distribution of conductivity data with a bin width of 5,
minimum of 0.0000020 S/cm to a maximum of 143.00 S/cm, revealing a pronounced skew. The x-axis represents the conductivity of
with a mean of 9.47 S/cm, a median of 0.19 S/cm, and a standard polymers doped with N-DMBI-H at a 20% molar ratio, while the y-axis
deviation of 25.00 S/cm. ML models trained on such an shows the number of samples (frequency) in each bin. (B) Distribution
imbalanced distribution are likely to inaccurately predict higher of logarithmically transformed conductivity data, which is more
values while skew toward the more abundant lower values. To balanced compared to the original conductivity distribution.
address this issue, a logarithmic transformation of the
conductivity data was employed, as shown in Figure 3B. The
transformed data range from a minimum of −5.82 to a maximum
of 2.16, with a mean of −1.10, a median of −0.72, and a standard
deviation of 1.97. This logarithmic representation offers a more
balanced distribution, providing a robust foundation for
subsequent analysis and model training.
Figure 4 illustrates the fundamental correlations between the
electronic descriptors and conductivities. Although the
correlations between logarithmic conductivity and crystalline
LUMO or monomer EA are small, they are still notably stronger
than those between logarithmic conductivity and either the
HOMO or the gap. These results are consistent with the n-type
doping mechanism, in which electrons are introduced into the
polymer’s LUMO,46 affirming the validity and appropriateness
of the selected descriptors. The correlation between crystalline
LUMO and monomer EA is −0.76, suggesting that the property
information provided by these two descriptors is not fully
redundant and that each contributes partially distinct
Figure 4. Pearson correlation heatmap for electronic descriptors versus
information to the model. Scatter plots for all the electronic
conductivity. A color scale bar on the right maps the Pearson
descriptors against logarithmic conductivity are provided in the correlation coefficient, with red indicating a positive correlation and
Supporting Information. blue indicating a negative correlation.
3.3. An ANN-CNN Fusion Deep Learning Model. To use
a data-driven approach to predict polymer conductivity, various
ML models, including ANN, CNN, and a CNN-ANN fusion complete database of 84 polymers, with hyperparameters
model, were implemented and tested in this work. Due to the initialized using random search. To validate the capability and
limited size of the database, all three models were trained on the effectiveness of all the models, cross-validation techniques,
E https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

particularly leave-one-out cross-validation (LOOCV), were connected layer, enabling the model to synthesize both
applied. The performance of each model was evaluated using structural and property-based information.
mean squared error (MSE), root mean squared error (RMSE), Compared to the standalone ANN and CNN models, this
and mean absolute error (MAE) metrics (see the Section 2 for fusion model demonstrated strong predictive performance,
details and the Supporting Information for parity plots of those achieving an MSE of 4.67 log2(S/cm). This result reveals that
three models). conductivity is not solely determined by a single descriptor but is
LOOCV is a cross-validation technique in which each data influenced by both the properties and structures of the polymers.
point is used as a test set once while the remaining data points With a better performance than either standalone model, the
form the training set, with this process repeated for every data fusion deep learning model enhances pattern recognition by
point. For a data set with 84 entries, LOOCV generates 84 test leveraging correlations between structural and property data,
results, each contributing to the statistical distribution of the leading to more robust and generalizable predictions.
performance for each ML model. This approach is particularly It is important to note that, due to the limited size of the
useful here since the database is relatively small, allowing for a database, the fusion deep learning model’s predictive power
thorough evaluation of model performance by utilizing all varies across different polymer types, as evidenced by the
available data for both training and testing. As expected, the LOOCV results in Table 3 (see the Supporting Information for
performance of the ML models varied, as shown in Table 3, details of the LOOCV results). Fourteen LOOCV tests yielded
reflecting their different architectures and underlying mecha- conductivity values within the same order of magnitude as the
nisms. experimental data. Among these, six correspond to the N2200-
type, five to the BDPPV-type, two to the PBTI-type, and one to
Table 3. LOOCV Performance Metrics the DPP-type. These results suggest that the fusion model
performs considerably better for the N2200-type and BDPPV-
model MSE RMSE MAE type than for other structures. Furthermore, an analysis of the
ANN 4.96 2.23 1.72 logarithmic MSE values revealed that 25 models achieved an
CNN 4.77 2.18 1.76 MSE below 1, while 57 models recorded an MSE lower than the
Fusion Model 4.67 2.16 1.76 overall average. This suggests that over half of the tests perform
above average.
The fusion deep learning model was further validated using a
5-fold cross-validation within the ± 15% hyperparameter
The ANN model yielded an MSE of 4.96 log2(S/cm) and an
neighborhood and a 90/10 training/validation split, with the
MAE of 1.72 log(S/cm), indicating moderate predictive
results provided in the Supporting Information.
accuracy. In comparison, the CNN model, specifically designed
After validating the capability and effectiveness of the fusion
to process grid-like data, such as the SCM, achieved an MSE of
deep learning model, it will subsequently be employed for
4.77 log2(S/cm) and an MAE of 1.76 log(S/cm). The CNN’s
double-blind experimental validation and for predicting the
lower MSE and RMSE compared to the ANN indicate that the
conductivity of new polymers.
CNN model exhibits fewer large deviations. The lower MSE and
3.4. Double-Blind Experimental Validation. We con-
RMSE observed in the CNN model suggests that its predictions
ducted a double-blind experiment to evaluate the practical
capture the broad conductivity range arising from variations in
application of the fusion deep learning model. The experimental
polymer structure. The low MAE in the ANN model indicates
group synthesized five n-type polymers, four of the BDPPV-type
that the calculated electronic descriptors are particularly critical
and one of the N2200-type, which were doped with N-DMBI-H,
for accurately predicting the conductivity of certain polymers.
followed by conductivity measurements. Structures of these
This observation motivates the use of structural data to establish
polymers are presented in Figure 6. Subsequently, DFT
a baseline prediction, with the ANN model subsequently
calculations and geometric optimizations for the simplified
employed to calibrate and refine the results.
structures (see the Section 2 for details) were performed to
We thus developed a fusion deep learning model that
leverages the strengths of both CNN and ANN, as illustrated extract the relevant electronic descriptors for prediction. The
in Figure 5. This model was trained to capture the intricate electronic descriptors of these five polymers are shown in Table
relationship between polymer structures and their properties. 4.
During training, the CNN extracted structural patterns from the With the structural data and descriptors in hand, we applied
sorted Coulomb matrix, while the ANN analyzed computed the fusion deep learning model to predict the conductivity of
material properties. These networks were integrated via a fully these five polymers when doped with 20% N-DMBI-H. The
experimental and ML-predicted conductivities are summarized
in Table 5. More detailed experimental results are provided in
the Supporting Information.
When comparing the experimental and predicted results,
some predictions closely match the experimental values, while
others show notable discrepancies. In particular, the predictions
for three polymers, N2200-OEG, BDOPV-TVT, and FBDOPV-
TVTCN, aligned very well with the experimental data. For these
cases, the predicted conductivity not only matched the order of
magnitude of the experimental conductivity but also was
quantitatively close, with the predicted logarithmic conductivity
within 15% of the experimental value.
As for the other two polymers, BDOPV-TVTCN and
Figure 5. Fusion deep learning architecture combines ANN and CNN. FBDOPV-TVT, they exhibited noticeable prediction errors,
F https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

explains the relatively higher prediction errors observed for low-


conductivity polymers (see the parity plot in Figure S9 in the
Supporting Information).
Compared to the mean log(conductivity) values of the
training set (see the Supporting Information), the predicted
value of −1.70 for the N2200 family and the mean value of
−2.33 for the BDPPV family are both far from the median and
mean of their respective families. This suggests that the fusion
model does not simply predict central values for each family.
The agreement between experimental and predicted con-
ductivities highlights the reliability of the fusion deep learning
model in capturing broad trends. However, because two
polymers in the test set are markedly mis-predicted, the model
is best viewed as a rapid screening tool rather than a definitive
predictor. Nevertheless, the strong overall correlation between
predictions and experiments confirms the model’s capacity to
accelerate materials design.
3.5. Test-Set Neighbor Analysis. To elucidate the
prediction mechanism of the fusion deep learning model, we
analyzed the neighboring data points of the five new polymers
from the double-blind experiment. As shown in Figure 7, the
stars represent the new polymers, illustrating their similarity to
monomers in the training set. We computed the Tanimoto
distance between each new polymer and all training data points
to identify the five nearest neighbors for each case.
The nearest neighbors of N2200-OEG are TEG-N2200,
PNDI2TEG-2T, PNDI2C8TEG-2Tz, N2200, and PNDI2-
TEG-T2DEG, all belonging to the N2200-type polymer family,
with a Tanimoto-distance-weighted average logarithmic con-
ductivity of −2.33. Similarly, BDOPV-TVT and BDOPV-
TVTCN share the same neighbors�BDOPV-TCCT,
BDOPV-2T, BDOPV-T, P3a, and BDPPV�all BDPPV-type
polymers, yielding a weighted average logarithmic conductivity
of −2.79. Likewise, FBDOPV-TVT and FBDOPV-TVTCN
share the same neighbors: P(FBDOPV-2T-C12), PFClTVT,
PDTz, FBDPPV, and UFBDPPV, which are all BDPPV-type
polymers, with a weighted average logarithmic conductivity of
−0.85.
As shown in Table 6, the fusion deep learning model reduces
the MAE by 39% and the RMSE by 29% relative to the neighbor-
average conductivity baseline for predicting the conductivity of
the five new polymers. These gains indicate that the model
captures intrinsic structure−property relationships, rather than
merely averaging the values of neighboring data points. This
Figure 6. Experimental structures of five new polymers that served as finding highlights the fusion model’s ability to uncover the
the test set. underlying patterns linking molecular structure, electronic
properties, and conductivity.
Table 4. Electronic Properties of Five New Polymers in eV To further demonstrate the fusion deep learning model’s
polymer HOMO LUMO gap EA ability to capture chemical variation, we applied several baseline
N2200-OEG −5.51 −4.17 1.33 2.53 models, including a simple average model, a random forest, a
BDOPV-TVT −5.25 −4.10 1.15 2.78 kernel ridge regression model, and a Bayesian-optimized ANN
BDOPV-TVTCN −5.51 −4.26 1.25 2.87 model, to the five new polymers. The fusion deep learning model
FBDOPV-TVT −5.35 −4.25 1.09 2.94 achieved a substantially lower error than all baselines, with an
FBDOPV-TVTCN −5.61 −4.40 1.21 3.02 RMSE of 0.84 log(S/cm) (see Supporting Information for
performance details).
which may be attributed to limitations in the training data set. In Furthermore, we applied the Deep Ensemble method to
the region of log(conductivity) < −3, the training set contains assess prediction uncertainty. A set of neural networks with the
only 20 data points, representing about 24% of the total set. same architecture and hyperparameters as the fusion model was
These points are distributed across five polymer families trained independently, each initialized with a different random
(N2200: 7; PBTI: 6; BDPPV: 4; DPP: 2; Other: 1), leaving seed. The ensemble was expanded to 40 models, with
only a few examples per family. This data scarcity, combined performance details provided in the Supporting Information.
with structural diversity, introduces a systematic bias that largely For all five new polymers, the predictive means and standard
G https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

Table 5. Experimental and Machine Learning Predicted Conductivity of Five New Polymers
polymer experimental conductivity (S/cm) experimental log(conductivity) ML predicted log(conductivity) ML predicted conductivity (S/cm)
N2200-OEG 2.00 × 10−2 −1.70 −1.51 3.09 × 10−2
BDOPV-TVT 1.14 × 10−2 −1.94 −1.94 1.15 × 10−2
BDOPV-TVTCN 2.00 × 10−3 −2.70 −1.86 1.38 × 10−2
FBDOPV-TVT 6.47 × 10−4 −3.19 −1.53 2.95 × 10−2
FBDOPV-TVTCN 2.99 × 10−2 −1.52 −1.41 3.89 × 10−2

optical band gap, an increased dielectric constant, and a larger


surface energy.48
Our fusion model’s predictions show that OEG side-chain
substitution modulates polymer conductivity. Among the test
set, seven candidates are predicted to exhibit N-DMBI-H-doped
conductivities between 1.07 and 2.04 S/cm, positioning them in
the 69−76% percentile range of the training distribution, about
an order of magnitude above the median value. These exemplars
outline the structural design space most conducive to high
conductivity and thus serve as practical guidelines for synthesiz-
ing next-generation materials. Detailed outputs from the CNN,
ANN, and fusion models are provided in the Supporting
Information.
Based on the prediction results, we identified three design
rules:
• The first design rule arises from the properties of OEG-
substituted PNDTI-BBT-DP, whose predicted conduc-
tivity is 2.62 S/cm when doped with N-DMBI-H (Figure
Figure 7. Visualization of the database using a t-SNE algorithm, where 8A). This polymer exhibits the lowest LUMO value of
each data point represents an individual polymer. New polymers are
indicated by red stars.
−4.93 eV and demonstrates high conductivity, suggesting
that polymers with lower LUMO values are more likely to
achieve enhanced conductivity when doped with N-
Table 6. Performance of the Fusion Deep Learning Model
DMBI-H.
Versus a Tanimoto-Distance-Weighted Neighbor-Average
Baseline on the Five New Polymers • The second design rule is derived from the properties of
OEG-substituted PCNI-BTI and PCNDTI-BTI, which
metric fusion average baseline feature relatively small monomer units and have the
MAE (log(S/cm)) 0.56 0.92 predicted conductivity of 3.70 S/cm and 1.59 S/cm when
RMSE (log(S/cm)) 0.84 1.19 doped with N-DMBI-H, respectively (Figure 8B). These
polymers exhibit potential for high conductivity, indicat-
ing that the increased electron density of states introduced
by the OEG side chains may enhance charge transport.
deviations remained nearly unchanged, confirming the high • The third design rule is derived from the high predicted
robustness of the fusion model. conductivities when doped with N-DMBI-H of OEG-
3.6. Predicting Conductivities of Polymers. To evaluate substituted P(TDPP-CT2) (1.78 S/cm), TBDPPV (1.63
the fusion model’s capability in predicting the conductivity of S/cm), TBDOPV-T (1.66 S/cm), and TBDOPV-2T
unknown species, we further constructed 20 virtual polymers as (1.46 S/cm), all of which exhibit high conductivity
inputs for the model. All the virtual polymers share the exact (Figure 8C). The predictions for these polymers suggest
backbones present in the training set. Therefore, the OEG that an extended conjugated backbone is a crucial factor in
substitution is the only new structural variable. According to achieving higher conductivity when doped with N-DMBI-
DFT calculations, OEG side chains can increase EAs of H.
polymers by 0.1−0.2 eV, enhancing their tendency to gain
electrons. Additionally, OEG side chains lower the LUMO
energy levels compared to the original structures. Furthermore, 4. CONCLUSIONS AND PERSPECTIVES
the incorporation of OEG side chains alters the structural The goal of this study is to predict conductivity in electron-
characteristics of the polymers, introducing additional variables doped organic polymers through a data-driven approach. We
that may influence their properties. For example, Kim et al. established a database of 84 experimentally tested n-type
verified that the OEG side chain can serve as a solubilizing polymers, dopable by N-DMBI-H, collected from the literature.
group, significantly increasing the electron mobility of n-channel In parallel, a fusion deep learning model combining CNN and
polymer semiconductors and, consequently, their conductiv- ANN was developed and trained to analyze polymer structures
ity.47 Similarly, Chen et al. reported that for poly((2,5-diyl- and DFT-computed descriptors. Using this model, seven
2,3,5,6-tetrahydro-3,6-dioxo-pyrrolo(3,4-c)pyrrole-1,4-diyl)- polymers with OEG side chains were identified as having high
alt-(2,2′:5′,2″-terthiophene-5,5″-diyl)) (PDPP3T) conjugated conductivity. Moreover, the model demonstrated high pre-
polymer backbones, the presence of OEG side chains resulted in dictive accuracy in a double-blind experiment comparing
a reduced π−π stacking distance, higher hole mobility, a smaller experimental measurements and ML predictions.
H https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

Figure 8. Structures of the seven OEG-substituted polymers with predicted N-DMBI-H doped conductivities using the fusion deep learning model.

In the double-blind experiment, the experimental group Our analysis of the predictions for polymers with OEG side
synthesized five unknown polymers, four BDPPV-type and one chains revealed three main factors that influence conductivity
N2200-type, and measured their conductivity. The predicted when doped with N-DMBI-H, providing a foundational
conductivity for the N2200-type polymer, as well as for two of framework for designing high-conductivity n-type polymers.
the BDPPV-type polymers, matched the same order of Specifically, lower LUMO values are associated with higher
magnitude as the experimental values. This result demonstrates conductivity, and polymers derived from small monomers with
the model’s accuracy and reliability for N2200- and BDPPV- short conjugated backbones benefit significantly from side
type polymers, indicating its utility for rapidly identifying chains that enhance electron density. In contrast, for polymers
promising conductivity candidates. Nevertheless, experimental with long conjugated backbones, the side chain’s impact on
validation remains essential before making the materials-design conductivity appears to be minimal, with emphasis placed on
decisions. preserving an extended conjugated backbone with a low LUMO.
I https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

It is important to acknowledge that the database used in this Notes


work is relatively small, suggesting that the full potential of the The authors declare no competing financial interest.
fusion deep learning model has not yet been realized. In
particular, the data set in the high-conductivity regime is limited,
and cross-validation indicates that the model’s predictions in this
■ ACKNOWLEDGMENTS
This work was supported by the Office of Naval Research
range are not fully reliable. As additional high-conductivity (MURI Award 5 N00014-23-1-2001). Computations were
polymers are identified through ML-guided experimental facilitated through the use of advanced computational, storage,
design, we plan to incorporate reinforcement learning to and networking infrastructure provided by the shared facility
enhance the model’s performance in this regime. supported by the University of Washington Molecular Engineer-
The structural descriptor in this work was derived from the ing Materials Center (DMR-2308979) via the Hyak super-
minimum-energy geometry and captures only intrinsic features computer system.
of the polymer molecular structure. Additional factors, such as
the relative orientation between the molecular dopant and the n-
type polymer,7 polymer packing, and the structural flexibility of
■ REFERENCES
(1) Lüssem, B.; Riede, M.; Leo, K. Doping of Organic Semi-
the polymer, may also play important roles in conductivity. conductors. Phys. Status Solidi (A) 2013, 210, 9−43.
These aspects will be investigated in future studies to provide (2) Scaccabarozzi, A. D.; Basu, A.; Aniés, F.; Liu, J.; Zapata-Arteaga,
deeper mechanistic insights into the structure−property O.; Warren, R.; Firdaus, Y.; Nugraha, M. I.; Lin, Y.; Campoy-Quiles, M.;
relationships governing polymer conductivity. et al. Doping Approaches for Organic Semiconductors. Chem. Rev.
This work focuses on polymer conductivity n-doped by N- 2022, 122, 4420−4492.
DMBI-H, rather than on general dopant−polymer complex (3) Salzmann, I.; Heimel, G.; Oehzelt, M.; Winkler, S.; Koch, N.
Molecular Electrical Doping of Organic Semiconductors: Fundamental
formation. As more high-quality experimental data become Mechanisms and Emerging Dopant Design Rules. Acc. Chem. Res. 2016,
available for different types of n-dopants, an additional 49, 370−378.
dimension incorporating these dopants will be integrated into (4) Jacobs, I. E.; Moulé, A. J. Controlling Molecular Doping in
the fusion deep learning model, enabling enhanced capabilities Organic Semiconductors. Adv. Mater. 2017, 29, No. 1703063.
to search for and design highly efficient n-type dopant−polymer (5) Zhao, W.; Ding, J.; Zou, Y.; Di, C.-a.; Zhu, D. Chemical Doping of
complexes. Organic Semiconductors for Thermoelectric Applications. Chem. Soc.
Rev. 2020, 49, 7210−7228.
■ ASSOCIATED CONTENT
* Supporting Information

(6) Yan, L.; Yang, X.; Yang, M.; Neu, J.; Kashani, S.; Giridharagopal,
R.; Olanrewaju, Y.; So, F.; Ginger, D.; Ade, H.; et al. Air-stable n-Type
Dopant for Organic Semiconductors via a Single-Photon Catalytic
The Supporting Information is available free of charge at Process. Sci. Adv. 2025, 11, No. eadu8215.
https://pubs.acs.org/doi/10.1021/jacs.5c09172. (7) Yang, M.; Yang, X.; Lambros, E.; Upadhyay, S.; Yan, L.; You, W.;
Li, X. Unraveling Ground-State Electron Transfer in Photoredox n-
Details of the database, parity plots, an example of sorted Doping of Conjugated Polymers through Real-Time Quantum
Coulomb matrix, experimental conductivity measure- Dynamics. J. Am. Chem. Soc. 2025, 147, 24095−24102.
ments in the double-blind test, and properties of polymers (8) Vijayakumar, V.; Zhong, Y.; Untilova, V.; Bahri, M.; Herrmann, L.;
with OEG side chains (PDF) Biniek, L.; Leclerc, N.; Brinkmann, M. Bringing Conducting Polymers
to High Order: Toward Conductivities Beyond 105 S cm−1 and

■ AUTHOR INFORMATION
Corresponding Author
Thermoelectric Power Factors of 2 mW m−1 K−2. Adv. Energy Mater.
2019, 9, No. 1900266.
(9) Wang, S.; Zuo, G.; Kim, J.; Sirringhaus, H. Progress of Conjugated
Polymers as Emerging Thermoelectric Materials. Prog. Polym. Sci. 2022,
Xiaosong Li − Department of Chemistry, University of 129, No. 101548.
Washington, Seattle, Washington 98195, United States; (10) Lu, Y.; Yu, Z.-D.; Liu, Y.; Ding, Y.-F.; Yang, C.-Y.; Yao, Z.-F.;
orcid.org/0000-0001-7341-6240; Email: [email protected] Wang, Z.-Y.; You, H.-Y.; Cheng, X.-F.; Tang, B.; et al. The Critical Role
of Dopant Cations in Electrical Conductivity and Thermoelectric
Authors Performance of n-Doped Polymers. J. Am. Chem. Soc. 2020, 142,
Ziyu Zhang − Department of Chemistry, University of 15340−15348.
Washington, Seattle, Washington 98195, United States (11) Wei, P.; Oh, J. H.; Dong, G.; Bao, Z. Use of a 1H-Benzoimidazole
Xinzheng Yang − Department of Chemistry, University of Derivative as an n-Type Dopant and To Enable Air-Stable Solution-
Washington, Seattle, Washington 98195, United States; Processed n-Channel Organic Thin-Film Transistors. J. Am. Chem. Soc.
orcid.org/0000-0002-2036-1220 2010, 132, 8852−8853.
Liang Yan − Department of Chemistry, University of North (12) Yang, C.-Y.; Ding, Y.-F.; Huang, D.; Wang, J.; Yao, Z.-F.; Huang,
C.-X.; Lu, Y.; Un, H.-I.; Zhuang, F.-D.; Dou, J.-H.; et al. A Thermally
Carolina, Chapel Hill, North Carolina 27599, United States; Activated and Highly Miscible Dopant for n-Type Organic Thermo-
orcid.org/0000-0003-4122-7466 electrics. Nat. Commun. 2020, 11, No. 3292.
Sungwoo Jung − Department of Chemistry, University of North (13) Feng, K.; Guo, H.; Wang, J.; Shi, Y.; Wu, Z.; Su, M.; Zhang, X.;
Carolina, Chapel Hill, North Carolina 27599, United States Son, J. H.; Woo, H. Y.; Guo, X. Cyano-Functionalized Bithiophene
Wei You − Department of Chemistry, University of North Imide-Based n-Type Polymer Semiconductors: Synthesis, Structure-
Carolina, Chapel Hill, North Carolina 27599, United States; Property Correlations, and Thermoelectric Performance. J. Am. Chem.
orcid.org/0000-0003-0354-1948 Soc. 2021, 143, 1539−1552.
Ting Cao − Department of Materials Science and Engineering, (14) Xiong, M.; Yan, X.; Li, J.-T.; Zhang, S.; Cao, Z.; Prine, N.; Lu, Y.;
University of Washington, Seattle, Washington 98195, United Wang, J.-Y.; Gu, X.; Lei, T. Efficient n-Doping of Polymeric
States; orcid.org/0000-0003-1300-6084 Semiconductors through Controlling the Dynamics of Solution-State
Polymer Aggregates. Angew. Chem., Int. Ed. 2021, 60, 8189−8197.
Complete contact information is available at: (15) Lu, Y.; Yu, Z.-D.; Un, H.-I.; Yao, Z.-F.; You, H.-Y.; Jin, W.; Li, L.;
https://pubs.acs.org/10.1021/jacs.5c09172 Wang, Z.-Y.; Dong, B.-W.; Barlow, S.; et al. Persistent Conjugated

J https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX
Journal of the American Chemical Society pubs.acs.org/JACS Article

Backbone and Disordered Lamellar Packing Impart Polymers with Molecular n-Doping of Organic Semiconductors. Nature 2021, 599,
Efficient n-Doping and High Conductivities. Adv. Mater. 2021, 33, 67−73.
No. 2005946. (34) Wang, Y.; Nakano, M.; Michinobu, T.; Kiyota, Y.; Mori, T.;
(16) Tang, H.; Liang, Y.; Liu, C.; Hu, Z.; Deng, Y.; Guo, H.; Yu, Z.; Takimiya, K. Naphthodithiophenediimide-Benzobisthiadiazole-Based
Song, A.; Zhao, H.; Zhao, D.; et al. A Solution-Processed n-Type Polymers: Versatile n-Type Materials for Field-Effect Transistors and
Conducting Polymer with Ultrahigh Conductivity. Nature 2022, 611, Thermoelectric Devices. Macromolecules 2017, 50, 857−864.
271−277. (35) Bardagot, O.; Kubik, P.; Marszalek, T.; Veyre, P.; Medjahed, A.
(17) Butler, K. T.; Davies, D. W.; Cartwright, H.; Isayev, O.; Walsh, A. A.; Sandroni, M.; Grévin, B.; Pouget, S.; Domschke, T. N.; Carella, A.;
Machine Learning for Molecular and Materials Science. Nature 2018, et al. Impact of Morphology on Charge Carrier Transport and
559, 547−555. Thermoelectric Properties of n-Type FBDOPV-Based Polymers. Adv.
(18) Keith, J. A.; Vassilev-Galindo, V.; Cheng, B.; Chmiela, S.; Funct. Mater. 2020, 30, No. 2000449.
Gastegger, M.; Müller, K.-R.; Tkatchenko, A. Combining Machine (36) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.;
Learning and Computational Chemistry for Predictive Insights Into Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G.
Chemical Systems. Chem. Rev. 2021, 121, 9816−9872. A.; Nakatsuji, H.et al. Gaussian 16, Revision A.03; Gaussian Inc.:
(19) Meuwly, M. Machine Learning for Chemical Reactions. Chem. Wallingford CT, 2016.
Rev. 2021, 121, 10218−10239. (37) Schießl, S. P.; Faber, H.; Lin, Y.-H.; Rossbauer, S.; Wang, Q.;
(20) Bender, A.; Schneider, N.; Segler, M.; Walters, W. P.; Engkvist, Zhao, K.; Amassian, A.; Zaumseil, J.; Anthopoulos, T. D. Hybrid
O.; Rodrigues, T. Evaluation Guidelines for Machine Learning Tools in Modulation-Doping of Solution-Processed Ultrathin Layers of ZnO
The Chemical Sciences. Nat. Rev. Chem. 2022, 6, 428−442. Using Molecular Dopants. Adv. Mater. 2016, 28, 3952−3959.
(21) Pyzer-Knapp, E. O.; Manica, M.; Staar, P.; Morin, L.; Ruch, P.; (38) Haque, M. A.; Villalva, D. R.; Hernandez, L. H.; Tounesi, R.;
Jang, S.; Baran, D. Role of Dopants in Organic and Halide Perovskite
Laino, T.; Smith, J. R.; Curioni, A. Foundation Models for Materials
Energy Conversion Devices. Chem. Mater. 2021, 33, 8147−8172.
Discovery -Current State and Future Directions. npj Comput. Mater.
(39) Lu, Y.; Wang, J.-Y.; Pei, J. Achieving Efficient n-Doping of
2025, 11, No. 61.
Conjugated Polymers by Molecular Dopants. Acc. Chem. Res. 2021, 54,
(22) Merchant, A.; Batzner, S.; Schoenholz, S. S.; Aykol, M.; Cheon,
2871−2883.
G.; Cubuk, E. D. Scaling Deep Learning for Materials Discovery. Nature (40) Stoeckel, M.-A.; Feng, K.; Yang, C.-Y.; Liu, X.; Li, Q.; Liu, T.;
2023, 624, 80−85. Jeong, S. Y.; Woo, H. Y.; Yao, Y.; Fahlman, M.; et al. On-Demand
(23) Pyzer-Knapp, E. O.; Li, K.; Aspuru-Guzik, A. Learning from the Catalysed n-Doping of Organic Semiconductors. Angew. Chem., Int. Ed.
Harvard Clean Energy Project: The Use of Neural Networks to 2024, 63, No. e202407273.
Accelerate Materials Discovery. Adv. Funct. Mater. 2015, 25, 6495− (41) Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J.
6502. Chem. Inf. Model. 2010, 50, 742−754.
(24) Nagasawa, S.; Al-Naamani, E.; Saeki, A. Computer-Aided (42) van der Maaten, L.; Hinton, G. Visualizing Data Using t-SNE. J.
Screening of Conjugated Polymers for Organic Solar Cell: Mach. Learn. Res. 2008, 9, 2579−2605.
Classification by Random Forest. J. Phys. Chem. Lett. 2018, 9, 2639− (43) Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A.
2646. Fast and Accurate Modeling of Molecular Atomization Energies with
(25) Yoon, J. W.; Kumar, A.; Kumar, P.; Hippalgaonkar, K.; Machine Learning. Phys. Rev. Lett. 2012, 108, No. 058301.
Senthilnath, J.; Chellappan, V. Explainable Machine Learning to Enable (44) Hansen, K.; Montavon, G.; Biegler, F.; Fazli, S.; Rupp, M.;
High-Throughput Electrical Conductivity Optimization and Discovery Scheffler, M.; von Lilienfeld, O. A.; Tkatchenko, A.; Müller, K.-R.
of Doped Conjugated Polymers. Knowl.-Based Syst. 2024, 295, Assessment and Validation of Machine Learning Methods for
No. 111812. Predicting Molecular Atomization Energies. J. Chem. Theory Comput.
(26) Jeong, J.; Park, S.; Park, J.; Song, J.; Kwak, J. Machine-Learning- 2013, 9, 3404−3419.
Assisted Process Optimization for High-Performance Organic Thermo- (45) Bredas, J.-L. Mind the gap! Mater. Horiz. 2014, 1, 17−19.
electrics. Adv. Energy Mater. 2025, 15, No. 2403431. (46) Nielsen, C. B.; Turbiez, M.; McCulloch, I. Recent Advances in
(27) Sahu, H.; Li, H.; Chen, L.; Rajan, A. C.; Kim, C.; Stingelin, N.; the Development of Semiconducting DPP-Containing Polymers for
Ramprasad, R. An Informatics Approach for Designing Conducting Transistor Applications. Adv. Mater. 2013, 25, 1859−1880.
Polymers. ACS Appl. Mater. Interfaces 2021, 13, 53314−53322. (47) Kim, R.; Kang, B.; Sin, D. H.; Choi, H. H.; Kwon, S.-K.; Kim, Y.-
(28) Shi, K.; Zhang, F.; Di, C.-A.; Yan, T.-W.; Zou, Y.; Zhou, X.; Zhu, H.; Cho, K. Oligo(ethylene glycol)-incorporated Hybrid Linear Alkyl
D.; Wang, J.-Y.; Pei, J. Toward High Performance n-Type Thermo- Side Chains for n-channel Polymer Semiconductors and Their Effect on
electric Materials by Rational Modification of BDPPV Backbones. J. The Thin-film Crystalline Structure. Chem. Commun. 2015, 51, 1524−
Am. Chem. Soc. 2015, 137, 6979−6982. 1527.
(29) Lu, Y.; Yu, Z.; Zhang, R.; Yao, Z.; You, H.; Jiang, L.; Un, H.; (48) Chen, X.; Zhang, Z.; Ding, Z.; Liu, J.; Wang, L.
Dong, B.; Xiong, M.; Wang, J.; Pei, J. Rigid Coplanar Polymers for Diketopyrrolopyrrole-based Conjugated Polymers Bearing Branched
Stable n-Type Polymer Thermoelectrics. Angew. Chem., Int. Ed. 2019, Oligo(Ethylene Glycol) Side Chains for Photovoltaic Devices. Angew.
58, 11390−11394. Chem., Int. Ed. 2016, 55, 10376−10380.
(30) Yang, C.; Jin, W.; Wang, J.; Ding, Y.; Nong, S.; Shi, K.; Lu, Y.; Dai,
Y.; Zhuang, F.; Lei, T.; et al. Enhancing the n-Type Conductivity and
Thermoelectric Performance of Donor-Acceptor Copolymers through
Donor Engineering. Adv. Mater. 2018, 30, No. 1802850.
(31) Han, J.; Fan, H.; Zhang, Q.; Hu, Q.; Russell, T. P.; Katz, H. E.
Dichlorinated Dithienylethene-Based Copolymers for Air-Stable n-
Type Conductivity and Thermoelectricity. Adv. Funct. Mater. 2021, 31,
No. 2005901.
(32) Nava, D.; Shin, Y.; Massetti, M.; Jiao, X.; Biskup, T.; Jagadeesh,
M. S.; Calloni, A.; Duò, L.; Lanzani, G.; McNeill, C. R.; et al. Drastic
Improvement of Air Stability in an n-Type Doped Naphthalene-
Diimide Polymer by Thionation. ACS Appl. Energy Mater. 2018, 1,
4626−4634.
(33) Guo, H.; Yang, C.-Y.; Zhang, X.; Motta, A.; Feng, K.; Xia, Y.; Shi,
Y.; Wu, Z.; Yang, K.; Chen, J.; et al. Transition Metal-Catalysed

K https://doi.org/10.1021/jacs.5c09172
J. Am. Chem. Soc. XXXX, XXX, XXX−XXX

You might also like