Crystal GCN
Crystal GCN
net/publication/320726915
CITATIONS READS
253 2,074
2 authors, including:
Tian Xie
Massachusetts Institute of Technology
20 PUBLICATIONS 475 CITATIONS
SEE PROFILE
All content following this page was uploaded by Tian Xie on 08 June 2018.
The use of machine learning methods for accelerating the design of crystalline materials usually requires
manually constructed feature vectors or complex transformation of atom coordinates to input the crystal
structure, which either constrains the model to certain crystal types or makes it difficult to provide chemical
insights. Here, we develop a crystal graph convolutional neural networks framework to directly learn material
properties from the connection of atoms in the crystal, providing a universal and interpretable representation
of crystalline materials. Our method provides a highly accurate prediction of density functional theory
calculated properties for eight different properties of crystals with various structure types and compositions
after being trained with 104 data points. Further, our framework is interpretable because one can extract the
contributions from local chemical environments to global properties. Using an example of perovskites, we
show how this information can be utilized to discover empirical rules for materials design.
DOI: 10.1103/PhysRevLett.120.145301
Machine learning (ML) methods are becoming increas- demonstrate the interpretability of the CGCNN by
ingly popular in accelerating the design of new materials by extracting the energy of each site in the perovskite structure
predicting material properties with accuracy close to ab initio from the total energy, an example of learning the contri-
calculations, but with computational speeds orders of mag- bution of local chemical environments to the global
nitude faster [1–3]. The arbitrary size of crystal systems property. The empirical rules generalized from the results
poses a challenge as they need to be represented as a fixed are consistent with the common knowledge for discovering
length vector in order to be compatible with most ML more stable perovskites and can significantly reduce the
algorithms. This problem is usually resolved by manually search space for high throughput screening.
constructing fixed length feature vectors using simple The main idea in our approach is to represent the crystal
material properties [1,3–6] or designing symmetry-invariant structure by a crystal graph that encodes both atomic
transformations of atom coordinates [7–9]. However, the information and bonding interactions between atoms,
former requires a case-by-case design for predicting different and then build a convolutional neural network on top of
properties, and the latter makes it hard to interpret the models the graph to automatically extract representations that are
as a result of the complex transformations. optimum for predicting target properties by training with
In this Letter, we present a generalized crystal graph DFT calculated data. As illustrated in Fig. 1(a), a crystal
convolutional neural networks (CGCNN) framework for graph G is an undirected multigraph which is defined by
representing periodic crystal systems that provides both nodes representing atoms and edges representing connec-
material property prediction with density functional theory tions between atoms in a crystal (the method for determin-
(DFT) accuracy and atomic level chemical insights. Recent ing atom connectivity is explained in the Supplemental
advances in “deep learning” have enabled learning from a Material [12]). The crystal graph is unlike normal graphs
very raw representation of data, e.g., pixels of an image, since it allows multiple edges between the same pair of end
making it possible to build general models that outperform nodes, a characteristic for crystal graphs due to their
traditionally expert designed representations [10]. By periodicity, in contrast to molecular graphs. Each node i
looking into the simplest form of crystal representation, is represented by a feature vector vi, encoding the property
i.e., the connection of atoms in the crystal, we directly build of the atom corresponding to node i. Similarly, each edge
convolutional neural networks on top of crystal graphs ði; jÞk is represented by a feature vector uði;jÞk correspond-
generated from crystal structures. The CGCNN achieves ing to the kth bond connecting atom i and atom j.
similar accuracy with respect to DFT calculations as DFT The convolutional neural networks built on top of the
compared with experimental data for eight different proper- crystal graph consist of two major components: convolu-
ties after being trained with data from the Materials Project tional layers and pooling layers. Similar architectures have
[11], indicating the generality of this method. We also been used for computer vision [22], natural language
145301-2
PHYSICAL REVIEW LETTERS 120, 145301 (2018)
(a) (b)
function, a significant improvement compared to Eq. (4).
In Fig. S3, we compare the effects of several other hyper-
parameters on the MAE which are much smaller than the
effect of the convolution function.
Figures 2(b) and 2(c) show the performance of the two
models on 9350 test crystals for predicting the formation
energy per atom. We find a systematic decrease of the
(c) (d) MAE of the predicted values compared with DFT calcu-
lated values for both convolution functions as the number
of training data is increased. The best MAEs we achieved
with Eqs. (4) and (5) are 0.136 and 0.039 eV=atom,
respectively, and 90% of the crystals are predicted within
0.3 and 0.08 eV=atom errors. In comparison, Kirklin et al.
reports that the MAE of the DFT calculation with respect to
experimental measurements in the Open Quantum
FIG. 2. Performance of CGCNN on the Materials Project Materials Database is 0.081–0.136 eV=atom depending
database [11]. (a) Histogram representing the distribution of on whether the energies of the elemental reference states
the number of elements in each crystal. (b) Mean absolute error as are fitted, although they also find a large MAE of
a function of training crystals for predicting formation energy per
0.082 eV=atom between different sources of experimental
atom using different convolution functions. The shaded area
denotes the MAEs of DFT calculations compared with experi- data. Given the comparison, our CGCNN approach pro-
ments [28]. (c) 2D histogram representing the predicted for- vides a reliable estimation of DFT calculations and can
mation per atom against DFT calculated value. (d) Receiver potentially be applied to predict properties calculated by
operating characteristic curve visualizing the result of metal- more accurate methods like GW [30] and quantum
semiconductor classification. It plots the proportion of correctly Monte Carlo calculations [31].
identified metals (true positive rate) against the proportion of After establishing the generality of the CGCNN with
wrongly identified semiconductors (false positive rate) under respect to the diversity of crystals, we next explore its
different thresholds. prediction performance for different material properties.
We apply the same framework to predict the absolute
where ⊕ denotes concatenation of atom and bond feature energy, band gap, Fermi energy, bulk moduli, shear moduli,
ðtÞ ðtÞ and Poisson ratio of crystals using DFT calculated data
vectors, W c , W s , and bðtÞ are the convolution weight
from the Materials Project [11]. The prediction perfor-
matrix, self-weight matrix, and bias of the tth layer,
mance of Eq. (5) is improved compared to Eq. (4) for all six
respectively, and g is the activation function for introducing
properties (Table S4). We summarize the performance in
nonlinear coupling between layers. By optimizing hyper-
Table I and the corresponding 2D histograms in Fig. S4.
parameters in Table S1, the lowest mean absolute error
As we can see, the MAEs of our model are close to or
(MAE) for the validation set is 0.108 eV=atom. One
higher than DFT accuracy relative to experiments for most
limitation of Eq. (4) is that it uses a shared convolution
ðtÞ properties when ∼104 training data are used. For elastic
weight matrix W c for all neighbors of i, which neglects properties, the errors are higher since less data are available,
the differences of interaction strength between neighbors. and the accuracy of DFT relative to experiments can be
To overcome this problem, we design a new convolution expected if ∼104 training data are available (Fig. S5).
function that first concatenates neighbor vectors
ðtÞ ðtÞ ðtÞ
zði;jÞ ¼ vi ⊕ vj ⊕ uði;jÞk , then perform convolution by
k TABLE I. Summary of the prediction performance of seven
different properties on test sets.
ðtþ1Þ ðtÞ
X ðtÞ ðtÞ ðtÞ
vi ¼ vi þ σ zði;jÞ W f þ bf # of train
k
j;k
Property data Unit MAEmodel MAEDFT
ðtÞ ðtÞ ðtÞ
⊙g zði;jÞ W s þ bs ; ð5Þ Formation 28 046 eV=atom 0.039 0.081–0.136 [28]
k
energy
Absolute 28 046 eV=atom 0.072
where ⊙ denotes element-wise multiplication and σ energy
denotes a sigmoid function. In Eq. (5), the σð·Þ functions Band gap 16 458 eV 0.388 0.6 [32]
as a learned weight matrix to differentiate interactions Fermi energy 28 046 eV 0.363
ðtÞ Bulk moduli 2041 log(GPa) 0.054 0.050 [13]
between neighbors and adding vi makes learning deeper
Shear moduli 2041 log(GPa) 0.087 0.069 [13]
networks easier [29]. We achieve MAE on the validation
Poisson ratio 2041 0.030
set of 0.039 eV=atom using the modified convolution
145301-3
PHYSICAL REVIEW LETTERS 120, 145301 (2018)
145301-4
PHYSICAL REVIEW LETTERS 120, 145301 (2018)
elements. Inspired by this result, we applied a combina- [5] L. M. Ghiringhelli, J. Vybiral, S. V. Levchenko, C. Draxl,
torial search for stable perovskites using elements from and M. Scheffler, Phys. Rev. Lett. 114, 105503 (2015).
groups 13–15 as the A site and groups 4–6 as the B site. [6] O. Isayev, D. Fourches, E. N. Muratov, C. Oses, K. Rasch,
Because of the theoretical inaccuracies of DFT calculations A. Tropsha, and S. Curtarolo, Chem. Mater. 27, 735 (2015).
and the possibility of metastable phases that can [7] K. T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K. R.
Müller, and E. K. U. Gross, Phys. Rev. B 89, 205118
be stabilized by temperature, defects, and substrates, many
(2014).
synthesizable inorganic crystals have positive calculated [8] F. Faber, A. Lindmaa, O. A. von Lilienfeld, and R.
energies above hull at 0 K. Some metastable nitrides Armiento, Int. J. Quantum Chem. 115, 1094 (2015).
can even have energies up to 0.2 eV=atom above hull as [9] A. Seko, H. Hayashi, K. Nakayama, A. Takahashi, and I.
a result of the strong bonding interactions [35]. In this work, Tanaka, Phys. Rev. B 95, 144110 (2017).
since some of the perovskites are also nitrides, we choose to [10] Y. LeCun, Y. Bengio, and G. Hinton, Nature (London) 521,
set the cutoff energy for potential synthesizability at 436 (2015).
0.2 eV=atom. We discovered 33 perovskites that fall within [11] A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S.
this threshold out of 378 in the entire data set, among which 8 Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder et al.,
are within the cutoff out of 58 in the test set (Table S5). Many APL Mater. 1, 011002 (2013).
of these compounds like PbTiO3 [36], PbZrO3 [36], SnTaO3 [12] See Supplemental Material at http://link.aps.org/
[37], and PbMoO3 [38] have been experimentally syn- supplemental/10.1103/PhysRevLett.120.145301 for further
details, which includes Refs. [4,13–21].
thesized. Note that PbMoO3 has calculated energy of
[13] M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine,
0.18 eV=atom above hull, indicating that our choice of A. Gamst, M. Sluiter, C. K. Ande, S. Van Der Zwaag, J. J.
cutoff energy is reasonable. In general, chemical insights Plata et al., Sci. Data 2, 150009 (2015).
gained from the CGCNN can significantly reduce the search [14] R. Sanderson, Science 114, 670 (1951).
space for high throughput screening. In comparison, there [15] R. Sanderson, J. Am. Chem. Soc. 74, 4792 (1952).
are only 228 potentially synthesizable perovskites out of [16] B. Cordero, V. Gómez, A. E. Platero-Prats, M. Revés, J.
18 928 in our database: the chemical insight increased the Echeverría, E. Cremades, F. Barragán, and S. Alvarez,
search efficiency by a factor of 7. Dalton Trans. 21, 2832 (2008).
In summary, the crystal graph convolutional neural net- [17] A. Kramida, Y. Ralchenko, J. Reader et al., Atomic Spectra
works present a flexible machine learning framework for Database (National Institute of Standards and Technology,
material property prediction and design knowledge extrac- Gaithersburg, MD, 2013).
tion. The framework provides a reliable estimation of DFT [18] W. M. Haynes, CRC Handbook of Chemistry and Physics
(CRC Press, Boca Raton, FL, 2014).
calculations using around 104 training data for eight proper-
[19] D. Kingma and J. Ba, arXiv:1412.6980.
ties of inorganic crystals with diverse structure types and [20] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever,
compositions. As an example of knowledge extraction, we and R. Salakhutdinov, J. Mach. Learn. Res. 15, 1929 (2014).
apply this approach to the design of new perovskite materials [21] V. A. Blatov, Crystallography Reviews 10, 249 (2004).
and show that information extracted from the model is [22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, in Advances
consistent with common chemical insights and significantly in Neural Information Processing Systems (MIT Press,
reduces the search space for high throughput screening. Cambridge, MA, 2012), pp. 1097–1105.
The code for the CGCNN is available from Ref. [39]. [23] R. Collobert and J. Weston, in Proceedings of the 25th
International Conference on Machine Learning (ACM,
This work was supported by Toyota Research Institute. New York, 2008), pp. 160–167.
Computational support was provided through the National [24] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell,
Energy Research Scientific Computing Center, a DOE T. Hirzel, A. Aspuru-Guzik, and R. P. Adams, in Advances
Office of Science User Facility supported by the Office of in Neural Information Processing Systems (MIT Press,
Science of the U.S. Department of Energy under Contract Cambridge, MA, 2015), pp. 2224–2232.
No. DE-AC02-05CH11231, and the Extreme Science and [25] M. Henaff, J. Bruna, and Y. LeCun, arXiv:1506.05163.
Engineering Discovery Environment, supported by National [26] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and
Science Foundation Grant No. ACI-1053575. G. E. Dahl, Proceedings of the 34th International Conference
on Machine Learning, 2017, http://proceedings.mlr.press/
v70/gilmer17a.html.
[27] M. Hellenbrandt, Crystallography Reviews 10, 17 (2004).
[1] A. Seko, A. Togo, H. Hayashi, K. Tsuda, L. Chaput, and I. [28] S. Kirklin, J. E. Saal, B. Meredig, A. Thompson, J. W. Doak,
Tanaka, Phys. Rev. Lett. 115, 205901 (2015). M. Aykol, S. Rühl, and C. Wolverton, npj Comput. Mater. 1,
[2] F. A. Faber, A. Lindmaa, O. A. von Lilienfeld, and R. 15010 (2015).
Armiento, Phys. Rev. Lett. 117, 135502 (2016). [29] K. He, X. Zhang, S. Ren, and J. Sun, in Proceedings
[3] D. Xue, P. V. Balachandran, J. Hogden, J. Theiler, D. Xue, of the IEEE Conference on Computer Vision and Pattern
and T. Lookman, Nat. Commun. 7, 11241 (2016). Recognition (IEEE, New York, 2016), pp. 770–778.
[4] O. Isayev, C. Oses, C. Toher, E. Gossett, S. Curtarolo, and [30] M. S. Hybertsen and S. G. Louie, Phys. Rev. B 34, 5390
A. Tropsha, Nat. Commun. 8, 15679 (2017). (1986).
145301-5
PHYSICAL REVIEW LETTERS 120, 145301 (2018)
[31] W. Foulkes, L. Mitas, R. Needs, and G. Rajagopal, Rev. [35] W. Sun, S. T. Dacek, S. P. Ong, G. Hautier, A. Jain, W. D.
Mod. Phys. 73, 33 (2001). Richards, A. C. Gamst, K. A. Persson, and G. Ceder, Sci.
[32] A. Jain, G. Hautier, C. J. Moore, S. P. Ong, C. C. Fischer, T. Adv. 2, e1600225 (2016).
Mueller, K. A. Persson, and G. Ceder, Comput. Mater. Sci. [36] G. Shirane, K. Suzuki, and A. Takeda, J. Phys. Soc. Jpn. 7,
50, 2295 (2011). 12 (1952).
[33] M. De Jong, W. Chen, R. Notestine, K. Persson, G. [37] J. Lang, C. Li, X. Wang et al., Mater. Today: Proc. 3, 424
Ceder, A. Jain, M. Asta, and A. Gamst, Sci. Rep. 6, (2016).
34256 (2016). [38] H. Takatsu, O. Hernandez, W. Yoshimune, C. Prestipino, T.
[34] I. E. Castelli, T. Olsen, S. Datta, D. D. Landis, S. Dahl, K. S. Yamamoto, C. Tassel, Y. Kobayashi, D. Batuk, Y. Shibata,
Thygesen, and K. W. Jacobsen, Energy Environ. Sci. 5, A. M. Abakumov et al., Phys. Rev. B 95, 155105 (2017).
5814 (2012). [39] CGCNN website, https://github.com/txie-93/cgcnn.
145301-6