Machine Learning of Percolation Models Using Graph Convolutional Neural Networks
Machine Learning of Percolation Models Using Graph Convolutional Neural Networks
𝑒𝑒12 𝑒𝑒23
𝑣𝑣1 𝑒𝑒21 𝑣𝑣2𝑒𝑒32 𝑣𝑣3 𝑣𝑣𝑖𝑖 ℎ𝑖𝑖0
𝑒𝑒14 𝑒𝑒25 𝑒𝑒36
𝑒𝑒41 𝑒𝑒45 𝑒𝑒52 𝑒𝑒56 𝑒𝑒63
𝑙𝑙
� 𝜎𝜎 ⋅ ⊙ℎ𝑔𝑔(⋅)
𝑣𝑣4 𝑒𝑒54 𝑣𝑣5 𝑒𝑒65 𝑣𝑣6 𝑖𝑖
𝑒𝑒47 𝑒𝑒58 𝑒𝑒69
𝑒𝑒74 𝑒𝑒78 𝑒𝑒85 𝑒𝑒89 𝑒𝑒96 𝑒𝑒𝑖𝑖𝑖𝑖
𝑣𝑣7 𝑒𝑒87 𝑣𝑣8 𝑒𝑒98 𝑣𝑣9
(𝑎𝑎1 ) Square lattice (𝑎𝑎2 ) Graph representation (𝑎𝑎3 ) Embedding (𝑏𝑏1 ) Convolution layer
𝑒𝑒12 𝑒𝑒23
output
FIG. 2. The main idea of this work. (a) The three snapshots, graph representation containing sites {vi } and {eij } and the
embedding layer, (b) are the convolution and pooling layers, (c) are the full connection layer and the ouput layer. Supervised
learning results in an ‘X’-shaped curve and unsupervised learning corresponds to a ‘W’-shaped neural network performance.
FIG. 3. After two graph convolutions, site 13 obtains infor- where  = A + I denotes the adjacency matrixPwith in-
mation from its next nearest neighbor site. (a) the l = 1-th serted self-loops for the super-nodes and D̂ii = j=0 Âij
iteration (b) the l = 2-th iteration. The square lattice with is diagonal degree matrix. Θ is the filter parameter ma-
L = 5 is used. After l iterations, the site 13 gets the informa- trix representing the probability that the node is assigned
tion of l-order neighbors. to any clusters (super nodes) for the next hierarchical
pooling layer. Then, one has to calculate the feature
X (k+1) and the adjacency matrix A(k+1) of the next layer,
and itself iteratively. In Fig. 2 (b2), the pooling is per-
by the equations as follows,
formed [32]. From a 3 × 3 graph, one obtains a super
node, representing the feature of the whole graph. T
X (k+1) = S (k) X (k) ∈ Rnk+1 ×d , (4)
(c) Output. In Fig. 2(c1), the final classification re-
(k) T
sult is obtained by training the full connection neural A(k+1) = S A(k) S (k) ∈ Rnk+1 ×nk+1 . (5)
network. In Fig. 2(c2), the results of the output neuron
have an “X” shape output for supervised learning. For In general, the iteration obeys the flow as in equation
GCN
the unsupervised method, one determines the true pc by A(k) , X (k) −→ S (k) −→ A(k+1) , X (k+1) .
the position of the peak of “W” shaped performance [23]. Message passing.— The previous operations of convo-
The algorithm for a batch of training is shown in ap- lution and pooling are implemented based on the message
pendix I. passing mechanism [35], by which one can obtain infor-
Convolution and pooling in more detail.— In each con- mation from “adjacent” nodes, and realize the “convolu-
volution layer, the features on the neighboring nodes and tion” operation on the graph and aggregate the informa-
edges are concatenated together, i.e., zijl
= hlvi ⊕ hlvj ⊕ tion of surrounding nodes. The message passing consists
attrvi ,vj , where ⊕ represents the concatenation opera- of the delivery and the readout steps. During the delivery
tion. Then, the convolution is performed by [33], step, the feature hlvi of each node vi is updated according
to Eq. 6 and Eq. 7 given by,
X
mlvi = l
σ(zij Wfl + blf ) g(zij
l
Wsl + bls ), (1)
vj ∈N (vil )
X
mlvi = Ml (hl−1 l−1
vi , hvj , evi,j ), (6)
vj ∈N (vil )
where {vj |vj ∈ N (vi )} represents the collection of adja-
cent nodes of node vi , W is the weight matrix and b is the hlvi = Ul (hl−1 l
vi , mvi ), (7)
bias, l ∈ {1, 2, 3, · · · } represents the convolution of layer
number, σ (·) is the sigmoid function σ (x) = 1+e1−x , g (·) where Ml (·) is the aggregate function, and Ul (·) is the
is the softplus function g (x) = log (1 + ex ). Finally, the update function. In Figs. 3 (a) and (b), when l = 1,
new feature hl+1 l+1 l l the site 13 absorbs messages from its four neighbors 12,
vi is updated by hvi = hvi + mvi .
In the pooling layer, we use a differentiable graph pool- 14, 8, and 18. At the same time, these neighbors also
ing module [32], based on the graph hierarchical pooling. absorb information from their respective neighborhoods,
In Figs. 2 (b2), in a specified graph collapse way, a 3 × 3 i.e., h0i is replaced with h1i . For the second iteration,
graph becomes a new graph with 2 × 2 nodes and finally l = 2, site 13 indirectly grabs information from the next
collapses to a super node. The way of graph collapse can nearest neighbors, connected by the dashed blue lines. In
be realized by a cluster assignment matrix S ∈ Rn1 ×n2 , the readout stage, the function R(·) is used to calculate
where n1 is the number of nodes in the graph Gl and the feature vector of the whole graph as follows,
n2 is the number of nodes of the new graph. Si,j = 1 y = R({hlv |v ∈ V (G)}), (8)
indicates that the node vi belongs to the j-th cluster or
supernode. Another example is shown in the appendix II, where R(·) sums features from all nodes and it can be
where the matrix S is defined manually (hard). In the some learnable differentiable function, for example, the
real simulation, for a graph with nk nodes, a soft learn pooling function.
4
output
SL16
0.6
SL20
in the largest differences between the two classes, there-
TL8 0.5
fore leads to a high classification accuracy obtained by 0.4 TL12 0 1/20 1/8
our GCN. The idea of the confusion method [23] here is TL16
1/L
Sq L=16
lattices simultaneously, i.e., we train the GCN using the
input graphs with a fixed number of nodes but different
numbers of edges simultaneously, which is impossible for
a NN. To test and validate the idea, we first study site
percolation models on the triangular and square lattices FIG. 4. (a) Supervised learning outputs double-“X” shaped
using supervised machine learning methods. curves with system sizes L = 8, 12, 16, 20. “SL” and “TL”
The results are shown in Fig. 4 (a). We see the two “X” means the square lattices and the triangular lattices, respec-
shaped outputs for triangular lattices (TL) and square tively. Inset is a finite size scaling for the pc . (b) The un-
lattices (SL). The lattice sizes are L = 8, 12, 16, 20, re- supervised leasing produces “W”-shaped performance curves
for system sizes of L = 8, 16, respectively.
spectively. In the training dataset, 100 configurations are
generated for each lattice type (SL, TL) under each pi .
The data set of different lattice types but the same size
also put degree centrality in the feature for each node in
of configurations are fed into the same neural network to
the embedding, which is defined as
participate in training. The same network, however, can
predict the thresholds on different types of lattices simul- degree(vi )
taneously, which is impossible to achieve with non-graph dc = , (9)
(ntotal − 1)
networks. The obtained threshold is 0.488(2) for the TL
and 0.58(1) for the SL by the inset in Fig. 4 (a). The where degree(vi ) is the degree of the node vi , ntotal is the
accuracy can continue to be improved by better training total number of nodes. The features such as properties
of the neural network. We also test the bond percolation of edges, occupation probability, and degree centrality
and the predictions are also acceptable, but not shown. make the network perform better.
The second idea is to predict the percolation threshold Conclusion and Outlook.— In conclusion, using the ad-
using unsupervised learning. In Fig. 4(b), using the con- vantages of graph neural networks, we have solved a dif-
fusion method [23, 29, 30], we obtain the performance ficult problem since the year 2017 [23, 25–27, 29, 30].
with the “W” shape. In these datasets, there are 100 The NN can not distinguish the different structures of
-1000 configurations per pi with pi ∈ [0, 1] interval of lattices if the site status are fed. Fortunately, the GCN
0.1. The lines are the real data, which is the average of captures the information about the connections between
20 bins. The colorful bands also mark the error bars. the sites, as well as global features like lattice percolation.
The location of the peak of the “W” shaped curve is the We show a possibility that the GCN can act as a general
percolation threshold, which is around pc = 0.593. In framework that can simultaneously detect the percola-
the simulation, to get a better result for the confusion tion threshold on different lattices in a supervised man-
method, in addition to putting pi in the embedding, we ner. The intersection in the “X” shaped output curves
5
is the critical point. For unsupervised machine learning, [11] C. M. Fortuin and P. W. Kasteleyn, On the random-
the position of the “W”-shaped performance also reflects cluster model: I. Introduction and relation to other mod-
the predicted percolation threshold well. At the technical els, Physica 57, 536 (1972).
level, we build a graph neural network suitable for lattice [12] C. Fortuin, On the random-cluster model ii. the percola-
tion model, Physica 58, 393 (1972).
statistical physical models such as the percolation model. [13] M. E. Fisher, Magnetic critical point exponents—their in-
Our work is helpful for the extension of GCN to many terrelations and meaning, J. Appl. Phys. 38, 981 (1967).
percolation-related topics. Further, we build a more gen- [14] A. Coniglio and W. Klein, Clusters and Ising critical
eral neural network to train physical systems with fixed droplets: a renormalisation group approach, J. Phys. A
nodes but different topologies. Math. Theor. 13, 2775 (1980).
[15] C.-K. Hu, Percolation, clusters, and phase transitions in
Acknowledgments We thank for the valuable discus- spin models, Phys. Rev. B 29, 5103 (1984).
sion from Tzu-Chieh Wei about the confusion method, [16] J. Carrasquilla and R. G. Melko, Machine learning phases
and also thank Junyin Zhang, Bo Zhang and Longxiang of matter, Nature Physics 13, 431 (2017).
Liu for useful discussions. W.Z. was supported by the [17] G. Carleo and M. Troyer, Solving the Quantum Many-
Body Problem with Artificial Neural Networks, Sci-
Hefei National Research Center for Physical Sciences at
ence 355, 602 (2017), arXiv:1606.02318 [cond-mat,
the Microscale (KF2021002), and Peng Huanwu Center’s physics:quant-ph].
Visiting Scientist Program for 2022 by Institute of The- [18] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld,
oretical Physics, Chinese Academy of Science, Beijing. N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Machine
L.Z. and H.T. were supported by NSFC under Grant No. learning and the physical sciences, Rev. Mod. Phys. 91,
51901152. Y.D. is supported by the National Natural 045002 (2019).
Science Foundation of China under Grant No. 11625522, [19] L. Wang, Discovering phase transitions with unsuper-
vised learning, Phys. Rev. B 94, 195105 (2016).
the Science and Technology Committee of Shanghai un-
[20] L. van der Maaten and G. Hinton, Visualizing data us-
der grant No. 20DZ2210100, and the National Key R and ing t-sne, Journal of Machine Learning Research 9, 2579
D Program of China under Grant No. 2018YFA0306501. (2008).
[21] J. F. Rodriguez-Nieva and M. S. Scheurer, Identifying
topological order through unsupervised machine learn-
ing, Nat. Phys. 15, 790 (2019).
[22] M. S. Scheurer and R.-J. Slager, Unsupervised machine
∗
Corerespoinding author: [email protected] learning and band topology, Phys. Rev. Lett. 124, 226401
†
Corresponding author: [email protected] (2020).
[1] S. Broadbent and J. M. Hammersley, Percolation pro- [23] E. P. L. Van Nieuwenburg, Y.-H. Liu, and S. D. Huber,
cesses. i. crystals and mazes (1957). Learning phase transitions by confusion, Nat. Phys. 13,
[2] R. G. Larson, L. E. Scriven, and H. T. Davis, Percolation 435 (2017).
theory of residual phases in porous media, Nature 268, [24] J. Shen, W. Li, S. Deng, and T. Zhang, Supervised and
409 (1977). unsupervised learning of directed percolation, Phys. Rev.
[3] K. P. Krishnaraj and P. R. Nott, Coherent force chains E 103, 052140 (2021).
in disordered granular materials emerge from a percola- [25] W. Zhang, J. Liu, and T.-C. Wei, Machine learning of
tion of quasilinear clusters, Phys. Rev. Lett. 124, 198002 phase transitions in the percolation and X Y models,
(2020). Physical Review E 99, 032142 (2019).
[4] O. Riordan and L. Warnke, Explosive Percolation Is Con- [26] D. Bayo, A. Honecker, and R. A. Römer, Machine learn-
tinuous, Science 333, 322 (2011). ing the 2d percolation model, Journal of Physics: Con-
[5] M. Li, R.-R. Liu, L. Lü, M.-B. Hu, S. Xu, and Y.-C. ference Series 2207, 012057 (2022).
Zhang, Percolation on complex networks: Theory and [27] S. Cheng, F. He, H. Zhang, K.-D. Zhu, and Y. Shi,
application, Physics Reports 907, 1 (2021), percolation Machine learning percolation model, arXiv preprint
on complex networks: Theory and application. arXiv:2101.08928 (2021).
[6] M. E. J. Newman, Spread of epidemic disease on net- [28] W. Zhang, J. Liu, and T.-C. Wei, Machine learning of
works, Phys. Rev. E 66, 016128 (2002). phase transitions in the percolation and XY models,
[7] J. Fan, J. Meng, Y. Ashkenazy, S. Havlin, and Phys. Rev. E 99, 032142 (2019).
H. J. Schellnhuber, Climate network percolation re- [29] R. Xu, W. Fu, and H. Zhao, A new strategy in applying
veals the expansion and weakening of the tropical the learning machine to study phase transitions, arXiv
component under global warming, Proceedings of preprint arXiv:1901.00774 (2019).
the National Academy of Sciences 115, E12128 (2018), [30] J. Zhang, B. Zhang, J. Xu, W. Zhang, and Y. Deng,
https://www.pnas.org/doi/pdf/10.1073/pnas.1811068115. Machine learning for percolation utilizing auxiliary Ising
[8] M. Pant, D. Towsley, D. Englund, and S. Guha, Percola- variables, Physical Review E 105, 024144 (2022).
tion thresholds for photonic quantum computing, Nature [31] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu,
Communications 10, 1070 (2019). L. Wang, C. Li, and M. Sun, Graph neural networks:
[9] X. Feng, Y. Deng, and H. W. J. Blöte, Percolation transi- A review of methods and applications, AI Open 1, 57
tions in two dimensions, Phys. Rev. E 78, 031136 (2008). (2020).
[10] W. Huang, P. Hou, J. Wang, R. M. Ziff, and Y. Deng, [32] R. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and
Critical percolation clusters in seven dimensions and on J. Leskovec, Hierarchical Graph Representation Learning
a complete graph, Phys. Rev. E 97, 022107 (2018). with Differentiable Pooling, arXiv:1806.08804 [cs, stat]
6
(2019), arXiv:1806.08804 [cs, stat]. The matrix A and S are shown as follows,
[33] T. Xie and J. C. Grossman, Crystal graph convolutional
neural networks for an accurate and interpretable predic-
tion of material properties, Phys. Rev. Lett. 120, 145301
(2018).
[34] T. N. Kipf and M. Welling, Semi-Supervised Clas-
sification with Graph Convolutional Networks,
arXiv:1609.02907 [cs, stat] (2017), arXiv:1609.02907 [cs,
stat].
[35] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and
G. E. Dahl, Neural Message Passing for Quantum Chem-
istry, arXiv:1704.01212 [cs] (2017), arXiv:1704.01212
0 1 1 1 0 0 1 0 0 10 0 0
[cs]. 1
0 1 0 1 0 0 1 0
1
0 0 0
1 1 0 0 0 1 0 0 1 00 1 0
1 0 0 0 1 1 1 0 0 00 0 1
I. ALGORITHMS
A(0) (0)
0
= 1 0 1 0 1 0 1 0 S = 10, 0 0
0 0 1 1 1 0 0 0 1 00 1 0
1 0 0 1 0 0 0 1 1 00 0 1
Algorithm 1 A batch of training
0
1 0 0 1 0 1 0 1 00 0 1
Initialize conv num=L;pool num=k;len G=|V |
0 0 1 0 0 1 1 1 0 01 0 0
Input: percolation snapshots
Output: Category probability p1 , p2 (10)
1: data input
where Sij = 1 if and only if node i belongs to the cluster
Generate percolation graph representation G j or the super node j. Using the transformation S T AS,
∀i, j, Initialize {vi |vi ∈ V (G)}, {eij |eij ∈ E(G)} one gets a new adjacency matrix A(1) defined as
∀i, j, use FCN to get new node embedding h0vi
2: data processing
for num=0 to conv num do:
∀i, j, zvl i = concat(hlvi , hlvj , evi vj )
hlvi = hl−1 σ(M LP (zvl i )) g(M LP (zvl i ))
P
vi +
for len G to 1 do:
Gk = Dif f P ool(Gk−1 )
3: data output
p1 , p2 = Sof tmax(M LP (Gk ))
end
4 3 5 0
3 2 1 2
II. AN EXAMPLE OF HARD CLUSTER A(1) = S (0)T A(0) S (0) = , (11)
NUMBER ASSIGNMENT 5 1 4 2
0 2 2 0
1 2 3 1 2
4 5 6 1
3 4
7 8 9
FIG. 5. Graph collapse example. A 9-node graph is collapsed where 4, 3, 5 in the first row means the super node 1 is
into a 4-super-node graph. The three nodes (1, 2, 5) in blue connected to the super nodes 1, 2, 3, which is consistent
become a super node. The nodes in the pink shadow, green with the description in Fig. 5. A(1) is an new adjacency
shadow and yellow shadow are performed similarly matrix with weights.