0% found this document useful (0 votes)
28 views34 pages

Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges

Uploaded by

davicoutinho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views34 pages

Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges

Uploaded by

davicoutinho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Graph Neural Networks in Particle Physics: Implementations,

Innovations, and Challenges


Savannah Thais∗1 , Paolo Calafiura2 , Grigorios Chachamis3 , Gage DeZoort1 , Javier
Duarte4 , Sanmay Ganguly5 , Michael Kagan6 , Daniel Murnane2 , Mark S. Neubauer7 ,
arXiv:2203.12852v2 [hep-ex] 25 Mar 2022

and Kazuhiro Terao6


1
Princeton University
2
Lawrence Berkeley National Lab
3
Laboratório de Instrumentação e Fı́sica Experimental de Partı́culas (LIP)
4
University of California San Diego
5
ICEPP, University of Tokyo
6
Stanford Linear Accelerator Laboratory
7
University of Illinois at Urbana-Champaign

Submitted to the Proceedings of the US Community Study


on the Future of Particle Physics (Snowmass 2021)

ABSTRACT

Many physical systems can be best understood as sets of discrete data with
associated relationships. Where previously these sets of data have been formu-
lated as series or image data to match the available machine learning architec-
tures, with the advent of graph neural networks (GNNs), these systems can be
learned natively as graphs. This allows a wide variety of high- and low-level
physical features to be attached to measurements and, by the same token, a
wide variety of HEP tasks to be accomplished by the same GNN architectures.
GNNs have found powerful use-cases in reconstruction, tagging, generation and
end-to-end analysis. With the wide-spread adoption of GNNs in industry, the
HEP community is well-placed to benefit from rapid improvements in GNN la-
tency and memory usage. However, industry use-cases are not perfectly aligned
with HEP and much work needs to be done to best match unique GNN capa-
bilities to unique HEP obstacles. We present here a range of these capabilities,

Contact Editor, [email protected]

1
predictions of which are currently being well-adopted in HEP communities, and
which are still immature. We hope to capture the landscape of graph tech-
niques in machine learning as well as point out the most significant gaps that
are inhibiting potentially large leaps in research.

2
1 Introduction

Machine learning (ML) has had a profound impact on physics research; particularly in
particle physics, ML has been successfully applied to a broad range of critical tasks includ-
ing data collection, physics object reconstruction and identification, Standard Model (SM)
measurements and new physics searches, experiment design and operation, and more [1–3].
Initially, ML applications in particle physics focused on traditional classification and regres-
sion methods (boosted decisions trees, support vector machines, shallow neural networks
(NNs), etc) using physics-motivated high-level features. However, more recent work has
employed a variety of more complex deep learning architectures including deep NNs, convo-
lutional neural networks (CNNs), and recurrent neural networks (RNNs). These methods
allow the use of low-level information, often energy deposits in detectors, rather than de-
rived variables and have inspired an assortment of different data representations such as
images and sequences. Additionally, the adoption and integration of cutting-edge ML meth-
ods has enabled closer collaboration between the particle physics and ML communities and
created opportunities for physics researchers to directly contribute to the development of
state-of-the-art ML architectures.
Over the past several years, Geometric Deep Learning (GDL) has emerged as a highly
impactful sub-field of ML focused on learning from non-Euclidean data structures including
sets, groups, graphs, and manifolds [4]. These methods have been successfully applied to
a broad range of scientific and societal domains like knowledge dependency representation
[5], physical system modeling [6], chemical and drug discovery [7], community connection
mapping [8], and much more. In particular, graph neural networks (GNNs) are an impactful
class of GDL algorithms that operate on graphs. Excellent reviews and summaries of GNNs
detailing the spectrum of current implementations are available in [4, 9–12] and we provide
a introductory overview of GNNs in the following section.
Graphs are a data representation that describes objects (represented as graph nodes)
and their pairwise relationships (represented as graph edges). Graph structures are able
to effectively capture complex relationships and dependencies between objects which is
essential for accurately representing physical data. For particle physics data, graph-based
representations provide several advantages over alternative data representations: unlike
vector- or grid-like structures, graphs allow for variable size data (i.e. one does not have to
lose information or zero-pad structures); additionally, graphs are better suited for dealing
with sparse and heterogeneous detector data that can be difficult to project into image-
based representations and do not require the application of an artificial ordering scheme as
required by sequence-based representations. Graphs are able to represent a broad range of
particle physics data including energy deposits in a detector, individual physics objects like
tracks or missing energy, individual particles or groups of particles, or even heterogeneous
information; an example of different graph representations of particle physics data is shown
in Figure 1.
Given the advantages of graph-based representations, it is unsurprising that GNNs have
been successfully applied to many problems in particle physics. Many of these applications,
particularly GNNs applied to data reconstruction tasks, are summarized in [13, 14] and

3
Figure 1: Examples of graph representations of particle physics data: (a) clustering tracking
detector hits into tracks, (b) segmenting calorimeter cells, (c) classifying events with multiple
types of physics objects, (d) jet classification based on the particles associated to the jet.
Image taken from [13]

4
a range of current GNN applications are described in Section 2. However, despite the
success of these methods, there are still substantial challenges that must be overcome in
order to maximize the potential of these techniques and integrate them into particle physics
experiments. In addition to providing an overview of current implementations, this paper
focuses on describing these outstanding challenges and considerations specific to physical
applications of GNNs in Section 3 and makes suggestions for directions researchers should
focus on to address these challenges in Section 4.

1.1 Graph Neural Networks

Following the success of convolutional neural networks for grid-structured data (i.e. pixels
in an image), graph neural networks aim to extend many of the same powerful techniques
to irregular, graph-like data structures. Initial work was done in parallel on generalising
recurrence [15–19] and convolution [20–22] operations to graphs, so-called RecGNNs and
ConvGNNs, respectively. Historically, RecGNNs were motivated by time-series data and
therefore handled graphs that were dynamic across time but had little sophistication in
how relational data was communicated. Conversely ConvGNNs were motivated by spectral
graph theory [23,24], using the graph Laplacian to capture higher-order features, in analogy
with the filtering of features performed by CNNs. In recent years however, research has
converged on some common, high-performing conventions. Convolutions have been shown
to perform very well in the “spatial” domain [25–28] - that is, not necessarily in physical
space, but as opposed to the spectral domain. A spatial convolution is generally defined
by two steps: an aggregation step and an update step. There is some arbitrariness to this
distinction, but for most GNN tasks that involve learning some hidden representation of
graph nodes, it is useful.
We begin by defining a graph G = (u, V, E) as a collection of node features {vi } = V ,
edge features {eij } = E, and graph features {ug } = u. For the (l+1)th convolution iteration,
a node i’s hidden representation vil+1 can be computed by
0
Aggregate : vil = ρ(el+1
ij ), where el+1 e l l l
ij = φ (vi , vj , eij ) and j ∈ Ni
l0
U pdate : v l+1 = φv (vi , vil , ul )
In words: Features are aggregated around a node i’s neighborhood Ni by first computing
each eij , called the “message” on the edge connecting node i and node j. The particular
message function φe is dependent on the choice of architecture, and may be isotropic (a
node treats all of its neighbors equivalently) or anisotropic (a node has some mechanism of
attention, or edges have their own feature space). The message function may also include an
MLP. Messages are aggregated around a node, where ρ stands for any permutation-invariant
aggregation. The new node features can be combined with previous node features, or some
higher features u belonging to the graph (or even node-level, non-local features), and passed
through a node-wise MLP. This node update is represented by φv . We can represent the
whole process by the diagram in fig 2, which also allows for graph features u.
Just as GNN convolutions have settled on a general language and set of best-practices,
so has the notion of recurrence. Because GNNs now typically have multiple convolutions (or

5
Figure 2: A GNN convolution as defined by [29]. φe defines a message function, ρe→v is an
aggregation around nodes, and φv is a node update. Graph-level features u can optionally
be produced.

“message passing steps”), there are two dimensions for feature update: one across message
passing steps (i.e. aggregation and update MLPs share weights), and one across spatio-
temporal steps (i.e. each MLP shares weights with a previous time-step). The former is
common, and generally good practice for reducing the size of a GNN and improving training
stability. The latter is an area of active research in so-called Spatio-Temporal GNNs [30,31],
where edge connections and even the existence of nodes may change over time. In the case
of edge connections changing between message passing steps, we will call these Dynamic
GNN architectures. The landscape of typical GNNs is given in fig 3a.

2 Current Uses of GNNs in HEP

2.1 Reconstruction and Identification

To date, the majority of graph-based learning algorithms in HEP focus on reconstruction


(clustering) and identification (classification) tasks. Despite having fundamentally different
objectives, their corresponding GNN-based workflows have structural similarities. Most
begin with an unordered set of data points, for example tracker hit spatial positions or
particle-level kinematic features. Graph construction routines embed these points as graph
nodes, extending edges between them to represent relational information. This process
may occur explicitly before the GNN, or otherwise be repeated dynamically as part of
the learning algorithm. The choice of edges has important downstream effects; message-
passing GNNs learn by aggregating information across each node’s local neighborhood,
which is defined by its incident edges. GNNs produce node-level, edge-level, or graph-level
predictions; these predictions may be leveraged by a subsequent post-processing algorithm.
In this way, graph-based learning algorithms can be characterized by their implementations
of three key steps: 1) graph construction, 2) GNN inference, 3) post-processing.

6
(a) The landscape of GNNs

(b) Reduction properties of a GNN.

Figure 3: GNN taxonomy

Reconstruction applications typically involve an underlying clustering task, which may


occur in real space or in a learned latent space. Some of the first GNN-based reconstruction
algorithms followed the edge classification paradigm, in which GNNs are trained to predict
the strength of each node-node relationship (an edge weight) encoded by an edge [32]. This
approach is prevalent in GNN-based tracking pipelines, where edges represent hypothesized
particle trajectories; in this scheme, edge weights produced by a GNN may be used to reject
edges outright to form a set of disjoint subgraph clusters, or leveraged by a downstream track
clustering algorithm. Many variants of this pipeline have been proposed, including a variety
of graph construction algorithms and post-processing track finding modules [33–35]. These

7
edge-classifying GNN architectures are frequently based on the Interaction Network [6],
iteratively applying both edge blocks, MLPs designed to re-embed each set of edge features,
and node blocks, message passing modules that leverage the re-embedded edge features, to
a latent representation of the input graph.
Though the edge classification paradigm has been applied to calorimeter segmentation
in HGCal-like data [36], most GNN-based calorimeter reconstruction studies focus on node
classification, such as predicting the fractional assignments of hits to different showers, and
graph classification tasks like particle identification and energy regression. Many such ap-
plications leverage the novel GravNet and GarNet GNN layers [37], proposed as lightweight
alternatives to EdgeConv-style [38] dynamic graph construction and subsequent message
passing. Notably, GravNet has been implemented on an FPGA via HLS4ML [39]. GravNet
was recently used to facilitate object condensation [40], the process of clustering nodes be-
longing to a common object and extracting that object’s properties in one-shot, on HGCal-
like calorimeter data [41]. EdgeConv-based networks have also been used for overlapping
calorimeter shower disentanglement [42]. Additionally, Dynamic Reduction Networks exist
at the intersection of reconstruction and classification, learning an optimal graph pooling
strategy (e.g. calorimeter hit clustering) to boost subsequent classification or regression
performance [43].
Event-level reconstruction tasks have also been explored by graph-based methods. For
example, pileup rejection has been addressed as a node classification in which pileup scores
are predicted for particles embedded as graph nodes [44, 45]. A GNN-based PF algorithm
was developed to operate on graphs with heterogeneous nodes corresponding to tracks
and calorimeter clusters [46, 47]. Set2Graph functions, a class of learnable approximators
mapping sets to graphs, were applied to vertex reconstruction by predicting edges between
an input set of tracks [48].
Identification tasks usually focus on graph classification or segmentation. Jet iden-
tification, the process of tagging a particle initiating a jet, has been addressed by a wide
range of graph-based learning algorithms. Many focus on jets represented as particle clouds,
unordered sets of particles embedded as graph nodes with kinematic features, subsequently
applying set-based learning functions (e.g. Energy Flow Network) [49–51], message passing
with adjacency learning [52], or EdgeConv blocks [53]. Others still represent jets as graphs
explicitly, for example applying interaction networks [54] or graph attentional pooling lay-
ers [45] to produce classification scores. Jet substructure has been explicitly leveraged to
identify jets, for example through secondary vertex finding [55] or graph-based models rep-
resentations of two-point correlations between subjets [56]. Set-based classifiers have also
been applied to jets represented as sets of tracks [57].
GNNs have also been applied to signal identification, typically at the level of an entire
event. For example, MPNNs have been applied to event graphs containing heterogeneous
particle nodes (i.e. the node features contain explicit particle labels) with distance-weighted
edges to classify stop pair production [58], CP odd/even Higgs decays to bb̄ with associated
semileptonic tt̄ decays [59], and di-Higgs events with bb̄W W ∗ final states [60]. Permutation
invariant set-based architectures (with no explicit graph structure) have been developed to
address the combinatorics of jet matching in fully-hadronic tt̄ events with novel attention

8
mechanisms [61, 62]. Graphs have also been used to represent decay chains in semi-leptonic
tt̄ decays, wherein particles are embedded as heterogeneous nodes and parent-child decay
relationships are represented as edges [63].

2.2 Data Generation

Sanmay, Grigoris, Jean-roch


Generative models has a potential big impact in producing simulated data, to be used for
high energy physics analysis [64]. A real detector simulation is extremely time consuming
and CPU intensive task. In future, for high pileup run condition like HL-LHC, the com-
plexity of a full simulation is going to scale-up extensively. Hence a generative model based
fast simulation, which can reliably replicate the detector smearing and shower generation,
is essential.
The most general representation of a general collider event is a point-cloud or graph,
where each node of a point-cloud refers to an energetic cell in the detector or a hit in the
tracker. Hence a generative model for producing graph data is a natural choice for many
high energy physics simulation tasks.
There is another potential application of GNNs on even more theoretical grounds. Our
current understanding of perturbation theory applied within the SM is mostly based on a
diagrammatic approach. Although there have been studies that focus on the complexity
and adjacency matrix representations of Feynman diagrams (mainly for integrable systems
in Quantum Chromodynamics and N = 4 SYM theories) the use of GNNs has not yet
been employed. As one introduces higher order corrections, which generally correspond to
diagrams with more loops and more external legs, the number of diagrams to be computed
increases very rapidly. However, we know that at each order in the perturbative expansion,
there are unifying principles shared by the Feynman diagrams at that order that could be
exploited and potentially lead to a much faster computation of the matrix elements needed
to calculate partonic cross-sections. These are needed for almost every phenomenological
study at collider experiments.
A message passing based point cloud generation has been suggested in article [65] where
the authors propose a new architecture called MPGAN to solve the task. It is shown
that the MPGAN model outperforms other generative models on every considered metric.
Another auto-encoder based GNN has been proposed in the article [66] to reconstruct LHC
events. A great amount of progress in fast simulation has been achieved using generative
model, as has been shown recently [67]. Usage of message passing based networks will
further enhance the performance in future. Another crucial task will be multi-segmented
point cloud generation for collider events for future colliders.

9
2.3 Algorithm Acceleration

While large experimental workflows are typically run on CPUs, it can be advantageous to
run ML inference on heterogeneous coprocessors, such as graphics processing units (GPUs),
field-programmable gate arrays (FPGAs), or even specialized AI processors like Graphcore
IPUs [68,69], Intel Habana Goya [70] and Gaudi [71] cards, Google TPUs [72], and Cerebras
wafer-scale engines [73]. Using heterogeneous computing resources as a service for ML
inference has been demonstrated for a variety of use cases in experimental workflows [74–76].
In particular, large GNN models like ParticleNet [53] and machine-learned particle-flow
(MLPF) [46,47,77] have been accelerated using GPUs as a service. Work has also been done
to accelerate the inference of neural networks with FPGAs [78–86], including GNNs [87–96].
Acceleration of GNN training and inference on coprocessors is promising direction for future
R&D.

3 Challenges

A significant motivation for developing ML models for particle physics applications is the
substantial computing and storage resource needs of experiments. Particularly, the planned
high luminosity upgrade of the LHC (HL-LHC) will significantly increase both data density
and processing complex; as shown in Figure 41 , the foreseen computing requirements cannot
be met without significant R&D efforts in algorithmic innovation and optimization, espe-
cially for data reconstruction and simulation pipelines. Graph-based representations and
GNN architectures have shown substantial promise for a variety of particle physics tasks, in-
cluding reconstruction and simulation, and in some cases have demonstrated better scaling
properties, reduced resource utilization, more efficient data representations, and increased
opportunity for parallelization and acceleration compared to traditional methods. However,
several challenges remain to integrate these methods directly into experimental pipelines
and reap the benefits of these innovations in practice. Furthermore, there difficulties specific
to formulating particle physics problems in a graph-based framework and identifying ap-
propriate architectures and evaluation methods. We outline the core challenges in applying
GNNs in particle physics below.

Formulating HEP problems as graph problems

It is not always straightforward to describe a particle physics problem in the formalism of


graphs and GNNs. As described in Section 2, the same task can be formulated in multiple
ways; for instance, the problem of charged particle tracking can be conceptualized as a graph
edge classification task, an object condensation task, or an instance segmentation/object
identification task. Each of these task types requires different graph construction, GNN
architecture design, and training considerations.
Graph construction determines the flow of information across the graph and allows for
1
source: https://twiki.cern.ch/twiki/bin/view/CMSPublic/CMSOfflineComputingResults

10
CMS Public
Total CPU

Total CPU[kHS06-years]
40000 2021 Estimates
No R&D improvements
R&D most probable outcome
10 to 20% annual resource increase
30000

20000

10000 Run 3 Run 4 Run 5

0 2021 2023 2025 2027 2029 2031 2033


Year

CMS Public
Total CPU HL-LHC (2029/No R&D Improvements) fractions
2021 Estimates
Other: 2%
GEN: 8%

RECO: 42% DIGI: 8%

Analysis: 4%

SIM: 14%

RECOSim: 23%

Figure 4: Approximate breakdown of CPU time, disk and tape requirements into primary
processing and analysis activities for CMS during a typical HL-LHC year.

the representation and re-embedding of edge features. Due to the large size of detector
datasets it is often not possible to form fully connected graphs and researchers must choose
a graph construction method that respects the underlying physics of the data and the
learning task. Several different graph construction methods have been explored for particle
physics data including detector geometry based construction [33], (dynamic) k-nearest-
neighbors [53], learned-representations [37], and set representations that avoid the need for
edge construction altogether [50]. However, the impact of these graph construction methods
on the downstream performance of a GNN model is not well characterized; we propose
that further research on this subject and the exploration of additional graph construction
algorithms will further enhance the expressivity of graph representations of particle physics
data.
Similarly, the specific GNN architecture used should be informed by the learning objec-
tive and the characteristics of the particle physics data. Although there is not a complete

11
model of the impact of GNN architecture design on overall model performance, there are
several key considerations that can be informed by knowledge of the underlying physics
problem. For example, the decision of where to place the aggregation step of edge or node
update blocks can be informed by the expected relationship between neighboring nodes;
in [97] the authors assume that jet-tagging performance is affected by the ∆R between
neighboring jets and so their architecture re-embeds graph edges then aggregates the rep-
resentations for node updates while in [50] the authors re-embed each node representation
independently and aggregate at the graph level. The choice of how many GNN modules to
‘stack’ together in an architecture can be informed by the expected distribution of infor-
mation across the graph; in addition to allowing higher-level representations of the data,
multiple GNN modules also allow the exchange of information from progressively distant
neighbors. Including attention mechanisms in a GNN architecture allows the model to learn
to emphasize or deemphasize certain information during the aggregation step, for instance,
allowing a jet tagging model to learn that secondary decay products are the most impor-
tant signal for classification. The popular particle physics GNN architecture, GarNet [37]
uses a multi-headed self attention mechanism [98] to allow s different node representations
that are weighted and combined with a pre-defined, data-informed function. Finally, the
choice of loss function is subject to several considerations such as whether or not the loss
function preserves the symmetries of the data and the GNN architecture ( for example, the
traditional MSE loss function is not invariant with respect to permutations of the output
and targets). Additionally, in multitask architectures like [99] the balancing of task-specific
loss terms becomes an important consideration.
In addition to implementation related concerns, there are important theoretical con-
straints and considerations. Unlike traditional feed-forward networks, message passing
GNNs are not universal approximators. While precisely understanding the expressive power
of GNNs is still an open question, recent work [100–102] has begun to characterize their
expressivity and potential for useful performance on different task types. In particular, on
the task of graph isomorphism identification, [101] demonstrated that the expressivity of
message passing GNNs is at most equivalent to the Weisfeiler-Lehman heuristic. In the
future, it is important that physicists are aware of these potential limitations to GNN ex-
pessivity and engage with graph theory and the broader ML research ecosystem to explore
alternative GNN formalisms.

Limitations of current tools

Compared to the well-established CV and NLP communities, tooling for graph-structured


data is relatively immature. The most popular ML frameworks Pytorch [103] and Ten-
sorflow/Keras [104] have many built-in tools for CV and NLP tasks (e.g. TorchText and
TorchVision) but do not natively support GNN approaches. Community extensions are
making good progress, but there can be a steep learning curve, since the underlying frame-
works are not tightly coupled with the extension (e.g. development and documentation may
not be aligned). Aside from the direct task of GNN training and inference, there are many
tangential tasks we may wish to perform on graph data. Those tasks which have historical
significance in graph theory are generally supported, for example traversal and community

12
finding [105]. But these are not yet well-integrated into DL libraries, and are not always
available on accelerators. Further, libraries that are optimized for sparse data on GPU may
not be suitable for scientific data. For example, point cloud tasks often overlap with GNN-
pipeline tasks, but libraries are often limited (or optimized towards) 3 dimensions [106].
A scientific pipeline may need 4 dimensions (e.g. for spacetime vectors) [107], or even N
dimensions (e.g. for operations in some learned latent space) [108].
Again, compared to CNN, RNN and dense graph (i.e. transformer) structures, ML
optimization libraries for GNNs are immature. Many libraries that apply network pruning
or quantization assume a model can be compiled to Onnx [109]. However, Onnx has not
previously supported many of the operations required for message passing, whether the
“scatter reduce” operation in Pytorch, or “unsorted sum” operation in TensorFlow. This
has very recently been implemented, but these operations will need to be tested and carefully
optimized for. Interpretability and explainability are notoriously slippery concepts, and
offering libraries to capture them is already difficult in the scientific space, compared with
mainstream ML. For example, there are many techniques for understanding the performance
of CNN predictions on images, using Captum [110]. However, this doesn’t extend naively
even to images in science - e.g. jet calorimetry image pixels are overlays of many particle
charges. This is even more difficult in graphs, for which concepts of edge and shape feature
filtering does not translate easily. Much work will need to be done in understanding even how
to frame the question of interpreting and explaining GNN performance, let alone building
general libraries that can apply to multiple scientific domains.

Speed of algorithms and memory footprint

The discrimination power of GNNs often stems from their ability to capture complex,
sparsely-represented relational features that live on the edges - called “anistropic message
passing” above. This characteristic, which distinguishes GNNs from most other ML ap-
proaches, is also the most costly component of training and inference. In training, gra-
dients must be back-propagated through edge features, meaning memory usage typically
scales with number of edges, and with number of message passing steps [111]. While gradi-
ent checkpointing [112] can reduce the scaling with MP steps to O(1), the scaling with edges
still remains a bottleneck for large graphs - as are typical in particle physics events. Some
early solutions to this are discussed in the following section. Memory usage in inference is
not such a hurdle where timing is not a priority. However, in the case of low-latency require-
ments, GNNs have been shown to work on FPGA systems with the HLS4ML library [89,113].
Graph size on this hardware can be tuned between O(10) edges and O(1000), with latency
scaling as the inverse to graph size. While this allows very low latency performance, in some
applications graphs may be several orders larger, and indiscrimately sampling and distribut-
ing a graph across devices can affect a GNN’s predictive capacity [114]. Ultimately, many
of the issues of memory and timing stem from the immaturity of sparse graph operations
on accelerators. Many fused and memory-efficient operations that have been painstakingly
designed for dense image data are still being redesigned for the sparse case.

13
Integration of GNNs with full experiment pipelines

Many experiments are now making efforts to incorporate at least some concept of graph
techniques or full GNN architectures in the data gathering and analysis pipeline [89, 115,
116]. Unlike some previous ML solutions for HEP phenomenology, graph-based ML is shown
to benefit from incorporation of both low-level and high-level features [117], presumably due
to its ability to hierarchically represent short and long-distance local information, as well
as graph-level information. While this holds great promise for high-accuracy processing, it
requires careful thought on how to integrate GNN-based pipelines across multiple stages
of an experiment’s dataflow. For example, the traditional ATLAS track reconstruction
involves many hand-engineered stages that alternate between very low-level pixel charge
information to reconstructed spacepoints to high-level calorimeter towers. Which of these
being used depends on the hardware constraints and the goal of the particular stage. A
GNN-based pipeline would benefit from the inclusion of multiple levels of feature, which
must be then propagated across heterogeneous computing devices. Significant time and
expertise are required to perform the same GNN operations across CPU, GPU and FPGA
(as three example modalities for difference latency and budget requirements). And for
the operations which are easily transferable, they are usually not optimized for the target
hardware (e.g. graph construction on CPU vs. GPU). There are promising libraries that
aim to provide modular operations for GNN and graph manipulation, and these will be
discussed in the following section. A utopian situation would be a common HEP pipeline
for graph-based analysis, but this is challenged by the very different data structures across
experiment types. Even within an experiment, a sample may have heterogeneous types for
the points of data (i.e. the nodes) and their relationships (i.e. the edges).
Within an experiment’s GNN integration, there are sizeable differences in training and
inference environments compared to industry and cloud environments. While historical
motivation for GNNs in HEP came from computer vision (CV), where treating physics
data as images has had great success, the implementation of GNNs in HEP fundamentally
shares more similarity with the natural language processing (NLP) world. There, large
language models (LLM) are built with transformers (conceptually equivalent to GNNs) and
trained across many HPC nodes for many weeks. While training is prohibitively costly for
many, models are then highly optimized for low memory/latency inference and implemented
cheaply. The HEP community may be able to take inspiration from this behavior, where
similar experiments could pool their resources to train large GNNs for complex physics tasks
in the style of LLM (i.e. on a centralized cloud with as abstract data representations as
possible). For each particular experiment, inference can then be optimized to the particular
hardware, physics and budget constraints, and the general models transfer-trained (i.e.
fine-tuned) towards an experiment’s particular geometry.

Collaboration across domains

In addition to the technical considerations outlined above, we also highlight that interdis-
ciplinary collaboration is critical to the continued success of this research direction. In
particular, in order for physicists to stay apprised of state-of-the-art innovations in graph

14
representation methods and GNN architectures and theoretical developments in geometric
machine learning, it is essential to build and maintain close collaborations with the broader
ML and CS community. This community building is mutually beneficial as it can advance
physics-informed and physics-inspired ML development and increase the broader research
community’s interest in and knowledge of particle physics computing problems and unique
data considerations. While there are efforts in these areas (for example the ML and the
Physical Sciences Workshop at NeurIPS2 , the Physics Meets ML lecture series3 , the Learn-
ing to Discover workshops and conference4 , and others) there are more structural shifts
required to maximize this information exchange and allow for more collaborative develop-
ment. We point readers to Snowmass papers Broadening the scope of Education, Career
and Open Science in HEP and Data Science and Machine Learning in Education for more
in depth discussion of cross-disciplinary community building considerations.
GNNs have been widely utilized in many other domains of physical sciences as well. For
example, GNNs have been successfully used for molecular [118, 119] and chemical property
prediction [120], protein structure and function prediction [121–123], and drug discovery
[124–126], among others. Many of these tasks are computationally similar to particle physics
tasks and engaging with these research communities and others in the applied geometric
machine learning community will likely yield widely beneficial advances.

4 Future Research Directions

Considering the challenges outlined above, we propose that the community focus research
and development efforts on methods incorporating existing physics knowledge and tech-
niques into GNN architectures, expanding the space of tasks that GNNs are applied to, and
developing new technique to increase the expressive power and ease of development of these
models. We outline specific areas of focus, and their potential benefits, below. By defining
a strong and complementary research trajectory we hope that the community will be able
to maximize the potential benefits of applying graph-based data representations and GNNs
to particle physics tasks.

Integrating physics knowledge into GNNs

As with all ML approaches, careful attention is being paid to how robust GNNs are to out-of-
distribution performance. Especially in particle physics, where anomalies (whether genuine
from new physics, or from some mis-calibration) are expected, any GNN-based pipeline must
be shown to generalize well. A promising approach to this is including knowledge about
the possible distribution in the GNN itself. This can be done in the structure of the data
(e.g. including high level invariant features), the training of the data (e.g. augmenting with
possible transformations or anomalies), or the architecture itself. This last is referred to
2
https://ml4physicalsciences.github.io
3
http://www.physicsmeetsml.org/
4
https://indico.ijclab.in2p3.fr/event/5999/

15
as an “equivariant”, or symmetry-constrained, GNN. [127] showed that a GNN constrained
to only perform convolutions that maintain Lorentz symmetry at each intermediate step is
able to perform equally well when exposed to new, highly boosted jet data. There are early
and very positive results in this burgening field, and we recommend the reader to review the
Snowmass white paper Symmetry Group Equivariant Architectures for Physics. Many of
the equivariant architectures bearing fruit are graph-based, and as such that work is highly
relevant to the ideas discussed here.

Data Augmentation

Increasing the diversity of data used in training machine learning models without actually
collecting additional data has proven to be a powerful technique to improve model per-
formance and generalization. Compared with other deep learning applications like image
classification [128], systematic studies of data augmentation for GNNs is rather limited.
Data augmentation in GNN training has some unique challenges due to graph irregular-
ity, although some of these can be mitigated with techniques such as utilizing neural edge
predictors [129].
In the context of GNN-based charged particle tracking for an LHC-like tracking detector,
one approach which has shown promising results is to make a copy of each graph in the
training set that has been reflected across the φ-axis [35], where φ is the azimuthal angle
of the track relative to the beam direction. The φ reflection creates the charge conjugate
graph and helps to balance any asymmetry between positive and negatively charged particles
within the training set. Similar studies for future GNN applications in HEP to improve
model performance could be undertaken in the future.

Uncertainty quantification

The predictions of particle physics ML models are often not only used directly (e.g. for
particle identification) but also feed into some eventual downstream calculation like a pre-
cision mass measurement of a SM particle or a distribution fit in a new particle search.
Uncertainty in and ML model can directly limit the statistical significance of this final mea-
surement. Thus, it is critical to consider not just the accuracy of a model’s prediction, but
also its uncertainty and robustness to overconfidence and out-of-distribution samples.
Uncertainty quantification (UQ) is still an open area of research in ML, particularly
for GNNs which have been shown to be sensitive to small perturbations in topology and
node/edge features [130, 131]. Generally, one must consider two types of uncertainty when
characterizing the performance of an ML model: model uncertainty describes how well a
model’s learned parameters fit the actual data distribution, while data uncertainty describes
uncertainty in the data distribution itself arising from noise in the data collection, data
drift, adversarial attacks, statistical artifacts, or other sources. Additionally,the choice of
loss function used to train a model implicitly defines a prior on the distribution of the
residuals and can be considered another potential source of uncertainty or bias in a model

16
that can limit the models ability to robustly estimate the underlying function and create
high-fideity predictions for unseen data.
Much of the work on UQ for GNNs has focused on model uncertainty. Perhaps the
simplest approach to UQ is model ensembling, where the same model is retrained multiple
times with different random seeds and the variance across the model iterations is a proxy
for the model uncertainty [132]. However, the bulk of recent work has centered on Bayesian
methods that seek to characterize the posterior distribution of a model’s decisions in order
to obtain an estimate of the model’s uncertainty [133]. A common approach to Bayesian
UQ is the so-called ‘monte carlo dropout’ method [134] that approximates the posterior
distribution by collecting samples from dropout regularized forward passes of the model;
however, there are also graph specific methods that perform Bayesian posterior updates
independently for separate nodes or subgraphs [135].
Unfortunately, Bayesian UQ methods are often not scalable to real world scientific prob-
lems, and so several other UQ approaches have been developed in parallel, particularly by
the scientific computing community. For example, orthogonal to Bayesian networks are Ev-
idential Deep Learning (EDL) methods that formulate learning as an evidence acquisition
process by modeling class probability predictions as a Dirichlet distribution which directly
provides an estimate of the decision uncertainty [136]; EDL has been applied with great
success in molecular applications including property prediction and discovery [137]. The
authors of [138] propose the use of surrogate models that exploit the benefits of GNNs for
structured physical data while maintaining precise measures of model confidence; in this
scheme a GNN is used to re-embed the data and form latent representations that are then
used as inputs to an model like a Gaussian process that inherently provides UQ. The au-
thors of [139] demonstrate ‘learning by calibrating’ by posing the estimation of prediction
intervals as an additional learning task and training with a bi-level optimization formula.
Despite the importance of UQ for any particle physics application of ML, these methods
have not been well explored for real world physics tasks. The most common method of
UQ for particle physics is ensemble methods, but this has not been compared to other
methods in terms of accuracy and efficiency in UQ computation, and in particular has not
been well studied for GNNs specifically. In one of the few examples of these comparisons,
the authors of [140] compare model ensembling (Naive Ensembling), monte carlo dropout,
and EDL on three common neutrino physics reconstruction tasks including multi-particle
classification using GNNs. They find that ensemble methods achieve the highest accuracy
and the best calibration of output probability values. While this is promising, there needs to
be substantial additional work comparing the performance of different UQ techniques across
other physics datasets and tasks types. There may be additional benefits from improved
UQ for GNNs; for example, [141] demonstrates that uncertainty estimates for individual
nodes can be used to prevent information from noise or low quality data (like damaged
or malfunctioning detector components) and [138] discusses potential uses of surrogate UQ
models for experiment design. We encourage in particular the exploration of techniques
beyond Bayesian UQ as those methods may be intractable for many physics applications.
We also point readers to other Snowmass papers focused on UQ estimation for ML in
particle physics, AI and Uncertainty Quantification and Solving Simulation Systematics in

17
and with AI/ML.

Instance segmentation approaches

Standard message passing GNNs can only perform classifications at the node, edge, or
graph level. Although these GNNs have been successfully applied to several key particle
physics reconstruction tasks such as tracking, clustering, and jet building, they are typi-
cally used as an intermediate step rather than an end-to-end solutions. For example, in
edge-classification-based tracking pipelines, the GNN weighted edges must be passed to a
clustering algorithm to form track candidates that are then fit to extract final track pa-
rameters. In order to enable one-shot solutions to these reconstruction problems, it can
be useful to re-conceptualize them as instance segmentation tasks. Instance segmenta-
tion is the task of detecting individual objects instances and their per-pixel segmentation
mask, and is a critical task in computer vision. Recently, modified GNN architectures
have been developed for 3D instance segmentation on point clouds. There have been two
main approaches: predicting bounding shapes to separate individual object instances after
re-embedding the graph through edge and/or node convolutions [142,143] and (possibly at-
tention based) graph pooling/clustering/condensation networks followed by localized node
classification [144–146].
This type of GNN architecture has not been well studied for particle physics applica-
tions. As described in [99] there are challenges in adapting the boundary shape prediction
models to particle physics data due to the irregular shape of physics objects (particularly
tracks), however these approaches are still promising for enabling end-to-end reconstruc-
tion pipelines. Additionally, initial work has shown that object condensation approaches can
successfully reconstruct particle clusters in calorimeters [40] and identify their originating
particle type [41]. We encourage the particle physics research community to consider what
other tasks can be described with this framework and to continue exploring both GNN and
non-GNN based instance segmentation architectures. In particular, reducing the number
of separate algorithms needed for end-to-end GNN-based reconstruction could better allow
these models to be used in computing resource constrained environments like the trigger
system.

Interpretability and explainability

Many ML models are often referred to as ‘black boxes’ as their large number of param-
eters and complexity of learned representations can prevent researchers from describing
what information is relevant to a model’s prediction. Interpretability and explainability
are critical research areas in ML that seek to address this issue. There are several estab-
lished explainability methods than can be easily modified to accommodate graph data and
GNNs; these include sensitivity analyses on edge and node features, layer-wise relevance
propagation within individual message passing networks, and disentangled representation
learning that seeks to separate input information into latent features and encode them as
separate dimensions that can then (ideally) be mapped back to human interpretable fea-

18
tures. GNN attention mechanisms can also be considered as an interpretability method
that allows model developers to identify relevant parts of a graph through the attention
scores.
Additionally, there are several explainability methods that have been developed specif-
ically for GNNs. So-called ‘black-box approximation methods’ train an inherrently inter-
pretable model like linear regression or decision trees to approximate the GNN’s decision
boundary in the neighborhood of a specific target node and then use the interpretable model
to explain the GNN decsion [147]. Similar to sensitivity analyses, perturbation-based ex-
plainability approaches seek to identify subgraphs that maximally contribute to the classi-
fication of an individual node or edge [148, 149]. Graph filters adopt interpretability tech-
niques originally applied to convolutional filters in CNNs to GNNs by incorporating graph
kernels into the message passing process to identify relevant local graph structures [150].
Interpretability and explainability techniques are particularly relevant to physics appli-
cations. If an ML model outperforms a hand-tuned physics-driven model physicists are
typically driven to characterize this difference in performance. Explainable GNN methods
have not been widely adopted in particle physics, though there are a few examples. In [77],
the authors applied layerwise-relevance propagation to characterize the relevant nodes and
features for a MLPF model while in [151] the authors apply symbolic regression as a form of
disentangled representation learning to extract explicit physical relations including known
force laws and Hamiltonians and a new analytic formula that can predict the concentration
of dark matter from the mass distribution of nearby cosmic structures. There are also a few
examples from the broader physical sciences space; in [152] the authors use an Integrated
Gradient method for sensitivity analysis to characterize the relationship between individual
material grains and overall material property prediction and in [153] the authors apply a
counterfactual perturbation method to understand which components of molecules make
them active for disease treatment.
We encourage physicists to invest in implementing and building upon these explainabil-
ity methods for several reasons. The ability to precisely characterize the reasons for an ML
model’s downstream prediction greatly enhances the reliability, trustworthiness, and stabil-
ity of the model and will better allow researchers to understand in which cases the model
might be inaccurate so that they can create alternatives and fail-safes to prevent data loss
and the introduction of additional uncertainty into measurements. By better understanding
what information and operations are relevant to a model’s predictions researchers may be
able to build more efficient data representations, ML architectures, and even detectors; this
is particularly relevant for GNNs where the data representation can correspond directly
to the physical structure of the data. Additionally, particle physics in an extremely excit-
ing area to explore interpretable and explainable ML techniques because so much of the
physical theory underlying the data is already mathematically described. Thus, as shown
in [151], by mapping the behavior of ML models back into this mathematical language
we can potentially uncover new physics and advance the state-of-the-art in explainable AI
techniques.

19
New tools

There is a landslide of community-built libraries becoming available for graph-centric ML.


As mentioned in the Challenges section, these are often modular and can take some effort
to make compatible, compared with the CV and NLP tools that are included as first-class
citizens in standard frameworks. Pytorch Geometric [154], DGL [155], Graph Nets [29],
Jraph [156] and Spektral [157] are a sample of the most popular kitchen-sink GNN libraries
that run on Pytorch, JAX and/or TensorFlow. They vary in their focus, but generally they
all attempt to cater to some of the highest hurdles to entry: a model zoo of implemented
GNN operations, a framework and utilities for message passing and a method of batching
graphs for training.
There is still much work to be done in the surrounding graph manipulation technology,
but some early implementations show that industry and scientific communities are moving
to graph techniques in earnest. For example, RAPIDS AI [158] dedicates a whole sublibrary
to accelerated graph techniques, called cuGraph. N-dimensional point cloud libraries are
also being made available for both CPU and GPU, for example for fast graph construction
with exact nearest neighbors [159], and approximate nearest neighbors [160]. Optimization
of GNN operations is also receiving much attention from industry. As mentioned in the
Challenges section, Onnx supports GNN operations, and these will soon be aligned with the
major frameworks. TensorRT [161], a popular library for optimizing CNN-based networks,
is collaborating with several scientific groups to extend these optimizations to GNNs, and
this has already been done in the form of fused operations in the DGL library. Finally,
the recently release Open Graph Benchmark (OGB) [162, 163] fills a significant gap that
traditionally CV and NLP have prioritized: uniform and translatable model evaluation.
This library is compatible with both PyG and DGL and should significantly speed up
comparisons between GNNs, a task that should be compulsory for any new implementation
on a physics use-case but that is typically laborious and inconsistent.

New task types

The majority GNN applications in HEP focus on well-studied reconstruction and identi-
fication tasks. Recent paradigms emerging in the HEP community, for example anomaly
detection methods for model-agnostic physics searches [164], present an exciting opportu-
nity to develop novel graph-based algorithms. To date, several studies have demonstrated
the applicability of autoencoders applied to particle graphs for anomaly detection [165,166].

New GNN operations

It is difficult to predict the direction that graph neural network research will take, but we can
be informed both by the trajectory of CNNs and transformers (which serve as inspiration
and motivation for much GNN development), as well as by the typical tasks in scientific
analysis. A clear research trend is towards heterogeneous data types, whether node, edge
or whole-graph - closely related to ”multi-modal” learning [167]. An example use-case may

20
be differing volumes within a detector, so that rather than a single MLP within a GNN
being expected to handle multiple sets of features trivially combined, different layers of the
GNN may learn natively on the various input features. This could be extended to the case
of a GNN applied simultaneously to sensors that produce numerical outputs and elsewhere
sensors that output image-like data, akin to vision transformers. These operations are now
natively supported in Pytorch Geometric, for example.
Hierarchical features have proven to be very important in CV research, for example as
extracted by pooling in CNNs. Edge contraction, node pooling, and hypergraph learning
may be one direction to emulate this high performance [168–170]. However, many GNN
operations are limited by the WL isomorphism test mentioned in the Challenges section.
Research is underway for generalisations to these WL-limited GNN operations. Hierar-
chical features may be able to be recovered by including information about neighborhood
structure [171], or pooling according to high-importance connected edges.
These ideas will ultimately feed into more powerful graph generation techniques. Hier-
archical features have long been a staple in image generation, for example in StyleGAN [172]
and Progressive GANs [173], where high fidelity images are generated by relying on different
layers to account for different levels of granularity. Without having GNN operations that
can capture hierarchical information (and that are not prohibitively expensive to compute),
graph generation may be confined to the same capabilities as non-hierarchical CNNs, which
struggle to generate even MNIST [174].

5 Conclusions

ML has become an integral part of particle physics and continued research and develop-
ment in these areas will likely be necessary to maximize the physics potential of current
experiments and to meet the computing requirements of future experiments. Amongst the
data structures currently utilized in ML, graphs are the most intuitive representation for
variable size, geometrically structured particle physics data. The adoption and development
of GNNs for physics tasks has proven extremely successful over the past several years and
given the expressive power of these models it is likely that this success will continue to grow.
Given their ability to handle sparse data and the irregular geometries of some particle
detectors, we propose that GNNs should be a universal benchmark for key particle physics
tasks. In other words, when new ML or traditional physics motivated approaches are
developed for a reconstruction or simulation task, they should be directly compared to a
graph (or set/pointcloud) based approach. However, this requires continued or expanded
support and investment from the research community; we particularly encourage work on
the development of relevant software tools for GNN model building and that experiments
ensure that their software pipelines are amenable to the incorporation of these models.

21
References

[1] K. Albertsson, P. Altoe, D. Anderson, M. Andrews, J.P.A. Espinosa, A. Aurisano


et al., Machine learning in high energy physics community white paper, in Journal of
Physics: Conference Series, vol. 1085, p. 022008, IOP Publishing, 2018.

[2] M.D. Schwartz, Modern machine learning and particle physics, Harvard Data
Science Review 3 (2021) .

[3] M. Feickert and B. Nachman, A living review of machine learning for particle
physics, arXiv preprint arXiv:2102.02770 (2021) .

[4] M.M. Bronstein, J. Bruna, T. Cohen and P. Velickovic, Geometric deep learning:
Grids, groups, graphs, geodesics, and gauges, CoRR abs/2104.13478 (2021)
[2104.13478].

[5] S. Arora, A survey on graph neural networks for knowledge graph completion, CoRR
abs/2007.12374 (2020) [2007.12374].

[6] P.W. Battaglia, R. Pascanu, M. Lai, D. Rezende and K. Kavukcuoglu, Interaction


networks for learning about objects, relations and physics, 1612.00222.

[7] T. Gaudelet, B. Day, A.R. Jamasb, J. Soman, C. Regep, G. Liu et al., Utilizing
graph machine learning within drug discovery and development, Briefings in
Bioinformatics 22 (2021)
[https://academic.oup.com/bib/article-pdf/22/6/bbab159/41087478/bbab159.pdf].

[8] F. Liu, S. Xue, J. Wu, C. Zhou, W. Hu, C. Paris et al., Deep learning for community
detection: progress, challenges and opportunities, arXiv preprint arXiv:2005.08225
(2020) .

[9] P.W. Battaglia, J.B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi,


M. Malinowski et al., Relational inductive biases, deep learning, and graph networks,
arXiv preprint arXiv:1806.01261 (2018) .

[10] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu et al., Graph neural networks: A
review of methods and applications, AI Open 1 (2020) 57.

[11] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang and S.Y. Philip, A comprehensive
survey on graph neural networks, IEEE transactions on neural networks and
learning systems 32 (2020) 4.

[12] Z. Zhang, P. Cui and W. Zhu, Deep learning on graphs: A survey, IEEE
Transactions on Knowledge and Data Engineering (2020) .

[13] J. Shlomi, P. Battaglia and J.-R. Vlimant, Graph neural networks in particle
physics, Mach. Learn.: Sci. and Technol. 2 (2021) 021001 [2007.13681].

22
[14] J. Duarte and J.-R. Vlimant, Graph neural networks for particle tracking and
reconstruction, in Artificial Intelligence for High Energy Physics, P. Calafiura,
D. Rousseau and K. Terao, eds., p. 387, World Scientific (2022), DOI [2012.01249].

[15] M. Gori, G. Monfardini and F. Scarselli, A new model for learning in graph
domains, in Proceedings. 2005 IEEE International Joint Conference on Neural
Networks, 2005., vol. 2, pp. 729–734 vol. 2, 2005, DOI.

[16] F. Scarselli, M. Gori, A.C. Tsoi, M. Hagenbuchner and G. Monfardini, The graph
neural network model, IEEE Transactions on Neural Networks 20 (2009) 61.

[17] C. Gallicchio and A. Micheli, Graph echo state networks, in The 2010 International
Joint Conference on Neural Networks (IJCNN), pp. 1–8, 2010, DOI.

[18] Y. Li, D. Tarlow, M. Brockschmidt and R. Zemel, Gated graph sequence neural
networks, arXiv preprint arXiv:1511.05493 (2015) .

[19] H. Dai, Z. Kozareva, B. Dai, A. Smola and L. Song, Learning steady-states of


iterative algorithms over graphs, in Proceedings of the 35th International Conference
on Machine Learning, J. Dy and A. Krause, eds., vol. 80 of Proceedings of Machine
Learning Research, pp. 1106–1114, PMLR, 10–15 Jul, 2018,
https://proceedings.mlr.press/v80/dai18a.html.

[20] J. Bruna, W. Zaremba, A. Szlam and Y. LeCun, Spectral networks and locally
connected networks on graphs, 2014.

[21] M. Henaff, J. Bruna and Y. LeCun, Deep convolutional networks on graph-structured


data, CoRR abs/1506.05163 (2015) [1506.05163].

[22] T.N. Kipf and M. Welling, Semi-supervised classification with graph convolutional
networks, CoRR abs/1609.02907 (2016) [1609.02907].

[23] F.R. Chung and F.C. Graham, Spectral graph theory, no. 92, American
Mathematical Soc. (1997).

[24] D.I. Shuman, S.K. Narang, P. Frossard, A. Ortega and P. Vandergheynst, The
emerging field of signal processing on graphs: Extending high-dimensional data
analysis to networks and other irregular domains, IEEE signal processing magazine
30 (2013) 83.

[25] A. Micheli, Neural network for graphs: A contextual constructive approach, IEEE
Transactions on Neural Networks 20 (2009) 498.

[26] J. Atwood and D. Towsley, Diffusion-convolutional neural networks, Advances in


neural information processing systems 29 (2016) .

[27] M. Niepert, M. Ahmed and K. Kutzkov, Learning convolutional neural networks for
graphs, in International conference on machine learning, pp. 2014–2023, PMLR,
2016.

23
[28] J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals and G.E. Dahl, Neural message
passing for quantum chemistry, in International conference on machine learning,
pp. 1263–1272, PMLR, 2017.

[29] P.W. Battaglia, J.B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V.F. Zambaldi,


M. Malinowski et al., Relational inductive biases, deep learning, and graph networks,
CoRR abs/1806.01261 (2018) [1806.01261].

[30] C. Song, Y. Lin, S. Guo and H. Wan, Spatial-temporal synchronous graph


convolutional networks: A new framework for spatial-temporal network data
forecasting, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34,
pp. 914–921, 2020.

[31] E. Rossi, B. Chamberlain, F. Frasca, D. Eynard, F. Monti and M.M. Bronstein,


Temporal graph networks for deep learning on dynamic graphs, CoRR
abs/2006.10637 (2020) [2006.10637].

[32] S. Farrell, P. Calafiura, M. Mudigonda, Prabhat, D. Anderson, J.-R. Vlimant et al.,


Novel deep learning methods for track reconstruction, arXiv:1810.06111 [hep-ex,
physics:physics] (2018) .

[33] G. DeZoort, S. Thais, J. Duarte, V. Razavimaleki, M. Atkinson, I. Ojalvo et al.,


Charged Particle Tracking via Edge-Classifying Interaction Networks, Computing
and Software for Big Science 5 (2021) 26.

[34] C. Biscarat, S. Caillou, C. Rougier, J. Stark and J. Zahreddine, Towards a realistic


track reconstruction algorithm based on graph neural networks for the HL-LHC, EPJ
Web of Conferences 251 (2021) 03047.

[35] X. Ju, D. Murnane, P. Calafiura, N. Choma, S. Conlon, S. Farrell et al.,


Performance of a geometric deep learning pipeline for HL-LHC particle tracking,
The European Physical Journal C 81 (2021) 876.

[36] X. Ju, S. Farrell, P. Calafiura, D. Murnane, Prabhat, L. Gray et al., Graph Neural
Networks for Particle Reconstruction in High Energy Physics detectors,
arXiv:2003.11603 [hep-ex, physics:physics] (2020) .

[37] S.R. Qasim, J. Kieseler, Y. Iiyama and M. Pierini, Learning representations of


irregular particle-detector geometry with distance-weighted graph networks, The
European Physical Journal C 79 (2019) 608.

[38] Y. Wang, Y. Sun, Z. Liu, S.E. Sarma, M.M. Bronstein and J.M. Solomon, Dynamic
graph cnn for learning on point clouds, 1801.07829.

[39] Y. Iiyama, G. Cerminara, A. Gupta, J. Kieseler, M. Pierini, M. Rieger et al.,


Application of distance-weighted graph neural networks to real-life particle detector
output, in Second Workshop on Machine Learning and the Physical Sciences,
(Vancouver, Canada), Dec., 2019.

24
[40] S.R. Qasim, K. Long, J. Kieseler, M. Pierini and R. Nawaz, Multi-particle
reconstruction in the High Granularity Calorimeter using object condensation and
graph neural networks, arXiv:2106.01832 [physics] (2021) .

[41] J. Kieseler, Object condensation: one-stage grid-free multi-object reconstruction in


physics detectors, graph, and image data, The European Physical Journal C 80
(2020) .

[42] F.A. Di Bello, S. Ganguly, E. Gross, M. Kado, M. Pitt, L. Santi et al., Towards a
Computer Vision Particle Flow, Eur. Phys. J. C 81 (2021) 107 [2003.08863].

[43] L. Gray, T. Klijnsma and S. Ghosh, A Dynamic Reduction Network for Point
Clouds, arXiv:2003.08013 [hep-ex] (2020) .

[44] J.A. Martinez, O. Cerri, M. Pierini, M. Spiropulu and J.-R. Vlimant, Pileup
mitigation at the Large Hadron Collider with Graph Neural Networks,
arXiv:1810.07988 [hep-ex, physics:hep-ph] (2019) .

[45] V. Mikuni and F. Canelli, Abcnet: an attention-based method for particle tagging,
The European Physical Journal Plus 135 (2020) .

[46] J. Pata, J. Duarte, J.-R. Vlimant, M. Pierini and M. Spiropulu, MLPF: Efficient
machine-learned particle-flow reconstruction using graph neural networks, Eur. Phys.
J. C 81 (2021) 381 [2101.08578].

[47] J. Pata, J. Duarte, F. Mokhtar, E. Wulff, J. Yoo, J.-R. Vlimant et al., Machine
Learning for Particle Flow Reconstruction at CMS, in 20th International Workshop
on Advanced Computing and Analysis Techniques in Physics Research, 3, 2022
[2203.00330].

[48] H. Serviansky, N. Segol, J. Shlomi, K. Cranmer, E. Gross, H. Maron et al.,


Set2Graph: Learning Graphs From Sets, 2002.08772.

[49] M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Póczos, R. Salakhutdinov and


A.J. Smola, Deep sets, CoRR abs/1703.06114 (2017) [1703.06114].

[50] P.T. Komiske, E.M. Metodiev and J. Thaler, Energy flow networks: deep sets for
particle jets, Journal of High Energy Physics 2019 (2019) .

[51] M.J. Dolan and A. Ore, Equivariant energy flow networks for jet tagging, Physical
Review D 103 (2021) .

[52] I. Henrion, J. Brehmer, J. Bruna, K. Cho, K. Cranmer, G. Louppe et al., Neural


message passing for jet physics, Deep Learning for Physical Sciences Workshop at
the 31st Conference on Neural Information Processing Systems (NIPS) (2017) .

[53] H. Qu and L. Gouskos, Jet tagging via particle clouds, Phys. Rev. D 101 (2020)
056019.

25
[54] E.A. Moreno, O. Cerri, J.M. Duarte, H.B. Newman, T.Q. Nguyen, A. Periwal et al.,
JEDI-net: a jet identification algorithm based on interaction networks, The
European Physical Journal C 80 (2020) 58.

[55] J. Shlomi, S. Ganguly, E. Gross, K. Cranmer, Y. Lipman, H. Serviansky et al.,


Secondary vertex finding in jets with neural networks, Eur. Phys. J. C 81 (2021)
540 [2008.02831].

[56] A. Chakraborty, S.H. Lim, M.M. Nojiri and M. Takeuchi, Neural Network-based Top
Tagger with Two-Point Energy Correlations and Geometry of Soft Emissions,
Journal of High Energy Physics 2020 (2020) 111.

[57] A. Collaboration, Deep sets based neural networks for impact parameter flavour
tagging in ATLAS, May, 2020.

[58] M. Abdughani, J. Ren, L. Wu and J.M. Yang, Probing stop pair production at the
LHC with graph neural networks, Journal of High Energy Physics 2019 (2019) 55.

[59] J. Ren, L. Wu and J.M. Yang, Unveiling CP property of top-Higgs coupling with
graph neural networks at the LHC, Physics Letters B 802 (2020) 135198.

[60] M. Abdughani, D. Wang, L. Wu, J.M. Yang and J. Zhao, Probing triple Higgs
coupling with machine learning at the LHC, Physical Review D 104 (2021) 056003.

[61] M.J. Fenton, A. Shmakov, T.-W. Ho, S.-C. Hsu, D. Whiteson and P. Baldi,
Permutationless Many-Jet Event Reconstruction with Symmetry Preserving
Attention Networks, arXiv:2010.09206 [hep-ex, physics:hep-ph] (2021) .

[62] J.S.H. Lee, I. Park, I.J. Watson and S. Yang, Zero-Permutation Jet-Parton
Assignment using a Self-Attention Network, arXiv:2012.03542 [hep-ex,
physics:hep-ph] (2020) .

[63] O. Atkinson, A. Bhardwaj, S. Brown, C. Englert, D.J. Miller and P. Stylianou,


Improved Constraints on Effective Top Quark Interactions using Edge Convolution
Networks, arXiv:2111.01838 [hep-ex, physics:hep-ph] (2021) .

[64] J.A. Martı́nez, T.Q. Nguyen, M. Pierini, M. Spiropulu and J.-R. Vlimant, Particle
generative adversarial networks for full-event simulation at the LHC and their
application to pileup description, Journal of Physics: Conference Series 1525 (2020)
012081.

[65] R. Kansal, J. Duarte, H. Su, B. Orzari, T. Tomei, M. Pierini et al., Particle cloud
generation with message passing generative adversarial networks, in Advances in
Neural Information Processing Systems, vol. 34, Curran Associates, Inc., 12, 2021,
https://papers.nips.cc/paper/2021/hash/c8512d142a2d849725f31a9a7a361ab9-
Abstract.html
[2106.11535].

[66] A. Hariri, D. Dyachkova and S. Gleyzer, Graph generative models for fast detector
simulations in high energy physics, 2104.01725.

26
[67] ATLAS collaboration, AtlFast3: the next generation of fast simulation in ATLAS,
2109.02551.
[68] Z. Jia, B. Tillman, M. Maggioni and D.P. Scarpazza, Dissecting the graphcore IPU
architecture via microbenchmarking, Tech. Rep. (2019).
[69] S. Maddrell-Mander, L.R.M. Mohan, A. Marshall, D. O’Hanlon, K. Petridis,
J. Rademacker et al., Studying the Potential of Graphcore IPUs for Applications in
Particle Physics, Comput. Softw. Big Sci. 5 (2021) 8 [2008.09210].
[70] Habana Labs, Goya inference platform white paper, Tech. Rep. v2.0 (2020).
[71] Habana Labs, Gaudi training platform white paper, Tech. Rep. v1.2 (2020).
[72] N.P. Jouppi, C. Young, N. Patil, D.A. Patterson, G. Agrawal, R. Bajwa et al.,
In-datacenter performance analysis of a tensor processing unit, SIGARCH Comput.
Archit. News 45 (2017) 1 [1704.04760].
[73] K. Rocki, D.V. Essendelft, I. Sharapov, R. Schreiber, M. Morrison, V. Kibardin
et al.,
Fast stencil-code computation on a wafer-scale processor, in Supercomuting 2020, 2020,
https://sc20.supercomputing.org/proceedings/tech paper/tech paper pages/pap555.html
[2010.03660].
[74] J. Duarte et al., FPGA-accelerated machine learning inference as a service for
particle physics computing, Comput. Softw. Big Sci. 3 (2019) 13 [1904.08986].
[75] J. Krupa et al., GPU coprocessors as a service for deep learning inference in high
energy physics, Mach. Learn. Sci. Tech. 2 (2021) 035005 [2007.10359].
[76] D.S. Rankin et al., FPGAs-as-a-service toolkit (FaaST), in 2020 IEEE/ACM
International Workshop on Heterogeneous High-performance Reconfigurable
Computing (H2RC), p. 38, 11, 2020, DOI [2010.08556].
[77] F. Mokhtar, R. Kansal, D. Diaz, J. Duarte, J. Pata, M. Pierini et al., Explaining
machine-learned particle-flow reconstruction, arXiv preprint arXiv:2111.12840
(2021) .
[78] Y. Umuroglu, N.J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre et al.,
FINN: A framework for fast, scalable binarized neural network inference, in
Proceedings of the 2017 ACM/SIGDA International Symposium on
Field-Programmable Gate Arrays, (New York, NY, USA), p. 65, ACM, 2017, DOI
[1612.07119].
[79] M. Blott, T. Preußser, N. Fraser, G. Gambardella, K. O’Brien and Y. Umuroglu,
FINN-R: An end-to-end deep-learning framework for fast exploration of quantized
neural networks, ACM Trans. Reconfigurable Technol. Syst. 11 (2018) [1809.04570].
[80] A. Shawahna, S.M. Sait and A. El-Maleh, FPGA-based accelerators of deep learning
networks for learning and classification: A review, IEEE Access 7 (2019) 7823
[1901.00121].

27
[81] T. Wang, C. Wang, X. Zhou and H. Chen, An overview of FPGA based deep
learning accelerators: Challenges and opportunities, in 2019 IEEE 21st International
Conference on High Performance Computing and Communications; IEEE 17th
International Conference on Smart City; IEEE 5th International Conference on
Data Science and Systems (HPCC/SmartCity/DSS), p. 1674, 2019 [1901.04988].

[82] J. Duarte et al., Fast inference of deep neural networks in FPGAs for particle
physics, J. Instrum. 13 (2018) P07027 [1804.06913].

[83] S. Summers et al., Fast inference of boosted decision trees in FPGAs for particle
physics, 2002.02534.

[84] G. Di Guglielmo et al., Compressing deep neural networks on FPGAs to binary and
ternary precision with hls4ml, Mach. Learn.: Sci. Technol. 2 (2020) 015001
[2003.06308].

[85] C.N. Coelho, A. Kuusela, S. Li, H. Zhuang, J. Ngadiuba, T.K. Aarrestad et al.,
Automatic heterogeneous quantization of deep neural networks for low-latency
inference on the edge for particle detectors, Nat. Mach. Intell. (2021) [2006.10159].

[86] T. Åarrestad et al., Fast convolutional neural networks on FPGAs with hls4ml,
Mach. Learn.: Sci. Technol. 2 (2021) 045015 [2101.05108].

[87] Y. Iiyama et al., Distance-weighted graph neural networks on FPGAs for real-time
particle reconstruction in high energy physics, Front. Big Data 3 (2021) 44
[2008.03601].

[88] A. Heintz, V. Razavimaleki, J. Duarte, G. DeZoort, I. Ojalvo, S. Thais et al.,


Accelerated charged particle tracking with graph neural networks on FPGAs, in 3rd
Machine Learning and the Physical Sciences Workshop at the 34th Annual
Conference on Neural Information Processing Systems, 12, 2020,
https://ml4physicalsciences.github.io/2020/files/NeurIPS ML4PS 2020 137.pdf
[2012.01563].

[89] A. Elabd et al., Graph Neural Networks for Charged Particle Tracking on FPGAs,
Front. Big Data (2022) [2112.02048].

[90] E. Nurvitadhi, G. Weisz, Y. Wang, S. Hurkat, M. Nguyen, J.C. Hoe et al.,


GraphGen: An FPGA framework for vertex-centric graph computation, in 2014
IEEE 22nd Annual International Symposium on Field-Programmable Custom
Computing Machines, (New York, NY, USA), p. 25, IEEE, 2014, DOI.

[91] M.M. Ozdal, S. Yesil, T. Kim, A. Ayupov, J. Greth, S. Burns et al., Energy efficient
architecture for graph analytics accelerators, Comput. Archit. News 44 (2016) 166.

[92] H. Zeng and V. Prasanna, GraphACT: Accelerating GCN training on CPU-FPGA


heterogeneous platforms, in 2020 ACM/SIGDA International Symposium on
Field-Programmable Gate Arrays, (New York, NY, USA), p. 255, ACM, 2020, DOI
[2001.02498].

28
[93] M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye et al., HyGCN: A GCN
accelerator with hybrid architecture, in 2020 IEEE International Symposium on High
Performance Computer Architecture (HPCA), (New York, NY, USA), p. 15, IEEE,
2020, DOI [2001.02514].

[94] A. Auten, M. Tomei and R. Kumar, Hardware acceleration of graph neural networks,
in 2020 57th ACM/IEEE Design Automation Conference (DAC), (New York, NY,
USA), p. 1, IEEE, 2020, DOI.

[95] T. Geng, A. Li, R. Shi, C. Wu, T. Wang, Y. Li et al., AWB-GCN: A graph


convolutional network accelerator with runtime workload rebalancing, in 53rd
IEEE/ACM International Symposium on Microarchitecture, (New York, NY, USA),
2020, DOI [1908.10834].

[96] K. Kiningham, C. Re and P. Levis, “GRIP: A graph neural network accelerator


architecture.” 7, 2020.

[97] F.A. Di Bello, J. Shlomi, C. Badiali, G. Frattari, E. Gross, V. Ippolito et al.,


Efficiency Parameterization with Neural Networks, Comput. Softw. Big Sci. 5 (2021)
14 [2004.02665].

[98] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez et al.,


Attention is all you need, CoRR abs/1706.03762 (2017) [1706.03762].

[99] S. Thais and G. DeZoort, Instance segmentation gnns for one-shot conformal
tracking at the lhc, 2021.

[100] M. Fereydounian, H. Hassani, J. Dadashkarimi and A. Karbasi, The exact class of


graph functions generated by graph neural networks, ArXiv abs/2202.08833 (2022)
.

[101] K. Xu, W. Hu, J. Leskovec and S. Jegelka, How powerful are graph neural
networks?, CoRR abs/1810.00826 (2018) [1810.00826].

[102] R. Sato, A survey on the expressive power of graph neural networks, CoRR
abs/2003.04078 (2020) [2003.04078].

[103] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito et al., Automatic


differentiation in pytorch, .

[104] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro et al., TensorFlow:


Large-scale machine learning on heterogeneous systems, 2015.

[105] A. Hagberg, P. Swart and D. S Chult, Exploring network structure, dynamics, and
function using networkx, Tech. Rep. Los Alamos National Lab.(LANL), Los Alamos,
NM (United States) (2008).

[106] N. Ravi, J. Reizenstein, D. Novotný, T. Gordon, W. Lo, J. Johnson et al.,


Accelerating 3d deep learning with pytorch3d, CoRR abs/2007.08501 (2020)
[2007.08501].

29
[107] H. Wang, L. Yang, X. Rong, J. Feng and Y. Tian, Self-supervised 4d spatio-temporal
feature learning via order prediction of sequential point cloud clips, in Proceedings of
the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV),
pp. 3762–3771, January, 2021.

[108] X. Ju et al., Performance of a geometric deep learning pipeline for HL-LHC particle
tracking, Eur. Phys. J. C 81 (2021) 876 [2103.06995].

[109] J. Bai, F. Lu, K. Zhang et al., “Onnx: Open neural network exchange.”
https://github.com/onnx/onnx, 2019.

[110] N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds et al.,


Captum: A unified and generic model interpretability library for pytorch, 2020.

[111] A. Tripathy, K. Yelick and A. Buluç, Reducing communication in graph neural


network training, in SC20: International Conference for High Performance
Computing, Networking, Storage and Analysis, pp. 1–14, 2020, DOI.

[112] N.S. Sohoni, C.R. Aberger, M. Leszczynski, J. Zhang and C. Ré, Low-memory neural
network training: A technical report, CoRR abs/1904.10631 (2019) [1904.10631].

[113] F. Fahim, B. Hawks, C. Herwig, J. Hirschauer, S. Jindariani, N. Tran et al., hls4ml:


An open-source codesign workflow to empower scientific low-power machine learning
devices, in 1st tinyML Research Symposium, 03, 2021 [2103.05579].

[114] W. Hamilton, Z. Ying and J. Leskovec, Inductive representation learning on large


graphs, in Advances in Neural Information Processing Systems, I. Guyon,
U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan et al., eds.,
vol. 30, Curran Associates, Inc., 2017,
https://proceedings.neurips.cc/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-
Paper.pdf
[1706.02216].

[115] S.R. Qasim, K. Long, J. Kieseler, M. Pierini and R. Nawaz, Multi-particle


reconstruction in the high granularity calorimeter using object condensation and
graph neural networks, EPJ Web of Conferences (2021) .

[116] R. Roberts and Atlas Experiment Collaboration, Graph Neural Network to Measure
Four-top Production with the ATLAS Detector, in APS April Meeting Abstracts,
vol. 2021 of APS Meeting Abstracts, p. Q19.007, Jan., 2021.

[117] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang and P.S. Yu, A comprehensive survey on
graph neural networks, IEEE Transactions on Neural Networks and Learning
Systems 32 (2019) 4.

[118] H. Stärk, D. Beaini, G. Corso, P. Tossou, C. Dallago, S. Günnemann et al., 3d


infomax improves gnns for molecular property prediction, CoRR abs/2110.04126
(2021) [2110.04126].

30
[119] J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals and G.E. Dahl, Neural message
passing for quantum chemistry, CoRR abs/1704.01212 (2017) [1704.01212].

[120] V. Fung, J. Zhang, E. Juarez and B. Sumpter, Benchmarking graph neural networks
for materials chemistry, npj Computational Materials 7 (2021) .

[121] T. Xia and W.-S. Ku, Geometric graph representation learning on protein structure
prediction, (New York, NY, USA), Association for Computing Machinery, 2021, DOI.

[122] A. Kabir and A. Shehu, Graph neural networks in predicting protein function and
interactions, in Graph Neural Networks: Foundations, Frontiers, and Applications,
L. Wu, P. Cui, J. Pei and L. Zhao, eds., (Singapore), pp. 541–556, Springer
Singapore (2022).

[123] K. Tunyasuvunakool, J. Adler, Z. Wu, T. Green, M. Zielinski, A. Žı́dek et al., Highly


accurate protein structure prediction for the human proteome, Nature 596 (2021) 1.

[124] K. Han, B. Lakshminarayanan and J.T. Liu, Reliable graph neural networks for drug
discovery under distributional shift, CoRR abs/2111.12951 (2021) [2111.12951].

[125] R. Mercado, T. Rastemo, E. Lindelöf, G. Klambauer, O. Engkvist, H. Chen et al.,


Graph networks for molecular design, Machine Learning: Science and Technology 2
(2021) 025023.

[126] T. Gaudelet, B. Day, A. Jamasb, J. Soman, C. Regep, G. Liu et al., Utilising graph
machine learning within drug discovery and development, ArXiv abs/2012.05716
(2020) .

[127] A. Bogatskiy, B. Anderson, J.T. Offermann, M. Roussi, D.W. Miller and R. Kondor,
Lorentz Group Equivariant Neural Network for Particle Physics, 2006.04780.

[128] L. Perez and J. Wang, The effectiveness of data augmentation in image classification
using deep learning, CoRR abs/1712.04621 (2017) [1712.04621].

[129] T. Zhao, Y. Liu, L. Neves, O.J. Woodford, M. Jiang and N. Shah, Data
augmentation for graph neural networks, CoRR abs/2006.06830 (2020)
[2006.06830].

[130] D. Zügner and S. Günnemann, Adversarial attacks on graph neural networks via
meta learning, CoRR abs/1902.08412 (2019) [1902.08412].

[131] K. Xu, H. Chen, S. Liu, P. Chen, T. Weng, M. Hong et al., Topology attack and
defense for graph neural networks: An optimization perspective, CoRR
abs/1906.04214 (2019) [1906.04214].

[132] B. Lakshminarayanan, A. Pritzel and C. Blundell, Simple and scalable predictive


uncertainty estimation using deep ensembles, in NIPS, 2017.

[133] J. Mena, O. Pujol and J. Vitrià, A survey on uncertainty estimation in deep learning
classification systems from a bayesian perspective, .

31
[134] Y. Gal and Z. Ghahramani, Dropout as a bayesian approximation: Representing
model uncertainty in deep learning, in Proceedings of The 33rd International
Conference on Machine Learning, M.F. Balcan and K.Q. Weinberger, eds., vol. 48 of
Proceedings of Machine Learning Research, (New York, New York, USA),
pp. 1050–1059, PMLR, 20–22 Jun, 2016,
https://proceedings.mlr.press/v48/gal16.html.

[135] M. Stadler, B. Charpentier, S. Geisler, D. Zugner and S. Gunnemann, Graph


posterior network: Bayesian predictive uncertainty for node classification, ArXiv
abs/2110.14012 (2021) .

[136] M. Sensoy, M. Kandemir and L.M. Kaplan, Evidential deep learning to quantify
classification uncertainty, CoRR abs/1806.01768 (2018) [1806.01768].

[137] A.P. Soleimany, A. Amini, S. Goldman, D. Rus, S.N. Bhatia and C.W. Coley,
Evidential deep learning for guided molecular property prediction and discovery, ACS
Central Science 7 (2021) 1356 [https://doi.org/10.1021/acscentsci.1c00546].

[138] J. Allotey, K.T. Butler and J. Thiyagalingam, Entropy-based active learning of graph
neural network surrogate models for materials properties, The Journal of Chemical
Physics 155 (2021) 174116.

[139] J.J. Thiagarajan, R. Anirudh, P.-T. Bremer and B. Venkatesh, Uncertainty


quantification in scientific ml, .

[140] D.H. Koh, Evaluating deep learning uncertainty quantification methods for neutrino
physics applications, 2021.

[141] B. Feng, Y. Wang, Z. Wang and Y. Ding, Uncertainty-aware attention graph neural
network for defending adversarial attacks, arXiv preprint arXiv:2009.10235 (2020) .

[142] M.A. Ansari, M. Meraz, P. Chakraborty and M. Javed, Angle based feature learning
in GNN for 3d object detection using point cloud, CoRR abs/2108.00780 (2021)
[2108.00780].

[143] W. Shi and R. Rajkumar, Point-gnn: Graph neural network for 3d object detection
in a point cloud, in Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition (CVPR), June, 2020.

[144] T. Li, K. Zhang, S. Shen, B. Liu, Q. Liu and Z. Li, Image co-saliency detection and
instance co-segmentation using attention graph clustering based graph convolutional
network, IEEE Transactions on Multimedia 24 (2022) 492.

[145] C.R. Qi, H. Su, K. Mo and L.J. Guibas, Pointnet: Deep learning on point sets for 3d
classification and segmentation, CoRR abs/1612.00593 (2016) [1612.00593].

[146] X.-L. Yun, Y.-M. Zhang, F. Yin and C.-L. Liu, Instance gnn: A learning framework
for joint symbol segmentation and recognition in online handwritten diagrams, IEEE
Transactions on Multimedia (2021) 1.

32
[147] Q. Huang, M. Yamada, Y. Tian, D. Singh, D. Yin and Y. Chang, GraphLIME: Local
Interpretable Model Explanations for Graph Neural Networks, arXiv e-prints (2020)
arXiv:2001.06216 [2001.06216].

[148] R. Ying, D. Bourgeois, J. You, M. Zitnik and J. Leskovec, GNN explainer: A tool
for post-hoc explanation of graph neural networks, CoRR abs/1903.03894 (2019)
[1903.03894].

[149] A. Lucic, M. ter Hoeve, G. Tolomei, M. de Rijke and F. Silvestri, Cf-gnnexplainer:


Counterfactual explanations for graph neural networks, CoRR abs/2102.03322
(2021) [2102.03322].

[150] A. Feng, C. You, S. Wang and L. Tassiulas, Kergnns: Interpretable graph neural
networks with graph kernels, CoRR abs/2201.00491 (2022) [2201.00491].

[151] M.D. Cranmer, A. Sanchez-Gonzalez, P.W. Battaglia, R. Xu, K. Cranmer,


D.N. Spergel et al., Discovering symbolic models from deep learning with inductive
biases, CoRR abs/2006.11287 (2020) [2006.11287].

[152] M. Dai, M. Demirel, Y. Liang and J. Hu, Graph neural networks for an accurate and
interpretable prediction of the properties of polycrystalline materials, .

[153] G.P. Wellawatte, A. Seshadri and A.D. White, Model agnostic generation of
counterfactual explanations for molecules, Chem. Sci. (2022) .

[154] M. Fey and J.E. Lenssen, Fast graph representation learning with PyTorch
Geometric, in ICLR Workshop on Representation Learning on Graphs and
Manifolds, 2019.

[155] M. Wang, L. Yu, D. Zheng, Q. Gan, Y. Gai, Z. Ye et al., Deep graph library:
Towards efficient and scalable deep learning on graphs, CoRR abs/1909.01315
(2019) [1909.01315].

[156] J. Godwin*, T. Keck*, P. Battaglia, V. Bapst, T. Kipf, Y. Li et al., Jraph: A library


for graph neural networks in jax., 2020.

[157] D. Grattarola and C. Alippi, Graph neural networks in tensorflow and keras with
spektral, CoRR abs/2006.12138 (2020) [2006.12138].

[158] R.D. Team, RAPIDS: Collection of Libraries for End to End GPU Data Science,
2018.

[159] L. Xue, D. Murnane and V. Sreenivasan, “Fixed radius nearest neighbors.”


https://github.com/murnanedaniel/FRNN, 2022.

[160] J. Johnson, M. Douze and H. Jégou, Billion-scale similarity search with GPUs,
IEEE Transactions on Big Data 7 (2019) 535.

[161] “Tensorrt.” https://developer.nvidia.com/tensorrt, 2022.

33
[162] W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu et al., Open graph benchmark:
Datasets for machine learning on graphs, arXiv preprint arXiv:2005.00687 (2020) .

[163] W. Hu, M. Fey, H. Ren, M. Nakata, Y. Dong and J. Leskovec, Ogb-lsc: A large-scale
challenge for machine learning on graphs, arXiv preprint arXiv:2103.09430 (2021) .

[164] G. Kasieczka, B. Nachman, D. Shih, O. Amram, A. Andreassen, K. Benkendorfer


et al., The LHC Olympics 2020: A Community Challenge for Anomaly Detection in
High Energy Physics, Reports on Progress in Physics 84 (2021) 124201.

[165] O. Atkinson, A. Bhardwaj, C. Englert, V.S. Ngairangbam and M. Spannowsky,


Anomaly detection with Convolutional Graph Neural Networks, Journal of High
Energy Physics 2021 (2021) 80.

[166] S. Tsan, R. Kansal, A. Aportela, D. Diaz, J. Duarte, S. Krishna et al., Particle


Graph Autoencoders and Differentiable, Learned Energy Mover’s Distance,
arXiv:2111.12849 [hep-ex, physics:physics] (2021) .

[167] T. Baltrusaitis, C. Ahuja and L. Morency, Multimodal machine learning: A survey


and taxonomy, CoRR abs/1705.09406 (2017) [1705.09406].

[168] Y. Wang, S. Wang, Q. Yao and D. Dou, Hierarchical heterogeneous graph


representation learning for short text classification, CoRR abs/2111.00180 (2021)
[2111.00180].

[169] F.M. Bianchi, D. Grattarola, L. Livi and C. Alippi, Hierarchical representation


learning in graph neural networks with node decimation pooling, CoRR
abs/1910.11436 (2019) [1910.11436].

[170] R. Ying, J. You, C. Morris, X. Ren, W.L. Hamilton and J. Leskovec, Hierarchical
graph representation learning with differentiable pooling, CoRR abs/1806.08804
(2018) [1806.08804].

[171] G. Bouritsas, F. Frasca, S. Zafeiriou and M.M. Bronstein, Improving graph neural
network expressivity via subgraph isomorphism counting, CoRR abs/2006.09252
(2020) [2006.09252].

[172] T. Karras, S. Laine and T. Aila, A style-based generator architecture for generative
adversarial networks, CoRR abs/1812.04948 (2018) [1812.04948].

[173] T. Karras, T. Aila, S. Laine and J. Lehtinen, Progressive growing of gans for
improved quality, stability, and variation, CoRR abs/1710.10196 (2017)
[1710.10196].

[174] K. Cheng, R. Tahir, L.K. Eric and M. Li, An analysis of generative adversarial
networks and variants for image synthesis on mnist dataset, Multimedia Tools and
Applications 79 (2020) 13725.

34

You might also like