Physics-informed neural network (PINN) modeling of charged particle multiplicity
using the two-component framework in heavy-ion collisions: A comparison with data-driven neural networks

Akash Das1 [email protected]    Satya Ranjan Nayak2 [email protected]    B. K. Singh1,2 [email protected] [email protected] 1Discipline of Natural Sciences, PDPM Indian Institute of Information Technology Design & Manufacturing, Jabalpur 482005, INDIA.
2Department of Physics, Institute of Science,
Banaras Hindu University (BHU), Varanasi 221005, INDIA.
(November 7, 2025)
Abstract

In this study, we employ a conventional deep neural network (NN) framework integrated with physics-based constraints to predict charged hadron multiplicity (NchN_{\text{ch}}) in heavy-ion collisions. The goal is to assess the performance of a purely data-driven deep neural network in comparison to a physics-informed neural network (PINN). To accomplish this, we have taken data generated from the HYDJET++ model for testing and training purposes. We train our neural network frameworks using the data of one million individual Zr4096+4096Zr{}^{96}_{40}\text{Zr}+^{96}_{40}\text{Zr} collision events. Our PINN model successfully extracts the hard-scattering fraction (xx) by learning its underlying relation from the event data. For further testing and comparison with the conventional NN, we take data of Ru4496+4496Ru{}^{96}_{44}\text{Ru}+^{96}_{44}\text{Ru} (isobar of Zr) and Au79197+79197Au{}^{197}_{79}\text{Au}+^{197}_{79}\text{Au} collisions using the same simulation model. We found that the NN model needs more time to train with physics. However, once trained, the PINN model is capable of accurately predicting data that it has not encountered during training, such as Au+Au collision results. Especially in a region of sparse data corresponding to high NchN_{\text{ch}} in our study, PINN has a clear advantage over a simple NN.

preprint: APS/123-QED

I Introduction

Heavy-ion collisions at relativistic energies provide a unique environment to study the properties of the quark-gluon plasma (QGP) [1, 2]. Experiments have been conducted on various colliding systems, such as Au79197+79197Au{}^{197}_{79}\text{Au}+^{197}_{79}\text{Au}, Ru4496+4496Ru{}^{96}_{44}\text{Ru}+^{96}_{44}\text{Ru}, and Zr4096+4096Zr{}^{96}_{40}\text{Zr}+^{96}_{40}\text{Zr} , to explore the collective behavior, nuclear structure properties, and isobaric effects [4, 3]. The charged particle multiplicity (NchN_{\text{ch}}) and its correlations serve as a probe to know the initial geometry and particle production mechanism.

Relativistic Heavy Ion Collider (RHIC) and Large Hadron Collider (LHC) record particle tracks in detectors following high-energy nuclear collisions. These tracks are then correlated with the collision geometry using the Glauber model calculations [5, 6]. The Glauber framework provides the number of nucleons taking part in the collision (NpartN_{\text{part}}) and the number of binary nucleon-nucleon collisions (NcollN_{\text{coll}}) corresponding to the impact parameter (bb). The NchN_{\text{ch}} is empirically related to NpartN_{\text{part}} and NcollN_{\text{coll}} via a two-component formula:

dNchdη=npp[(1x)Npart2+xNcoll]\frac{dN_{\text{ch}}}{d\eta}=n_{pp}\left[(1-x)\cdot\frac{N_{\text{part}}}{2}+x\cdot N_{\text{coll}}\right] (1)

Since NpartN_{\text{part}} counts the total number of participating nucleons from both nuclei, we divide it by 2 to get the effective number of participant pairs. xx represents the weight of the hard scattering contribution, and nppn_{pp} is the average number of charged hadrons produced in individual proton-proton collisions. The contributions from hard processes scale with NcollN_{\text{coll}} and soft processes with NpartN_{\text{part}}.

In recent years, neural networks emerged as a promising tool in heavy-ion collision physics, enabling data-driven modeling of complex nonlinear dependencies [7, 8]. Although effective, the standard neural networks (NN) rely purely on the statistical correlations of the data and lack physical interoperability. To overcome this limitation of conventional NN, a more advanced version of NN was introduced, known as Physics-Informed Neural Networks (PINN) [9]. This embeds physical laws directly into the NN’s loss function and allows simultaneous learning from the data and theoretical constraints. Thus, PINN improves the robustness of the learning model in regions where the data are sparse.

In this work, we employed both a conventional neural network (NN) and a physics-informed neural network (PINN) to predict NchN_{\text{ch}} in isobaric collisions. The models are trained on Zr+Zr data generated from the HYDJET++ model [10, 11], while testing is performed on Ru+Ru and Au+Au data generated by the same process. As an isobar system, Ru+Ru collision data serves as the seen test dataset, since the model was trained on Zr+Zr data. In contrast, the Au+Au collision data represents a completely unseen dataset during the training phase. The comparative analysis of NN and PINN predictions in these testing systems highlights the role of physics constraints in improving generalization across both seen and unseen collision systems. The study provides an initial step toward integrating PINN for analysis and prediction of observables in relativistic heavy-ion collisions.

The letter is organized as follows. In section-II\mathrm{II}, we have discussed broadly about the NN architecture, Pythia, and HYDJET++ models. In section-III\mathrm{III}, we provide the result and discussion part. In section-IV\mathrm{IV}, we have summarized our work.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Figure 1: Distributions of (a) impact parameter (bb), (b) number of binary nucleon-nucleon collisions (NcollN_{coll}), (c) number of participant nucleons (NpartN_{part}), and (d) number of charged hadrons (NchN_{ch}) in the dataset.

II Model Formalism

II.1 Physics framework

The heavy-ion dataset was generated using the HYDJET++ event generator. HYDJET++ is a combination of two independent algorithms, i.e, PYQUEN for hard interactions and FASTMC for thermal particle production. The simulation starts by calculating the number of participants and the number of binary collisions using the Glauber model. Then the initial parton spectra are generated using PYTHIA6 [12]. PYQUEN incorporates the collisional and radiational energy loss of the partons inside the dense QGP medium. The thermal state is generated on freezeout hypersurfaces with preset freezeout conditions. The model assumes both chemical freezeout and thermal freezeout with Tch>TthT_{ch}>T_{th} [13, 14]. The final hadronization is performed according to the LUND string fragmentation model [15, 16].

The multiplicity in pp collisions (nppn_{pp}) was generated using PYTHIA 8 [17, 18]. PYTHIA is an all-purpose Monte Carlo event generator used to simulate a wide range of collision systems. The hard interaction between the valency quarks is calculated using standard pQCD tools. The softer interactions between the beam remnants are simulated using the Multi-Parton interaction (MPI) framework. A detailed description can be found in the corresponding papers [19]. The parton showers are simulated simultaneously and added back to the final state. A color reconnection [20] is performed between the qq¯q\bar{q} dipoles to reduce the number of long strings and control the multiplicity. Finally, the strings are fragmented into hadrons according to the Lund string fragmentation model [15, 16]. A detailed description of PYTHIA can be found in the PYTHIA manual [12].

II.2 Model Architecture and Data Processing

In this study, we employed the PyTorch deep learning framework [21] to construct and train the PINN for predicting NchN_{\text{ch}} based on event-level observables. Here, event-level observables denote the NpartN_{\text{part}}, NcollN_{\text{coll}} and the impact parameter (bb) for each collision event. PyTorch’s ecosystem offers a flexible and modular platform for implementing the NN framework. It has some core features such as GPU acceleration, nn.Module, automatic differentiation, learning rate scheduling, gradient clipping, etc., which help to achieve stable and efficient optimization.

For training the neural networks, we employ a dataset generated from simulated Zr+Zr collision events at RHIC energies using the HYDJET++ event generator. The dataset consists of four features per event:- bb, NpartN_{\text{part}}, NcollN_{\text{coll}}, and NchN_{\text{ch}}. The input vector X = (b, NpartN_{\text{part}}, NcollN_{\text{coll}}) is fed into a fully connected feedforward network, while the target variable Y = NchN_{\text{ch}} serves as a prediction target.

Before training, all the input features and the target variables are normalized to the range [0,1][0,1] using min-max scaling [22]. The normalized data is converted to PyTorch tensors and split into an 8080%-2020% train-test partition using random_split. The batch size is adjusted according to the size of the training dataset. For a training dataset of one million samples, a batch size of 1000 is used to ensure efficient training using PyTorch’s DataLoader [21].

The neural network architecture consists of an input layer of dimension 3, followed by three hidden layers containing 128, 64, and 32 neurons, respectively. The first hidden layer uses a hyperbolic tangent (tanh) activation function, followed by Rectified Linear Unit (ReLU) activations in subsequent layers [23]. Dropout layers with a rate of 0.1 are applied after the second and third hidden layers to mitigate overfitting. A learning rate of 0.0001 is employed for training, and the mean squared error (MSE) is used as the loss function. The output layer produces a single scalar prediction corresponding to NchN_{\text{ch}}.

A physics module is incorporated in parallel with the data-driven neural network to enforce known physical constraints. This module implements a fixed analytical expression for charged hadron multiplicity inspired by the Glauber two-component model Eq. (1). In this module, nppn_{\text{pp}} is treated as a non-trainable buffer using register_buffer. On the other hand xx is taken as a scaler trainable parameter defined as self.x = nn.Parameter(torch.tensor(7.0, dtype=torch.float32)). The value 7.0 is a random initial guess for the xx value. Since it is wrapped with nn.Parameter, it is registered as a model parameter and included in the gradient-based optimization.

The overall training is guided by a composite loss function:

total=data+λphysics\mathcal{L}_{\text{total}}=\mathcal{L}_{\text{data}}+\lambda\,\mathcal{L}_{\text{physics}} (2)

The data\mathcal{L}_{\text{data}} quantifies how much difference there is between the NN prediction and simulated data, while physics\mathcal{L}_{\text{physics}} penalizes deviation from the already known physics-based model prediction. Both losses use the MSELoss function to ensure robustness against the outliers. The hyperparameter λ\lambda adjusts how much importance we give to the physics rules in matching the data.

The training was carried out on Google Colab using an NVIDIA T4 accelerator with 16 GB of memory. The T4 provides hardware acceleration optimized for deep learning, enabling efficient parallel computations. PyTorch was employed as the deep learning framework, taking advantage of Colab’s GPU/TPU backend for faster neural network training and experimentation. The PINN takes about 1.5 to 2 times longer to train compared to the data-driven neural network.

Refer to caption
Figure 2: Comparison of simulated (actual) and PINN predicted NchN_{\text{ch}} for Zr with λ\lambda=0.410.41.
Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Figure 3: Comparison of NN-predicted NchN_{\text{ch}} and PINN predicted NchN_{\text{ch}} for (a) λ\lambda = 0.1 and (c) λ\lambda = 1.0, and loss evolution curves for (b) λ\lambda = 0.1 and (d) λ\lambda = 1.0 for Zr+Zr datasets.
Metric 𝝀=0.1\bm{\lambda=0.1} 𝝀=𝟏\bm{\lambda=1}
R2R^{2} 0.9880 0.9841
RMSE 23.16 26.07
MAE 14.85 17.16
Table 1: Performance metrics for different values of λ\lambda, trained and tested on Zr+Zr data points for 100 epochs.

III Result and discussions

The input data distributions are shown in Fig. 1. An intrinsic non-uniformity is present in the data. In the Glauber model, events are sampled with a probability proportional to 2πbdb2\pi b\,db, so smaller bb values are less probable than intermediate ones. The distribution peaks around b2b\sim 2 fm before dropping off due to the finite nuclear size. On account of the reduced number of events at small impact parameters (bb), the NpartN_{\text{part}} distribution exhibits an exponential fall-off. The NcollN_{\text{coll}} and NchN_{\text{ch}} are directly determined by the the NpartN_{\text{part}}, therefore their distributions follow the same trend. Due to this, our model has fewer training samples in the high-NchN_{\text{ch}} region. To handle this sparse-data regime, a PINN is a good choice.

A total of 10610^{6} minimum bias Zr+Zr collision events were simulated at sNN=200\sqrt{s_{NN}}=200 GeV using the HYDJET++ model, assuming spherical nuclear configurations obtained from the density functional theory (DFT) calculations [24]. Those events are selected that fall under the pseudorapidity cut |η|<0.5|\eta|<0.5.

The code for purely data-driven and physics-based models is nearly the same, with the latter incorporating an additional physics-informed loss term. We started with predicting the value of xx using PINN. For that, we took xx as a trainable parameter in our code. The value of nppn_{pp} is fixed to 3.602 as obtained with the help of PYTHIA. The model trains on 1 million minimum-bias Zr+Zr collision data points converged within 32 epochs (early stopped), yielding a predicted value of x0.4097x\approx 0.4097, which we round off to 0.41. To validate this result, we fixed xx together with nppn_{pp} and performed multiple trials by varying the value of xx from 0.1 to 1.0. The parity plot confirms that x=0.41x=0.41 provides the best agreement with the data, as shown in Fig. 2.

Refer to caption
Figure 4: Loss trends for 500 training data-sets of Zr+Zr collisions.
Refer to caption
Figure 5: Comparison of simulated (actual) and PINN predicted NchN_{\text{ch}} for the test data of Ru+Ru.
Refer to caption

(a)

Refer to caption

(b)

Refer to caption

(c)

Figure 6: Comparison of simulated (actual) and model-predicted NchN_{\text{ch}} for the Au+Au test data: (a) without physics, (b) λ=0.1\lambda=0.1, and (c) λ=1.0\lambda=1.0.

Taking xx as a fixed parameter, we conducted trials to determine the value of λ\lambda that provides a good balance between data loss and physics loss. The corresponding plots are presented in Fig. 3, obtained using one-tenth of the original dataset (0.1 million events) of Zr+Zr collisions and trained for 100 epochs. The total data set is divided into an 8080%-2020% split, with 2020% reserved for testing. Performance metrics [25] are tabulated in the Table I. Since λ=0.1\lambda=0.1 does not strongly enforce physics, it is more suitable for fitting the data at a few number of epochs. As the value of λ\lambda increases, the neural network output and physics-based model show strong agreement. At high NchN_{\text{ch}}, due to the sparsity of data, the NN gets less guidance from the data. Hence, the prediction follows the physics model strongly.

The comparison between PINN and purely data-driven models is performed using limited Zr+Zr data to evaluate how much data is sufficient for effective training. Corresponding evaluation matrices are tabulated in Table II. For training and testing, only 500 data points are used on an 80–20% train-test split. A batch size of 10 is employed and the value of λ\lambda is taken as 0.10.1. We tested the model with and without physics for 10 epochs, and the results clearly indicate that incorporating physical constraints provides a distinct advantage, especially when the model is trained with limited data. The loss curve for this is plotted in Fig. 4.

10 Epochs
PINN Normal NN
R2R^{2} 0.9789 0.9742
RMSE 31.19 34.68
MAE 20.30 23.75
Table 2: Performance metrics for 500 data points for PINN and normal NN over different epochs.

The model trained on Zr+Zr collisions is evaluated on Ru+Ru and Au+Au data to assess generalizability. We trained it with one million Zr data points with a physics weight of λ=0.1\lambda=0.1 over 100100 epochs. Testing datasets of 1000 events each are taken from Ru+Ru and Au+Au collisions. The PINN model predicted Ru’s NchN_{\text{ch}} with high accuracy as it is an isobar of Zr (as shown in Fig. 5). We further evaluated the model on Au data, which was completely absent during the training process, serving as a fully unseen test set. In Fig. 6, the predicted NchN_{\text{ch}} of Au is shown for three different conditions, namely (a) without physics constraints, (b) with PINN (λ=0.1\lambda=0.1), and (c) with PINN (λ=1.0\lambda=1.0). Without physics-based loss, the model underpredicts around NchN_{\text{ch}} = 400400. That shows that the purely data-driven model is poor at predicting unknown NchN_{\text{ch}}. Meanwhile, with physics imposed, the model predicts better for Au+Au data. The prediction accuracy improves for λ=1.0\lambda=1.0 compared to λ=0.1\lambda=0.1, as the former predicts higher NchN_{\text{ch}} values that are not present in the Zr’s training dataset. So, despite being trained on Zr+Zr data with a fixed hard-scattering fraction (xx), the PINN is successful in reproducing the charged-particle multiplicity trends in Au+Au collisions. This explains that the model has learned the Glauber-type scaling law and can generalize the learned physical relation across collision systems.

IV Summary

In summary, we started training our models with Zr+Zr collisions data generated by the HYDJET++ model. A limited number of data points are present in the low impact parameter region, due to Glauber statistics. So, the use of a physics-informed neural network (PINN) helps to compensate for this scarcity by incorporating physics constraints into the learning process. We have taken xx as a scalar trainable parameter, and the model predicts a value of xx=0.41. Then we have verified this value of xx by taking both xx and nppn_{pp} fixed. In our study, with the increasing value of λ\lambda, the normal neural network output and physics-based output show strong agreement. We also find that physical constraints can make the model learn better with fewer training data points. With PINN, the trained model can predict even on unseen test datasets, as we have shown using isobaric collisions of Ru+Ru data and unseen Au+Au data.

Thus, it is expected that our work will enlighten the role of PINN over conventional NN in heavy-ion collision problems. This approach will help the high-energy physics community to learn complex nonlinear relations between different observables in the future. PINN can specifically be effective at reducing the number of simulated events and the overall computation time in heavy-ion collision phenomenological analysis. PINN can also help predict the outcomes of Beam Energy Scan (BES) studies, provided that data at specific energies and reliable theoretical constraints are available.

V Data availability

Data and codes will be made available on request.

VI Acknowledgment

We are grateful to Dr. Stephen Baek for his valuable lectures on PINN and L. Pon Vijaya Kanthan for his guidance on coding. BKS sincerely acknowledges financial support from the Institute of Eminence (IoE), BHU Grant number 6031. AD acknowledges the financial support through the Institute fellowship from IIITDM Jabalpur. SRN acknowledges the financial support from the UGC Non-NET fellowship and IoE research incentive during the research work.

References

  • [1] J. C. Collins and M. J. Perry, Phys. Rev. Lett. 34 (1975), 1353 doi:10.1103/PhysRevLett.34.1353
  • [2] E. V. Shuryak, Phys. Lett. B 78 (1978), 150 doi:10.1016/0370-2693(78)90370-2
  • [3] M. Abdallah et al. [STAR], Phys. Rev. C 105 (2022) no.1, 014901 doi:10.1103/PhysRevC.105.014901 [arXiv:2109.00131 [nucl-ex]].
  • [4] J. Adam et al. [STAR], Phys. Rev. C 102 (2020) no.5, 054913 doi:10.1103/PhysRevC.102.054913 [arXiv:2006.00582 [nucl-ex]].
  • [5] K. Adcox et al. [PHENIX], Phys. Rev. Lett. 86 (2001), 3500-3505 doi:10.1103/PhysRevLett.86.3500 [arXiv:nucl-ex/0012008 [nucl-ex]].
  • [6] C. Adler et al. [STAR], Phys. Rev. Lett. 87 (2001), 112303 doi:10.1103/PhysRevLett.87.112303 [arXiv:nucl-ex/0106004 [nucl-ex]].
  • [7] D. Guest, K. Cranmer and D. Whiteson, Ann. Rev. Nucl. Part. Sci. 68 (2018), 161-181 doi:10.1146/annurev-nucl-101917-021019 [arXiv:1806.11484 [hep-ex]].
  • [8] J. Duarte and J. R. Vlimant, doi:10.1142/9789811234033_0012 [arXiv:2012.01249 [hep-ph]].
  • [9] M. Raissi, P. Perdikaris and G. E. Karniadakis, J. Comput. Phys. 378, 686–707 (2019). doi:10.1016/j.jcp.2018.10.045
  • [10] I. P. Lokhtin, L. V. Malinina, S. V. Petrushanko, A. M. Snigirev, I. Arsene and K. Tywoniuk, Comput. Phys. Commun. 180 (2009), 779-799 doi:10.1016/j.cpc.2008.11.015 [arXiv:0809.2708 [hep-ph]].
  • [11] I. P. Lokhtin, L. V. Malinina, S. V. Petrushanko and A. M. Snigirev, Phys. Atom. Nucl. 73 (2010), 2139–2147 doi:10.1134/S1063778810120203 [DOI]
  • [12] T. Sjostrand, S. Mrenna and P. Z. Skands, JHEP 05 (2006), 026 doi:10.1088/1126-6708/2006/05/026 [arXiv:hep-ph/0603175 [hep-ph]].
  • [13] N. S. Amelin, R. Lednicky, T. A. Pocheptsov, I. P. Lokhtin, L. V. Malinina, A. M. Snigirev, I. A. Karpenko and Y. M. Sinyukov, Phys. Rev. C 74 (2006), 064901 doi:10.1103/PhysRevC.74.064901 [arXiv:nucl-th/0608057 [nucl-th]].
  • [14] N. S. Amelin, R. Lednicky, I. P. Lokhtin, L. V. Malinina, A. M. Snigirev, I. A. Karpenko, Y. M. Sinyukov, I. Arsene and L. Bravina, Phys. Rev. C 77 (2008), 014903 doi:10.1103/PhysRevC.77.014903 [arXiv:0711.0835 [hep-ph]].
  • [15] B. Andersson, Camb. Monogr. Part. Phys. Nucl. Phys. Cosmol. 7 (1997), 1-471 Cambridge University Press, 1998, ISBN 978-1-009-40129-6, 978-1-009-40125-8, 978-1-009-40128-9, 978-0-521-01734-3, 978-0-521-42094-5, 978-0-511-88149-7 doi:10.1017/9781009401296
  • [16] B. Andersson, S. Mohanty and F. Soderberg, Eur. Phys. J. C 21 (2001), 631-647 doi:10.1007/s100520100757 [arXiv:hep-ph/0106185 [hep-ph]].
  • [17] C. Bierlich, G. Gustafson, L. Lönnblad and H. Shah, JHEP 10 (2018), 134 doi:10.1007/JHEP10(2018)134 [arXiv:1806.10820 [hep-ph]].
  • [18] C. Bierlich, S. Chakraborty, N. Desai, L. Gellersen, I. Helenius, P. Ilten, L. Lönnblad, S. Mrenna, S. Prestel, C. T. Preuss et al., SciPost Phys. Codeb. 2022 (2022), 8 doi:10.21468/SciPostPhysCodeb.8 [DOI]
  • [19] I. Helenius and C. O. Rasmussen, Eur. Phys. J. C 79 (2019) no. 5, 413 doi:10.1140/epjc/s10052-019-6914-1 [DOI]
  • [20] T. Sjostrand and M. van Zijl, Phys. Rev. D 36 (1987), 2019 doi:10.1103/PhysRevD.36.2019
  • [21] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai and S. Chintala, arXiv:1912.01703 [cs.LG] [arXiv]
  • [22] S. G. K. Patro and K. K. Sahu, arXiv:1503.06462 [cs.OH] [arXiv]
  • [23] S. R. Dubey, S. K. Singh and B. B. Chaudhuri, [arXiv:2109.14545 [cs.LG]].
  • [24] H. j. Xu, H. Li, X. Wang, C. Shen and F. Wang, Phys. Lett. B 819 (2021), 136453 doi:10.1016/j.physletb.2021.136453 [arXiv:2103.05595 [nucl-th]].
  • [25] A. Botchkarev, doi:10.48550/arXiv.1809.03006