A robust and versatile deep learning model for prediction of the arterial input function in dynamic small animal [F18]\left[{}^{18}\text{F}\right]FDG PET imaging

Christian Salomonsen* Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Luigi T. Luppino* Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Fredrik Aspheim Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Kristoffer K. Wickstrøm Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Elisabeth Wetzer Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Michael C. Kampffmeyer Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Rodrigo Berzaghi PET Imaging Center, University Hospital of North Norway, Tromsø, Norway Department of Clinical Medicine, UiT The Arctic University of Norway, Tromsø, Norway Rune Sundset PET Imaging Center, University Hospital of North Norway, Tromsø, Norway Department of Clinical Medicine, UiT The Arctic University of Norway, Tromsø, Norway Robert Jenssen Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway Samuel Kuttner Department of Physics and Technology, UiT The Arctic University of Norway, Tromsø, Norway PET Imaging Center, University Hospital of North Norway, Tromsø, Norway
(October 23, 2025)
Abstract

Dynamic positron emission tomography (PET) and kinetic modeling are pivotal in advancing tracer development research in small animal studies. Accurate kinetic modeling requires precise input function estimation, traditionally achieved through arterial blood sampling. However, arterial cannulation in small animals, such as mice, involves intricate, time-consuming, and terminal procedures, precluding longitudinal studies. This work proposes a non-invasive, fully convolutional deep learning-based approach (FC-DLIF) to predict input functions directly from PET imaging data, which may eliminate the need for arterial blood sampling in the context of dynamic small-animal PET imaging.

The proposed FC-DLIF model consists of a spatial feature extractor that acts on the volumetric time frames of the dynamic PET imaging sequence, extracting spatial features. These are subsequently further processed in a temporal feature extractor that predicts the arterial input function. The proposed approach is trained and evaluated using images and arterial blood curves from [18F]FDG\mathrm{[^{18}F]FDG} data using cross validation. Further, the model applicability is evaluated on imaging data and arterial blood curves collected using two additional radiotracers ([18F]FDOPA\mathrm{[^{18}F]FDOPA}, and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA}). The model was further evaluated on data truncated and shifted in time, to simulate shorter, and shifted, PET scans.

The proposed FC-DLIF model reliably predicts the arterial input function with respect to mean squared error and correlation. Furthermore, the FC-DLIF model is able to predict the arterial input function even from truncated and shifted samples. The model fails to predict the AIF from samples collected using different radiotracers, as these are not represented in the training data.

Our deep learning-based input function offers a non-invasive and reliable alternative to arterial blood sampling, proving robust and flexible to temporal shifts and different scan durations.

11footnotetext: These authors contributed equally to this work.22footnotetext: Corresponding author: christian.salomonsen@uit.no

1 Introduction

Dynamic positron emission tomography (PET) plays a critical role in the imaging of small animals, enabling in vivo visualization of tracer uptake over time, and subsequent quantification of tracer transport or binding through kinetic modeling [1]. This is actively used as a tool for downstream tasks, for example in the development of new tracers, drugs, diagnostic procedures, and disease therapies [2, 3, 4]. However, a prerequisite for kinetic modeling is knowledge of the arterial tracer concentration over time, commonly referred to as the arterial input function (AIF) [5].

Arterial blood sampling is considered the gold standard for AIF acquisition, but in preclinical settings it presents serious limitations. The procedure is technically complex, time-consuming, and terminal in mice due to the invasiveness of carotid cannulation. Furthermore, only a limited amount of blood volume can be withdrawn without altering animal physiology [6, 7]. These limitations make high-throughput and longitudinal studies infeasible and may further raise ethical concerns regarding excessive animal usage [8].

This has driven significant interest in non-invasive AIF estimation strategies. Population based input functions (PBIFs) averages time-activity curves (TACs) from demographically similar subjects [9], but fail to capture individual variability and still requires at least one blood sample to scale the population curve. Image-derived input functions (IDIFs) extract TACs from vascular regions such as the left ventricle, but are prone to inaccuracies stemming from partial-volume effects, motion artifacts, and poor signal-to-noise ratios [10, 6, 11, 12, 13]. Simultaneous estimation (SIME) approaches reduce the reliance on blood sampling by fitting both tissue kinetics and the AIF from multiple regions concurrently [14], but they require a predefined functional form for the AIF and still depend on at least one late-time blood sample to anchor the estimated curve to absolute tracer concentrations [15, 16, 17, 18].

To address these limitations, several data-driven alternatives have been proposed. In our previous research, we developed machine learning-based methods for AIF estimation using extracted TACs as input [19, 20]. While these methods eliminated the need for blood sampling, they relied on manually delineated regions of interest (ROIs) and post-imaging TAC extraction, limiting their scalability and integration into automated pipelines.

In more recent studies, it has been demonstrated that deep learning-based methods outperform more traditional machine learning-based methods bypassing the need for handcrafted features and manual ROI delineation by prediction of the AIF directly from the dynamic PET image volume [21, 22, 23, 24]. However, most of these methods adopt hybrid architectures that combine 3D convolutional layers with fully connected or recurrent layers. These architectural choices introduce key limitations. For example, fully connected layers necessitate fixed-length inputs, forcing all dynamic scans to be padded, truncated, or interpolated to a uniform number of time frames prior to inference [21, 22]. This increases computational overhead and may introduce interpolation artifacts, especially when acquisition protocols differ in temporal resolution or duration.

Recurrent-style models, such as LSTM-based networks [21, 24], attempt to model temporal dependencies explicitly, but often learn features tied to specific time indices, reducing robustness to timing shifts or variation in tracer arrival across subjects. Furthermore, several methods include post hoc fitting or model parameter regression [24, 23], adding assumptions and complexity to the pipeline. These constraints can hinder generalization and reduce flexibility in practical settings.

In this study, we propose a novel approach—fully convolutional deep learning-based input function prediction (FC-DLIF)—that overcomes these limitations by using only convolutional operations over both spatial and temporal dimensions. The fully convolutional design allows FC-DLIF to process input sequences of arbitrary length, eliminating the need for dense or recurrent layers, and removing dependence on fixed input shapes. FC-DLIF takes reconstructed 4D dynamic PET data (t, x, y, z) and directly predicts an AIF output, with no need for manual ROI segmentation, TAC extraction, temporal resampling, or post hoc fitting. By treating time as a learnable axis and detecting temporal patterns such as peak onset, plateau, and washout tails, regardless of absolute frame positions, FC-DLIF is inherently robust to timing variability and flexible across different imaging protocols. The architecture provides a streamlined, end-to-end pipeline for non-invasive AIF estimation suitable for diverse preclinical study designs.

We evaluate FC-DLIF on a dynamic PET dataset of mice with paired arterial blood data, and show that our method generalizes across protocols with time-shifted inputs and different scan durations. By accurate, non-terminal, and non-invasive AIF estimation, FC-DLIF has the potential to reduce animal use, facilitating longitudinal studies, and simplify the workflow for kinetic-modeling, aligning with the 3Rs principles of animal research [8].

2 Materials and methods

Refer to caption
Figure 1: The proposed architecture consists of two parts: a 3D ResNet acts as a spatial feature extractor (SFE), and a 1D convolutional network acts as a temporal feature extractor (TFE). The first takes in input the data volume of each time step and reduces its dimensions to one vector of 3232 features. Then the extracted vectors of all time steps are stacked along the time dimension, over which the second part of the model performs a series of convolutions, until the vector dimension is reduced to 1. The final output of the model is a time-series with the same length as the input data.

This section describes the data, the design of the proposed deep learning model, and its training procedure. Further details on data collection are provided in Appendix A.

2.1 Dataset

The training dataset consisted of 7070 dynamic [18F]FDG\mathrm{[^{18}F]FDG} PET scans with simultaneously measured AIFs of mice in ages 9 to 24924 weeks, collected at UiT The Arctic University of Norway (UiT). The dataset included three different mouse strains: BALB/cJRj (N = 5555), C57BL/6JRj (N = 88) and Balb/cAnNCrl (N = 77). Another 1010 samples, collected with different tracers [18F]FDOPA\mathrm{[^{18}F]FDOPA} (N = 66) and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA} (N = 44), which were not included during model training, but used to evaluate the model performance on new, unseen tracers. Following PET imaging, dynamic image volumes were reconstructed into 4242 time-frames (\qtylist[parse-numbers=false]\numproduct1x30;\numproduct24x5;\numproduct9x20;\numproduct8x300) of whole-body data with shape \numproduct42x96x48x48, and normalized into standardized uptake value (SUV) [25].

2.2 Model architecture

Fig. 1 shows the architecture of the proposed model. It consists of two parts, one spatial, and one temporal feature extractor, that serve different purposes. First, the data volume associated with each time step of the dynamic scans is passed through a 3D ResNet [26], called a spatial feature extractor (SFE), which extracts relevant spatial features while reducing only the spatial dimensions. From the input, each data volume goes through a series of residual blocks interleaved with max-pooling layers, until a convolutional layer with a \numproduct4x2x2 cuboid kernel, followed by an adaptive average-pooling layer reduces said volume to a one-dimensional vector of 3232 features. This acts as a compact representation for each time step. Effectively, each time frame is processed independently by the SFE, exposing the same convolutional filters to each part of the dynamic image. In the second part of the model, all the extracted vectors are stacked along the time dimension, over which the temporal feature extractor (TFE) captures temporal correlations between the time frames. The TFE uses 1D convolutions, which is motivated by recent research that has demonstrated superior performance and efficiency compared to alternative methods based on recurrent neural networks [27]. The output of the TFE is the final output of the model and represents the DLIF prediction of the AIF, given the data volume.

2.3 Training procedure

The ADAM optimizer [28, 29] with standard settings was used to minimize the weighted mean squared error (wMSE) between the predicted DLIF and the ground truth. The error on each time step was weighted to account for the imbalance between three parts of the curve: the first 2525 (peak), middle 99 (intermediate), and last 88 (tail) frames were associated with weights 0.4 , 0.7 and 10.40.71 respectively. Training was performed with a learning rate of 1×1041\text{\times}{10}^{-4} for 10001000 epochs. 1010-fold cross validation allowed the evaluation of the model performance over the whole training dataset. For each fold, 1010 runs were repeated for statistical rigor. Data augmentation with additive random Poisson noise injection were used during training, with the goal of exposing the model to different signal-to-noise ratios and accustom it to lower image qualities. The degraded image was generated by sampling a scalar pUnif(0,1)p\sim\mathrm{Unif}(0,1), then, for a dynamic PET image with voxel intensities II, the corresponding image with noise, IpoissonI_{\mathrm{poisson}} is defined as:

Λ\displaystyle\Lambda =Ip\displaystyle=I\cdot p (1)
Ipoisson\displaystyle I_{\mathrm{poisson}} I+Pois(Λ)Λ\displaystyle\equiv I+\mathrm{Pois}(\Lambda)-\Lambda
Refer to caption
(a) MSE boxplot
Refer to caption
(b) MSE over different injected noise factors
Figure 2: Quantitative comparison between the baseline and the proposed model over each sample, (a) boxplot of the MSE from the two models; (b) Lineplot with mean and 95% confidence interval for different Poisson noise factors.

2.4 Evaluation metrics

The predicted DLIF curves were compared time frame by time frame with the respective measured AIF using a paired t-test (α=0.05\alpha=0.05), and orthogonal regression to account for measurement errors in both predicted and measured variables. Normality was assessed using quantile-quantile plots, reporting the Pearson correlation to describe the spread of the predicted curves. In order to assess the efficacy of the proposed method for downstream tasks, such as kinetic modeling, the graphical Patlak plot [30, 31] is used to produce net influx rates Ki given the reference and predicted input functions for each voxel over time. Prior to kinetic modeling, both the AIF and DLIF curves were converted to plasma input functions following [32]. DLIF-based influx estimates were further compared to those obtained from the measured AIF using the same evaluation methods as for the predicted curves.

Comparisons were done against the DLIF model proposed by Kuttner et al. [22], which will be referred to as the baseline in the following sections. To ensure fair comparisons, the baseline was trained on the same data as the FC-DLIF, also using a 1010-fold cross validation setting.

Additionally and unrelated to the above comparisons, the FC-DLIF network’s ability to operate on data with varying temporal dimensions were explored using two tests. First, a time-shifted sample was simulated by prepending the initial time frame to the dynamic PET image—before the radiotracer uptake phase had begun—to reveal any learned time-specific dependencies. For instance, the network may insist that time frame ItI_{t} always contains the peak of the input function during inference.

Second, the dynamic image was truncated by removing the first 44 frames (first \qty40), and the last 66 time frames (last \qty30) of the signal. This simulates a change in the imaging protocol, where radiotracer infusion and imaging is simultaneously started, partially masking the input function onset, and ending the scan earlier. This experiment was hypothesized to reveal the models ability to work with limited scan durations, while retaining performance on the task.

To investigate the latent representations extracted from the SFE, the t-SNE [33] algorithm was used. The algorithm was applied to one model from a single run of a specific fold (fold 2, run 9) to visualize common traits of the condensed spatial representation.

3 Results

Refer to caption
Figure 3: Scatterplot summarizing the results over the whole dataset for the arterial input function estimation for each model. The points show the pointwise comparison between true and predicted values for each mouse, represented by 4242 time steps for both the baseline and proposed models. The axes labels indicate the SUV for either the predicted (SUVDLIF), or measured input function (SUVAIF). Ideally for a perfect fit, all points would lie on the black dashed line y=xy=x, with a=1a=1 and b=0b=0. The coefficient of determination r21r^{2}\leq 1 quantifies how well the predictor fits the data. Similarly, Pearson’s correlation coefficient r1r\leq 1 measures the linear correlation between the AIF and the DLIF.

First, this section summarizes the results of the proposed FC-DLIF model during cross-validation on our dataset when compared against the baseline, both in terms of errors on AIF estimation and, consequently, after tracer kinetic modeling. Then, some examples of AIF estimation show the performance and versatility of the proposed model with respect to the input data. Finally, the features extracted by the model’s SFE is visualized and inspected using the t-SNE algorithm [33].

3.1 Quantitative results

Fig. 2 summarizes the distribution of mean squared error (MSE, Fig. 2(a)) and injected noise (Fig. 2(b)) across all samples. FC-DLIF consistently achieved lower MSE values than the baseline, with a noticeably narrower spread and a more favorable distribution across percentiles. In terms of noise, FC-DLIF also demonstrated reduced variability, being less sensitive to noise in the input data.

In particular, the proposed model has close to 75 %75\text{\,}\mathrm{\char 37\relax} of MSE vales lower than the median of the baseline model, and generally produce predictions more aligned with the AIF, as seen in the 5th5^{th} percentile (best 95 %95\text{\,}\mathrm{\char 37\relax} of predictions), which is slightly lower than the baseline (Fig. 2(a)). These results are further consolidated when comparing the same metrics over temporal regions, as shown in Fig. 9, where the proposed method exhibits less bias in most segments.

In Fig. 3, the predicted AIF values from each model are compared against ground truth measurements on a point-wise basis. The proposed method achieves higher R2R^{2} and Pearson correlation values, with predictions more closely aligned to the ideal y=xy=x line, indicating a tighter spread of predictions around the regression line. While the baseline exhibits systematic deviations, particularly overestimating intermediate and tail values and underestimating peak amplitudes, FC-DLIF maintains a tighter spread around the regression line.

Predictions from the baseline model are generally overestimating small values (less than 4 g/ml4\text{\,}\mathrm{g}\mathrm{/}\mathrm{m}\mathrm{l}), and underestimating larger values, associated with the peak of the input function curve. The same characteristics are not seen for the FC-DLIF method, but instead the spread is generally larger above values of 4 g/ml4\text{\,}\mathrm{g}\mathrm{/}\mathrm{m}\mathrm{l}, which is also shown in Fig. 9, for the peak labeled segments.

Downstream implications of AIF estimation accuracy are reflected in kinetic modeling results shown in Fig 4, where voxel-wise KiK_{i} values derived from each model are compared to reference values obtained using the measured AIF and the Patlak model [34, 35]. FC-DLIF again yields higher agreement with the reference, confirming that improvements in input function estimation translate to better physiological parameter estimation.

Results on the unseen tracer data ([18F]FDOPA\mathrm{[^{18}F]FDOPA} and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA}) showed decreased performance for both models, as expected due to tracer-specific uptake characteristics. Additional scatterplots and analysis are provided in the Appendix B (Figs. 10 and 11).

Refer to caption
Figure 4: Scatterplot summarising the results over the whole dataset for voxel-wise tracer kinetic modeling. Each point represents a voxel within a mouse, randomly sampled over 50 00050\,000 voxels. Also for this figure, the fitted lines should ideally be overlapping the dashed y=xy=x line

3.2 Qualitative results

This section evaluates the predictions of the proposed FC-DLIF model. This is followed by an experiment that shift the PET images in time, to delay tracer injection start. Then the PET images are truncated, which simulates simultaneous tracer injection and scan start.

Refer to caption
(a) Best sample
Refer to caption
(b) Median sample
Refer to caption
(c) Worst sample
Figure 5: Examples of input function predicted with FC-DLIF and compared against the ground truth AIF. The insets zoom on the first 33 minutes of the curves, to emphasize the tracer uptake peak. (a) Best sample; (b) median sample; (c) worst sample.

Fig. 5 shows three examples of input functions predicted by the proposed FC-DLIF model. The mean curve and standard deviation of the model’s predictions are included for the best (Fig. 5(a)), median (Fig. 5(b)), and worst (Fig. 5(c)) sample according to the MSE.

Fig. 6 displays the versatility of the proposed model owing to its fully convolutional design. In particular, the example on the left is the output of the model when the input data is time-shifted by prepending the first frame of the dynamic PET image to itself. The example on the right is the result obtained by truncating the input both at the beginning and at the end of the time series, removing several time steps.

Refer to caption
(a) Time-shifted sample
Refer to caption
(b) Truncated sample
Figure 6: Special cases of input function predicted with FC-DLIF. The insets zoom on the first 33 minutes of the curves: (a) Shifted sample. An additional empty 3030 second frame was added at the beginning of the input time series; (b) Truncated sample. The first 4040 seconds and the last 3030 minutes of the input time series were removed, corresponding to the first 44 and last 66 time steps.

Finally, t-SNE is used to visualize the similarities between the high-dimensional vectors extracted by the SFE for each time step of each mouse, on both the training and test datasets. Fig. 7 and 8 illustrate the results of the t-SNE dimensionality reduction for the groups of injected volume and injected tracers respectively. To the left, the colormap indicates the peak, mid, and tail time steps in shades of red, green, and blue respectively. On the right hand side, the colormap indicate certain attributes—the amount of [18F]FDG\mathrm{[^{18}F]FDG} injected in Fig. 7, and what tracer is used in Fig. 8.

Fig. 7 shows a clear temporal progression in feature space. Early frames with low uptake cluster tightly on the left, followed by a gradual arc through peak and post-distribution frames. The pattern reflects the biological progression of tracer dynamics and suggests that the SFE captures consistent spatial characteristics across time.

Likewise, Fig. 8 shows the t-SNE projection, from different tracer groups, that were kept aside during training, comprised of [18F]FDG\mathrm{[^{18}F]FDG}, [68Ga]PSMA\mathrm{[^{68}Ga]PSMA}, and [18F]FDOPA\mathrm{[^{18}F]FDOPA} injected mice. While the early and peak phases of all tracers appear to follow a similar initial trajectory, clear deviations form in the later frames. In particular, the steady-state representations of the [68Ga]PSMA\mathrm{[^{68}Ga]PSMA} and [18F]FDOPA\mathrm{[^{18}F]FDOPA} groups form separate clusters, distinct from the [18F]FDG\mathrm{[^{18}F]FDG} tail-phase points. This indicates that the model’s spatial encoder captures consistent tracer-specific signatures not seen during training. Additionally, the [18F]FDOPA\mathrm{[^{18}F]FDOPA} group also diverges around the late peak phase, forming a distinct cluster that may explain the model’s reduced accuracy for this group. These observations align with the systematic under- and overestimation patterns seen in the early and late phases of predicted AIFs for these out-of-distribution tracers.

Refer to caption
Figure 7: t-SNE visualization of the SFE-extracted vectors for different injection volumes. On the left-hand side, the colormap distinguishes peak (red), mid (green), and tail (blue) time steps. To the right, the colormap indicates the various mice groups.
Refer to caption
Figure 8: t-SNE visualization of the SFE-extracted vectors for different injected tracers. To the left, the colormap distinguishes peak (red), mid (green), and tail (blue) time steps. To the right, the colormap indicates the various mice groups.

4 Discussion

This study introduces FC-DLIF, a robust and flexible deep learning model for the non-invasive estimation of AIF directly from dynamic PET imaging data. Compared to existing methods, FC-DLIF demonstrates superior accuracy, reduced bias, and improved robustness to noise and input variability, while maintaining flexibility with respect to temporal shifts and varying scan lengths. The fully convolutional design enables the model to generalize across different imaging protocols without the need for rigid preprocessing or fixed input dimensions.

4.1 Quantitative analysis

Fully convolutional design leads to consistent improvements

Across all quantitative evaluations, FC-DLIF demonstrated superior accuracy compared to the baseline model. While both methods use 3D convolutional backbones to extract spatial features, the fully convolutional design of FC-DLIF, that comprises separate spatial and temporal feature extractors, appears to contribute to more accurate AIF predictions. In particular, each time frame is processed independently through the same spatial filters in the SFE, potentially reducing time-point-specific biases during feature extraction. The subsequent TFE, built with 1D convolutions (Fig. 1), is then able to learn temporal dependencies over a time-series of uniformly extracted spatial representations.

Although the precise source of performance gain is difficult to isolate, the two-step network inherently enforces a type of implicit regularization. This separation of concerns allows each subnetwork to specialize; spatial encoding in the SFE; and temporal modeling in the TFE, potentially introducing a form of structural regularization. This modular architecture may help the network learn more generalized temporal patterns in tracer dynamics, leading to improved alignment with the true AIF. The benefits are clearly reflected in the lower MSE distribution (Fig. 2(a)), being less affected by a noisy input signal (Fig. 2(b)), and a tighter correspondence between predicted and true AIF values (Fig. 3).

Improved downstream kinetic modeling

The improvements in AIF prediction also translate into better physiological parameter estimation, as evidenced by the voxel-wise KiK_{i} comparisons (Fig. 4), and region-wise comparisons (Fig. 13(e) and Fig. 14(e)). Accurate AIF estimation is critical for computing reliable kinetic parameters using models such as Patlak [34, 35] and compartment models [36], and FC-DLIF’s enhanced performance further supports its potential to replace invasive blood sampling without compromising modeling accuracy.

Generalization challenges with unseen tracers

When applied to a dataset using tracers not seen during training ([18F]FDOPA\mathrm{[^{18}F]FDOPA} and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA} tracers), both models exhibited reduced performance. This result is consistent with the expectation that different tracers produce distinct uptake patterns and temporal kinetics, making generalization difficult without exposure during training. Future work may explore strategies for improving generalization across tracers, such as domain adaptation [37, 38], tracer-aware conditioning, or semi-supervised strategies [39].

4.2 Qualitative analysis

Understanding prediction characteristics

To further gain insight into the model’s behavior, the AIF predictions from FC-DLIF were examined over a range of samples (Fig. 5). Even in the worst-performing example, the model captures the general shape of the AIF, particularly during the post-distribution phase. This aligns with the scatter-based results from Fig. 3, where the largest errors are observed around the peak uptake frames, while the steady-state phase shows more consistent alignment with ground truth. This discrepancy likely stems from the temporal characteristics of the input data itself: the uptake phase is captured in only a few frames, and exhibits high inter-frame variability, making it difficult to learn reliably. In contrast, the tail of the curve, corresponding to tracer equilibrium, spans more frames and features more consistent signal characteristics, yielding more stable predictions (as seen in Fig. 9).

Importantly, for downstream kinetic modeling tasks such as Patlak analysis [34, 35], the steady-state segment of the AIF is the only part used to derive macro-parameters like KiK_{i}. This suggests that even if the peak predictions are imperfect, FC-DLIF’s accurate modeling of the linear phase still enables reliable kinetic quantification, when using the Patlak model.

Robustness to shifts and truncation

One of the practical advantages of FC-DLIF is its architectural flexibility, which is demonstrated by the model’s robustness to time shifts and partial input sequences, a trait which has not been seen in other similar works. When the input time series was artificially shifted or truncated, the model continued to produce consistent AIF predictions (Fig. 6). This behavior stems directly from the model’s fully convolutional structure, which applies the same spatial and temporal filters regardless of absolute time index or input length, emphasizing its practicality in real-world settings, where scan durations and start times may vary between protocols or subjects.

What the model learns from spatial features

To better understand the internal representations learned by FC-DLIF, the spatial features extracted by the SFE were visualized using t-SNE [33]. Across all mice and time points, the t-SNE component projection reveals a coherent temporal progression: from early frames with low uptake, through peak uptake, to the post-distribution tail. This suggests that the SFE successfully encodes temporally meaningful spatial information, even though it processes each frame independently. Notably, these feature clusters were consistent across both training and test sets, with no clear separation, indicating good generalization and successfully not overfitting. Furthermore, the internal feature space showed no meaningful correlation with injected volume, suggesting that the SFE learned to abstract away from trivial scaling differences in tracer volume, which is an encouraging sign of biological relevance and robustness (Fig. 7).

Out-of-distribution tracers reveal biological misalignment

Despite showing reasonable results and potential for improvements, t-SNE projections of mice injected with unseen tracers ([68Ga]PSMA\mathrm{[^{68}Ga]PSMA} and [18F]FDOPA\mathrm{[^{18}F]FDOPA}) revealed key deviations. While their early and peak-phase representations partially aligned with the [18F]FDG\mathrm{[^{18}F]FDG} training set, their tail-phase features diverged and formed distinct clusters. This reflects fundamental differences in tracer kinetics and binding properties [40]. Correspondingly, FC-DLIF struggled to predict accurate AIFs for these tracers, typically underestimating peak intensity and overestimating post-distribution activity. This mismatch to some extent mirrors the temporal drift observed in the feature space, and suggests that the model’s learned temporal filters in the TFE are not readily transferable across tracer types without retraining.

Spatial dimensions limitations

While the proposed method accepts input data of varying temporal dimensions, a pretrained model constrains future studies to preprocess their spatial dimensions to match that of the data it was trained on. Since transformations and interpolations over spatial dimensions are common and easier to visualize than temporal ones, this is not seen as a large hurdle.

Toward generalization through transfer learning

Despite these limitations, the structure of the learned feature space provides an opportunity for targeted adaptation. Since the SFE appears to produce meaningful spatial embeddings even for unseen tracers, it may be possible to freeze this component and retrain only the temporal extractor (TFE) on a small amount of new tracer data, to map the distinct clusters back onto the common feature manifold. This strategy aligns with principles from transfer learning and domain adaptation [37, 38], and could substantially reduce the data requirements for generalizing FC-DLIF to new tracers. Given that most variation across tracers lies in their temporal behavior, this modular approach, reusing spatial filters while tailoring the temporal decoder, offers a promising path forward.

Compact design for efficient deployment

A final yet key advantage of FC-DLIF is its compact size. The model end-to-end processes full 4D PET volumes yet comprises only 90 12490\,124 trainable parameters (352 KB). This efficiency arises from its internal structure, where each 3D time frame is processed independently using shared spatial filters in the SFE, followed by a lightweight 1D convolutional TFE consisting of 4 layers. For comparison, the input images are approximately 71 MB (64-bit floating point) each, making the model over 200 times smaller than a single input scan. Despite its modest footprint, which is comparable to the 92 10492\,104 parameter (360 KB) baseline model, FC-DLIF achieves superior predictive performance, making it an attractive candidate for real-time or embedded applications.

5 Conclusion

This study introduces a non-invasive, deep learning-based input function predictor, that provides a flexible method that is invariant to time-shifting and truncation of the sequence length. Altogether, FC-DLIF represents a promising tool for preclinical PET studies, removing the need for invasive blood sampling and enabling more efficient and scalable experimental designs.

Acknowledgments

This work is funded through Visual Intelligence, a Center for Research-based Innovation funded by the Research Council of Norway, grant no. 309439 and 303514. It was further supported by Tromsø Research Foundation (19_PET-NUKL) through the 180N Norwegian Nuclear Medicine Consortium and UiT Innovation funds (UiT Talent).

References

  • Gunn et al. [2001] R N Gunn, S R Gunn, and V J Cunningham. Positron emission tomography compartmental models. Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism, 21:635–52, 6 2001. ISSN 0271-678X. doi: 10.1097/00004647-200106000-00002.
  • Hicks et al. [2006] Rodney J Hicks, Donna Dorow, and Peter Roselt. Pet tracer development—a tale of mice and men. Cancer Imaging, 6(Spec No A):S102, 2006. doi: 10.1102/1470-7330.2006.9098.
  • Yao et al. [2012] Rutao Yao, Roger Lecomte, and Elpida S Crawford. Small-animal pet: what is it, and why do we need it? Journal of nuclear medicine technology, 40(3):157–165, 2012. doi: 10.2967/jnmt.111.098632.
  • Cunha et al. [2014] Lídia Cunha, Ildiko Horvath, Sara Ferreira, Joana Lemos, Pedro Costa, Domingos Vieira, Dániel S. Veres, Krisztián Szigeti, Teresa Summavielle, Domokos Máthé, and Luís F. Metello. Preclinical imaging: an essential ally in modern biosciences. Molecular diagnosis & therapy, 18(2):153–173, 2014. doi: 10.1007/s40291-013-0062-3.
  • Alf et al. [2013a] Malte F. Alf, Matthias T. Wyss, Alfred Buck, Bruno Weber, Roger Schibli, and Stefanie D. Krämer. Quantification of brain glucose metabolism by 18f-fdg pet with real-time arterial and image-derived input function in mice. Journal of nuclear medicine : official publication, Society of Nuclear Medicine, 54(1):132–138, 1 2013a. doi: 10.2967/jnumed.112.107474.
  • Laforest et al. [2005] Richard Laforest, Terry L. Sharp, John A. Engelbach, Nicole M. Fettig, Pilar Herrero, Joonyoung Kim, Jason S. Lewis, Douglas J. Rowland, Yuan-Chuan Tai, and Michael J. Welch. Measurement of input functions in rodents: challenges and solutions. Nuclear Medicine and Biology, 32:679–685, 10 2005. ISSN 09698051. doi: 10.1016/j.nucmedbio.2005.06.012.
  • Convert et al. [2022] Laurence Convert, Otman Sarrhini, Maxime Paillé, Nicolas Salem, Paul G. Charette, and Roger Lecomte. The ultra high sensitivity blood counter: A compact, mri-compatible, radioactivity counter for pharmacokinetic studies in μ\mul volumes. Biomedical Physics and Engineering Express, 8, 5 2022. ISSN 20571976. doi: 10.1088/2057-1976/AC4C29.
  • Russell et al. [1959] William Moy Stratton Russell, Rex Leonard Burch, Charles Westley Hume, et al. The principles of humane experimental technique, volume 238. Methuen London, 1959.
  • Takikawa et al. [1993] Shugo Takikawa, V Dhawan, P Spetsieris, W Robeson, T Chaly, R Dahl, D Margouleff, and D Eidelberg. Noninvasive quantitative fluorodeoxyglucose pet studies with an estimated input function derived from a population-based arterial blood curve. Radiology, 188(1):131–136, 1993. doi: 10.1148/radiology.188.1.8511286.
  • Zanotti-Fregonara et al. [2011] Paolo Zanotti-Fregonara, Kewei Chen, Jeih-San Liow, Masahiro Fujita, and Robert B Innis. Image-derived input function for brain pet studies: Many challenges and few opportunities. Journal of Cerebral Blood Flow & Metabolism, 31(10):1986–1998, 2011. doi: 10.1038/jcbfm.2011.107.
  • Frouin et al. [2002] Vincent Frouin, Claude Comtat, Anthonin Reilhac, and Marie-Claude Grégoire. Correction of partial-volume effect for pet striatal imaging: fast implementation and study of robustness. Journal of Nuclear Medicine, 43(12):1715–1726, 2002.
  • Kim et al. [2013] Euitae Kim, Miho Shidahara, Charalampos Tsoumpas, Colm J McGinnity, Jun Soo Kwon, Oliver D Howes, and Federico E Turkheimer. Partial volume correction using structural–functional synergistic resolution recovery: comparison with geometric transfer matrix method. Journal of Cerebral Blood Flow & Metabolism, 33(6):914–920, 2013. doi: 10.1038/jcbfm.2013.2.
  • Fang and Muzic [2008] Yu-Hua Dean Fang and Raymond F Muzic. Spillover and partial-volume correction for image-derived input functions for small-animal 18f-fdg pet studies. Journal of Nuclear Medicine, 49(4):606–614, 2008. doi: 10.2967/jnumed.107.047613.
  • Van Der Weijden et al. [2023] Chris WJ Van Der Weijden et al. Non-invasive kinetic modelling approaches for quantitative analysis of brain pet studies. European Journal of Nuclear Medicine and Molecular Imaging, 50(6):1636–1650, 2023. doi: 10.1007/s00259-022-06057-4.
  • Bartlett et al. [2019] Elizabeth A Bartlett, Mala Ananth, Samantha Rossano, Mengru Zhang, Jie Yang, Shu-fei Lin, Nabeel Nabulsi, Yiyun Huang, Francesca Zanderigo, Ramin V Parsey, et al. Quantification of positron emission tomography data using simultaneous estimation of the input function: validation with venous blood and replication of clinical studies. Molecular imaging and biology, 21:926–934, 2019. doi: 10.1007/s11307-018-1300-1.
  • Roccia et al. [2019] Elisa Roccia, Arthur Mikhno, R. Todd Ogden, J. John Mann, Andrew F. Laine, Elsa D. Angelini, and Francesca Zanderigo. Quantifying brain [18f]fdg uptake noninvasively by combining medical health records and dynamic pet imaging data. IEEE Journal of Biomedical and Health Informatics, 23(6):2576–2582, 2019. doi: 10.1109/JBHI.2018.2890459.
  • Feng et al. [1997] Dagan Feng, Koon-Pong Wong, Chi-Ming Wu, and Wan-Chi Siu. A technique for extracting physiological parameters and the required input function simultaneously from pet image measurements: theory and simulation study. IEEE Transactions on Information Technology in Biomedicine, 1(4):243–254, 1997. doi: 10.1109/4233.681168.
  • Wong et al. [2001] Koon-Pong Wong, D. Feng, S.R. Meikle, and M.J. Fulham. Simultaneous estimation of physiological parameters and the input function - in vivo pet data. IEEE Transactions on Information Technology in Biomedicine, 5(1):67–76, 2001. doi: 10.1109/4233.908397.
  • Kuttner et al. [2020] Samuel Kuttner, Kristoffer Knutsen Wickstrøm, Gustav Kalda, S. Esmaeil Dorraji, Montserrat Martin-Armas, Ana Oteiza, Robert Jenssen, Kristin Fenton, Rune Sundset, and Jan Axelsson. Machine learning derived input-function in a dynamic 18 F-FDG PET study of mice. Biomed. Phys. Eng. Express, 6(1):015020, 2020. ISSN 2057-1976. doi: 10.1088/2057-1976/ab6496.
  • Kuttner et al. [2021] Samuel Kuttner, Kristoffer Knutsen Wickstrøm, Mark Lubberink, Joachim Burman, Rune Sundset, Robert Jenssen, and Lieuwe Appel. Cerebral blood flow measurements with 15 o-water pet using a non-invasive machine-learning-derived arterial input function. Journal of Cerebral Blood Flow & Metabolism, 41(9):2229–2241, 2021. doi: 10.1177/0271678X21991393.
  • Wang et al. [2024] Zhenguo Wang, Yaping Wu, Zeheng Xia, Xinyi Chen, Xiaochen Li, Yan Bai, Yun Zhou, Dong Liang, Hairong Zheng, Yongfeng Yang, Shanshan Wang, Meiyun Wang, and Tao Sun. Non-invasive quantification of the brain [18f]fdg-pet using inferred blood input function learned from total-body data with physical constraint. IEEE Transactions on Medical Imaging, 43(7):2563–2573, 2024. doi: 10.1109/TMI.2024.3368431.
  • Kuttner et al. [2024] Samuel Kuttner, Luigi T. Luppino, Laurence Convert, Otman Sarrhini, Roger Lecomte, Michael C. Kampffmeyer, Rune Sundset, and Robert Jenssen. Deep-learning-derived input function in dynamic [18f]fdg pet imaging of mice. Frontiers in Nuclear Medicine, 4, 2024. ISSN 2673-8880. doi: 10.3389/fnume.2024.1372379.
  • Ferrante et al. [2024] Matteo Ferrante, Marianna Inglese, Ludovica Brusaferri, Alexander C. Whitehead, Lucia Maccioni, Federico E. Turkheimer, Maria A. Nettis, Valeria Mondelli, Oliver Howes, Marco L. Loggia, Mattia Veronese, and Nicola Toschi. Physically informed deep neural networks for metabolite-corrected plasma input function estimation in dynamic pet imaging. Computer Methods and Programs in Biomedicine, 256:108375, 2024. ISSN 0169-2607. doi: 10.1016/j.cmpb.2024.108375.
  • Varnyú and Szirmay-Kalos [2021] Dóra Varnyú and László Szirmay-Kalos. Blood input function estimation in positron emission tomography with deep learning. In 2021 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), pages 1–7, 2021. doi: 10.1109/NSS/MIC44867.2021.9875543.
  • Keyes [1995] J W Keyes. SUV: standard uptake or silly useless value? J. Nucl. Med., 36(10):1836–1839, 1995. ISSN 0161-5505.
  • He et al. [2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 6 2016. URL doi.org/10.1109/CVPR.2016.90.
  • Wang et al. [2017] Zhiguang Wang, Weizhong Yan, and Tim Oates. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 1578–1585, 2017. URL doi.org/10.1109/IJCNN.2017.7966039.
  • Kingma and Ba [2014] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2014.
  • Reddi et al. [2018] Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. On the convergence of ADAM and beyond. In Proc. Int. Conf. Learn. Represent. (ICLR), 2018. URL doi.org/10.48550/arXiv.1904.09237.
  • Patlak et al. [1983a] Clifford S. Patlak, Ronald G. Blasberg, and Joseph D. Fenstermacher. Graphical evaluation of blood-to-brain transfer constants from multiple-time uptake data. Journal of Cerebral Blood Flow & Metabolism, 3(1):1–7, 1983a. doi: 10.1038/jcbfm.1983.1.
  • Patlak and Blasberg [1985a] Clifford S. Patlak and Ronald G. Blasberg. Graphical evaluation of blood-to-brain transfer constants from multiple-time uptake data. generalizations. Journal of Cerebral Blood Flow & Metabolism, 5(4):584–590, 1985a. doi: 10.1038/jcbfm.1985.87.
  • Wu et al. [2007] Hsiao-Ming Wu, Guodong Sui, Cheng-Chung Lee, Mayumi L. Prins, Waldemar Ladno, Hong-Dun Lin, Amy S. Yu, Michael E. Phelps, and Sung-Cheng Huang. In vivo quantitation of glucose metabolism in mice using small-animal pet and a microfluidic device. Journal of Nuclear Medicine, 48(5):837–845, 2007. ISSN 0161-5505. doi: 10.2967/jnumed.106.038182.
  • Van der Maaten and Hinton [2008] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(86):2579–2605, 2008. URL http://jmlr.org/papers/v9/vandermaaten08a.html.
  • Patlak et al. [1983b] Clifford S Patlak, Ronald G Blasberg, and Joseph D Fenstermacher. Graphical evaluation of blood-to-brain transfer constants from multiple-time uptake data. Journal of Cerebral Blood Flow and Metabolism, 3:1–7, 1983b. doi: 10.1038/jcbfm.1983.1.
  • Patlak and Blasberg [1985b] C S Patlak and R G Blasberg. Graphical evaluation of blood-to-brain transfer constants from multiple-time uptake data. generalizations. Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism, 5:584–90, 12 1985b. ISSN 0271-678X. doi: 10.1038/jcbfm.1985.87.
  • Sokoloff et al. [1977] L. Sokoloff, M. Reivich, C. Kennedy, M. H. Des Rosiers, C. S. Patlak, K. D. Pettigrew, O. Sakurada, and M. Shinohara. The [14c]deoxyglucose method for the measurement of local cerebral glucose utilization: Theory, procedure, and normal values in the conscious and anesthetized albino rat. Journal of Neurochemistry, 28(5):897–916, 1977. doi: 10.1111/j.1471-4159.1977.tb10649.x.
  • Zhuang et al. [2021] Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, 2021. doi: 10.1109/JPROC.2020.3004555.
  • Farahani et al. [2021] Abolfazl Farahani, Sahar Voghoei, Khaled Rasheed, and Hamid R Arabnia. A brief review of domain adaptation. In Advances in Data Science and Information Engineering, pages 877–894. Springer International Publishing, 2021. ISBN 978-3-030-71704-9. doi: 10.1007/978-3-030-71704-9_65.
  • Chapelle et al. [2009] O. Chapelle, B. Scholkopf, and A. Zien, Eds. Semi-supervised learning (chapelle, o. et al., eds.; 2006) [book reviews]. IEEE Transactions on Neural Networks, 20(3):542–542, 2009. doi: 10.1109/TNN.2009.2015974.
  • Duclos et al. [2021] Valentin Duclos, Alex Iep, Léa Gomez, Lucas Goldfarb, and Florent L Besson. Pet molecular imaging: A holistic review of current practice and emerging perspectives for diagnosis, therapeutic evaluation and prognosis in clinical oncology. International Journal of Molecular Sciences, 22(8), 2021. ISSN 1422-0067. doi: 10.3390/ijms22084159.
  • Hubert and Van der Veeken [2008] Mia Hubert and Stephan Van der Veeken. Outlier detection for skewed data. J. Chemom., 22(3-4):235–246, 3 2008. ISSN 08869383. doi: 10.1002/cem.1123.
  • Kreissl et al. [2011] Michael C Kreissl, David B Stout, Koon-Pong Wong, Hsiao-Ming Wu, Evren Caglayan, Waldemar Ladno, Xiaoli Zhang, John O Prior, Christoph Reiners, Sung-Cheng Huang, et al. Influence of dietary state and insulin on myocardial, skeletal muscle and brain [18f]-fluorodeoxyglucose kinetics in mice. EJNMMI research, 1(1):8, 2011. doi: 10.1186/2191-219X-1-8.
  • Wong et al. [2011] Koon-Pong Wong, Wei Sha, Xiaoli Zhang, and Sung-Cheng Huang. Effects of administration route, dietary condition, and blood glucose level on kinetics and uptake of 18f-fdg in mice. Journal of Nuclear Medicine, 52(5):800–807, 2011. doi: 10.2967/jnumed.110.085092.
  • Alf et al. [2013b] Malte F Alf, Marianne I Martić-Kehl, Roger Schibli, and Stefanie D Krämer. Fdg kinetic modeling in small rodent brain pet: optimization of data acquisition and analysis. EJNMMI research, 3(1):61, 2013b. doi: 10.1186/2191-219X-3-61.

Appendix A Data acquisition and processing

A.1 Animal experiments

The imaging data of mice used to train and evaluate the FC-DLIF model were acquired at UiT The Arctic University of Norway (UiT). All animal experiments were approved by the Norwegian Food Safety Authority; FOTS id 29689. In total, 8080 healthy female mice from three different strains were included in this study: BALB/cJRj (N = 5555), C57BL/6JRj (N = 88) and Balb/cAnNCrl (N = 77), where BALB/cJRj made up the main group of 7070 samples. The remaining 1010 were set aside for evaluating the effects of the tracers [18F]FDOPA\mathrm{[^{18}F]FDOPA} (N = 66) and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA} (N = 44). The mice were 7 to 878 weeks of age upon arrival to the UiT animal facility, and fed ad libitum a standard rodent diet.

A.2 Arterial and venous cannulation during image acquisition

At the time of PET/CT imaging, the mouse age was 13.0(9) weeks13.0(9)\text{\,}\mathrm{weeks}, with a corresponding weight of 22.5(5) g22.5(5)\text{\,}\mathrm{g}. The study used three different radiotracers: [18F]FDG\mathrm{[^{18}F]FDG}, [18F]FDOPA\mathrm{[^{18}F]FDOPA}, and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA}. The mice receiving [18F]FDG\mathrm{[^{18}F]FDG} (7070 animals) were fasted for 3.4(2) hours3.4(2)\text{\,}\mathrm{hours} prior to injection, whereas the remaining animals—given [18F]FDOPA\mathrm{[^{18}F]FDOPA} and [68Ga]PSMA\mathrm{[^{68}Ga]PSMA}—were not fasted. The mice were anesthetized prior to the scan, weighed, and placed on a heated plate (38 °C38\text{\,}\mathrm{\SIUnitSymbolCelsius}) while receiving oxygen through a mask. A venous catheter was inserted into the tail vein of each mouse for radiotracer injection. The blood glucose during venous cannulation was measured to 6.4(3) mmol L16.4(3)\text{\,}\mathrm{mmol}\text{\,}{\mathrm{L}}^{-1}. An incision in the neck enabled surgical cannulation of the carotid artery, allowing blood to be routed through a radiation detector at a withdrawal rate of 105.6(28) µL min1105.6(28)\text{\,}\mathrm{\SIUnitSymbolMicro L}\text{\,}{\mathrm{min}}^{-1}. This setup facilitated concurrent measurements of whole blood activity during the PET scan with a temporal resolution of 1 s1\text{\,}\mathrm{s}. To enable continuous arterial line measurements, an arterial–venous shunt was established, forming a closed loop between arterial sampling and venous reinjection. Blood flowed sequentially from the arterial sampling line, radiation detector, a peristaltic pump, and a Y-connector—enabling intravenous radiotracer injection—before being reinfused via the venous injector, thereby preventing excessive blood loss.

The PET/CT imaging was performed using a \qty45.5 listmode scan on a TriumphTM LabPET-8TM small animal PET/CT scanner (TriFoil Imaging Inc., Chatsworth, CA, USA) while a sensor monitored the respiration rate. The mice were injected with \qty16.2 +- 0.7\mega using an automated injection pump, started 30 s30\text{\,}\mathrm{s} after scanning was initiated. CT imaging was performed after PET scanning to correct for attenuation and scatter. While still under deep anesthesia after scanning, the mice were euthanized using cervical dislocation. Scanner sensitivity was monitored through daily phantom calibrations.

A.3 Calibration and AIF processing

To allow for delay correction of the manual blood sample measurements in the arteriovenous shunt, the time delay between the radiation detector and the Y-connector was measured during the first pass of arterial blood to 25.1(7) s25.1(7)\text{\,}\mathrm{s}. The continuous line radiation detector was calibrated with three manual blood samples taken in a late stage or post-scan by measuring blood dripped from the arterial catheter over 30 s30\text{\,}\mathrm{s}. This also enabled measurement of the actual arterial withdrawal rate during each mouse scan.

A calibration factor for the continuous arterial line measurements was derived as the ratio of the average signal from the continuous line measurements and the manual blood samples collected during the same 30 s30\text{\,}\mathrm{s} blood sampling at each time point, corrected for delay. Calibration factors outside three scaled median absolute deviations from the median factors were considered outliers and discarded [41]. An overall calibration factor was determined as the average factor from the included blood sample factors. The arterial input function (AIF) for each mouse was obtained by scaling the continuous line measurement data by this average calibration factor.

A.4 Image reconstruction and processing

The PET images were reconstructed into 4242 time frames (\qtylist[parse-numbers=false]\numproduct1x30;\numproduct24x5;\numproduct9x20;\numproduct8x300) using a three-dimensional maximum-likelihood estimator algorithm with 5050 iterations. Corrections for detector efficiency, radioactive decay, random coincidences, dead time, attenuation, and scatter were applied. Each time frame had an image matrix size of \numproduct128x92x92 voxels. The voxels were converted from units of counts per second into units of \unit\mega\per\milli using the average counts inside a 14 mL14\text{\,}\mathrm{mL} homogeneous image region of a daily phantom scan. Subsequently, the voxels were normalized into standardized uptake value (SUV) [\unit\per\milli] [25].

Refer to caption
Figure 9: Prediction error distribution during different phases of the imaging for the baseline DLIF [22] and the proposed FC-DLIF.

Appendix B Additional results

B.1 Detailed results for individual imaging phases

Figure 9 visualizes the distribution of error between the predicted input function from the baseline model [22] and FC-DLIF compared with the AIF, as a function of time since the start of tracer injection. A predominantly negative distribution in the plot would indicate the model underestimating the AIF.

B.2 Results on unseen tracers dataset

Figure 10 compares the baseline [22] and the proposed FC-DLIF models’ SUV predictions with the measured AIF. The lines are computed as the orthogonal regression line fitted to the point distributions of each respective color. A perfect predictor would have SUV points distributed across the black, dashed, y=xy=x line.

Figure 11 uses the predicted input function from either model in a Patlak graphical analysis [34, 35] to derive the kinetic parameters KiK_{i}.

Refer to caption
(a) FDOPA
Refer to caption
(b) PSMA
Figure 10: Scatterplots summarizing the results over the unseen tracers dataset for the arterial input function estimation. The color represents SUV predictions for each model.
Refer to caption
(a) FDOPA
Refer to caption
(b) PSMA
Figure 11: Scatterplots summarizing the results over the unseen tracers dataset for voxel-wise tracer kinetic modeling. Colors indicate which model provided each coefficient for a random subset of 50 00050\,000 voxels.

Figure 12 shows examples of input functions predicted by the proposed FC-DLIF model. The mean curve and the standard deviation over the 1010 runs are depicted for the best (left), median (middle), and worst (right) sample according to the mean squared error (MSE).

Refer to caption
(a) Best sample
Refer to caption
(b) Median sample
Refer to caption
(c) Worst sample
Figure 12: Examples of input function predictions with FC-DLIF compared against the ground truth on unseen tracers. Insets zoom on the first 33 minutes of the curves, when the input function peak occurs. (a) Best sample ([18F]FDOPA\mathrm{[^{18}F]FDOPA}); (b) median sample ([18F]FDOPA\mathrm{[^{18}F]FDOPA}); (c) worst sample ([68Ga]PSMA\mathrm{[^{68}Ga]PSMA}).

B.3 Detailed kinetic modeling results

Further kinetic modeling results using the predicted input functions from both models are summarized in Figures 13 and 14. The kinetic parameters were estimated using an irreversible two-tissue compartment model [36] in two different regions: myocardium and brain. The estimated parameters were compared with those obtained using the measured AIF and DLIF. The quantile–quantile plots show the distribution of the kinetic parameters estimated using orthogonal regression.

Compared to the Patlak analysis presented in the main manuscript, the kinetic parameters derived from the two-tissue compartment model are less stable because the full measurement period is used in the fitting, and not only the linear phase. Outliers are therefore removed if they lie more than three standard deviations away from the mean of each parameter distribution. Once an outlier is detected, it is removed from all parameters in the respective region for consistency. This results in 6464 and 6666 samples remaining for the myocardium and brain regions, respectively.

The observed myocardium kinetic parameters (Figures 13 and 13(e)) follow the reference distribution well, with larger spread in k2k_{2}, k3k_{3}, and VbV_{b}, as well as outliers not removed by the three standard deviation rule in k2k_{2} and the blood volume fraction. Influx rates (Figure 13(e)) were higher than previously reported [42, 43]. This is likely due to shorter fasting times in our study, where longer fasting times are known to reduce influx rates in the myocardium [42, 43].

The parameter distributions in the brain (Figure 14) had fewer outliers, and similar to the myocardium, k2k_{2}, k3k_{3}, and VbV_{b} had larger spread than the uptake and influx rate. The ranges of the brain kinetic parameters are in line with previous literature [44].

Refer to caption
(a) K1K_{1}
Refer to caption
(b) k2k_{2}
Refer to caption
(c) k3k_{3}
Refer to caption
(d) VbV_{b}
Refer to caption
(e) KiK_{i}
Figure 13: Scatterplots summarizing kinetic parameters estimated using an irreversible two-tissue compartment model in the myocardium region. Points represent each region in each mouse sample, colored by the model producing the coefficient. From top left to bottom right: K1K_{1}, k2k_{2}, k3k_{3}, VbV_{b}, and KiK_{i}.
Refer to caption
(a) K1K_{1}
Refer to caption
(b) k2k_{2}
Refer to caption
(c) k3k_{3}
Refer to caption
(d) VbV_{b}
Refer to caption
(e) KiK_{i}
Figure 14: Scatterplots summarizing kinetic parameters estimated using an irreversible two-tissue compartment model in the brain region. Points represent each region in each mouse sample, colored by the model producing the coefficient. From top left to bottom right: K1K_{1}, k2k_{2}, k3k_{3}, VbV_{b}, and KiK_{i}.