2023 - C2BA-UNet - A Context-Coordination Multi-Atlas Boundary-Aware UNet-like Method For PET-CT Images Based Tumor Segmentation
2023 - C2BA-UNet - A Context-Coordination Multi-Atlas Boundary-Aware UNet-like Method For PET-CT Images Based Tumor Segmentation
Keywords: Tumor segmentation is a necessary step in clinical processing that can help doctors diagnose tumors and
Tumor segmentation plan surgical treatments. Since tumors are usually small, the locations and appearances vary substantially
UNet across individuals, and the contrast between tumors and adjacent normal tissues is low, tumor segmentation
Context-coordination
is still a challenging task. Although convolutional neural networks (CNNs) have achieved good results in
Boundary-awareness
tumor segmentation, the information about tumor boundaries has been rarely explored. To solve the problem,
PET/CT imaging
this paper proposes a new method for automatic tumor segmentation in PET/CT images based on context-
coordination and boundary-aware, termed as C2 BA-UNet. We employ a UNet-like backbone network and
replace the encoder with EfficientNet-B0 for efficiency. To acquire potential tumor boundaries, we propose a
new multi-atlas boundary-aware (MABA) module based on gradient atlas, uncertainty atlas, and level set atlas,
that focuses on uncertain regions between tumors and adjacent tissues. Furthermore, we propose a new context
coordination module (CCM) to combine multi-scale context information with attention mechanism to optimize
skip connection in high-level layers. To validate the superiority of our method, we conduct experiments on a
publicly available soft tissue sarcoma (STS) dataset and a lymphoma dataset, and the results show our method
is competitive with other comparison methods.
∗ Corresponding author at: Key Laboratory of Intelligent Computing in Biomedical Image, Ministry of Education, Northeastern University, Shenyang 110819,
China.
E-mail address: [email protected] (H. Jiang).
https://doi.org/10.1016/j.compmedimag.2022.102159
Received 10 May 2022; Received in revised form 11 November 2022; Accepted 5 December 2022
Available online 9 December 2022
0895-6111/© 2022 Elsevier Ltd. All rights reserved.
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
2. Related works
2
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Fig. 2. The architecture of proposed C2 BA-UNet model. Area branch is formed by the backbone of our model which is a modified UNet with EfficientNet-B0 as encoder. The
proposed MABA module learns tumor boundary segmentation in the decoder layer with a resolution of 64×64 as boundary branch. Meanwhile, CCM reduces the redundancy of
high-level features fed to decoder.
3
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Fig. 4. The CCM for high-level skip connection. Three dilated convolution kernels are used in context features extraction, represented by dashed boxes in blue, green and light
red, respectively. The orange color in each convolution kernel indicates regions corresponding to the input feature map to be convolved, and the gray indicates regions skipped
in the dilated convolution.
3.3. Skip connection based on CCM where 𝑦𝑛 ∈ [0, 1] denotes the predicted probability map, and 𝑦̂𝑛 ∈
{0, 1} denotes the ground truth corresponding to 𝑦𝑛 . 𝑁 is the batch
In this section, we propose CCM to enhance skip connections to size, 𝑝𝑐 is a hyper-parameter designed because of the class imbalance
refine features and give ROIs of tumors in high-level layers more problem, we set it to 6 by experience.
attention, as shown in Fig. 4. Inspired by ASPP (Chen et al., 2017b), BCE loss is used in two branches, as shown in Fig. 2: 𝐿𝐴𝑆 for
we use three parallel dilated convolution branches and a constant area branch and 𝐿𝐵𝑆 for boundary branch. 𝐿𝐴𝑆 checks the difference
concatenation to extract context features of multiple receptive fields. between area prediction and area ground truth. 𝐿𝐵𝑆 checks the differ-
Considering the low occupancy of tumors in the image and small spatial ence between boundary-aware result generated by MABA module and
size of the previous layer of the bottleneck layer, dilated rates are set boundary ground truth. Thus, the total loss 𝐿𝑡𝑜𝑡𝑎𝑙 is calculated by adding
to 1, 3, 5. Given the input feature map 𝑋 ∈ R𝑊 ×𝐻×80 , we can obtain
𝑟=1 , 𝑓 𝑟=3 , 𝑓 𝑟=5 ) by dilated both losses:
three features of different receptive fields(𝑓dc dc dc
convolutions, and CCM extracts the multi-scale feature as follows: 𝐿total = 𝛼𝐿𝐴𝑆 + (1 − 𝛼)𝐿𝐵𝑆 (9)
( ( ))
𝑋 ′ = 𝜏 𝜙 𝑋, 𝑓dc𝑟=1 𝑟=3
, 𝑓dc 𝑟=5
, 𝑓dc (6) where 𝛼 and 1 − 𝛼 are weights of 𝐿𝐴𝑆 and 𝐿𝐵𝑆 , respectively.
where 𝜙(⋅) represents a concatenation operation, 𝜏(⋅) represents 1 × 1
convolution. 𝑋 ′ ∈ R𝑊 ×𝐻×80 is the feature by integrating and fusing. 4. Experiment
Then, we improve the long-term dependence of integrated multi-
scale feature by self-attention mechanism. We perform a position em- 4.1. Experimental settings
bedding operation (Vaswani et al., 2017) on feature map 𝑋 ′ and expand
it into two one-dimensionalized sequences with row-by-row (𝑋row ′ ∈ 4.1.1. Dataset
R𝑊 𝐻×1×80 ) and column-by-column (𝑋col ′ ∈ R1×𝑊 𝐻×80 ), respectively. In the experiments, we use a publicly available STS dataset as well
For calibrating both axis, we employ gated axial attention (Valanarasu as a lymphoma dataset for training and testing. PET/CT image pairs
′
et al., 2021). Take 𝑋row (same for 𝑋col ′ ) as an example:
are included in the initial data format. The size of each PET slice is
∑ ( )( ) 128 × 128 pixels, and the size of each CT slice is 512 × 512 pixels.
𝑦𝑚 = sof tmax𝑛 𝑞𝑚𝑇 𝑘𝑛 + 𝐺𝑄 𝑞𝑚𝑇 𝑟𝑞𝑛′ + 𝐺𝐾 𝑘𝑇𝑛 𝑟𝑘𝑛′ 𝐺𝑉 1 𝑣𝑛 + 𝐺𝑉 2 𝑟𝑣𝑛′ (7) STS dataset1 : The publicly available STS dataset contains 51 his-
𝑛∈
tologically confirmed cases of STS of the extremities. FDG-PET scan
where 𝑚, 𝑛 are positions in 𝑋row ′ , 𝑦 represents the output at point 𝑚,
𝑚 images of STS data were acquired on a PET/CT scanner (Discovery ST,
represents the local receptive field region centered on 𝑚. 𝑞, 𝑘, 𝑣 GE Healthcare, Waukesha, WI) at the McGill University Health Centre
𝑞
are query, key, value, 𝑟𝑛′ , 𝑟𝑛′ , 𝑟𝑛′ ∈ R𝑊 ×𝑊 denote relative positional
𝑘 𝑣
(MUHC) from November 2004 to November 2011. FDG at a median
encodings for 𝑞, 𝑘, 𝑣, respectively. 𝐺𝑄 , 𝐺𝐾 , 𝐺𝑉1 , 𝐺𝑉2 are four attention
of 420 𝑀𝐵𝑞 (210–620 𝑀𝐵𝑞) was injected intravenously and whole-
gates added to queries, keys, and values, respectively. All parameters
body 2D imaging acquisition was performed after 60 min at a median of
are trainable.
180 s (160–300 s) per patient. All patients’ tumor areas were manually
3.4. Loss function mapped slice-by-slice by a radiation oncologist basis on T2FS scans. The
rigid alignment software MIM (MIM Software Inc., Cleveland, OH) was
Following Zhou et al. (2022), we adopt the binary cross entropy loss used to propagate the contours into the FDG-PET images. For patients
(𝐿𝐵𝐶𝐸 ) as the loss function:
∑
𝑁
[ ( ) ( )] 1
https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=
𝐿𝐵𝐶𝐸 = − 𝑝𝑐 ⋅ 𝑦̂𝑛 ⋅ log 𝑦𝑛 + 1 − 𝑦̂𝑛 ⋅ log 1 − 𝑦𝑛 (8)
21266533
𝑛=1
4
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Table 1
Experiment setting.
Parameter Batchsize Epoch Optimizer Learn rate 𝛼 Early stopping
Setting 32 100 Adamw 0.0001 0.5 True
5
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Table 2
Quantitative results of ablation experiments on STS and Lymphoma datasets.
Dataset Method EfficientNet-B0 CCM MABA 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
Encoder
E-UNet ✓ – – 0.745 0.733 0.615 0.698
C2 -UNet ✓ ✓ – 0.753 0.783 0.623 0.719
STS
BA-UNet ✓ – ✓ 0.757 0.806 0.627 0.730
C2 BA-UNet ✓ ✓ ✓ 0.781 0.837 0.649 0.756
E-UNet ✓ – – 0.815 0.868 0.696 0.793
C2 -UNet ✓ ✓ – 0.821 0.866 0.707 0.798
Lymphoma
BA-UNet ✓ – ✓ 0.815 0.901 0.695 0.804
C2 BA-UNet ✓ ✓ ✓ 0.834 0.902 0.722 0.819
* The definitions of 𝐷𝑆𝐶 𝑝𝑤 , 𝑅𝑒𝑐 𝑝𝑤 , 𝐼𝑂𝑈 𝑝𝑤 , 𝐷𝑅𝐼 𝑝𝑤 are given in Section 4.2.
* The best values are highlighted in bold.
Table 3
The proposed C2 BA-UNet is evaluated against other comparison methods for quantitative metrics on STS dataset.
Method FLOPs (GMac) 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
UNet (Ronneberger et al., 2015) 16.37 0.744 0.763 0.620 0.709
UNet++ (Zhou et al., 2018) 34.53 0.762 0.785 0.643 0.730
DiSegNet (Xu et al., 2021) 15.05 0.738 0.784 0.623 0.715
MSAM (Fu et al., 2021) 36.45 0.761 0.743 0.634 0.713
CoFeatureModel (Kumar et al., 2019) 5.93 0.748 0.772 0.632 0.717
C2 BA-UNet (𝑂𝑢𝑟𝑠) 2.37 0.781 0.837 0.649 0.756
* The FLOPs metric indicates floating point operations and is given in Section 4.4.1.
* The best values are highlighted in bold.
Fig. 6. Heatmap results on test images from STS (the first row) and lymphoma (the second row) datasets. The highlighted regions in the figure demonstrate the ability of C2 BA-UNet
to better utilize tumor boundary features and area information than other methods.
4.4.1. STS segmentation results 0.689 and lowest 𝐷𝑅𝐼 𝑝𝑤 with 0.796. Our proposed network yields
Among the methods for STS segmentation as listed in Table 3, DiS- optimal performance in all four metrics (𝐷𝑆𝐶 𝑝𝑤 with 0.834, 𝑅𝑒𝑐 𝑝𝑤
egNet (Xu et al., 2021) has the lowest 𝐷𝑆𝐶 𝑝𝑤 with 0.738, MSAM (Fu with 0.902, 𝐼𝑂𝑈 𝑝𝑤 with 0.722, and 𝐷𝑅𝐼 𝑝𝑤 with 0.819) compared to
et al., 2021) has the lowest 𝑅𝑒𝑐 𝑝𝑤 with 0.743, UNet (Ronneberger et al., the other models. For qualitative analysis, Fig. 8 shows three patient
2015) has the lowest 𝐼𝑂𝑈 𝑝𝑤 with 0.743 and lowest 𝐷𝑅𝐼 𝑝𝑤 with 0.709. slices from the test set and the results corresponding to six segmentation
On the contrary, our C2 BA-UNet yields optimal performance in all four methods. Note that, for the slice shown in first row, UNet++ (Zhou
metrics(𝐷𝑆𝐶 𝑝𝑤 with 0.781, 𝑅𝑒𝑐 𝑝𝑤 with 0.837, 𝐼𝑂𝑈 𝑝𝑤 with 0.649, and et al., 2018), DiSegNet (Xu et al., 2021), MSAM (Fu et al., 2021) and
𝐷𝑅𝐼 𝑝𝑤 with 0.756) compared to the other models. Due to the use of CoFeatureModel (Kumar et al., 2019) all have scattered false positive
EfficientNet, which significantly reduces the number of channels in the areas, which are mis-segmentations. However, false positive areas of
encoder, our model has the lowest computational complexity compared C2 BA-UNet are all around the tumor segmentation result and are mostly
to other models. Floating point operations(FLOPs) in Table 3 shows this distributed in the lower left region of the tumor, which may be a
benefit on tumor segmentation. part of uncertainty regions caused by missing of labels during manual
For qualitative analysis, Fig. 7 shows three patient slices from the contouring. This suggests that our method can provide a more reliable
test set and visualization results corresponding to six segmentation signal for tumor segmentation.
networks. It can be seen that C2 BA-UNet fuses the boundary informa-
tion well into the final segmentation result, and can effectively inhibit 4.4.3. Tumor boundary segmentation results
non-tumor areas. We evaluate the effect of proposed C2 BA-UNet and comparison
methods on tumor boundary segmentation, as shown in Fig. 9. All
4.4.2. Lymphoma segmentation results methods, including ours, have over segmentation. Comparison methods
Table 4 lists quantitative results tested on lymphoma dataset. lack optimization and learning specifically for tumor boundaries and
Among these metrics, DiSegNet (Xu et al., 2021) has the lowest 𝐷𝑆𝐶 𝑝𝑤 will produce over-segmentation when tumors are small or irregularly
with 0.806, CoFeatureModel (Kumar et al., 2019) has the lowest 𝑅𝑒𝑐 𝑝𝑤 shaped. However, C2 BA-UNet performs multi-strategy and multi-scale
with 0.862, DiSegNet (Xu et al., 2021) has the lowest 𝐼𝑂𝑈 𝑝𝑤 with awareness of tumor boundaries to eliminate some false positive areas
6
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Fig. 7. Quantitative results of slices from three test patients on STS dataset. From left to right, the first three columns are original CT, original PET, Ground Truth, and the other
columns are results of various segmentation methods. The yellow shows correctly predicted tumor segmentation, the red regions indicate false positive pixels, and the green regions
indicate false negative pixels.
Fig. 8. Quantitative results of slices from three test patients on Lymphoma dataset. From left to right, the first three columns are original CT, original PET, Ground Truth, and
the other columns are results of various segmentation methods. The yellow shows correctly predicted tumor segmentation, the red regions indicate false positive pixels, and the
green regions indicate false negative pixels.
Table 4
The proposed C2 BA-UNet is evaluated against other comparison methods for quantitative metrics on Lymphoma dataset.
Method FLOPs (GMac) 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
UNet (Ronneberger et al., 2015) 16.37 0.821 0.878 0.711 0.803
UNet++ (Zhou et al., 2018) 34.53 0.831 0.885 0.721 0.812
DiSegNet (Xu et al., 2021) 15.05 0.806 0.895 0.689 0.796
MSAM (Fu et al., 2021) 36.45 0.829 0.875 0.721 0.808
CoFeatureModel (Kumar et al., 2019) 5.93 0.827 0.862 0.717 0.802
C2 BA-UNet (𝑂𝑢𝑟𝑠) 2.37 0.834 0.902 0.722 0.819
* The FLOPs metric indicates floating point operations and is given in Section 4.4.1.
* The best values are highlighted in bold.
and can achieve more realistic segmentation results related to tumor different variants. Moreover, our MABA module with triple-atlas can
boundaries. obtain a better performance with 𝐷𝑆𝐶𝑝𝑤 increased by 1.2%, 2.4%,
1.2%, 2.2%, 3.9%, 3.3%, compared to 𝑀𝐴𝐵𝐴𝐺 , 𝑀𝐴𝐵𝐴𝑈 , 𝑀𝐴𝐵𝐴𝐿 ,
5. Discussion 𝑀𝐴𝐵𝐴𝐺𝑈 , 𝑀𝐴𝐵𝐴𝐺𝐿 , 𝑀𝐴𝐵𝐴𝑈 𝐿 , respectively. This observation demon-
strates that all atlases introduced are beneficial for learning and op-
5.1. Different atlas configurations in MABA module timizing tumor segmentation predictions. The highest performance is
obtained by combining all three approaches, and our module can play
To explore the impact of different atlas configurations on experi- a positive role in tumor boundary segmentation.
ment performance, we discuss MABA module with six variants, includ-
ing three single-atlas strategies (gradient atlas only (𝑀𝐴𝐵𝐴𝐺 ), uncer- 5.2. Different dilated rates in CCM
tainty atlas only (𝑀𝐴𝐵𝐴𝑈 ), level set atlas only (𝑀𝐴𝐵𝐴𝐿 )) and three
dual-atlas strategies (gradient atlas with uncertainty atlas (𝑀𝐴𝐵𝐴𝐺𝑈 ), To study the effect of different size of receptive fields on tumor
gradient atlas with level set atlas (𝑀𝐴𝐵𝐴𝐺𝐿 ), uncertainty atlas with segmentation, we further evaluate the performance of different multi-
level set atlas (𝑀𝐴𝐵𝐴𝑈 𝐿 )). scale dilated convolution, including all combinations of convolution
Fig. 10 shows the 𝐷𝑆𝐶𝑝𝑤 for all modules. Generally, the results kernels of 3×3 with dilated rates from 1 to 5, Fig. 11 illustrates the
show that MABA module can have robust and stable performance with 𝐷𝑆𝐶𝑝𝑤 for all combinations on STS dataset.
7
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Fig. 9. Quantitative results of tumor boundary segmentations. From left to right, the first three columns are original CT, original PET, Ground Truth, and the other columns are
results of various segmentation methods. The background of results are PET images, and images are scaled and cropped for better display. The yellow line indicates the ground
truth, and the red line indicates the predicted results. The first two rows are from STS dataset, and the third row is from Lymphoma dataset.
Fig. 10. The 𝐷𝑆𝐶𝑝𝑤 performance for different atlas configurations in MABA module
on STS dataset.
Fig. 12. 𝑅𝑒𝑐𝑝𝑤 − 𝐷𝑆𝐶𝑝𝑤 distribution with different 𝑝𝑐 value on STS dataset. The
sequence (𝑝𝑐 , 𝑅𝑒𝑐𝑝𝑤 , 𝐷𝑆𝐶𝑝𝑤 ) indicates the optimal 𝑅𝑒𝑐𝑝𝑤 and 𝐷𝑆𝐶𝑝𝑤 under the cor-
responding 𝑝𝑐 setting. The figure is divided into four regions, and 𝑝𝑐 produces better
result when the distribution is located in region (b).
(1, 3, 4), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5). As a result,
increasing the gap in dilated rates of multi-scale dilated convolutions
can effectively improve the model performance.
Suppose the dilated rates of three different dilated convolutions To investigate the impact of different loss functions on our model,
are 𝑟𝑎𝑡𝑒1 , 𝑟𝑎𝑡𝑒2 , 𝑟𝑎𝑡𝑒3 , and the sequence (𝑟𝑎𝑡𝑒1 , 𝑟𝑎𝑡𝑒2 , 𝑟𝑎𝑡𝑒3 ) denotes the we perform experiments on the STS dataset using Dice Loss (Milletari
corresponding multi-scale convolution. Specifically, in our experiment, et al., 2016) and different Shannon-based loss functions, including
the dilated rate combination of the dilated convolutions in CCM is (1, BCE Loss, Focal Loss (Lin et al., 2020), Tversky Loss (Salehi et al.,
3, 5), which increases in 𝐷𝑆𝐶𝑝𝑤 with 3.1%, 1.5%, 0.8%, 1.8%, 2.9%, 2017), and our Weighted BCE Loss. The values of the losses in each
2.2%, 3.8%, 1.3%, 1.2%, compared to (1, 2, 3), (1, 2, 4), (1, 2, 5), epoch are shown in Fig. 14. As listed in Table 5, the performances
8
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Table 5
Segmentation performance of different losses on STS dataset.
Losses 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
BCE Loss 0.743 0.719 0.612 0.691
Dice Loss (Milletari et al., 2016) 0.779 0.807 0.653 0.746
Focal Loss (Lin et al., 2020) 0.762 0.752 0.635 0.716
Tversky Loss (Salehi et al., 2017) 0.787 0.804 0.659 0.750
Weight BCE Loss (Ours) 0.781 0.837 0.649 0.756
6. Conclusion
are stable and accurate across all comparing loss functions. It can be
seen that the Weighted BCE Loss provides promising performance in all In this paper, we propose C2 BA-UNet, a novel UNet-like model
metrics compared to other Shannon-based losses. This is owing to the for PET/CT images based tumor segmentation. For the efficiency of
weight term we introduce, which is able to alleviate the class imbalance feature extraction, we use EfficientNet-B0 as the encoder. To learn
problems between the positive samples and negative ones effectively and optimize the challenging tumor boundary segmentation, we de-
and conveniently. sign MABA module using three model-driven approaches with varied
focuses on the uncertain regions between tumors and adjacent tissues.
5.5. Impact of data augmentation on model performance In addition, we propose CCM to improve tumor-related feature response
while keeping network efficiency and lightness. The superiority of
The data augmentation in preprocessing also has a critical impact on
our proposed method is verified on a publicly available STS dataset
the performance of deep learning model. We explore the impact of data
and an internal lymphoma PET/CT dataset. The results show that
augmentation on the segmentation performance of PET/CT images. We
C2 BA-UNet can achieve superior performance in all metrics evaluated
compare the method we apply (see Section 4.1.2) for preprocessing
against other comparison models. In summary, the proposed model can
PET/CT images with other methods, including Riemmanian random
combine features of tumor boundary and area to achieve better effect
walk (RRW) and manifold sampling (MS) (Chadebec and Allassonnière,
enhancement for tumor segmentation in small-scale datasets.
2021).
As seen in Table 6, when RRW and MS are only applied to CT im- Considering the limitations of our proposed model in learning 3D
ages, the performance can be comparative to that of the paper at some imaging information, we will keep exploring tumor segmentation meth-
level. However, we experimentally observe that when PET modality is ods that combine both boundary-aware and area-based approaches of
involved in the data augmentation operation, the performance of the 3D images in the future work, and focus on studying more tumor
corresponding model decreases, especially when we perform RRM or datasets to improve the generalization of our method. Also, we will
MS on PET/CT simultaneously, the results become very low. This is continue to study the automatic selection of the hyper-parameters as
because of the low resolution of PET images, which makes them highly well as weight values in the subsequent works.
Fig. 14. The performance curves of different loss functions in each epoch. (a) training set, (b) validation set.
9
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Fig. 15. Illustration of failure cases with PET image as background. Images are scaled and cropped for better display. The yellow line gives the ground truth, and the red line
shows our predicted results.
Table 6 Boellaard, R., Delgado-Bolton, R., Oyen, W.J., Giammarile, F., Tatsch, K., Eschner, W.,
Segmentation performance of the model with different data augmentation methods on Verzijlbergen, F.J., Barrington, S.F., Pike, L.C., Weber, W.A., et al., 2015. FDG
STS dataset. PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur. J. Nucl.
Network Input 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤 Med. Mol. Imaging 42 (2), 328–354. http://dx.doi.org/10.1007/s00259-014-2961-
x.
PET/CT† 0.772 0.838 0.636 0.749
Chadebec, C., Allassonnière, S., 2021. Data augmentation with variational autoencoders
C2 BA-UNet𝑅𝑅𝑊 PET† /CT 0.530 0.552 0.414 0.499
and manifold sampling. In: Deep Generative Models, and Data Augmentation,
PET† /CT† 0.095 0.122 0.064 0.094
Labelling, and Imperfections. Springer International Publishing, pp. 184–192. http:
PET/CT† 0.726 0.827 0.586 0.713 //dx.doi.org/10.1007/978-3-030-88210-5_17.
C2 BA-UNet𝑀𝑆 PET† /CT 0.549 0.568 0.434 0.517 Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L., 2017a. Deeplab:
PET† /CT† 0.099 0.121 0.067 0.096 Semantic image segmentation with deep convolutional nets, atrous convolution,
C2 BA-UNet PET/CT 0.781 0.837 0.649 0.756 and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40 (4), 834–848.
http://dx.doi.org/10.1109/TPAMI.2017.2699184.
2 2 2 Chen, L.-C., Papandreou, G., Schroff, F., Adam, H., 2017b. Rethinking atrous convo-
* C BA-UNet𝑅𝑅𝑊 and C BA-UNet𝑀𝑆 indicate that C BA-UNet uses RRW and MS as the
data augmentation method. lution for semantic image segmentation. arXiv preprint arXiv:1706.05587. http:
* PET† and CT† indicate the corresponding images with data augmentation methods. //dx.doi.org/10.48550/arXiv.1706.05587.
* The best values are highlighted in bold. Chen, X., Qi, D., Shen, J., 2019. Boundary-aware network for fast and high-
accuracy portrait segmentation. arXiv preprint arXiv:1901.03814. http://dx.doi.
org/10.48550/arXiv.1901.03814.
Dahab, D.A., Ghoniemy, S.S., Selim, G.M., et al., 2012. Automated brain tumor de-
CRediT authorship contribution statement tection and identification using image processing and probabilistic neural network
techniques. Int. J. Image Process. Vis. Commun. 1 (2), 1–8.
Diao, Z., Jiang, H., Han, X.-H., Yao, Y.-D., Shi, T., 2021. EFNet: evidence fusion network
Shijie Luo: Conceptualization, Software, Validation, Formal analy- for tumor segmentation from PET-CT volumes. Phys. Med. Biol. 66 (20), 205005.
sis, Investigation, Data curation, Writing – original draft, Visualization. http://dx.doi.org/10.1088/1361-6560/ac299a.
Huiyan Jiang: Methodology, Resources, Writing – review & editing, Dutta, K., Roy, S., Whitehead, T.D., Luo, J., Jha, A.K., Li, S., Quirk, J.D., Shoghi, K.I.,
2021. Deep learning segmentation of triple-negative breast cancer (TNBC) pa-
Supervision, Project administration, Funding acquisition. Meng Wang:
tient derived tumor xenograft (PDX) and sensitivity of radiomic pipeline to
Writing – review & editing. tumor probability boundary. Cancers 13 (15), 3795. http://dx.doi.org/10.3390/
cancers13153795.
Declaration of competing interest Fu, X., Bi, L., Kumar, A., Fulham, M., Kim, J., 2021. Multimodal spatial attention
module for targeting multimodal PET-CT lung tumor segmentation. IEEE J. Biomed.
Health Inf. 25 (9), 3507–3516. http://dx.doi.org/10.1109/jbhi.2021.3059453.
The authors declare that they have no known competing finan- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
cial interests or personal relationships that could have appeared to Courville, A., Bengio, Y., 2014. Generative adversarial nets. Adv. Neu-
influence the work reported in this paper. ral Inf. Process. Syst. 27, URL: https://proceedings.neurips.cc/paper/2014/file/
5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.
Han, Y., Li, X., Wang, B., Wang, L., 2021. Boundary loss-based 2.5 D fully convolutional
Data availability neural networks approach for segmentation: a case study of the liver and tumor
on computed tomography. Algorithms 14 (5), 144. http://dx.doi.org/10.3390/
a14050144.
The authors do not have permission to share data.
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E., 2020. Squeeze-and-excitation networks.
IEEE Trans. Pattern Anal. Mach. Intell. 42 (8), 2011–2023. http://dx.doi.org/10.
Acknowledgments 1109/TPAMI.2019.2913372.
Huang, H., Lin, L., Tong, R., Hu, H., Zhang, Q., Iwamoto, Y., Han, X., Chen, Y.-
W., Wu, J., 2020. Unet 3+: A full-scale connected unet for medical image
This work was supported by the National Natural Science Foun- segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics,
dation of China (No. 61872075) and Natural Science Foundation of Speech and Signal Processing. ICASSP, IEEE, pp. 1055–1059. http://dx.doi.org/10.
Liaoning Province, China (No. 2021-YGJC-07). We gratefully acknowl- 1109/ICASSP40776.2020.9053405.
Huang, H., Zheng, H., Lin, L., Cai, M., Hu, H., Zhang, Q., Chen, Q., Iwamoto, Y.,
edge the support of Dr. Guoxiu Lu, Dr. Jia Guo, Dr. Zhiguo Wang,
Han, X., Chen, Y.-W., et al., 2021. Medical image segmentation with deep atlas
and Mr. Youchao Wang from the nuclear medicine department of the prior. IEEE Trans. Med. Imaging 40 (12), 3519–3530. http://dx.doi.org/10.1109/
General Hospital of Shenyang Military Area Command. TMI.2021.3089661.
Kubicek, J., Timkovic, J., Penhaker, M., Oczka, D., Krestanova, A., Augustynek, M.,
Cernỳ, M., 2019. Retinal blood vessels modeling based on fuzzy sobel edge
References
detection and morphological segmentation.. In: Biodevices. pp. 121–126. http:
//dx.doi.org/10.5220/0007237501210126.
Ali, Z., Irtaza, A., Maqsood, M., 2022. An efficient U-Net framework for lung nodule Kumar, A., Fulham, M., Feng, D., Kim, J., 2019. Co-learning feature fusion maps
detection using densely connected dilated convolutions. J. Supercomput. 78 (2), from PET-CT images of lung cancer. IEEE Trans. Med. Imaging 39 (1), 204–217.
1602–1623. http://dx.doi.org/10.1007/s11227-021-03845-x. http://dx.doi.org/10.1109/TMI.2019.2923601.
Alom, M.Z., Yakopcic, C., Hasan, M., Taha, T.M., Asari, V.K., 2019. Recurrent residual Le Dinh, T., Lee, S.-H., Kwon, S.-G., Kwon, K.-R., 2022. Cell nuclei segmentation
U-Net for medical image segmentation. J. Med. Imaging 6 (1), 014006. http: in cryonuseg dataset using nested unet with EfficientNet encoder. In: 2022
//dx.doi.org/10.1117/1.JMI.6.1.014006. International Conference on Electronics, Information, and Communication. ICEIC,
Bhattarai, B., Subedi, R., Gaire, R.R., Vazquez, E., Stoyanov, D., 2022. Histogram of IEEE, pp. 1–4. http://dx.doi.org/10.1109/ICEIC54506.2022.9748537.
oriented gradients meet deep learning: A novel multi-task deep network for medical Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P., 2020. Focal loss for dense
image semantic segmentation. arXiv preprint arXiv:2204.01712 http://dx.doi.org/ object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42 (2), 318–327. http:
10.48550/arXiv.2204.01712. //dx.doi.org/10.1109/TPAMI.2018.2858826.
10
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159
Liu, T., Liu, J., Ma, Y., He, J., Han, J., Ding, X., Chen, C.-T., 2021. Spatial feature fusion Tang, Y., Tang, Y., Zhu, Y., Xiao, J., Summers, R.M., 2020. E2 net: An edge enhanced
convolutional network for liver and liver tumor segmentation from CT images. Med. network for accurate liver and tumor segmentation on CT scans. In: International
Phys. 48 (1), 264–272. http://dx.doi.org/10.1002/mp.14585. Conference on Medical Image Computing and Computer-Assisted Intervention.
Loshchilov, I., Hutter, F., 2019. Decoupled weight decay regularization. In: International Springer, pp. 512–522. http://dx.doi.org/10.1007/978-3-030-59719-1_50.
Conference on Learning Representations. URL: https://openreview.net/forum?id= Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., Patel, V.M., 2021. Medical transformer:
Bkg6RiCqY7. Gated axial-attention for medical image segmentation. In: International Conference
Luo, X., Chen, J., Song, T., Wang, G., 2021. Semi-supervised medical image segmen- on Medical Image Computing and Computer-Assisted Intervention. Springer, pp.
tation through dual-task consistency. In: Proceedings of the AAAI Conference on 36–46. http://dx.doi.org/10.1007/978-3-030-87193-2_4.
Artificial Intelligence. Vol. 35. (10), pp. 8801–8809. http://dx.doi.org/10.48550/ Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N.,
arXiv.2009.04448. Kaiser, Ł., Polosukhin, I., 2017. Attention is all you need. Adv. Neural Inf.
Meyer, C., Mallouh, V., Spehner, D., Baudrier, E., Schultz, P., Naegel, B., 2021. Process. Syst. 6000–6010, URL: https://proceedings.neurips.cc/paper/2017/file/
Automatic multi class organelle segmentation for cellular fib-sem images. In: 3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
2021 IEEE 18th International Symposium on Biomedical Imaging. ISBI, IEEE, pp. Wang, M., Jiang, H., Shi, T., Yao, Y.-D., 2021a. HD-RDS-UNet: Leveraging spatial-
668–672. http://dx.doi.org/10.1109/ISBI48211.2021.9434075. temporal correlation between the decoder feature maps for lymphoma seg-
Milletari, F., Navab, N., Ahmadi, S.-A., 2016. V-Net: Fully convolutional neural mentation. IEEE J. Biomed. Health Inf. http://dx.doi.org/10.1109/JBHI.2021.
networks for volumetric medical image segmentation. In: 2016 Fourth International 3102612.
Conference on 3D Vision. 3DV, pp. 565–571. http://dx.doi.org/10.1109/3DV.2016. Wang, J., Zhang, X., Lv, P., Zhou, L., Wang, H., 2021b. EAR-U-Net: EfficientNet
79. and attention-based residual U-Net for automatic liver segmentation in CT. arXiv
Mok, T.C., Chung, A., 2018. Learning data augmentation for brain tumor segmenta- preprint arXiv:2110.01014. http://dx.doi.org/10.48550/arXiv.2110.01014.
tion with coarse-to-fine generative adversarial networks. In: International MICCAI Wong, D., Liu, J., Fengshou, Y., Tian, Q., Xiong, W., Zhou, J., Qi, Y., Han, T.,
Brainlesion Workshop. Springer, pp. 70–80. http://dx.doi.org/10.1007/978-3-030- Venkatesh, S., Wang, S.-c., 2008. A semi-automated method for liver tumor
11723-8_7. segmentation based on 2D region growing with knowledge-based constraints. In:
Nathan, S., Ramamoorthy, S., 2020. Efficient supervision net: Polyp segmentation using MICCAI Workshop. Vol. 41. No. 43. http://dx.doi.org/10.54294/25etax.
EfficientNet and attention unit. In: Working Notes Proceedings of the MediaEval Xu, G., Cao, H., Udupa, J.K., Tong, Y., Torigian, D.A., 2021. DiSegNet: A deep
2020 Workshop, Online, 14-15 December 2020. URL: http://ceur-ws.org/Vol-2882/ dilated convolutional encoder-decoder architecture for lymph node segmentation
paper72.pdf. on PET/CT images. Comput. Med. Imaging Graph. 88, 101851. http://dx.doi.org/
Pham, V.-T., Tran, T.-T., Wang, P.-C., Chen, P.-Y., Lo, M.-T., 2021. EAR-UNet: A deep 10.1016/j.compmedimag.2020.101851.
learning-based approach for segmentation of tympanic membranes from otoscopic Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z., 2017. Suggestive annotation: A deep
images. Artif. Intell. Med. 115, 102065. http://dx.doi.org/10.1016/j.artmed.2021. active learning framework for biomedical image segmentation. In: International
102065. Conference on Medical Image Computing and Computer-Assisted Intervention.
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for Springer, pp. 399–407. http://dx.doi.org/10.1007/978-3-319-66179-7_46.
biomedical image segmentation. In: International Conference on Medical Image Yu, F., Koltun, V., 2016. Multi-scale context aggregation by dilated convolutions. In:
Computing and Computer-Assisted Intervention. Springer, pp. 234–241. http://dx. International Conference on Learning Representations. http://dx.doi.org/10.48550/
doi.org/10.1007/978-3-319-24574-4_28. arXiv.1511.07122.
Salehi, S.S.M., Erdogmus, D., Gholipour, A., 2017. Tversky loss function for image Yu, Q., Xie, L., Wang, Y., Zhou, Y., Fishman, E.K., Yuille, A.L., 2018. Recurrent saliency
segmentation using 3D fully convolutional deep networks. In: Machine Learning transformation network: Incorporating multi-stage visual cues for small organ
in Medical Imaging. Springer International Publishing, pp. 379–387. http://dx.doi. segmentation. In: Proceedings of the IEEE Conference on Computer Vision and
org/10.1007/978-3-319-67389-9_44. Pattern Recognition. pp. 8280–8289. http://dx.doi.org/10.1109/CVPR.2018.00864.
Sanchez-Vega, F., Mina, M., Armenia, J., Chatila, W.K., Luna, A., La, K.C., Dimitri- Zaimy, M., Saffarzadeh, N., Mohammadi, A., Pourghadamyari, H., Izadi, P., Sarli, A.,
adoy, S., Liu, D.L., Kantheti, H.S., Saghafinia, S., et al., 2018. Oncogenic signaling Moghaddam, L., Paschepari, S., Azizi, H., Torkamandi, S., et al., 2017. New meth-
pathways in the cancer genome atlas. Cell 173 (2), 321–337. http://dx.doi.org/10. ods in the diagnosis of cancer and gene therapy of cancer based on nanoparticles.
1016/j.cell.2018.03.035. Cancer Gene Ther 24 (6), 233–243. http://dx.doi.org/10.1038/cgt.2017.16.
Sharifrazi, D., Alizadehsani, R., Roshanzamir, M., Joloudari, J.H., Shoeibi, A., Jafari, M., Zhang, R., Li, G., Li, Z., Cui, S., Qian, D., Yu, Y., 2020. Adaptive context selection
Hussain, S., Sani, Z.A., Hasanzadeh, F., Khozeimeh, F., et al., 2021. Fusion of for polyp segmentation. In: International Conference on Medical Image Computing
convolution neural network, support vector machine and sobel filter for accurate and Computer-Assisted Intervention. Springer, pp. 253–262. http://dx.doi.org/10.
detection of COVID-19 patients using X-ray images. Biomed. Signal Process. Control 1007/978-3-030-59725-2_25.
68, 102622. http://dx.doi.org/10.1016/j.bspc.2021.102622. Zhang, Y., Zhang, J., 2021. Dual-task mutual learning for semi-supervised medical
Shelhamer, E., Long, J., Darrell, T., 2017. Fully convolutional networks for semantic image segmentation. In: Chinese Conference on Pattern Recognition and Computer
segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), 640–651. http: Vision. PRCV, Springer, pp. 548–559. http://dx.doi.org/10.1007/978-3-030-88010-
//dx.doi.org/10.1109/TPAMI.2016.2572683. 1_46.
Sobel, I.E., 1970. Camera Models and Machine Perception. Stanford University. Zhao, X., Zhang, P., Song, F., Fan, G., Sun, Y., Wang, Y., Tian, Z., Zhang, L., Zhang, G.,
Soler, L., Delingette, H., Malandain, G., Montagnat, J., Ayache, N., Koehl, C., Dour- 2021. D2A U-Net: Automatic segmentation of COVID-19 CT slices based on dual
the, O., Malassagne, B., Smith, M., Mutter, D., et al., 2001. Fully automatic attention and hybrid dilated convolution. Comput. Biol. Med. 135, 104526. http:
anatomical, pathological, and functional segmentation from CT scans for hep- //dx.doi.org/10.1016/j.compbiomed.2021.104526.
atic surgery. Comput Aided Surg 6 (3), 131–142. http://dx.doi.org/10.3109/ Zhou, Q., Qin, J., Xiang, X., Tan, Y., Ren, Y., 2022. MOLS-Net: Multi-organ and
10929080109145999. lesion segmentation network based on sequence feature pyramid and attention
Tan, M., Le, Q., 2019. Efficientnet: Rethinking model scaling for convolutional neural mechanism for aortic dissection diagnosis. Knowl.-Based Syst. 239, 107853. http:
networks. In: International Conference on Machine Learning. PMLR, pp. 6105–6114, //dx.doi.org/10.1016/j.knosys.2021.107853.
URL: https://proceedings.mlr.press/v97/tan19a.html. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J., 2018. Unet++: A nested
Tang, Y., Cai, J., Yan, K., Huang, L., Xie, G., Xiao, J., Lu, J., Lin, G., Lu, L., 2021. u-net architecture for medical image segmentation. In: Deep Learning in Medical
Weakly-supervised universal lesion segmentation with regional level set loss. In: Image Analysis and Multimodal Learning for Clinical Decision Support. Springer,
International Conference on Medical Image Computing and Computer-Assisted pp. 3–11. http://dx.doi.org/10.1007/978-3-030-00889-5_1.
Intervention. Springer, pp. 515–525. http://dx.doi.org/10.1007/978-3-030-87196-
3_48.
11