0% found this document useful (0 votes)
53 views11 pages

2023 - C2BA-UNet - A Context-Coordination Multi-Atlas Boundary-Aware UNet-like Method For PET-CT Images Based Tumor Segmentation

Uploaded by

WYS SNAPE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views11 pages

2023 - C2BA-UNet - A Context-Coordination Multi-Atlas Boundary-Aware UNet-like Method For PET-CT Images Based Tumor Segmentation

Uploaded by

WYS SNAPE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Computerized Medical Imaging and Graphics 103 (2023) 102159

Contents lists available at ScienceDirect

Computerized Medical Imaging and Graphics


journal homepage: www.elsevier.com/locate/compmedimag

C2 BA-UNet: A context-coordination multi-atlas boundary-aware UNet-like


method for PET/CT images based tumor segmentation
Shijie Luo a , Huiyan Jiang a,b ,∗, Meng Wang a
a Software College, Northeastern University, Shenyang 110819, China
b
Key Laboratory of Intelligent Computing in Biomedical Image, Ministry of Education, Northeastern University, Shenyang 110819, China

ARTICLE INFO ABSTRACT

Keywords: Tumor segmentation is a necessary step in clinical processing that can help doctors diagnose tumors and
Tumor segmentation plan surgical treatments. Since tumors are usually small, the locations and appearances vary substantially
UNet across individuals, and the contrast between tumors and adjacent normal tissues is low, tumor segmentation
Context-coordination
is still a challenging task. Although convolutional neural networks (CNNs) have achieved good results in
Boundary-awareness
tumor segmentation, the information about tumor boundaries has been rarely explored. To solve the problem,
PET/CT imaging
this paper proposes a new method for automatic tumor segmentation in PET/CT images based on context-
coordination and boundary-aware, termed as C2 BA-UNet. We employ a UNet-like backbone network and
replace the encoder with EfficientNet-B0 for efficiency. To acquire potential tumor boundaries, we propose a
new multi-atlas boundary-aware (MABA) module based on gradient atlas, uncertainty atlas, and level set atlas,
that focuses on uncertain regions between tumors and adjacent tissues. Furthermore, we propose a new context
coordination module (CCM) to combine multi-scale context information with attention mechanism to optimize
skip connection in high-level layers. To validate the superiority of our method, we conduct experiments on a
publicly available soft tissue sarcoma (STS) dataset and a lymphoma dataset, and the results show our method
is competitive with other comparison methods.

1. Introduction location, extent, and characterization of lesions detected by PET (Boel-


laard et al., 2015). Fig. 1 shows two axial PET/CT image pairs of
Cancers are considered among the most deadly diseases, and more patients with soft tissue sarcoma (STS) and lymphoma, respectively.
than 10 million individuals die of cancers each year around the world In clinical medicine, tumor segmentation is primarily relied on manual
(Zaimy et al., 2017). Early diagnosis and treatment are key to effective labeling, which is time-consuming and also has inaccurate diagnosis
mortality decline. 18 F-fluorodeoxyglucose positron emission tomogra- due to inter- and intra-individual experience disparities among oncol-
phy computed tomography (18 F-FDG-PET/CT) is currently the most ogists. As a result, using Computer Aided Diagnosis (CAD) technology
commonly used to detect and diagnose areas of suspected tumor le- to automatically detect and segment suspected lesion areas in medical
sions (Wang et al., 2021a). Positron emission tomography (PET) is images can assist oncologists in recognizing tumor signals and pro-
a tomographic technique that allow noninvasive quantitative assess- vide technical support with reliable predictions for clinicians in the
ment of biochemical and functional processes. At present, PET uses subsequent diagnosis monitor and therapy response assessment.
18 F-fluorodeoxyglucose as the tracer to measure the distribution of
Tumor segmentation by using typical machine learning methods as
positron emitting labeled radiotracers. Because increased consumption
region growing (Wong et al., 2008), edge tracking (Dahab et al., 2012),
of glucose is characteristic of most cancers and FDG accumulation is
and intensity thresholding (Soler et al., 2001) require manual feature
proportional to the amount of glucose utilization, PET has been proven
dependence, which makes automatic segmentation inconvenient. Re-
to be a sensitive imaging modality for cancer. Computed tomography
cently, many deep learning based models, such as FCN (Shelhamer
(CT) generates tomographic images by combining a X-ray transmission
et al., 2017; Yang et al., 2017), UNet (Ronneberger et al., 2015),
source and detector system rotating around the subject. CT can provide
not only attenuation correction but also high-resolution visualization of GAN (Goodfellow et al., 2014; Mok and Chung, 2018), and various
morphological and anatomical features. CT-derived data improves the extensions (Zhou et al., 2018; Huang et al., 2020; Liu et al., 2021), have

∗ Corresponding author at: Key Laboratory of Intelligent Computing in Biomedical Image, Ministry of Education, Northeastern University, Shenyang 110819,
China.
E-mail address: [email protected] (H. Jiang).

https://doi.org/10.1016/j.compmedimag.2022.102159
Received 10 May 2022; Received in revised form 11 November 2022; Accepted 5 December 2022
Available online 9 December 2022
0895-6111/© 2022 Elsevier Ltd. All rights reserved.
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

2. Related works

2.1. EfficientNet-based segmentation

EfficientNet (Tan and Le, 2019) is proposed to simultaneously opti-


mize the efficiency and accuracy of network learning. In EfficientNet,
the mobile bottleneck convolution (MBConv) plays a role in improve-
ment by squeeze and excitation (SE) operation. By searching for a
uniform balance with scaling depth, width, and resolution, EfficientNet
can balance the advantages without introducing additional manual
tuning costs and resource budgets.
In the automatic segmentation of medical images, EfficientNet has
Fig. 1. Axial PET/CT image pairs with STS and lymphoma, respectively. The tumor been widely used. Le Dinh et al. (2022) employed EfficientNet with
boundaries are marked by the pink. Nested UNet as a backbone network for automatic cell nuclei seg-
mentation. Nathan and Ramamoorthy (2020) proposed a novel multi-
supervision net by using EfficientNet as the encoder, and the method
been proposed for automatic tumor segmentation and they achieved was well done on polyp segmentation. Meyer et al. (2021) proposed
impressive performances. However, potential tumor boundary informa- a UNet-like network using the EfficientNet-B4 structure to achieve
tion has been rarely considered in these networks, resulting in segmen- high-precision automatic segmentation of numerous cells classes. By
tation uncertainty of boundaries between tumors and normal tissues. combining EfficientNet, attention mechanism, and residual connec-
To fix this problem, Dutta et al. (2021) utilized a radiomics pipeline to tion, Wang et al. (2021b) and Pham et al. (2021) achieved satisfactory
quantify the sensitivity of tumor boundary perturbation features in the performance in automatic liver segmentation and tympanic membrane
study of tumor boundary modeling. E2 Net (Tang et al., 2020) improved segmentation of otoscopic images, respectively.
network segmentation accuracy by obtaining edge enhancement super-
vision information through distance transformation. Tang et al. (2021) 2.2. Dilated convolution
proposed a regional level set loss, Han et al. (2021) proposed a loss that
combines distance, area and boundary information, both losses were
Dilated convolution (Yu and Koltun, 2016) can widen the range of
designed to promote the segmentation close to tumor boundary. These
receptive field by adding zeros to conventional convolution kernels,
methods considered the importance of boundaries for better tumor
that is useful in semantic segmentation. Without pooling and down-
segmentation. But using only single strategy is not powerful enough
sampling, global features associated with semantic can be well stored.
to model multi-scale features of the potential boundary regions.
Combining dilated convolutions with different rates can learn contex-
To further improve the segmentation of tumors especially the
tual information of multi-dilated receptive fields (Ali et al., 2022). Zhao
boundaries, this study proposes a novel method based on context-
et al. (2021) used multiple dilated convolutions with residual attention
coordination and boundary-aware, termed as C2 BA-UNet. Specifically,
to form the decoder, and achieved improved recall score for COVID-
we propose a multi-atlas boundary-aware (MABA) module because the
19 segmentation. Deeplab v2 proposed Atrous Spatial Pyramid Pooling
convolution and pooling operations in the encoder ignore some local
(ASPP) (Chen et al., 2017a), which uses multiple dilated rates of the
tumor boundary information, resulting in inadequate segmentation.
dilated convolutions to effect the features resolution responded. Xu
In MABA module, three distinct model-driven boundary-aware ap-
et al. (2021) utilized SegNet with ASPP integrated for lymphoma
proaches are utilized to learn the uncertain regions between tumors and
tissues and optimize the tumor boundary segmentation through fusion segmentation, improved the performances and detailed predictions of
and awareness. The proposed MABA module can obtain a boundary the boundary areas.
closer to the real by mining the boundary features under multiple
approaches. In addition, we use a fine-tuned EfficientNet-B0 based 2.3. Gradient-based boundary detection
encoder and propose a new skip connection module called context
coordination module (CCM) by combining multi-scale context informa- Image gradient is used as a method for boundary detection. The
tion with axial attention mechanism. It can improve the attention to gradient magnitude 𝐺 of a pixel 𝑝 is calculated as the first-order
the region-of-interest (ROI) of tumors and boost the regions judged as derivative of different sharpnesses with neighboring pixels and is given
foreground in the feature map. by:
The main contributions of this paper are as follows: √
𝐺𝑝 = 𝛥𝑝2𝑥 + 𝛥𝑝2𝑦 (1)
• We propose a new multi-atlas boundary-aware (MABA) mod-
ule, where three distinct model-driven computational approaches where 𝐺𝑝 denotes the gradient magnitude at point 𝑝, 𝛥𝑝𝑥 , 𝛥𝑝𝑦 denote
for key prediction probability maps are concentrated on learn- gradient magnitudes along the horizontal and vertical axes, respec-
ing the uncertain regions between tumors and tissues to obtain tively.
more realistic tumor boundaries and better the difficult boundary Usually, the medical images, such as PET imaging, because of the
segmentation problem. low signal-to-noise ratio, ROIs are sensitive to blurring edges and poor
• We design a context coordination module (CCM) to improve contrasts, which can cause uncertain boundary predictions. Using gra-
the skip connection in high-level layers. The module combines dient map on the image can well guide to detect the potential boundary
multiple receptive field contextual features with axial attention regions. Bhattarai et al. (2022) successfully used gradient maps to fake
to optimize tumor area predictions while reducing network com- image labels for an auxiliary task. Sharifrazi et al. (2021) proposed a
plexity. model using CNN-SVM with sobel operator, and proved that the study
• We propose C2 BA-UNet, a new tumor segmentation model which of gradient can improve performance for COVID-19 detecting. Kubicek
combines EfficientNet-B0 based encoder with MABA module and et al. (2019) performed morphological operations using gradients to
CCM to optimize context-dependent information and enhance distinguish retinal vessels from the background and obtained smooth
tumor boundary segmentations. models.

2
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Fig. 2. The architecture of proposed C2 BA-UNet model. Area branch is formed by the backbone of our model which is a modified UNet with EfficientNet-B0 as encoder. The
proposed MABA module learns tumor boundary segmentation in the decoder layer with a resolution of 64×64 as boundary branch. Meanwhile, CCM reduces the redundancy of
high-level features fed to decoder.

3. Method boundary awareness during the end-to-end training. We use Sobel


operator (Sobel, 1970) to calculate the gradient magnitude. It can be
3.1. Network architecture more accurate for boundary localization when the gray scale gradient
and noise are high (Chen et al., 2019). Given point 𝑜, its gradient
UNet is an encoder–decoder structure for medical image semantic magnitudes in the horizontal direction (𝐺𝑜𝑥 ) and vertical direction (𝐺𝑜𝑦 )
segmentation. In this paper, we propose C2 BA-UNet, as shown in Fig. 2, are computed as follows:
a UNet-like tumor segmentation model based on context coordina-
⎡ 1 0 −1 ⎤ ⎡ 1 2 1 ⎤
tion and boundary-awareness to optimize the performance. Details of
𝐺𝑜𝑥 = 𝐹𝑜 ∗ ⎢ 2 0 −2 ⎥ 𝐺𝑜𝑦 = 𝐹𝑜 ∗ ⎢ 0 0 0 ⎥ (2)
C2 BA-UNet are as follows: ⎢ ⎥ ⎢ ⎥
⎣ 1 0 −1 ⎦ ⎣ −1 −2 −1 ⎦
(1) A UNet-like method with pre-trained EfficientNet-B0 based en- where 𝐹𝑜 denotes the feature map composed of point 𝑜 with its eight
coder is utilized as the backbone network to increase network neighboring points, ∗ denotes the convolution operation.
learning efficiency and accuracy. Futhermore, we use transfer Combining 𝐺𝑜𝑥 and 𝐺𝑜𝑦 to generate the complete gradient magnitude
learning strategy to fine-tune a complex model to mitigate over- 𝐺𝑜 at point 𝑜 for boundary-aware gradient atlas:
fitting of few-shot learning. The backbone is used to form the √
( )2 ( 𝑦 )2
area branch. 𝐺𝑜 = 𝐺𝑜𝑥 + 𝐺𝑜 (3)
(2) We propose MABA module to fix the problem of potential bound-
ary information loss caused by image resolution shrunk in CNNs. Uncertainty atlas: Uncertain regions between the tumor and sur-
Since potential boundary information exists in the adjacent area rounding tissues contain vast and crucial boundary information. Given
between the tumor and its surrounding tissues, combining ap- 𝑃1 denoting boundary output of the penultimate layer followed by
proaches can aware multi-scale boundary features and obtain a Sigmoid (see 𝑃1 in Fig. 3), for any pixel 𝑝𝑖 ∈ 𝑃1 , when variance
more realistic boundary result. We construct MABA module in between its value 𝑣𝑝𝑖 ∈ [0, 1] and threshold 𝑇 is smaller, 𝑝𝑖 is more
penultimate layer of the decoder to form the boundary branch. uncertain Zhang et al. (2020). Empirically, the threshold 𝑇 is set to 0.5
(3) We propose context coordination module (CCM), a optimal skip in existing tumor segmentation studies (Yu et al., 2018; Alom et al.,
connection approach in high-level layers. The encoding blocks 2019; Huang et al., 2021). The boundary-aware uncertainty atlas (𝑈 𝑅)
feed features to the decoder through CCM-based skip connection. can be defined as follows:
It can improve the attention to ROIs of tumors and allow the |𝑃1 − 0.5|
network to maintain efficiency and lightness. Note that, due to 𝑈𝑅 = 1 − | | (4)
0.5
hardware constraints, we set CCM in the previous layer of the
We generate a normal-like distribution using this strategy, which
bottleneck layer only.
focuses on pixels whose predicted probabilities close to threshold 𝑇 in
(4) Tumors are easily mis-segmented as information about potential
𝑃1 , i.e., uncertain regions, and depresses the influence of pixels that can
tumor regions is usually affected by background, i.e., the class
be considered as foreground or background directly.
imbalance problem. To address this, we combine area branch
Level set atlas: Level set method can capture geometric activ-
and boundary branch to enhance the tumor segmentation ability.
ity contours and distance information by implicitly modeling motion
curves. To obtain the precise curves important to boundary segmen-
3.2. Multi-atlas boundary-aware module
tation, the instantaneous running state of the level set needs to be
tracked. However, trying to perform level set evolution in our model
In MABA module, we consider atlas as a medical navigational
is an expensive task, we fit level set method as a regression task using
map that has multiple information representations and can provide
signed distance field (SDF) (Luo et al., 2021; Zhang and Zhang, 2021),
features ignored by traditional methods (Sanchez-Vega et al., 2018).
resulting in boundary-aware level set atlas:
As a result, we combine multiple model-driven approaches based on
image gradient strategy, uncertainty learning, and level set evolution, ⎧ − inf
⎪ 𝑏∈𝜕𝑆 ‖𝑎 − 𝑏‖2 , 𝑖𝑓 𝑎 ∈ 𝑆in
respectively, to segment potential boundary regions. As shown in Fig. 3, 𝐿𝑆(𝑎, 𝑏) = ⎨ 0, 𝑖𝑓 𝑎 ∈ 𝜕𝑆 (5)
the gradient atlas, uncertainty atlas, and level set atlas are used for ⎪ + inf 𝑏∈𝜕𝑆 ‖𝑎 − 𝑏‖2 , 𝑖𝑓 𝑎 ∈ 𝑆out
fusion multi-scale features, SE (Hu et al., 2020) block enhances the ⎩
features to optimize the tumor boundary. where 𝐿𝑆 denotes the level set atlas, 𝑎, 𝑏 denote two random pixels in
Gradient atlas: The gradient magnitude can indicate the variation feature map 𝑃1 , ‖𝑎 − 𝑏‖ represents the Euclidean distance between 𝑎
of pixel sharpness. Typically, gradient magnitudes of pixels between and 𝑏. 𝑆𝑖𝑛 , 𝜕𝑆, 𝑆𝑜𝑢𝑡 denote the interior, boundary, and exterior of the
the tumor and the adjacent tissues can be well reflected in the tumor tumor area, respectively.

3
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Fig. 3. Architecture of the proposed MABA module.

Fig. 4. The CCM for high-level skip connection. Three dilated convolution kernels are used in context features extraction, represented by dashed boxes in blue, green and light
red, respectively. The orange color in each convolution kernel indicates regions corresponding to the input feature map to be convolved, and the gray indicates regions skipped
in the dilated convolution.

3.3. Skip connection based on CCM where 𝑦𝑛 ∈ [0, 1] denotes the predicted probability map, and 𝑦̂𝑛 ∈
{0, 1} denotes the ground truth corresponding to 𝑦𝑛 . 𝑁 is the batch
In this section, we propose CCM to enhance skip connections to size, 𝑝𝑐 is a hyper-parameter designed because of the class imbalance
refine features and give ROIs of tumors in high-level layers more problem, we set it to 6 by experience.
attention, as shown in Fig. 4. Inspired by ASPP (Chen et al., 2017b), BCE loss is used in two branches, as shown in Fig. 2: 𝐿𝐴𝑆 for
we use three parallel dilated convolution branches and a constant area branch and 𝐿𝐵𝑆 for boundary branch. 𝐿𝐴𝑆 checks the difference
concatenation to extract context features of multiple receptive fields. between area prediction and area ground truth. 𝐿𝐵𝑆 checks the differ-
Considering the low occupancy of tumors in the image and small spatial ence between boundary-aware result generated by MABA module and
size of the previous layer of the bottleneck layer, dilated rates are set boundary ground truth. Thus, the total loss 𝐿𝑡𝑜𝑡𝑎𝑙 is calculated by adding
to 1, 3, 5. Given the input feature map 𝑋 ∈ R𝑊 ×𝐻×80 , we can obtain
𝑟=1 , 𝑓 𝑟=3 , 𝑓 𝑟=5 ) by dilated both losses:
three features of different receptive fields(𝑓dc dc dc
convolutions, and CCM extracts the multi-scale feature as follows: 𝐿total = 𝛼𝐿𝐴𝑆 + (1 − 𝛼)𝐿𝐵𝑆 (9)
( ( ))
𝑋 ′ = 𝜏 𝜙 𝑋, 𝑓dc𝑟=1 𝑟=3
, 𝑓dc 𝑟=5
, 𝑓dc (6) where 𝛼 and 1 − 𝛼 are weights of 𝐿𝐴𝑆 and 𝐿𝐵𝑆 , respectively.
where 𝜙(⋅) represents a concatenation operation, 𝜏(⋅) represents 1 × 1
convolution. 𝑋 ′ ∈ R𝑊 ×𝐻×80 is the feature by integrating and fusing. 4. Experiment
Then, we improve the long-term dependence of integrated multi-
scale feature by self-attention mechanism. We perform a position em- 4.1. Experimental settings
bedding operation (Vaswani et al., 2017) on feature map 𝑋 ′ and expand
it into two one-dimensionalized sequences with row-by-row (𝑋row ′ ∈ 4.1.1. Dataset
R𝑊 𝐻×1×80 ) and column-by-column (𝑋col ′ ∈ R1×𝑊 𝐻×80 ), respectively. In the experiments, we use a publicly available STS dataset as well
For calibrating both axis, we employ gated axial attention (Valanarasu as a lymphoma dataset for training and testing. PET/CT image pairs

et al., 2021). Take 𝑋row (same for 𝑋col ′ ) as an example:
are included in the initial data format. The size of each PET slice is
∑ ( )( ) 128 × 128 pixels, and the size of each CT slice is 512 × 512 pixels.
𝑦𝑚 = sof tmax𝑛 𝑞𝑚𝑇 𝑘𝑛 + 𝐺𝑄 𝑞𝑚𝑇 𝑟𝑞𝑛′ + 𝐺𝐾 𝑘𝑇𝑛 𝑟𝑘𝑛′ 𝐺𝑉 1 𝑣𝑛 + 𝐺𝑉 2 𝑟𝑣𝑛′ (7) STS dataset1 : The publicly available STS dataset contains 51 his-
𝑛∈
tologically confirmed cases of STS of the extremities. FDG-PET scan
where 𝑚, 𝑛 are positions in 𝑋row ′ , 𝑦 represents the output at point 𝑚,
𝑚 images of STS data were acquired on a PET/CT scanner (Discovery ST,
 represents the local receptive field region centered on 𝑚. 𝑞, 𝑘, 𝑣 GE Healthcare, Waukesha, WI) at the McGill University Health Centre
𝑞
are query, key, value, 𝑟𝑛′ , 𝑟𝑛′ , 𝑟𝑛′ ∈ R𝑊 ×𝑊 denote relative positional
𝑘 𝑣
(MUHC) from November 2004 to November 2011. FDG at a median
encodings for 𝑞, 𝑘, 𝑣, respectively. 𝐺𝑄 , 𝐺𝐾 , 𝐺𝑉1 , 𝐺𝑉2 are four attention
of 420 𝑀𝐵𝑞 (210–620 𝑀𝐵𝑞) was injected intravenously and whole-
gates added to queries, keys, and values, respectively. All parameters
body 2D imaging acquisition was performed after 60 min at a median of
are trainable.
180 s (160–300 s) per patient. All patients’ tumor areas were manually
3.4. Loss function mapped slice-by-slice by a radiation oncologist basis on T2FS scans. The
rigid alignment software MIM (MIM Software Inc., Cleveland, OH) was
Following Zhou et al. (2022), we adopt the binary cross entropy loss used to propagate the contours into the FDG-PET images. For patients
(𝐿𝐵𝐶𝐸 ) as the loss function:

𝑁
[ ( ) ( )] 1
https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=
𝐿𝐵𝐶𝐸 = − 𝑝𝑐 ⋅ 𝑦̂𝑛 ⋅ log 𝑦𝑛 + 1 − 𝑦̂𝑛 ⋅ log 1 − 𝑦𝑛 (8)
21266533
𝑛=1

4
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Table 1
Experiment setting.
Parameter Batchsize Epoch Optimizer Learn rate 𝛼 Early stopping
Setting 32 100 Adamw 0.0001 0.5 True

4.2. Evaluation metrics

Patient-wise evaluation metrics, i.e., all metrics reflecting at the


individual patient level, including patient-wise Dice similarity coeffi-
cients (𝐷𝑆𝐶 𝑝𝑤 ), patient-wise Recall (𝑅𝑒𝑐 𝑝𝑤 ), and patient-wise Inter-
section over Union (𝐼𝑂𝑈 𝑝𝑤 ), are used in this paper for quantitative
Fig. 5. Pre-processing of PET/CT image pairs with corresponding ground truth data.
analysis. These three metrics have same value range of [0, 1], and
The CT images are resampled to the size of 128×128×1, and then the aligned PET/CT higher values indicate better related segmentations. In addition, the
image pairs are concatenated as the input data. arithmetic average of the three above metrics sums with a value range
of [0, 1] is introduced as a composite patient-wise metric 𝐷𝑅𝐼 𝑝𝑤 .
Higher 𝐷𝑅𝐼 𝑝𝑤 value indicates better overall performance of the model.
with visible edema around the tumor, the invisible edema data was 𝐷𝑅𝐼 𝑝𝑤 can be defined as follows:
used. 𝐷𝑆𝐶𝑝𝑤 + 𝑅𝑒𝑐𝑝𝑤 + 𝐼𝑂𝑈𝑝𝑤
Lymphoma dataset: The lymphoma dataset contains 54 cases col- 𝐷𝑅𝐼𝑝𝑤 = (10)
3
lected between September 2016 and June 2019 at the General Hospital
of Shenyang Military Area Command. The PET/CT scans were per- 4.3. Ablation experiment
formed by the GE PET/CT-discovery VCT system on the XT platform.
Tumor areas were manually mapped by three specialized radiation To validate the contributions of all parts in our model, we per-
oncologists and delineated on CT sections by analyzing consecutive form ablation experiments to demonstrate the proposed modules have
axial slices containing lymphomas from whole-body PET/CT scans. To positive impacts on improving the effectiveness of tumor segmenta-
reduce inter-observer variation, the annotations were further revised by tion prediction. First, we build the UNet-like baseline network using
an expert with more than ten years of experience. EfficientNet-B0 as the encoder to form E-UNet. Then, we add the
For STS dataset, we select 30 cases (1147 slices in total) from total proposed CCM and MABA module to baseline to create C2 -UNet and
51 publicly available patient cases as training set and 6 cases (258 slices BA-UNet, respectively. Finally, both modules are added to create the
in total) as validation set, the rest 15 cases (515 slices in total) as test proposed C2 BA-UNet. Note that, before the introduction of MABA
set. For lymphoma dataset, we select 37 cases (1131 slices in total) module, all predictions are correlated with tumor area segmentations
from total 54 cases as training set and 7 cases (309 slices in total) as only.
validation set, the rest 10 cases (393 slices in total) as test set. All The quantitative results of our segmentation ablation experiments
experiments in this paper are based on this same data division, and on STS and lymphoma datasets are listed in Table 2. It can be seen that
experimental results are all calculated in test set. C2 BA-UNet consistently achieves promising results in 𝐷𝑆𝐶 𝑝𝑤 compared
with its variants across both datasets. On STS dataset, C2 BA-UNet
outperforms E-UNet by 3.6%, C2 -UNet by 2.8%, BA-UNet by 2.4%.
4.1.2. Preprocessing
On lymphoma dataset, C2 BA-UNet outperforms E-UNet by 1.9%, C2 -
We preprocess the PET/CT images using the method described
UNet by 1.3%, BA-UNet by 1.9%. In addition, compared to the baseline
by Diao et al. (2021). Firstly, we calculate Standard Uptake Value (SUV)
network E-UNet, the variant C2 -UNet improves 0.8% and 0.6% on STS
of PET data and truncate them to [0, 5] while retaining the image
and lymphoma datasets, respectively. BA-UNet improves 1.2% on STS
size as 128×128. In the second step, we calculate Hounsfield Unit (HU)
dataset. Fig. 6 shows qualitative results on both datasets. We can see
intensity of the CT data, truncate the value to [−260, 340], resample
that adding only CCM, the noise and redundant features extracted by
CT size to 128×128 to match the PET images, and align PET/CT image
encoder can be reduced, and decoder can better take advantage of the
pairs. Finally, the PET/CT images are normalized by min–max scaling.
coordination information to generate tumor area segmentation results.
Due to the dataset quantity limitations, we perform data augmen- When only the MABA module is added, boundary-aware segmentation
tation on the PET/CT and ground truth data throughout the training can be well represented, but there are still interfering area information.
phase. The operations performed are primarily horizontal flip, vertical By adding both CCM and MABA module to form C2 BA-UNet, the model
flip, and random rotation to improve the generalization capability of can acquire context-dependent information and optimized boundary
the model and lessen the impact of the overfitting problem. The pre- features, as a result, tumor area segmentation can be learnt well,
processing flow is shown in Fig. 5. PET/CT image pairs that correspond meanwhile, the predicted boundaries can be more realistic because of
to the same ground truth are fed into the network as bimodal data. the multiple boundary-aware approaches.

4.1.3. Implementation details 4.4. Comparison experiment


All of the models are implemented on a workstation with an NVIDIA
GeForce RTX 3070 GPU using the Pytorch framework. The batch size is To validate the superiority of C2 BA-UNet on tumor segmentation,
set to 32, the epoch is set to 100, Adamw (Loshchilov and Hutter, 2019) we evaluate our model against five comparison methods used for
is selected as the optimizer, the initial learning rate 𝑙𝑟 is set to 0.0001, 𝛼 2D tumor segmentation, including UNet (Ronneberger et al., 2015),
in loss function is set to 0.5, and set early stopping to end the training UNet++ (Zhou et al., 2018), DiSegNet (Xu et al., 2021), MSAM (Fu
process when the loss of the validation set no longer declines in 10 et al., 2021), and CoFeatureModel (Kumar et al., 2019). All models are
consecutive epochs. Each epoch takes around 2 min, and it takes about trained and tested on the STS dataset and the lymphoma dataset. For
120 min for the whole training process. The parameter information of fairness, they use the training set, validation set, and test set all with
our experiment is listed in Table 1. the same data distribution and use the same experiment settings.

5
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Table 2
Quantitative results of ablation experiments on STS and Lymphoma datasets.
Dataset Method EfficientNet-B0 CCM MABA 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
Encoder
E-UNet ✓ – – 0.745 0.733 0.615 0.698
C2 -UNet ✓ ✓ – 0.753 0.783 0.623 0.719
STS
BA-UNet ✓ – ✓ 0.757 0.806 0.627 0.730
C2 BA-UNet ✓ ✓ ✓ 0.781 0.837 0.649 0.756
E-UNet ✓ – – 0.815 0.868 0.696 0.793
C2 -UNet ✓ ✓ – 0.821 0.866 0.707 0.798
Lymphoma
BA-UNet ✓ – ✓ 0.815 0.901 0.695 0.804
C2 BA-UNet ✓ ✓ ✓ 0.834 0.902 0.722 0.819

* The definitions of 𝐷𝑆𝐶 𝑝𝑤 , 𝑅𝑒𝑐 𝑝𝑤 , 𝐼𝑂𝑈 𝑝𝑤 , 𝐷𝑅𝐼 𝑝𝑤 are given in Section 4.2.
* The best values are highlighted in bold.

Table 3
The proposed C2 BA-UNet is evaluated against other comparison methods for quantitative metrics on STS dataset.
Method FLOPs (GMac) 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
UNet (Ronneberger et al., 2015) 16.37 0.744 0.763 0.620 0.709
UNet++ (Zhou et al., 2018) 34.53 0.762 0.785 0.643 0.730
DiSegNet (Xu et al., 2021) 15.05 0.738 0.784 0.623 0.715
MSAM (Fu et al., 2021) 36.45 0.761 0.743 0.634 0.713
CoFeatureModel (Kumar et al., 2019) 5.93 0.748 0.772 0.632 0.717
C2 BA-UNet (𝑂𝑢𝑟𝑠) 2.37 0.781 0.837 0.649 0.756

* The FLOPs metric indicates floating point operations and is given in Section 4.4.1.
* The best values are highlighted in bold.

Fig. 6. Heatmap results on test images from STS (the first row) and lymphoma (the second row) datasets. The highlighted regions in the figure demonstrate the ability of C2 BA-UNet
to better utilize tumor boundary features and area information than other methods.

4.4.1. STS segmentation results 0.689 and lowest 𝐷𝑅𝐼 𝑝𝑤 with 0.796. Our proposed network yields
Among the methods for STS segmentation as listed in Table 3, DiS- optimal performance in all four metrics (𝐷𝑆𝐶 𝑝𝑤 with 0.834, 𝑅𝑒𝑐 𝑝𝑤
egNet (Xu et al., 2021) has the lowest 𝐷𝑆𝐶 𝑝𝑤 with 0.738, MSAM (Fu with 0.902, 𝐼𝑂𝑈 𝑝𝑤 with 0.722, and 𝐷𝑅𝐼 𝑝𝑤 with 0.819) compared to
et al., 2021) has the lowest 𝑅𝑒𝑐 𝑝𝑤 with 0.743, UNet (Ronneberger et al., the other models. For qualitative analysis, Fig. 8 shows three patient
2015) has the lowest 𝐼𝑂𝑈 𝑝𝑤 with 0.743 and lowest 𝐷𝑅𝐼 𝑝𝑤 with 0.709. slices from the test set and the results corresponding to six segmentation
On the contrary, our C2 BA-UNet yields optimal performance in all four methods. Note that, for the slice shown in first row, UNet++ (Zhou
metrics(𝐷𝑆𝐶 𝑝𝑤 with 0.781, 𝑅𝑒𝑐 𝑝𝑤 with 0.837, 𝐼𝑂𝑈 𝑝𝑤 with 0.649, and et al., 2018), DiSegNet (Xu et al., 2021), MSAM (Fu et al., 2021) and
𝐷𝑅𝐼 𝑝𝑤 with 0.756) compared to the other models. Due to the use of CoFeatureModel (Kumar et al., 2019) all have scattered false positive
EfficientNet, which significantly reduces the number of channels in the areas, which are mis-segmentations. However, false positive areas of
encoder, our model has the lowest computational complexity compared C2 BA-UNet are all around the tumor segmentation result and are mostly
to other models. Floating point operations(FLOPs) in Table 3 shows this distributed in the lower left region of the tumor, which may be a
benefit on tumor segmentation. part of uncertainty regions caused by missing of labels during manual
For qualitative analysis, Fig. 7 shows three patient slices from the contouring. This suggests that our method can provide a more reliable
test set and visualization results corresponding to six segmentation signal for tumor segmentation.
networks. It can be seen that C2 BA-UNet fuses the boundary informa-
tion well into the final segmentation result, and can effectively inhibit 4.4.3. Tumor boundary segmentation results
non-tumor areas. We evaluate the effect of proposed C2 BA-UNet and comparison
methods on tumor boundary segmentation, as shown in Fig. 9. All
4.4.2. Lymphoma segmentation results methods, including ours, have over segmentation. Comparison methods
Table 4 lists quantitative results tested on lymphoma dataset. lack optimization and learning specifically for tumor boundaries and
Among these metrics, DiSegNet (Xu et al., 2021) has the lowest 𝐷𝑆𝐶 𝑝𝑤 will produce over-segmentation when tumors are small or irregularly
with 0.806, CoFeatureModel (Kumar et al., 2019) has the lowest 𝑅𝑒𝑐 𝑝𝑤 shaped. However, C2 BA-UNet performs multi-strategy and multi-scale
with 0.862, DiSegNet (Xu et al., 2021) has the lowest 𝐼𝑂𝑈 𝑝𝑤 with awareness of tumor boundaries to eliminate some false positive areas

6
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Fig. 7. Quantitative results of slices from three test patients on STS dataset. From left to right, the first three columns are original CT, original PET, Ground Truth, and the other
columns are results of various segmentation methods. The yellow shows correctly predicted tumor segmentation, the red regions indicate false positive pixels, and the green regions
indicate false negative pixels.

Fig. 8. Quantitative results of slices from three test patients on Lymphoma dataset. From left to right, the first three columns are original CT, original PET, Ground Truth, and
the other columns are results of various segmentation methods. The yellow shows correctly predicted tumor segmentation, the red regions indicate false positive pixels, and the
green regions indicate false negative pixels.

Table 4
The proposed C2 BA-UNet is evaluated against other comparison methods for quantitative metrics on Lymphoma dataset.
Method FLOPs (GMac) 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
UNet (Ronneberger et al., 2015) 16.37 0.821 0.878 0.711 0.803
UNet++ (Zhou et al., 2018) 34.53 0.831 0.885 0.721 0.812
DiSegNet (Xu et al., 2021) 15.05 0.806 0.895 0.689 0.796
MSAM (Fu et al., 2021) 36.45 0.829 0.875 0.721 0.808
CoFeatureModel (Kumar et al., 2019) 5.93 0.827 0.862 0.717 0.802
C2 BA-UNet (𝑂𝑢𝑟𝑠) 2.37 0.834 0.902 0.722 0.819

* The FLOPs metric indicates floating point operations and is given in Section 4.4.1.
* The best values are highlighted in bold.

and can achieve more realistic segmentation results related to tumor different variants. Moreover, our MABA module with triple-atlas can
boundaries. obtain a better performance with 𝐷𝑆𝐶𝑝𝑤 increased by 1.2%, 2.4%,
1.2%, 2.2%, 3.9%, 3.3%, compared to 𝑀𝐴𝐵𝐴𝐺 , 𝑀𝐴𝐵𝐴𝑈 , 𝑀𝐴𝐵𝐴𝐿 ,
5. Discussion 𝑀𝐴𝐵𝐴𝐺𝑈 , 𝑀𝐴𝐵𝐴𝐺𝐿 , 𝑀𝐴𝐵𝐴𝑈 𝐿 , respectively. This observation demon-
strates that all atlases introduced are beneficial for learning and op-
5.1. Different atlas configurations in MABA module timizing tumor segmentation predictions. The highest performance is
obtained by combining all three approaches, and our module can play
To explore the impact of different atlas configurations on experi- a positive role in tumor boundary segmentation.
ment performance, we discuss MABA module with six variants, includ-
ing three single-atlas strategies (gradient atlas only (𝑀𝐴𝐵𝐴𝐺 ), uncer- 5.2. Different dilated rates in CCM
tainty atlas only (𝑀𝐴𝐵𝐴𝑈 ), level set atlas only (𝑀𝐴𝐵𝐴𝐿 )) and three
dual-atlas strategies (gradient atlas with uncertainty atlas (𝑀𝐴𝐵𝐴𝐺𝑈 ), To study the effect of different size of receptive fields on tumor
gradient atlas with level set atlas (𝑀𝐴𝐵𝐴𝐺𝐿 ), uncertainty atlas with segmentation, we further evaluate the performance of different multi-
level set atlas (𝑀𝐴𝐵𝐴𝑈 𝐿 )). scale dilated convolution, including all combinations of convolution
Fig. 10 shows the 𝐷𝑆𝐶𝑝𝑤 for all modules. Generally, the results kernels of 3×3 with dilated rates from 1 to 5, Fig. 11 illustrates the
show that MABA module can have robust and stable performance with 𝐷𝑆𝐶𝑝𝑤 for all combinations on STS dataset.

7
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Fig. 9. Quantitative results of tumor boundary segmentations. From left to right, the first three columns are original CT, original PET, Ground Truth, and the other columns are
results of various segmentation methods. The background of results are PET images, and images are scaled and cropped for better display. The yellow line indicates the ground
truth, and the red line indicates the predicted results. The first two rows are from STS dataset, and the third row is from Lymphoma dataset.

Fig. 10. The 𝐷𝑆𝐶𝑝𝑤 performance for different atlas configurations in MABA module
on STS dataset.
Fig. 12. 𝑅𝑒𝑐𝑝𝑤 − 𝐷𝑆𝐶𝑝𝑤 distribution with different 𝑝𝑐 value on STS dataset. The
sequence (𝑝𝑐 , 𝑅𝑒𝑐𝑝𝑤 , 𝐷𝑆𝐶𝑝𝑤 ) indicates the optimal 𝑅𝑒𝑐𝑝𝑤 and 𝐷𝑆𝐶𝑝𝑤 under the cor-
responding 𝑝𝑐 setting. The figure is divided into four regions, and 𝑝𝑐 produces better
result when the distribution is located in region (b).

(1, 3, 4), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5). As a result,
increasing the gap in dilated rates of multi-scale dilated convolutions
can effectively improve the model performance.

5.3. Different values of hyper-parameter 𝑝𝑐 for class balance

Since low occupancies of tumors in the image, setting for hyper-


parameter 𝑝𝑐 can have a crucial impact on model training phase. We
discuss different performances of the model by varying 𝑝𝑐 from 1 to
10 and plot the 𝑅𝑒𝑐𝑝𝑤 − 𝐷𝑆𝐶𝑝𝑤 distribution on STS dataset, as shown
in Fig. 12. It can be seen from the distribution, the highest 𝑅𝑒𝑐𝑝𝑤 of
0.865 is obtained with 𝑝𝑐 set to 7, but the 𝐷𝑆𝐶𝑝𝑤 is low at 0.753. And
when 𝑝𝑐 is set to 6, the best 𝐷𝑆𝐶𝑝𝑤 of 0.781 can be obtained, while
still having a high 𝑅𝑒𝑐𝑝𝑤 of 0.837. Meanwhile, Fig. 13 shows that the
highest 𝐷𝑅𝐼𝑝𝑤 occurs at a 𝑝𝑐 setting of 6. This demonstrates that the
model using 𝑝𝑐 of 6 can have an optimal class balance and obtain better
overall segmentation results than using other 𝑝𝑐 settings, and it can also
Fig. 11. The 𝐷𝑆𝐶𝑝𝑤 performance for different multi-scale convolutions in CCM on STS improve the robustness of our model.
dataset. The red color indicates the best performance.

5.4. Comparisons of different loss functions

Suppose the dilated rates of three different dilated convolutions To investigate the impact of different loss functions on our model,
are 𝑟𝑎𝑡𝑒1 , 𝑟𝑎𝑡𝑒2 , 𝑟𝑎𝑡𝑒3 , and the sequence (𝑟𝑎𝑡𝑒1 , 𝑟𝑎𝑡𝑒2 , 𝑟𝑎𝑡𝑒3 ) denotes the we perform experiments on the STS dataset using Dice Loss (Milletari
corresponding multi-scale convolution. Specifically, in our experiment, et al., 2016) and different Shannon-based loss functions, including
the dilated rate combination of the dilated convolutions in CCM is (1, BCE Loss, Focal Loss (Lin et al., 2020), Tversky Loss (Salehi et al.,
3, 5), which increases in 𝐷𝑆𝐶𝑝𝑤 with 3.1%, 1.5%, 0.8%, 1.8%, 2.9%, 2017), and our Weighted BCE Loss. The values of the losses in each
2.2%, 3.8%, 1.3%, 1.2%, compared to (1, 2, 3), (1, 2, 4), (1, 2, 5), epoch are shown in Fig. 14. As listed in Table 5, the performances

8
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Table 5
Segmentation performance of different losses on STS dataset.
Losses 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤
BCE Loss 0.743 0.719 0.612 0.691
Dice Loss (Milletari et al., 2016) 0.779 0.807 0.653 0.746
Focal Loss (Lin et al., 2020) 0.762 0.752 0.635 0.716
Tversky Loss (Salehi et al., 2017) 0.787 0.804 0.659 0.750
Weight BCE Loss (Ours) 0.781 0.837 0.649 0.756

* The best values are highlighted in bold.

sensitive to data augmentation methods, such as RRM and MS, and


less representative of lesion regions, leading to a decrease in overall
performance.

5.6. Failure cases

As shown in Fig. 15, although the proposed method has an impres-


sive performance, it still generates failure cases. When the similarities
between tumors and normal tissues are too high, the model produces
wrong predictions (Fig. 15 (a, b)). In addition, the prediction can be
deviated due to the existence of FDGavid organs in PET images (Fig. 15
Fig. 13. Line graph of the comprehensive metric 𝐷𝑅𝐼𝑝𝑤 with different 𝑝𝑐 settings on (c, d)).
STS dataset.

6. Conclusion
are stable and accurate across all comparing loss functions. It can be
seen that the Weighted BCE Loss provides promising performance in all In this paper, we propose C2 BA-UNet, a novel UNet-like model
metrics compared to other Shannon-based losses. This is owing to the for PET/CT images based tumor segmentation. For the efficiency of
weight term we introduce, which is able to alleviate the class imbalance feature extraction, we use EfficientNet-B0 as the encoder. To learn
problems between the positive samples and negative ones effectively and optimize the challenging tumor boundary segmentation, we de-
and conveniently. sign MABA module using three model-driven approaches with varied
focuses on the uncertain regions between tumors and adjacent tissues.
5.5. Impact of data augmentation on model performance In addition, we propose CCM to improve tumor-related feature response
while keeping network efficiency and lightness. The superiority of
The data augmentation in preprocessing also has a critical impact on
our proposed method is verified on a publicly available STS dataset
the performance of deep learning model. We explore the impact of data
and an internal lymphoma PET/CT dataset. The results show that
augmentation on the segmentation performance of PET/CT images. We
C2 BA-UNet can achieve superior performance in all metrics evaluated
compare the method we apply (see Section 4.1.2) for preprocessing
against other comparison models. In summary, the proposed model can
PET/CT images with other methods, including Riemmanian random
combine features of tumor boundary and area to achieve better effect
walk (RRW) and manifold sampling (MS) (Chadebec and Allassonnière,
enhancement for tumor segmentation in small-scale datasets.
2021).
As seen in Table 6, when RRW and MS are only applied to CT im- Considering the limitations of our proposed model in learning 3D
ages, the performance can be comparative to that of the paper at some imaging information, we will keep exploring tumor segmentation meth-
level. However, we experimentally observe that when PET modality is ods that combine both boundary-aware and area-based approaches of
involved in the data augmentation operation, the performance of the 3D images in the future work, and focus on studying more tumor
corresponding model decreases, especially when we perform RRM or datasets to improve the generalization of our method. Also, we will
MS on PET/CT simultaneously, the results become very low. This is continue to study the automatic selection of the hyper-parameters as
because of the low resolution of PET images, which makes them highly well as weight values in the subsequent works.

Fig. 14. The performance curves of different loss functions in each epoch. (a) training set, (b) validation set.

9
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Fig. 15. Illustration of failure cases with PET image as background. Images are scaled and cropped for better display. The yellow line gives the ground truth, and the red line
shows our predicted results.

Table 6 Boellaard, R., Delgado-Bolton, R., Oyen, W.J., Giammarile, F., Tatsch, K., Eschner, W.,
Segmentation performance of the model with different data augmentation methods on Verzijlbergen, F.J., Barrington, S.F., Pike, L.C., Weber, W.A., et al., 2015. FDG
STS dataset. PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur. J. Nucl.
Network Input 𝐷𝑆𝐶 𝑝𝑤 𝑅𝑒𝑐 𝑝𝑤 𝐼𝑂𝑈 𝑝𝑤 𝐷𝑅𝐼 𝑝𝑤 Med. Mol. Imaging 42 (2), 328–354. http://dx.doi.org/10.1007/s00259-014-2961-
x.
PET/CT† 0.772 0.838 0.636 0.749
Chadebec, C., Allassonnière, S., 2021. Data augmentation with variational autoencoders
C2 BA-UNet𝑅𝑅𝑊 PET† /CT 0.530 0.552 0.414 0.499
and manifold sampling. In: Deep Generative Models, and Data Augmentation,
PET† /CT† 0.095 0.122 0.064 0.094
Labelling, and Imperfections. Springer International Publishing, pp. 184–192. http:
PET/CT† 0.726 0.827 0.586 0.713 //dx.doi.org/10.1007/978-3-030-88210-5_17.
C2 BA-UNet𝑀𝑆 PET† /CT 0.549 0.568 0.434 0.517 Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L., 2017a. Deeplab:
PET† /CT† 0.099 0.121 0.067 0.096 Semantic image segmentation with deep convolutional nets, atrous convolution,
C2 BA-UNet PET/CT 0.781 0.837 0.649 0.756 and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40 (4), 834–848.
http://dx.doi.org/10.1109/TPAMI.2017.2699184.
2 2 2 Chen, L.-C., Papandreou, G., Schroff, F., Adam, H., 2017b. Rethinking atrous convo-
* C BA-UNet𝑅𝑅𝑊 and C BA-UNet𝑀𝑆 indicate that C BA-UNet uses RRW and MS as the
data augmentation method. lution for semantic image segmentation. arXiv preprint arXiv:1706.05587. http:
* PET† and CT† indicate the corresponding images with data augmentation methods. //dx.doi.org/10.48550/arXiv.1706.05587.
* The best values are highlighted in bold. Chen, X., Qi, D., Shen, J., 2019. Boundary-aware network for fast and high-
accuracy portrait segmentation. arXiv preprint arXiv:1901.03814. http://dx.doi.
org/10.48550/arXiv.1901.03814.
Dahab, D.A., Ghoniemy, S.S., Selim, G.M., et al., 2012. Automated brain tumor de-
CRediT authorship contribution statement tection and identification using image processing and probabilistic neural network
techniques. Int. J. Image Process. Vis. Commun. 1 (2), 1–8.
Diao, Z., Jiang, H., Han, X.-H., Yao, Y.-D., Shi, T., 2021. EFNet: evidence fusion network
Shijie Luo: Conceptualization, Software, Validation, Formal analy- for tumor segmentation from PET-CT volumes. Phys. Med. Biol. 66 (20), 205005.
sis, Investigation, Data curation, Writing – original draft, Visualization. http://dx.doi.org/10.1088/1361-6560/ac299a.
Huiyan Jiang: Methodology, Resources, Writing – review & editing, Dutta, K., Roy, S., Whitehead, T.D., Luo, J., Jha, A.K., Li, S., Quirk, J.D., Shoghi, K.I.,
2021. Deep learning segmentation of triple-negative breast cancer (TNBC) pa-
Supervision, Project administration, Funding acquisition. Meng Wang:
tient derived tumor xenograft (PDX) and sensitivity of radiomic pipeline to
Writing – review & editing. tumor probability boundary. Cancers 13 (15), 3795. http://dx.doi.org/10.3390/
cancers13153795.
Declaration of competing interest Fu, X., Bi, L., Kumar, A., Fulham, M., Kim, J., 2021. Multimodal spatial attention
module for targeting multimodal PET-CT lung tumor segmentation. IEEE J. Biomed.
Health Inf. 25 (9), 3507–3516. http://dx.doi.org/10.1109/jbhi.2021.3059453.
The authors declare that they have no known competing finan- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
cial interests or personal relationships that could have appeared to Courville, A., Bengio, Y., 2014. Generative adversarial nets. Adv. Neu-
influence the work reported in this paper. ral Inf. Process. Syst. 27, URL: https://proceedings.neurips.cc/paper/2014/file/
5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.
Han, Y., Li, X., Wang, B., Wang, L., 2021. Boundary loss-based 2.5 D fully convolutional
Data availability neural networks approach for segmentation: a case study of the liver and tumor
on computed tomography. Algorithms 14 (5), 144. http://dx.doi.org/10.3390/
a14050144.
The authors do not have permission to share data.
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E., 2020. Squeeze-and-excitation networks.
IEEE Trans. Pattern Anal. Mach. Intell. 42 (8), 2011–2023. http://dx.doi.org/10.
Acknowledgments 1109/TPAMI.2019.2913372.
Huang, H., Lin, L., Tong, R., Hu, H., Zhang, Q., Iwamoto, Y., Han, X., Chen, Y.-
W., Wu, J., 2020. Unet 3+: A full-scale connected unet for medical image
This work was supported by the National Natural Science Foun- segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics,
dation of China (No. 61872075) and Natural Science Foundation of Speech and Signal Processing. ICASSP, IEEE, pp. 1055–1059. http://dx.doi.org/10.
Liaoning Province, China (No. 2021-YGJC-07). We gratefully acknowl- 1109/ICASSP40776.2020.9053405.
Huang, H., Zheng, H., Lin, L., Cai, M., Hu, H., Zhang, Q., Chen, Q., Iwamoto, Y.,
edge the support of Dr. Guoxiu Lu, Dr. Jia Guo, Dr. Zhiguo Wang,
Han, X., Chen, Y.-W., et al., 2021. Medical image segmentation with deep atlas
and Mr. Youchao Wang from the nuclear medicine department of the prior. IEEE Trans. Med. Imaging 40 (12), 3519–3530. http://dx.doi.org/10.1109/
General Hospital of Shenyang Military Area Command. TMI.2021.3089661.
Kubicek, J., Timkovic, J., Penhaker, M., Oczka, D., Krestanova, A., Augustynek, M.,
Cernỳ, M., 2019. Retinal blood vessels modeling based on fuzzy sobel edge
References
detection and morphological segmentation.. In: Biodevices. pp. 121–126. http:
//dx.doi.org/10.5220/0007237501210126.
Ali, Z., Irtaza, A., Maqsood, M., 2022. An efficient U-Net framework for lung nodule Kumar, A., Fulham, M., Feng, D., Kim, J., 2019. Co-learning feature fusion maps
detection using densely connected dilated convolutions. J. Supercomput. 78 (2), from PET-CT images of lung cancer. IEEE Trans. Med. Imaging 39 (1), 204–217.
1602–1623. http://dx.doi.org/10.1007/s11227-021-03845-x. http://dx.doi.org/10.1109/TMI.2019.2923601.
Alom, M.Z., Yakopcic, C., Hasan, M., Taha, T.M., Asari, V.K., 2019. Recurrent residual Le Dinh, T., Lee, S.-H., Kwon, S.-G., Kwon, K.-R., 2022. Cell nuclei segmentation
U-Net for medical image segmentation. J. Med. Imaging 6 (1), 014006. http: in cryonuseg dataset using nested unet with EfficientNet encoder. In: 2022
//dx.doi.org/10.1117/1.JMI.6.1.014006. International Conference on Electronics, Information, and Communication. ICEIC,
Bhattarai, B., Subedi, R., Gaire, R.R., Vazquez, E., Stoyanov, D., 2022. Histogram of IEEE, pp. 1–4. http://dx.doi.org/10.1109/ICEIC54506.2022.9748537.
oriented gradients meet deep learning: A novel multi-task deep network for medical Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P., 2020. Focal loss for dense
image semantic segmentation. arXiv preprint arXiv:2204.01712 http://dx.doi.org/ object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42 (2), 318–327. http:
10.48550/arXiv.2204.01712. //dx.doi.org/10.1109/TPAMI.2018.2858826.

10
S. Luo et al. Computerized Medical Imaging and Graphics 103 (2023) 102159

Liu, T., Liu, J., Ma, Y., He, J., Han, J., Ding, X., Chen, C.-T., 2021. Spatial feature fusion Tang, Y., Tang, Y., Zhu, Y., Xiao, J., Summers, R.M., 2020. E2 net: An edge enhanced
convolutional network for liver and liver tumor segmentation from CT images. Med. network for accurate liver and tumor segmentation on CT scans. In: International
Phys. 48 (1), 264–272. http://dx.doi.org/10.1002/mp.14585. Conference on Medical Image Computing and Computer-Assisted Intervention.
Loshchilov, I., Hutter, F., 2019. Decoupled weight decay regularization. In: International Springer, pp. 512–522. http://dx.doi.org/10.1007/978-3-030-59719-1_50.
Conference on Learning Representations. URL: https://openreview.net/forum?id= Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., Patel, V.M., 2021. Medical transformer:
Bkg6RiCqY7. Gated axial-attention for medical image segmentation. In: International Conference
Luo, X., Chen, J., Song, T., Wang, G., 2021. Semi-supervised medical image segmen- on Medical Image Computing and Computer-Assisted Intervention. Springer, pp.
tation through dual-task consistency. In: Proceedings of the AAAI Conference on 36–46. http://dx.doi.org/10.1007/978-3-030-87193-2_4.
Artificial Intelligence. Vol. 35. (10), pp. 8801–8809. http://dx.doi.org/10.48550/ Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N.,
arXiv.2009.04448. Kaiser, Ł., Polosukhin, I., 2017. Attention is all you need. Adv. Neural Inf.
Meyer, C., Mallouh, V., Spehner, D., Baudrier, E., Schultz, P., Naegel, B., 2021. Process. Syst. 6000–6010, URL: https://proceedings.neurips.cc/paper/2017/file/
Automatic multi class organelle segmentation for cellular fib-sem images. In: 3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
2021 IEEE 18th International Symposium on Biomedical Imaging. ISBI, IEEE, pp. Wang, M., Jiang, H., Shi, T., Yao, Y.-D., 2021a. HD-RDS-UNet: Leveraging spatial-
668–672. http://dx.doi.org/10.1109/ISBI48211.2021.9434075. temporal correlation between the decoder feature maps for lymphoma seg-
Milletari, F., Navab, N., Ahmadi, S.-A., 2016. V-Net: Fully convolutional neural mentation. IEEE J. Biomed. Health Inf. http://dx.doi.org/10.1109/JBHI.2021.
networks for volumetric medical image segmentation. In: 2016 Fourth International 3102612.
Conference on 3D Vision. 3DV, pp. 565–571. http://dx.doi.org/10.1109/3DV.2016. Wang, J., Zhang, X., Lv, P., Zhou, L., Wang, H., 2021b. EAR-U-Net: EfficientNet
79. and attention-based residual U-Net for automatic liver segmentation in CT. arXiv
Mok, T.C., Chung, A., 2018. Learning data augmentation for brain tumor segmenta- preprint arXiv:2110.01014. http://dx.doi.org/10.48550/arXiv.2110.01014.
tion with coarse-to-fine generative adversarial networks. In: International MICCAI Wong, D., Liu, J., Fengshou, Y., Tian, Q., Xiong, W., Zhou, J., Qi, Y., Han, T.,
Brainlesion Workshop. Springer, pp. 70–80. http://dx.doi.org/10.1007/978-3-030- Venkatesh, S., Wang, S.-c., 2008. A semi-automated method for liver tumor
11723-8_7. segmentation based on 2D region growing with knowledge-based constraints. In:
Nathan, S., Ramamoorthy, S., 2020. Efficient supervision net: Polyp segmentation using MICCAI Workshop. Vol. 41. No. 43. http://dx.doi.org/10.54294/25etax.
EfficientNet and attention unit. In: Working Notes Proceedings of the MediaEval Xu, G., Cao, H., Udupa, J.K., Tong, Y., Torigian, D.A., 2021. DiSegNet: A deep
2020 Workshop, Online, 14-15 December 2020. URL: http://ceur-ws.org/Vol-2882/ dilated convolutional encoder-decoder architecture for lymph node segmentation
paper72.pdf. on PET/CT images. Comput. Med. Imaging Graph. 88, 101851. http://dx.doi.org/
Pham, V.-T., Tran, T.-T., Wang, P.-C., Chen, P.-Y., Lo, M.-T., 2021. EAR-UNet: A deep 10.1016/j.compmedimag.2020.101851.
learning-based approach for segmentation of tympanic membranes from otoscopic Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z., 2017. Suggestive annotation: A deep
images. Artif. Intell. Med. 115, 102065. http://dx.doi.org/10.1016/j.artmed.2021. active learning framework for biomedical image segmentation. In: International
102065. Conference on Medical Image Computing and Computer-Assisted Intervention.
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for Springer, pp. 399–407. http://dx.doi.org/10.1007/978-3-319-66179-7_46.
biomedical image segmentation. In: International Conference on Medical Image Yu, F., Koltun, V., 2016. Multi-scale context aggregation by dilated convolutions. In:
Computing and Computer-Assisted Intervention. Springer, pp. 234–241. http://dx. International Conference on Learning Representations. http://dx.doi.org/10.48550/
doi.org/10.1007/978-3-319-24574-4_28. arXiv.1511.07122.
Salehi, S.S.M., Erdogmus, D., Gholipour, A., 2017. Tversky loss function for image Yu, Q., Xie, L., Wang, Y., Zhou, Y., Fishman, E.K., Yuille, A.L., 2018. Recurrent saliency
segmentation using 3D fully convolutional deep networks. In: Machine Learning transformation network: Incorporating multi-stage visual cues for small organ
in Medical Imaging. Springer International Publishing, pp. 379–387. http://dx.doi. segmentation. In: Proceedings of the IEEE Conference on Computer Vision and
org/10.1007/978-3-319-67389-9_44. Pattern Recognition. pp. 8280–8289. http://dx.doi.org/10.1109/CVPR.2018.00864.
Sanchez-Vega, F., Mina, M., Armenia, J., Chatila, W.K., Luna, A., La, K.C., Dimitri- Zaimy, M., Saffarzadeh, N., Mohammadi, A., Pourghadamyari, H., Izadi, P., Sarli, A.,
adoy, S., Liu, D.L., Kantheti, H.S., Saghafinia, S., et al., 2018. Oncogenic signaling Moghaddam, L., Paschepari, S., Azizi, H., Torkamandi, S., et al., 2017. New meth-
pathways in the cancer genome atlas. Cell 173 (2), 321–337. http://dx.doi.org/10. ods in the diagnosis of cancer and gene therapy of cancer based on nanoparticles.
1016/j.cell.2018.03.035. Cancer Gene Ther 24 (6), 233–243. http://dx.doi.org/10.1038/cgt.2017.16.
Sharifrazi, D., Alizadehsani, R., Roshanzamir, M., Joloudari, J.H., Shoeibi, A., Jafari, M., Zhang, R., Li, G., Li, Z., Cui, S., Qian, D., Yu, Y., 2020. Adaptive context selection
Hussain, S., Sani, Z.A., Hasanzadeh, F., Khozeimeh, F., et al., 2021. Fusion of for polyp segmentation. In: International Conference on Medical Image Computing
convolution neural network, support vector machine and sobel filter for accurate and Computer-Assisted Intervention. Springer, pp. 253–262. http://dx.doi.org/10.
detection of COVID-19 patients using X-ray images. Biomed. Signal Process. Control 1007/978-3-030-59725-2_25.
68, 102622. http://dx.doi.org/10.1016/j.bspc.2021.102622. Zhang, Y., Zhang, J., 2021. Dual-task mutual learning for semi-supervised medical
Shelhamer, E., Long, J., Darrell, T., 2017. Fully convolutional networks for semantic image segmentation. In: Chinese Conference on Pattern Recognition and Computer
segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), 640–651. http: Vision. PRCV, Springer, pp. 548–559. http://dx.doi.org/10.1007/978-3-030-88010-
//dx.doi.org/10.1109/TPAMI.2016.2572683. 1_46.
Sobel, I.E., 1970. Camera Models and Machine Perception. Stanford University. Zhao, X., Zhang, P., Song, F., Fan, G., Sun, Y., Wang, Y., Tian, Z., Zhang, L., Zhang, G.,
Soler, L., Delingette, H., Malandain, G., Montagnat, J., Ayache, N., Koehl, C., Dour- 2021. D2A U-Net: Automatic segmentation of COVID-19 CT slices based on dual
the, O., Malassagne, B., Smith, M., Mutter, D., et al., 2001. Fully automatic attention and hybrid dilated convolution. Comput. Biol. Med. 135, 104526. http:
anatomical, pathological, and functional segmentation from CT scans for hep- //dx.doi.org/10.1016/j.compbiomed.2021.104526.
atic surgery. Comput Aided Surg 6 (3), 131–142. http://dx.doi.org/10.3109/ Zhou, Q., Qin, J., Xiang, X., Tan, Y., Ren, Y., 2022. MOLS-Net: Multi-organ and
10929080109145999. lesion segmentation network based on sequence feature pyramid and attention
Tan, M., Le, Q., 2019. Efficientnet: Rethinking model scaling for convolutional neural mechanism for aortic dissection diagnosis. Knowl.-Based Syst. 239, 107853. http:
networks. In: International Conference on Machine Learning. PMLR, pp. 6105–6114, //dx.doi.org/10.1016/j.knosys.2021.107853.
URL: https://proceedings.mlr.press/v97/tan19a.html. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J., 2018. Unet++: A nested
Tang, Y., Cai, J., Yan, K., Huang, L., Xie, G., Xiao, J., Lu, J., Lin, G., Lu, L., 2021. u-net architecture for medical image segmentation. In: Deep Learning in Medical
Weakly-supervised universal lesion segmentation with regional level set loss. In: Image Analysis and Multimodal Learning for Clinical Decision Support. Springer,
International Conference on Medical Image Computing and Computer-Assisted pp. 3–11. http://dx.doi.org/10.1007/978-3-030-00889-5_1.
Intervention. Springer, pp. 515–525. http://dx.doi.org/10.1007/978-3-030-87196-
3_48.

11

You might also like