0% found this document useful (0 votes)
41 views8 pages

Neural Network-Based Reversible Data Hiding For Medical Image

About a network process on medical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views8 pages

Neural Network-Based Reversible Data Hiding For Medical Image

About a network process on medical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Expert Systems With Applications 256 (2024) 124903

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Neural network-based reversible data hiding for medical image✩


Ping Kong a , Yongdong Zhang b , Lin Huang c , Liang Zhou a , Lifan Chen d , Chuan Qin c ,∗
a
Jiading District Central Hospital Affiliated Shanghai University of Medicine and Health Sciences, Shanghai 201318, China
b
School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
c
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
d
Collaborative Innovation Center for Biomedicine, Shanghai University of Medicine and Health Sciences, Shanghai, China

ARTICLE INFO ABSTRACT

Keywords: Pixel prediction is an important issue in the field of reversible data hiding. Neural networks are gradually
Reversible data hiding used to improve the accuracy of pixel prediction owing to their excellent performance. However, current
Prediction error expansion neural network-based pixel predictors are designed for natural images and do not consider the characteristics
Neural network
of medical images. Therefore, in this paper, we propose a dual-branch neural network-based reversible data
Medical image
hiding scheme for medical images. Detailedly, considering the characteristics of medical images, in which
complex and smooth regions are more clearly distinguished, we present a clustering method to classify pixels
into three classes according to their complexities, and generate masks to assist pixel prediction. Then, in the
prediction stage, a dual-branch neural network-based pixel predictor is designed to extract unique and shared
features, and a convolutional block attention module is used to optimize the extracted features. Finally, in the
embedding stage, considering the characteristics of region of interest (ROI) and region of non-interest (NROI) in
medical images, we design a class-based embedding algorithm, which can prioritize embedding data into NROI
with low complexity and then sequentially into low texture complexity region and high texture complexity
region of ROI. Experimental results show that our scheme can achieve better performance of pixel prediction
and data embedding than existing state-of-the-art works.

1. Introduction Fortunately, reversible data hiding (RDH) provides a way that can not
only extract secret data correctly, but also recover the cover image
The development of multimedia technology and the widespread losslessly. Therefore, RDH has good application prospects in medical
adoption of communication technologies drive industrial transforma- field.
tion and technological innovation. This trend propels the healthcare In general, RDH schemes can be divided into the following cat-
industry into the era of electronic data, and the transmission of medical egories: (1) lossless compression-based schemes (Celik et al., 2005;
data becomes more efficient and convenient. However, the transmission Kalker & Willems, 2002; Xuan et al., 2004), (2) difference expansion-
of medical data is faced with many security problems, such as the based schemes (Alattar, 2004; Lee et al., 2008; Tian, 2003), (3) his-
leakage and tampering of medical data. To address these security issues, togram shifting-based schemes (Li, Li, Yang & Zeng, 2013; Ni et al.,
researchers apply steganography technique to the healthcare indus- 2006; Qi et al., 2019; Wang et al., 2019), and (4) prediction error
try (Karakus & Avci, 2020; Mortazavian et al., 2004; Yildirim, 2021). expansion-based schemes (He & Cai, 2021; He et al., 2018; Li, Li,
Over the past two decades, researchers have been working on devel- Li & Yang, 2013; Qi et al., 2023). For lossless compression-based
oping effective steganography techniques to enhance imperceptibility, scheme, it generates redundant space for data embedding by lossless
capacity, and security (Ker, 2005; Shang et al., 2023; Wu & Wang,
compression. For difference expansion-based scheme, it embeds data
2014). Generally, steganography can only ensure lossless extraction
by expanding the difference between neighboring pixels. For histogram
of embedded data in a lossless channel and cannot recover the cover
shifting-based scheme, data embedding can be accomplished by shifting
image losslessly. However, for sensitive medical field, the cover image
some bins of image histogram. The scheme based on prediction error
is also important, subtle changes may lead to serious consequences.

✩ This research was supported in part by the Shanghai Pujiang Program under Grant 22PJD031, in part by the National Natural Science Foundation of China
under Grants 62172280 and U20B2051, and in part by the Natural Science Foundation of Shanghai under Grant 21ZR1444600.
∗ Corresponding author.
E-mail addresses: [email protected] (P. Kong), [email protected] (Y. Zhang), [email protected] (L. Huang), [email protected]
(L. Zhou), [email protected] (L. Chen), [email protected] (C. Qin).

https://doi.org/10.1016/j.eswa.2024.124903
Received 3 April 2024; Received in revised form 10 July 2024; Accepted 25 July 2024
Available online 27 July 2024
0957-4174/© 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

expansion (PEE) is widely used due to its excellent performance, which


is the focus of this paper. The PEE-based RDH scheme (abbreviated
as PEE-RDH) mainly contains two stages: (1) pixel prediction, which
focuses on generating sharp prediction error histograms (PEH) based
on the correlation between the reference pixels and the to-be-predicted
pixels, thus improving the embedding capability of PEE-RDH; (2) data
embedding, which focuses on designing specific embedding algorithms
to minimize the image distortion with a given embedding capability.
According to the pixel prediction methods we further classify PEE-
RDH into traditional-based PEE-RDH schemes and deep learning-based Fig. 1. The framework of data hiding process in our scheme.
PEE-RDH schemes. For traditional-based PEE-RDH schemes, Thodi and
Rodríguez (2007) devised a prediction error expansion technique based
on the median edge detection predictor (MEDP), which obtained the
(DBNNP) is designed to extract unique and shared features of the to-be-
predicted value based on the linear relationship between pixels in the
predicted pixels. In addition, to further improve the performance of our
left, upper left, and upper locations of the to-be-predicted pixel. Sach-
pixel predictor, we employ the convolutional block attention module
nev et al. (2009) proposed the rhombus predictor (RP) that utilized
(CBMA) (Woo et al., 2018) to adaptively optimize the features extracted
more reference pixels for the to-be-predicted pixel. In addition, they
from the two branches. Finally, to improve embedding performance,
combined this predictor with histogram shifting technique to signif-
a class-based embedding (CE) algorithm is presented, which utilizes
icantly reduce the size of the location map, thus increasing the em-
the results of pixel clustering to prioritize data embedding in the NROI
bedding capacity. Coltuc (2011) increased the number of reference
of medical images. The contributions of this work are summarized as
pixels in the prediction context to seven, which can not only detect
follows:
the presence of vertical and horizontal edges, but also calculate their
strong and weak relationships. In the embedding stage, the data was • We design a pixel predictor based on a dual-branch neural net-
embedded into the current pixel and its context, resulting in good work for medical images, which can effectively utilize the unique
embedding performance and reduced computational cost. Considering and shared features of different classes of pixels to improve the
the correlation between prediction errors, Ou et al. (2013) constructed performance of pixel prediction.
two neighboring prediction errors into a prediction error pair to gen- • A pixel clustering method based on pixel complexity is presented
erate a two-dimensional prediction error histogram, and designed a to provide masks for different branches of DBNNP and to reduce
two-dimensional prediction error histogram-based data embedding al- the distortion caused by data embedding.
gorithm. Jafar et al. (2016) combined multiple predictors to design an • A class-based embedding algorithm is used to preferentially em-
RDH scheme that has better prediction accuracy and higher embed- bed data into NROI to reduce ROI distortion as much as possible.
ding capacity. Although these traditional-based PEE-RDH schemes have • Extensive experimental results show that our scheme has satis-
devised many efficient predictors using image local correlation, it is factory performance of pixel prediction and data embedding on
difficult to obtain locally and globally optimal solutions. medical images.
With the development of deep learning, it shows powerful predic-
tion ability. Therefore, some deep learning-based PEE-RDH schemes The rest of this paper is organized as follows. Section 2 describes the
are proposed. Hu and Xiang (2021a) proposed a convolutional neu- proposed RDH scheme in detail. Experimental results and analysis are
ral network-based predictor for greyscale images, which significantly provided in Section 3. Finally, Section 4 concludes the whole paper.
improved the prediction accuracy by taking advantage of the multiple
receptive fields and global optimization capability of convolutional 2. Proposed scheme
neural network. To further improve the performance of the pixel pre-
dictor and data embedding, Hu and Xiang (2021b) first divided im- Our scheme performs data hiding by dividing the image into four
age pixels into four classes for pixel prediction and then designed a sub-images in sequence, each sub-image data hiding process is con-
complexity-based embedding algorithm to perform data embedding. sistent and data extraction is the inverse process of data hiding, so
After that, Yang and Huang (2022) designed an improved image pixels we only give the hiding process of one of the sub-images in Fig. 1,
division strategy, which can use more reference pixels in the same 3 × 3 omitting the hiding process of the rest of the sub-images as well as
receptive field. In addition, a filtering operation was used to further the data extraction process. The data hiding process in our scheme
enhance the local correlation of pixels in the pre-processing stage of can be divided into three parts: image pre-processing, pixel prediction,
pixel prediction. and data embedding. During image pre-processing, to ensure the re-
The RDH schemes mentioned above are designed for natural images versibility of the RDH scheme, we first divide the cover image into
and do not consider the regional characteristics of medical images. four sub-images. Then, to improve the pixel prediction accuracy and the
Therefore, the above RDH schemes cannot be directly applied to med- embedding performance, we cluster the pixels of each sub-image into
ical images. Medical images usually contain the region of non-interest three classes. During pixel prediction, we predict the pixels of each sub-
(NROI) with many flat areas and the region of interest (ROI) with image sequentially by the designed DBNNP. During data embedding,
abundant clinical information (Khor et al., 2017; Parah et al., 2017). we use the designed CE algorithm, which performs data embedding
Furthermore, according to investigations, most RDH schemes used in in the order of pixel classes based on the clustering results obtained
medical images are aimed at enhancing the visual quality of medical during image pre-processing, it reduces the distortion caused by data
images (Huang et al., 2013; Ren et al., 2023; Wu et al., 2015), with embedding, especially in the ROI.
less focus on the performance of RDH itself, such as embedding capacity
and image distortion. 2.1. Image pre-processing
To improve the prediction accuracy and embedding performance of
the PEE-RDH scheme for medical images, in this paper, we propose In general, the more reference pixels in the neighborhood of the to-
a dual-branch neural network-based reversible data hiding scheme. be predicted pixel, the more abundant feature information it contains,
Specifically, we first cluster pixels based on their local weighted bright- which can improve the accuracy of pixel classification and prediction.
ness and local standard deviation to generate masks for the pixel At the present stage, there are two main division methods, i.e. dividing
predictor. Then, a dual-branch neural network-based pixel predictor the image into two sub-images and four sub-images. It is worth noting

2
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

Fig. 2. Sub-images division process.

that dividing the image into more than four sub-images does not
increase the number of reference pixels in the 3 × 3 neighborhood, but
rather increases the cost of time. Therefore, we employ the strategy
proposed by Yang and Huang (2022) to divide the cover image 𝐈 into
four sub-images, i.e., 𝐈1 , 𝐈2 , 𝐈3 and 𝐈4 , see Fig. 2. For each sub-image
𝐈𝑖 , the pixels with a gray background indicate the true values, while
the other pixels are set to 0, thus 𝐈 = 𝐈1 + 𝐈2 + 𝐈3 + 𝐈4 . Then,
we generate four intermediate sub-images 𝐏1 , 𝐏2 , 𝐏3 and 𝐏4 , 𝐏𝑖 = 𝐈
− 𝐈𝑖 , 𝑖 ∈ {1, 2, 3, 4}. In addition, considering that each pixel has a
different complexity, which affects the performance of pixel prediction,
we group pixels with the similar complexity into the same class by using
the K-means clustering algorithm. Specifically, for each sub-image, we
first calculate two features for each pixel in a neighborhood of size 5 × 5
according to the other three sub-images, see Fig. 3(a), where shadows
indicate the reference pixels of the to-be predicted pixel 𝑥. The first
feature is the local standard deviation (𝐿𝑆𝐷), i.e., calculated as in Eq.
(1), which is used to evaluate the texture complexity of each pixel. The
second feature is the local weighted brightness (𝐿𝑊 ), i.e., calculated
as in Eq. (2), which is specifically considered for medical images. 𝐿𝑊
can better distinguish between ROI and NROI pixels, thus reducing the
ROI distortion caused by data embedding. The combined consideration Fig. 3. Context of to-be-classified pixel and the filter kernel.

of 𝐿𝑆𝐷 and 𝐿𝑊 can achieve a better balance between classification


accuracy and prediction accuracy.

∑16 ( )2 1 ∑
𝑖=1 𝑥𝑖 − 𝑥 𝐶𝑗′ = 𝐿𝐹𝑖 . (5)
𝐿𝑆𝐷 = , (1) | |
16 |𝐬𝑗 | 𝐿𝐹𝑖 ∈𝐬𝑗
| |
∑16 ( ) Repeat the above steps until the cluster centers no longer change
𝑤𝑖 × 𝑥𝑖
𝐿𝑊 = 𝑖=1 ∑16 , (2) significantly and the pixel clustering results can be obtained. To ensure
𝑖=1 𝑤𝑖 reversibility, three final clustering centers are embedded in the image
where 𝑥𝑖 is the 𝑖th reference pixel in the neighborhood of the to-be- as auxiliary information. We take the lowest average 𝐿𝑊 among the
classified pixel 𝑥, which belongs to the other three sub-images. 𝑥 is the three classes as the first class, which is usually the NROI, and the lower
average value of the 16 reference pixels in the neighborhood. 𝑤𝑖 is the average 𝐿𝑆𝐷 class among the remaining two classes as the second
weight corresponding to each reference pixel, each weight of the filter class, and the higher average 𝐿𝑆𝐷 class as the third class. As shown
kernel is shown in Fig. 3(b). in Fig. 4, we take the pixel classification of the sub-image 𝐈1 as an
Then, these two features, i.e., 𝐿𝑆𝐷 and 𝐿𝑊 , are concatenated into a example, the label map and the masks of the sub-image 𝐈1 , i.e., Mask
local feature vector (𝐿𝐹 ) and fed into the K-means clustering algorithm 1, Mask 2, and Mask 3, can be obtained after pixel clustering from the
to classify the pixels into three classes. We randomly select three local intermediate sub-image 𝐏1 . These masks ensure that each sub-branch
of the proposed DBNNP is utilized solely to extract the local features of
feature vectors as initial clustering centers, i.e., 𝐶1 , 𝐶2 , and 𝐶3 , and use
the corresponding class of pixels in the mask, and independent of other
Eq. (3) to calculate the Euclidean distance between the other feature
classes. In addition, these masks provide information for the subsequent
vectors and the initial clustering centers.
√ embedding stage to ensure that the data is embedded into pixels in
√ 𝑛
( ) √ ∑( )2 ascending order of pixel class.
𝑑 𝐿𝐹𝑖 , 𝐶𝑗 = √ 𝐿𝐹𝑖,𝑘 − 𝐶𝑗,𝑘 , (3)
𝑘=1
2.2. Pixel prediction
where 𝑛 is the number of features for each pixel, 𝑗 is the number
of classes, 𝐿𝐹𝑖 represents the local feature value of the 𝑖th pixel, 𝑖 Our DBNNP contains two modules: (1) feature extraction module,
∈ {1, 2, … , 𝑁 −1, 𝑁}, 𝑁 is the total number of the to-be-clustered pixels. and (2) pixel prediction module, see Fig. 5. The details are described
The class 𝐾𝑖 of the corresponding pixel is obtained from Eq. (4), and as follows:
the local features of pixels for the same class are combined to form Feature extraction module. We believe that pixels with different
a new cluster 𝐒𝑗 , 𝑗 ∈ {1, 2, 3}. The average value of each cluster is complexity may interfere with the prediction accuracy of the pixel pre-
| | dictor. Therefore, we divide the pixels into three classes by clustering
obtained as the new cluster center 𝐶𝑗′ by Eq. (5), |𝐒𝑗 | represents the
| |
number of the 𝑗th class. algorithm in image pre-processing. Specifically, we design a dual-
[ ( )] branch feature extraction module, where the upper branch is called
𝐾𝑖 = arg min 𝑑 𝐿𝐹𝑖 , 𝐶𝑗 , (4) the unique feature extraction branch (UFEB), which mainly extracts the
𝑗

3
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

Fig. 4. The generation process of label map and masks.

Fig. 5. The structure of our pixel predictor (DBNNP).

local features of the same class of pixels, and the lower branch is called Pixel prediction module. The pixel prediction module contains
the shared feature extraction branch (SFEB), which mainly extracts the three convolutional layers with 3 × 3 convolutional kernels in each
common features of all classes of pixels. For the UFEB, we design three convolutional layer, where the first two output channels are 32 and
sub-branches with the same structure, each containing a ConvBlock the last output channel is set to 1 to reconstruct the predicted image.
and a Mask, see Fig. 5. In addition, to extract features from different And we used LeakyReLU activation function and skip connection struc-
receptive fields, we use three convolutional kernels with sizes of 3, ture to improve the training stability and generalization ability of the
5, and 7, respectively. For the SFEB, it consists of two convolutional model (He et al., 2016). During pixel predictor training process, the
layers and an activation function. Finally, the features extracted from input images are normalized to the range [0, 1], and as a result, we
these two branches are concatenated. To further improve the prediction constrain the output images of the pixel predictor to the same range
accuracy, we employ the CBAM to weight these feature maps in channel [0, 1].
and space. The CBAM consists of a channel attention module (CAM) and Loss function. To train a superior pixel predictor, we employed
a spatial attention module (SAM). Firstly, for the given feature map F the mean square error (MSE) to measure the difference between the
with the size of 𝐶 × 𝑊 × 𝐻, CAM is used to adaptively calculate the predicted image 𝐈′ and the cover image 𝐈. The loss function is mini-
channel attention weight vector 𝐌C ∈ R𝐶×1×1 as follows: mized based on backpropagation (LeCun et al., 1998) and the AdamW
( ( ( )) ( ( ))) optimizer (Loshchilov & Hutter, 2017) with a batch size of 8.
𝐌𝐂 = 𝜎 MLP AvgPool 𝐅 + MLP MaxPool 𝐅 , (6)
1 ∑(
𝑀
where 𝜎 represents the sigmoid function, MLP is a feed-forward neural )2
𝐿= 𝐈 − 𝐈′𝑡 + 𝜆‖𝜔‖22 , (8)
network with a multilayer structure, which can be well used to infer 𝑀 𝑡=1 𝑡
the attention vectors through backpropagation. AvgPool and MaxPool
where 𝑀 is the total number of training images, 𝜔 represents the
are average pooling and maximum pooling, respectively.
parameters of the pixel predictor. 𝜆 is a weight decay parameter that is
Then, the obtained channel attention vector 𝐌C is used to multiply
used to accelerate network convergence and suppress overfitting, and
with the original feature map F to get the optimized feature map 𝐅′ ,
𝜆 is set to 10−3 .
and the optimized feature map 𝐅′ is inputted into the SAM to get the
spatial attention weight vector 𝐌S ∈ R1×𝑊 ×𝐻 as follows:
( ( ( ) ( ))) 2.3. Data embedding
𝐌𝐒 = 𝜎 Conv7×7 AvgPool F′ ⊕ MaxPool F′ , (7)

where Conv7×7 represents a convolution operation with a convolution During the data embedding process, we design a class-based em-
kernel size of 7 × 7, and ⊕ refers to the concatenation operation. Same bedding algorithm to embed data. According to the clustering result,
as the CAM, we multiply the resulting spatial weight attention vector we conduct data embedding based on pixel classes. That is, we embed
𝐌S with the feature map 𝐅′ to obtain the final optimized feature map the data to the pixels belonging to class 1, class 2, and class 3 in turn.
𝐅′′ . The detailed embedding process is as follows:

4
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

Table 1
{ (1) PEH generation. } For each sub-image 𝐈𝑖 with 𝑛𝑖 pixels
Composition of our dataset.
𝐈𝑖;1 , 𝐈𝑖;2 , … , 𝐈𝑖;𝑛 , a corresponding predicted sub-image 𝐈′𝑖 can be ob-
tained according to our DBNNP. Then, we can calculate the pixel Images Training set Testing set
{ }
prediction error vector 𝐞𝑖 = 𝐈𝑖;𝑗 − 𝐈′𝑖;𝑗 , 𝑗 = 1, 2, … , 𝑛𝑖 between 𝐈𝑖 and Lung X-ray 900 50
Lung CT 1,050 65
𝐈′𝑖 , and 𝐞𝑖 can be divided into three sub-error vectors 𝐞1𝑖 , 𝐞2𝑖 , 𝐞3𝑖 , according Colon CT 1,050 60
to the pixel classes. Finally, the corresponding 𝑃 𝐸𝐻 𝑐 of each sub-error Spine MR 1,000 65
vector 𝐞𝑐𝑖 can be generated as follows: Pancreas CT 1,000 60
Total 5,000 300
𝑃 𝐸𝐻 𝑐 = (𝐞𝑐𝑖 ), 𝑐 = 1, 2, 3; 𝑖 = 1, 2, 3, 4, (9)
where (⋅) denotes histogram generation function. Table 2
(2) Data embedding. We perform the data embedding in the order of Average MSE and Var for different pixel predictors on the testing set.
the pixel classes, and each class is embedded in the same way. For each Predictor MSE Var
sub-error vector 𝐞𝑐𝑖 , the bit 𝑚 ∈ {0, 1} can be embedded by modifying Thodi and Rodríguez (2007) 88.21 88.20
the prediction error 𝑒𝑖;𝑗 belonging to the 𝑐th class as follows. Sachnev et al. (2009) 53.79 53.71
Coltuc (2011) 87.62 87.30
⎧ 𝑒𝑖;𝑗 , if 𝑒𝑖;𝑗 < 0, Hu and Xiang (2021a) 23.62 23.61
⎪ Hu and Xiang (2021b) 11.65 11.65
𝑒′𝑖;𝑗 = ⎨ 𝑒𝑖;𝑗 + 𝑚, if 𝑒𝑖;𝑗 = 0, (10) Yang and Huang (2022) 10.81 10.80

⎩ 𝑒𝑖;𝑗 + 1, if 𝑒𝑖;𝑗 > 0. DBNNP 9.98 9.98

Finally, the pixels in the sub-image are modified as: 𝐈′𝑖;𝑗 + 𝑒′𝑖;𝑗 to
obtain the embedded sub-image 𝐈𝑒𝑚𝑏 𝑖 . Similarly, the other three sub-
images follow the same data embedding method. In addition, to ensure 3.2. Visualization results of clustering
data extraction and lossless image recovery, the auxiliary information
is embedded in the outermost two rows and two columns of the cover We select six representative medical images from the testing set to
image 𝐈 using the LSB embedding algorithm to obtain the final embed- show the visualization results of pixel clustering. As shown in Fig. 6,
ded image 𝐈𝑒𝑚𝑏 . The auxiliary information includes a location map for the first row represents the six medical images and the second row
preventing pixel overflow, the clustering center, and the location of the represents the clustering results corresponding to the six images. For
last pixel embedded in the data. the clustering results, the purple color represents the first class of pixels,
the yellow color represents the second class of pixels, and the green
2.4. Data extraction and image recovery
color represents the third class of pixels. We can see that the first class
of pixels essentially represents the NROI, the second class represents the
Data extraction is the inverse process of data embedding, and the
low texture complexity region of the ROI, and the third class represents
extraction process is the same for each embedded sub-image, so we
the higher texture complexity region of the ROI.
use the embedded sub-image 𝐈𝑒𝑚𝑏 𝑖 as an example to describe the data
extraction process. Detailedly, the auxiliary information is extracted
first from the outermost two rows and two columns of the embedded 3.3. Results of pixel predictor
image 𝐈𝑒𝑚𝑏 , and the masks, label map can be obtained like the image
pre-processing. Then, the predicted image 𝐈′𝑖 can be obtained based To evaluate the effectiveness of our pixel predictor (DBNNP), we
on the { pixel prediction process } in data hiding, and the error vector compare it with six pixel predictors, including three classical traditional-
𝐞′𝑖 = 𝐈𝑒𝑚𝑏
𝑖;𝑗 − 𝐈 ′ , 𝑗 = 1, 2, … , 𝑛
𝑖;𝑗 𝑖 between 𝐈𝑒𝑚𝑏
𝑖 and 𝐈′𝑖 can be obtained based pixel predictors (e.g., RP (Sachnev et al., 2009), MEDP (Thodi
like the data embedding process. Finally, the receiver can get three & Rodríguez, 2007), and GAP (Coltuc, 2011)) and three state-of-the-
PEHs of three classes and perform data extraction and image recovery art deep learning-based pixel predictors (e.g., Hu & Xiang, 2021a,
according to Eqs. (11) and (12), respectively. It is worth noting that 2021b; Yang & Huang, 2022). We first measure the performance of
data extraction and image recovery need to be performed in descending the pixel predictor using two metrics, i.e., mean squared error (MSE)
order of the classes. and variance (Var). MSE represents the mean square error between the
{
0, if 𝑒′𝑖;𝑗 = 0, cover and predicted images. Var represents the variance of prediction
𝑚= (11) error. As listed in Table 2, we calculate the average of MSE and Var for
1, if 𝑒′𝑖;𝑗 = 1,
{ 𝑒𝑚𝑏 the testing set. From the comparison results we can find that the deep
𝐈𝑖;𝑗 , if 𝑒′𝑖;𝑗 ≤ 0, learning-based pixel predictor outperforms the traditional-based pixel
𝐈𝑖;𝑗 = (12)
𝑒𝑚𝑏
𝐈𝑖;𝑗 − 1, if 𝑒′𝑖;𝑗 ≥ 1. predictor and our pixel predictor achieves the optimal performance.
In addition, for a more intuitive comparison, we plot the average
3. Experimental results and analysis PEH of the seven pixel predictors on the testing set, see Fig. 7. For better
display, we only plot the histogram of prediction errors in the range
In this section, we first presented the experimental setup. Then, the [−15, 15]. Compared to other pixel predictors, our pixel predictor can
visualization of the clustering was given. Finally, the results of pixel generate a sharper PEH and this PEH has a higher peak bin, i.e., zero
predictor, data embedding, and ablation study were presented. bin. That is, the PEH generated by our pixel predictor has a higher
embedding capacity.
3.1. Experimental setup

Datasets. Our medical image dataset was obtained from public 3.4. Results of data embedding
datasets (Chen et al., 2022; Tianchi, 2020; Tong & Li, 2022; Tsai
et al., 2020; Wang et al., 2017). All medical images were converted In this subsection, we compare the performance of data embedding.
to grayscale images of 512 × 512 size for training and testing, and our For a fair comparison of data embedding performance, the same con-
training and testing sets are listed in Table 1, ventional PEE-based data embedding (C-PEE) algorithm (Hong et al.,
Platform. All experiments were implemented on the PyTorch frame- 2009) is employed for our DBNNP and four compared pixel predictors.
work with a 3.60 GHz Intel(R) Core(TM) i9 10900x CPU and an NVIDIA Compared to CT and MR images, X-ray images have a lower embedding
Titan V GPU. capability, we divide the testing set into two groups: one consisting

5
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

Fig. 6. The visualization results of pixel clustering.

Table 3
Average PSNR (dB) values of CT and MR images inside the testing set at 10,000–70,000 bits.
Bits Sachnev et al. (2009) Hu and Xiang (2021a) Hu and Xiang (2021b) Yang and Huang (2022) DBNNP
10,000 64.17 64.33 64.43 64.47 64.57
20,000 60.82 60.99 61.15 61.23 61.26
30,000 58.72 58.91 59.03 59.17 59.25
40,000 57.20 57.45 57.65 57.71 57.80
50,000 56.07 56.28 56.50 56.56 56.67
60,000 55.10 55.23 55.50 55.55 55.72
70,000 54.29 54.35 54.68 54.66 54.89

Table 4
Average PSNR (dB) values of X-ray images inside the testing set at 2,000–20,000 bits.
Bits Sachnev et al. (2009) Hu and Xiang (2021a) Hu and Xiang (2021b) Yang and Huang (2022) DBNNP
2,000 64.71 64.72 64.75 65.30 65.33
4,000 61.19 61.13 61.30 61.78 61.84
6,000 59.02 58.95 59.09 59.58 59.70
8,000 57.45 57.43 57.54 57.99 58.16
10,000 56.30 56.33 56.45 56.81 57.05
12,000 55.42 55.48 55.62 55.90 56.17
14,000 54.73 54.82 54.95 55.20 55.49
16,000 54.16 54.27 54.38 54.61 54.94
18,000 53.68 53.80 53.91 54.12 54.46
20,000 53.28 53.41 53.52 53.71 54.07

embedding algorithm. The reason is that a sharper PEH can be obtained


by DBNNP, thus reducing the embedding distortion.
In addition, to more visually compare the changes in PSNR of differ-
ent schemes under various embedding capacities, we plot the change
curves of PSNR in six representative images for different schemes, see
Fig. 8. From the results, we can find that our DBNNP with the same
embedding algorithm has higher PSNR under the same embedding
capacity and has higher embedding capacity under the same PSNR. The
reason is that our DBNNP has a better prediction accuracy.

3.5. Ablation study

In this subsection, we verify the effectiveness of the different com-


ponents for our pixel predictor and embedding algorithms. For our pixel
predictor (DBNNP), three important components, i.e., UFEB, CBAM,
Fig. 7. The average prediction error histogram (PEH) for different predictors on the and Mask, and a base component SFEB were used for ablation studies.
testing set. For our class-based embedding algorithm (CE), based on the same pixel
predictor (DBNNP), two different embedding algorithms, i.e., C-PEE
and CE, were used for ablation studies.
of 250 CT and MR images in total and the other consisting of 50 X- In Table 5, the peak bins of PEH under different ablation studies
ray images. Since the embedding capacity of some images in the two are given, and a higher peak bin indicates better prediction accuracy,
groups are 70,000 and 20,000 bits respectively, we set the maximum i.e., higher embedding capacity. The results show that our pixel pre-
test capacity of the two groups to 70,000 and 20,000 bits and perform dictor can obtain the highest peak bin when UFEB, CBAM, and Mask
data embedding, respectively, see Tables 3 and 4. From the results, are used in combination. The reasons are as follows: (1) UFEB can
we can find that our DBNNP can obtain better PSNR with the same efficiently extract features from the same class of pixels by eliminating

6
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

Fig. 8. Performance comparison between our DBNNP and other predictors under the same embedding algorithm.

Table 5 and finally into the high texture complexity region of ROI. Compared
Ablation study on our pixel predictor.
with traditional embedding algorithms, our embedding algorithm can
SFEB UFEB CBAM Mask Peak bin reduce the embedding distortion, especially the distortion of ROI. The
✔ ✗ ✗ ✗ 137,380 experimental results demonstrate that our scheme achieves satisfactor
✔ ✔ ✗ ✗ 138,997
performances in both pixel prediction and data embedding for medical
✔ ✔ ✔ ✗ 139,041
✔ ✔ ✔ ✔ 141,682
images.

CRediT authorship contribution statement


Table 6
Ablation study on our embedding algorithm.
Ping Kong: Conceptualization, Writing – original draft. Yongdong
DBNNP C-PEE CE PSNR (dB)
Zhang: Methodology. Lin Huang: Validation. Liang Zhou: Investi-
CT and MR X-ray gation. Lifan Chen: Software. Chuan Qin: Supervision, Writing –
✔ ✔ ✗ 54.89 54.08 reviewing & editing.
✔ ✗ ✔ 55.46 55.83

Declaration of competing interest

the influence of other classes through Mask; (2) CBAM can better The authors declare that they have no known competing finan-
optimize the features extracted from the two branches and achieve cial interests or personal relationships that could have appeared to
better balance, thus further improving the performance of prediction. influence the work reported in this paper.
As mentioned in Section 3.4, we divide the testing set into two
groups (one for CT and MR images and the other for X-ray images). Data availability
Their PSNRs at 70,000 and 20,000 bits embedding capacity respectively
are listed in Table 6. From the results, we can find that our CE algorithm Data will be made available on request.
achieves higher PSNR compared to C-PEE algorithm. The reason is
that our CE algorithm classifies the local features of each pixel and Acknowledgments
determines the embedding order according to the different classes,
instead of indiscriminate embedding. The authors would like to thank the reviewers for their insightful
comments and useful suggestions.

4. Conclusion
References

In this work, we propose a dual-branch neural network-based re- Alattar, A. M. (2004). Reversible watermark using the difference expansion of
versible data hiding scheme for medical images. First, we cluster the a generalized integer transform. IEEE Transactions on Image Processing, 13(8),
pixels into three classes according to local weighted brightness and lo- 1147–1156.
Celik, M. U., Sharma, G., Tekalp, A. M., & Saber, E. (2005). Lossless generalized-LSB
cal standard deviation. Then, a dual-branch neural network-based pixel data embedding. IEEE Transactions on Image Processing, 14(2), 253–266.
predictor is designed to extract unique and shared features of different Chen, L., Wang, W., Jin, K., Yuan, B., Tan, H., Sun, J., Guo, Y., Luo, Y., Feng, S.-
classes. In addition, the CBAM is employed to optimize the extracted t., Yu, X., Chen, M.-h., & Chen, J. (2022). Prediction of sunitinib efficacy using
features to improve the prediction accuracy. Finally, in the embedding computed tomography in patients with pancreatic neuroendocrine tumors (CTpred-
Sunitinib-panNET). The Cancer Imaging Archive, http://dx.doi.org/10.7937/SPGK-
stage, we use a class-based embedding algorithm, the data is prioritized
0P94, [dataset].
for embedding into low complexity region, which is usually NROI, Coltuc, D. (2011). Improved embedding for prediction-based reversible watermarking.
and then embedded into the low texture complexity region of ROI IEEE Transactions on Information Forensics and Security, 6(3), 873–882.

7
P. Kong et al. Expert Systems With Applications 256 (2024) 124903

He, W., & Cai, Z. (2021). Reversible data hiding based on dual pairwise prediction-error Parah, S. A., Sheikh, J. A., Ahad, F., Loan, N. A., & Bhat, G. M. (2017). Information
expansion. IEEE Transactions on Image Processing, 30, 5045–5055. hiding in medical images: a robust medical image watermarking system for
He, W., Xiong, G., Weng, S., Cai, Z., & Wang, Y. (2018). Reversible data hiding using E-healthcare. Multimedia Tools and Applications, 76, 10599–10633.
multi-pass pixel-value-ordering and pairwise prediction-error expansion. Information Qi, W., Li, X., Zhang, T., & Guo, Z. (2019). Optimal reversible data hiding scheme based
Sciences, 467, 784–799. on multiple histograms modification. IEEE Transactions on Circuits and Systems for
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image Video Technology, 30(8), 2300–2312.
recognition. In Proceedings of the IEEE conference on computer vision and pattern Qi, W., Zhang, T., Li, X., Ma, B., & Guo, Z. (2023). Reversible data hiding based
recognition (pp. 770–778). on prediction-error value ordering and multiple-embedding. Signal Processing, 207,
Hong, W., Chen, T.-S., & Shiu, C.-W. (2009). Reversible data hiding for high quality Article 108956.
images using modification of prediction errors. Journal of Systems and Software, Ren, F., Liu, Y., Zhang, X., & Li, Q. (2023). Reversible information hiding scheme
82(11), 1833–1842. based on interpolation and histogram shift for medical images. Multimedia Tools
Hu, R., & Xiang, S. (2021a). CNN prediction based reversible data hiding. IEEE Signal and Applications, 82(18), 28445–28471.
Processing Letters, 28, 464–468. Sachnev, V., Kim, H. J., Nam, J., Suresh, S., & Shi, Y. Q. (2009). Reversible
Hu, R., & Xiang, S. (2021b). Reversible data hiding by using CNN prediction and watermarking algorithm using sorting and prediction. IEEE Transactions on Circuits
adaptive embedding. IEEE Transactions on Pattern Analysis and Machine Intelligence, and Systems for Video Technology, 19(7), 989–999.
44(12), 10196–10208. Shang, F., Lan, Y., Yang, J., Li, E., & Kang, X. (2023). Robust data hiding for JPEG
Huang, L.-C., Tseng, L.-Y., & Hwang, M.-S. (2013). A reversible data hiding method by images with invertible neural network. Neural Networks, 163, 219–232.
histogram shifting in high quality medical images. Journal of Systems and Software, Thodi, D. M., & Rodríguez, J. J. (2007). Expansion embedding techniques for reversible
86(3), 716–727. watermarking. IEEE Transactions on Image Processing, 16(3), 721–730.
Jafar, I. F., Darabkh, K. A., Al-Zubi, R. T., & Al Na’mneh, R. A. (2016). Efficient Tian, J. (2003). Reversible data embedding using a difference expansion. IEEE
reversible data hiding using multiple predictors. The Computer Journal, 59(3), Transactions on Circuits and Systems for Video Technology, 13(8), 890–896.
423–438. Tianchi (2020). Spinal disease dataset. URL https://tianchi.aliyun.com/dataset/
Kalker, T., & Willems, F. M. (2002). Capacity bounds and constructions for re- dataDetail?dataId=79463.
versible data-hiding. In 2002 14th international conference on digital signal processing Tong, T., & Li, M. (2022). Abdominal or pelvic enhanced CT images within 10 days
proceedings. DSP 2002 (cat. no. 02TH8628), vol. 1 (pp. 71–76). IEEE. before surgery of 230 patients with stage II colorectal cancer (StageII-Colorectal-CT).
Karakus, S., & Avci, E. (2020). A new image steganography method with optimum The Cancer Imaging Archive, http://dx.doi.org/10.7937/p5k5-tg43, [dataset].
pixel similarity for data hiding in medical images. Medical Hypotheses, 139, Article Tsai, E., Simpson, S., Lungren, M., Hershman, M., Roshkovan, L., Colak, E., Erickson, B.,
109691. Shih, G., Stein, A., Kalpathy-Cramer, J., Shen, J., Hafez, M., John, S., Rajiah, P.,
Ker, A. D. (2005). Steganalysis of LSB matching in grayscale images. IEEE Signal Pogatchnik, B., Mongan, J., Altinmakas, E., Ranschaert, E., Kitamura, F., .... Wu, C.
Processing Letters, 12(6), 441–444. (2020). Data from the medical imaging data resource center – RSNA International
Khor, H. L., Liew, S.-C., & Zain, J. M. (2017). Region of interest-based tamper COVID radiology database release 1a – Chest CT Covid+ (MIDRC-RICORD-1A). The
detection and lossless recovery watermarking scheme (ROI-DR) on ultrasound Cancer Imaging Archive, http://dx.doi.org/10.7937/VTW4-X588, [dataset].
medical images. Journal of Digital Imaging, 30, 328–349. Wang, J., Chen, X., Ni, J., Mao, N., & Shi, Y. (2019). Multiple histograms-based
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied reversible data hiding: Framework and realization. IEEE Transactions on Circuits
to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. and Systems for Video Technology, 30(8), 2313–2328.
Lee, C.-C., Wu, H.-C., Tsai, C.-S., & Chu, Y.-P. (2008). Adaptive lossless stegano- Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., & Summers, R. M. (2017). Chestx-
graphic scheme with centralized difference expansion. Pattern Recognition, 41(6), ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised
2097–2106. classification and localization of common thorax diseases. In Proceedings of the IEEE
Li, X., Li, J., Li, B., & Yang, B. (2013). High-fidelity reversible data hiding scheme based conference on computer vision and pattern recognition (pp. 2097–2106).
on pixel-value-ordering and prediction-error expansion. Signal Processing, 93(1), Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). Cbam: Convolutional block attention
198–205. module. In Proceedings of the European conference on computer vision (pp. 3–19).
Li, X., Li, B., Yang, B., & Zeng, T. (2013). General framework to histogram- Wu, H.-T., Huang, J., & Shi, Y.-Q. (2015). A reversible data hiding method with
shifting-based reversible data hiding. IEEE Transactions on Image Processing, 22(6), contrast enhancement for medical images. Journal of Visual Communication and
2181–2191. Image Representation, 31, 146–153.
Loshchilov, I., & Hutter, F. (2017). Decoupled weight decay regularization. arXiv Wu, K.-C., & Wang, C.-M. (2014). Steganography using reversible texture synthesis.
preprint arXiv:1711.05101. IEEE Transactions on Image Processing, 24(1), 130–139.
Mortazavian, P., Jahangiri, M., & Fatemizadeh, E. (2004). A low-degradation steganog- Xuan, G., Shi, Y. Q., Ni, Z., Chen, J., Yang, C., Zhen, Y., & Zheng, J. (2004). High
raphy model for data hiding in medical images. In Proceeding of the fourth IASTED capacity lossless data hiding based on integer wavelet transform. In 2004 IEEE
international conference visualization, imaging, and image processing (pp. 914–920). international symposium on circuits and systems, vol. 2 (pp. II–29). IEEE.
Citeseer. Yang, X., & Huang, F. (2022). New CNN-based predictor for reversible data hiding.
Ni, Z., Shi, Y.-Q., Ansari, N., & Su, W. (2006). Reversible data hiding. IEEE Transactions IEEE Signal Processing Letters, 29, 2627–2631.
on Circuits and Systems for Video Technology, 16(3), 354–362. Yildirim, M. (2021). Steganography-based voice hiding in medical images of COVID-19
Ou, B., Li, X., Zhao, Y., Ni, R., & Shi, Y.-Q. (2013). Pairwise prediction-error expansion patients. Nonlinear Dynamics, 105(3), 2677–2692.
for efficient reversible data hiding. IEEE Transactions on Image Processing, 22(12),
5010–5021.

You might also like