0% found this document useful (0 votes)
35 views12 pages

Accurate Magnetic Resonance Image Super-Resolution Using Deep Networks and Gaussian Filtering in The Stationary Wavelet Domain

This document summarizes a research paper that proposes an accurate magnetic resonance (MR) image super-resolution method using a deep residual network and Gaussian filtering in the stationary wavelet domain. The method trains a very deep residual network on low-resolution and high-resolution image subbands obtained via 2D stationary wavelet transform. It then uses the trained network to generate residual subbands for a low-resolution test image. The residual subbands are added to the original subbands to produce super-resolved subbands. Finally, Gaussian filtering is applied to the super-resolved subbands to maintain intrinsic image structure. Experimental results showed the proposed method outperformed existing approaches in terms of objective metrics and subjective quality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views12 pages

Accurate Magnetic Resonance Image Super-Resolution Using Deep Networks and Gaussian Filtering in The Stationary Wavelet Domain

This document summarizes a research paper that proposes an accurate magnetic resonance (MR) image super-resolution method using a deep residual network and Gaussian filtering in the stationary wavelet domain. The method trains a very deep residual network on low-resolution and high-resolution image subbands obtained via 2D stationary wavelet transform. It then uses the trained network to generate residual subbands for a low-resolution test image. The residual subbands are added to the original subbands to produce super-resolved subbands. Finally, Gaussian filtering is applied to the super-resolved subbands to maintain intrinsic image structure. Experimental results showed the proposed method outperformed existing approaches in terms of objective metrics and subjective quality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received April 22, 2021, accepted May 1, 2021, date of publication May 5, 2021, date of current version May

19, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3077611

Accurate Magnetic Resonance Image


Super-Resolution Using Deep Networks
and Gaussian Filtering in the Stationary
Wavelet Domain
GUNNAM SURYANARAYANA 1,2 , KARTHIK CHANDRAN 3 , (Senior Member, IEEE),
OSAMAH IBRAHIM KHALAF4 , YOUSEEF ALOTAIBI 5 , ABDULMAJEED ALSUFYANI 6 ,
AND SALEH AHMED ALGHAMDI7
1 Institute
of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
2 Department of Electronics and Communications, V R Siddhartha Engineering College, Vijayawada 520007, India
3 Department of Mechatronics, Jyothi Engineering College, Thrissur 679531, India
4 Al-Nahrain Nanorenewable Energy Research Center, Al-Nahrain University, Baghdad 64074, Iraq
5 Department of Computer Science, College of Computers and Information Systems, Umm Al-Qura University, Makkah 77207, Saudi Arabia
6 Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia
7 Department of Information Technology, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia

Corresponding authors: Gunnam Suryanarayana ([email protected]) and Karthik Chandran ([email protected])


This work was supported by the Taif University, Taif, Saudi Arabia, through Taif University Researchers Supporting under Project
TURSP-2020/115.

ABSTRACT In this correspondence, we present an accurate Magnetic Resonance (MR) image


Super-Resolution (SR) method that uses a Very Deep Residual network (VDR-net) in the training phase.
By applying 2D Stationary Wavelet Transform (SWT), we decompose each Low Resolution (LR)-High
Resolution (HR) example image pair into its low-frequency and high-frequency subbands. These LR-HR
subbands are used to train the VDR-net through the input and output channels. The trained parameters are
then used to generate residual subbands of a given LR test image. The obtained residuals are added with their
LR subbands to produce the SR subbands. Finally, we attempt to maintain the intrinsic structure of images by
implementing the Gaussian edge-preservation step on the SR subbands. Our extensive experimental results
show that the proposed MR-SR method outperforms the existing methods in terms of four different objective
metrics and subjective quality.

INDEX TERMS Deep learning, edge-preservation, MR imaging, residual network, stationary wavelet
decomposition, super-resolution.

I. INTRODUCTION clinical applications. As a result, we can overcome the limi-


The motive of single-image Super-Resolution (SR) is to tations of MR imaging, such as a lower signal-to-noise ratio
generate a High Resolution (HR) image from an input Low and longer scan time. Research on single-image SR has been
Resolution (LR) image. The generated HR image is expected classified into three categories, namely interpolation meth-
to possess adequate edge information with minimum arti- ods, dictionary learning methods, and deep learning methods.
facts. Single-image SR algorithms are currently active in Bilinear and bicubic methods are the widely used tradi-
industrial and academic applications as well. Few applica- tional interpolation algorithms in practice [6]. The bilinear
tions include medical imaging, crime investigation, video method considers four closest neighbor pixels, and the bicu-
surveillance, infrared image processing, and consumer elec- bic method considers sixteen pixels to compute the unknown
tronics [1]–[5]. In this work, we focus on improving the pixels. Every missing pixel in the HR grid is obtained using
resolution of LR Magnetic Resonance (MR) images to benefit the neighboring pixels based on simple isotropic kernels. As a
result, the bilinear and bicubic methods fail to preserve the
The associate editor coordinating the review of this manuscript and intrinsic edge structures leading to severe blurring and jaggy
approving it for publication was Li He . artifacts. Jaggies are unwanted high-frequency components

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
71406 VOLUME 9, 2021
G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

that appear along straight lines or curved edges in the recon- Networks (CNNs) [25]–[39]. This class of SR methods
structed image. These artifacts appear so widespread in digi- has gained considerable attention from many researchers in
tal display devices and remain an issue in the design of print- recent times. Dong et al. [29], [30] introduced the first deep
ing devices. To overcome these deficiencies, edge-directed convolutional network to solve the SR problem. Motivated by
interpolation methods are proposed [7], [8]. These methods this work, several other problems like depth map SR [31] and
attempt to preserve the overall edge structures of images but face hallucination [32] have produced state-of-the-art results.
often produce speckle-noise around edges. Besides, the per- Unlike the traditional learning-based methods [16]–[24],
formance of these methods degrades very rapidly for scale Dong et al.’s method [30] directly learns a non-linear map-
factors above 2. To address this issue, interpolation tech- ping between LR and HR spaces. This end-to-end map-
niques are combined with wavelet transforms. The Discrete ping in deep networks trains all the model parameters more
Wavelet Transform (DWT) [9], [10], Stationary Wavelet efficiently, leading to an accurate inference. Inspired by
Transform (SWT) [11], lifting wavelet transform [12], and this, Wang et al. [33] developed a network with a set of
dual-tree complex wavelet transform [13] are used for cascaded sparse coding networks in each mapping layer.
wavelet domain interpolation. Recently, a Rational Fractal However, the sparse coding solver cannot guarantee the opti-
Interpolation (RFI) model is constructed by Zhang et al. [14] mal mapping accuracy of this approach.
and Shao et al. [15] for preserving textural details. The Dong et al. [34] modified their network [29], [30] in terms
RFI function is more accurate and works better than poly- of the number of mapping layers, filter sizes, and feature
nomial interpolation kernels [7], [8]. However, the perfor- dimensions. The resultant network provides fast upscaling
mance of interpolation methods [6]–[8], [14] is inferior to with improved accuracy. In addition, it operates directly on
learning-based SR methods. LR images without the initial bicubic interpolation. In other
Learning-based methods use an external LR-HR train- approaches, generative adversarial networks were used to
ing dataset to find the missing high-frequency details in recover fine textures and edges [35], [36]. Self-exemplars
an LR image. These methods can be further categorized were used in [37] where LR-HR training examples exploit
into two types. The first category is based on dictionary self-similarity to enhance the output SR quality. Cui et al. [38]
learning, and the second is based on deep learning. Dic- introduced a deep cascade network for gradual upscaling of
tionary learning methods solve the SR problem based on LR patches after each layer. A deep joint SR model was devel-
statistical analysis or intuitive understanding of many natu- oped in [39] using a high complex convolutional auto-encoder
ral images. Neighbor embedding approaches [16], [17] and network.
sparse coding approaches [18]–[24] are the commonly used In this paper, we exploit the idea of residual learning [26]
dictionary-based methods in practice. In neighbor embedding in the stationary wavelet domain and attempt to preserve
approaches, an input LR image is usually expressed as a the intrinsic structure of images. The input LR image is
weighted linear combination of example patches from an subjected to 2D SWT for decomposing into its low-frequency
LR dictionary. The same weight combination is used with and high-frequency subbands. We prefer SWT to overcome
its HR dictionary to reconstruct the output patch. This is the shift variance and inferior directionality of DWT. The
because LR and HR images share similar local geometric decomposed LR subbands are fed forward through the trained
structures. However, this approach leads to a quick expansion VDR-nets using the four input channels to produce corre-
of dictionaries when the desired training dataset is large. sponding residual subbands. These residuals are added to the
On the other hand, sparse coding approaches address this LR subbands to yield the SR subbands. Finally, we apply the
issue by enforcing an efficient representation using a sparse edge-preservation step on the SR subbands using Gaussian
linear combination. Yang et al. [18] applied this idea to operation and then fuse the resulting subbands to generate the
the SR problem by jointly training the LR-HR dictionaries output SR image.
using similar sparse representations. Their algorithm first In the following, we discuss the deep learning methods in
computes a sparse representation vector from a given input our related work module in Section 2. The network architec-
patch and uses the same sparse prior for computing the output ture and SR reconstruction process of our proposed algorithm
patch. Zeyde et al. [19] made significant improvements to are given in Section 3. Section 4 describes the image datasets,
Yang et al.’s model [18] in terms of speed and quality. Other methods, metrics, and implementation details, followed by a
attempts used centralized, nonlocally centralized sparse rep- discussion on experimental results. Finally, conclusions are
resentations [20], [21], and statistical prediction model with- drawn in Section 5.
out sparse invariance assumption [22]. Timofte et al. [23],
[24] attempted to reduce the running time of sparse coding II. RELATED WORK
approaches by combining sparse learned LR-HR dictionaries More recently, Very Deep Residual networks (VDR-nets)
with neighbor embedding approaches. However, these meth- [25]–[28] have shown great improvement over the exist-
ods fail to yield superior SR results compared to the recent ing deep learning networks with faster convergence rates
deep learning-based methods [25]–[28]. and accurate SR performance. Unlike traditional networks,
The second category of learning-based SR methods uses the VDR-net does not directly reconstruct HR images.
deep learning technology by applying Convolutional Neural Instead, it emphasizes on the residuals between LR-HR image

VOLUME 9, 2021 71407


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 1. VDR-net configuration for prediction of LR-HR output residual image.

pairs and reduces the training time significantly. In particular, in regression problems, signal and image processing as the
we focus on the VDR-net presented by Kim et al. [25], [40] dominant error measure.
in their work. The objective of the VDR-net is to estimate the residual
between input and output images rather than the HR images
A. VDR-NET directly. Now, if R = H − L represents the residual between
The structure of the VDR-net is shown in Fig. 1. As illus- LR-HR images, the loss function can be modified as
trated, the network has M layers, namely an image input layer, 1
M − 2 middle layers, and a regression layer. The filter details loss =kR − f (L)k22 . (3)
2
of these layers are briefed as follows.
The VDR-net is trained by minimizing the loss function
• Image input layer: This is the first layer in the network in Eq. (3), and the weights and biases are computed. The
that operates on input image patches. It consists of 64 Stochastic Gradient Descent with Momentum (SGDM) tech-
filters of size 3 × 3. nique [41] is used for obtaining the optimum parameters of
• Middle layers: All the middle layers are similarly param- the network. These parameters are used in the testing phase
eterized with 64 filters of size 3 × 3 × 64 in each layer. of our method to generate the residual image for a given LR
• Regression layer: This is the last layer in the network image.
which reconstructs the output residual image. It has a
single 3 × 3 × 64 sized filter. III. PROPOSED METHOD
All these convolution layers except the regression layer are In this section, we present the details of our network structure,
followed by Rectified Linear Unit (ReLU) layers. The func- training, and SR reconstruction process.
tion of ReLU is to introduce nonlinearity in the network by
replacing negative values with zeroes. As a result, the time A. NETWORK STRUCTURE AND LOSS FUNCTION
required to estimate the output residual is minimized. The task of SR is to restore the maximum lost high-frequency
details of an image. An efficient way to achieve this is to
B. DATA MODELING process the low-frequency and high-frequency content of an
To compute the output residual for a given LR image, we train image separately. For this, we apply SWT on a given image I
the VDR-net using numerous LR-HR example images. to decompose into its low-frequency LL and high-frequency
N
Let {L (i) , H (i) }i=1 represents the training dataset, where L LH , HL, and HH subbands.
denotes an interpolated LR image and H represents its HR
{LL, LH , HL, HH } = SWT(I ) (4)
version. The network learns a model f , which accepts the LR
training images and predicts the corresponding HR images. Eq. (4) represents the subband decomposition of an image I
The model can be represented as using SWT operation. Here LL represents the approximation
coefficient, whereas LH , HL, and HH are the detail coef-
Ĥ = f (L), (1) ficients along horizontal, vertical, and diagonal directions.
N
where Ĥ is an estimate of the original HR image H . The Similarly, for a given training dataset {L (i) , H (i) }i=1 , the LR
resultant mean square error is modeled as a loss function subbands are given as {LL , LH , HL , HH } and the HR
L L L L

which has to be minimized by averaging over the training subbands as {LL H , LH H , HL H , HH H }. Considering all these
dataset. subbands together, we solve the SR problem in the wavelet
domain.
1
loss = kH − f (L)k22 . (2) Fig. 2 illustrates the structure of our deep CNN used in
2 the training phase. The network has four input and four
In Eq. (2), the mean squared error-l2 is considered while output channels connected to the VDR-net. The input chan-
computing the loss function. l2 is convex and differentiable in nels couple each LR subband with VDR-net, transform-
nature which is very advantageous while solving optimization ing into the corresponding output residual subbands at the
problems. These properties led to l2 ’s widespread adoption output channels. The output residuals corresponding to the

71408 VOLUME 9, 2021


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 2. Proposed network structure.

FIGURE 3. Proposed SR reconstruction.

LR subbands {LL L , LH L , HL L , HH L } are represented by and L2 regularization parameter are set to 0.9 and 0.0001 dur-
{LL R , LH R , HL R , HH R }. ing the optimization process. The obtained network param-
As described in Section 2, the VDR-net has an image input eters are used in the testing phase to generate the residual
layer, M − 2 middle layers, and a regression layer. The input subbands for a given set of LR image subbands.
layer consists of 64 filters of size 3×3, the middle layers have
64 filters of size 3 × 3 × 64 in each layer, and the regression B. SR RECONSTRUCTION
layer has a single 3 × 3 × 64 sized filter. Here the objective of The proposed SR reconstruction scheme is outlined as a
VDR-net is to train a model f that accepts the LR subbands block diagram in Fig. 3. The given LR test image L is first
and outputs the residuals between LR-HR subbands. From interpolated using the bicubic filter with a factor s and then
Eq. (3), the loss function corresponding to each channel is subjected to 2D SWT decomposition.
given as From Eq. (4):
1
CH loss
i = kJiR − f (JiL )k22 , (5) {LL L , LH L , HL L , HH L } = SWT(L). (6)
2
where Ji ∈ {LL, LH , HL, HH } for i = {1, 2, 3, 4}. We min- The LR subbands {LL L , LH L , HL L , HH L } are fed forward
imize the loss function using the SGDM technique [41] and through the VDR-net using the four input channels. The
solve for the optimum network parameters. The momentum VDR-net transforms the LR subbands into the corresponding

VOLUME 9, 2021 71409


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

output residual subbands {LL R , LH R , HL R , HH R }. By adding For a fair comparison with the state-of-the-art methods,
these residuals to the LR subbands produces the SR subbands we also train the VDR-net by employing three benchmark
{LL S , LH S , HL S , HH S }. datasets, namely 91 image dataset from Yang et al. [18],
JiS = JiL + JiR , (7) BSDS200, and BSDS300 from Berkeley segmentation
datasets [43]. A total of 591 HR images are used in the train-
where ing phase with data augmentation. For SR image reconstruc-
Ji ∈ {LL, LH , HL, HH } for i = {1, 2, 3, 4}. (8) tion, we employ two benchmark datasets in the testing phase.
‘Set5’ [17] and ‘Set14’ [19] containing 5 and 14 images
In the SR problem, bicubic interpolation in the prepro- respectively. Total 19 images are used with three different
cessing stage leads to the loss of high-frequency compo- scaling factors (2, 3, and 4). We use the luminance channel
nents. This loss is due to the averaging effect caused by information alone for both training and testing phases. This
the interpolation filters. As a result, the SR subbands are is because humans are more sensitive to changes in intensity
deficient in edge information causing poor reconstruction than in color.
quality. To increase the quality of the SR image, preserving
edge information is essential. We apply the Gaussian low
pass filter on all the SR subbands and subtract the Gaussian 2) METHODS AND METRICS
smoothed output from the SR subbands. To compare the SR performance, twelve classic and recent
state-of-the-art methods are involved in testing. These meth-
ods are based on interpolation, dictionary learning and
deep learning. Interpolation methods include bicubic, new
edge directed interpolation (NEDI) [7], local RFI (LRFI)
[15]. SR via sparse representation (SCSR) [18], statis-
tical prediction model based on sparse representations
(SPMSR) [22], adjusted anchored neighborhood regression
FIGURE 4. SR result pairs on the IXI-MR test images: (a) without Gaussian (A+) [24] are dictionary learning techniques. Cascaded deep
(b) with Gaussian (c) without Gaussian (d) with Gaussian. sparse coding based networks (SCN) [33], SR using deep
The necessity of Gaussian edge-preservation can be under- CNNs (SRCNN) [30], accelerating SRCNN (FSRCNN) [34],
stood from Fig. 4. In Fig. 4(a)-Fig. 4(b), we present one pair SR using very deep CNNs (VDSR) [25], deep wavelet pre-
of the proposed SR results without and with the Gaussian diction for SR (DWSR) [27], multi-scale saliency and deep
operation, respectively. Similarly, Fig. 4(c)-Fig. 4(d) repre- CNNs for SR (MSSCNN) [28] are based on deep learning.
sent another pair of our SR results. From these figures, we can To assess the performance of SR methods, four
see that the SR images in Fig. 4(a) and Fig. 4(c) suffer from full-reference objective assessment indices are reported. They
poor edge information. Whereas the SR images in Fig. 4(b) are Peak Signal-to-Noise Ratio (PSNR), Structural Similarity
and Fig. 4(d) have better edge details and are visually pleas- Index Measure (SSIM), perceptual blur (Qblur ), and ringing
ing. The Gaussian process extracts the edge details from each (Qring ) metrics [44]. PSNR and SSIM metrics have been
SR subband which are then added back to the SR subbands. widely in the field of digital imaging for several decades.
The resultant SR subbands have more edge details leading to However, many researchers argue that these simple fidelity
superior SR reconstruction quality when fused using Inverse metrics do not provide a precise correlation with human
SWT (ISWT). perception. Hence, we report two additional metrics Qblur and
Qring along with PSNR and SSIM. Qblur and Qring metrics
IV. RESULTS AND DISCUSSION
enable us to measure the artifacts present in SR images and
In this section, we conduct experiments to evaluate the per-
better assess the algorithms. For better performance, high
formance of our method. First, we present the details about
PSNR, SSIM, low Qblur , and Qring indices are desired.
image datasets, methods, and metrics used for comparison.
Next, the parameter settings for training and testing are given.
Finally, we provide a discussion on the results and assess the B. PARAMETER SETTINGS
quality of SR algorithms. 1) TRAINING
During the training phase, the images are subjected to one
A. EXPERIMENTAL CONFIGURATION level 2D SWT with Haar wavelet function. It decomposes the
1) IMAGE DATASETS HR images into {LL H , LH H , HL H , HH H } subbands. These
The VDR-net is trained using the public IXI-MR image subbands are then down-sampled and upscaled using the
dataset (http://brain-developme-nt.org/ixi-dataset) [42]. The bicubic filter by the same factor s (s = 2, 3, and 4) to yield the
dataset has 600 MR-HR images collected from three different LR image subbands {LL L , LH L , HL L , HH L }. On subtract-
hospitals with Philips 3T , Philips 1.5T , and GE 3T systems, ing the LR subbands from the HR subbands produces the
respectively. We have randomly selected 500 subjects with residual subbands {LL R , LH R , HL R , HH R }. The VDR-net is
augmentation for training and performed selective testing on trained using the LR and residuals subbands individually for
the other 100 subjects. each subband type. The subbands are cropped to 41 × 41

71410 VOLUME 9, 2021


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 5. Factor 4 SR results on the IXI-MR test image: (a) Ground truth image (b) bicubic (c) LRFI (d) VDSR (e) MSSCNN
(f) Proposed.

FIGURE 6. Factor 3 SR results on the IXI-MR test image: (a) Ground truth image (b) bicubic (c) LRFI (d) VDSR (e) MSSCNN
(f) Proposed.

FIGURE 7. Factor 2 SR results on the IXI-MR test image: (a) Ground truth image (b) bicubic (c) LRFI (d) VDSR (e) MSSCNN
(f) Proposed.

sub-images with no overlap. The depth of the VDR-net is C. QUALITY ASSESSMENT


set to M = 20, and the weights and biases are updated by We assess the quality of SR algorithms for three different
minimizing the loss function in Eq. (5) using the SGDM upscaling factors. Figs. 5-7 present the SR results on three
optimizer. For SGDM optimization, the momentum and L2 different test images from the IXI-MR dataset for s = 4,
regularization parameter are set to 0.9 and 0.0001, respec- s = 3, and s = 2 respectively. Figs. 5(a)-7(a) represent the
tively. The gradients threshold is set to 0.01 using norm ground truth MR images. The results based on interpolation
clipping. The initial learning rate is 0.1 and decreases by a are depicted in Figs. 5(b)-7(b) for bicubic and Figs. 5(c)-7(c)
factor 10 for every 10 epochs. After 100 epochs, the VDR-net for LRFI. In Figs. 5(d)-7(d) and Figs. 5(e)-7(e), we present
is fully converged, and it takes about 12 hours to train all the the results of deep learning methods VDSR and MSSCNN,
subbands with a single GPU. respectively. When compared with interpolation methods and
deep learning methods in Figs. 5(b)-7(b), Figs. 5(c)-7(c),
2) TESTING Figs. 5(d)-7(d), and Figs. 5(e)-7(e), the SR images of the
We consider the ground truth images from the IXI-MR proposed method in Figs. 5(f)-7(f) appears close to the ground
dataset, ‘Set5’ [17], and ‘Set14’ [19], for full reference image truth images. Besides, the proposed method has high PSNR
quality assessment. These images are down-sampled and and SSIM indices with minimized Qblur and Qring artifacts.
upscaled using bicubic interpolation by a factor s (s = 2, 3 To prove the effectiveness of the proposed method,
and 4) to generate the LR test images. We use one level 2D we further trained VDR-net using Yang et al. [18] dataset,
SWT with the Haar wavelet function for subband decom- BSDS200, and BSDS300 datasets [43]. The results are tested
position. The Gaussian low pass filtering is applied on the on standard ‘Set5’ [17] and ‘Set14’ [19] images. Figs. 8-13
SR subbands with standard deviation σ = 0.9. For SR present visual comparisons of SR methods on ‘Set5’ and
reconstruction by a factor s, we use the weights and biases ‘Set14’ test images. Fig. 8 and Fig. 9 show the SR results for
of the VDR-net trained with the same factor s. s = 4, Fig. 10 and Fig. 11 show the SR results for s = 3,

VOLUME 9, 2021 71411


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 8. Factor 4 SR results on Comic image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN (g) SRCNN
(h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

FIGURE 9. Factor 4 SR results on Woman image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN
(g) SRCNN (h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

and Fig. 12 and Fig. 13 show the SR results for s = 2. results obtained using bicubic interpolation. The results of
Fig. 8(a) and Fig. 9(a) represent the ground truth images of dictionary learning methods, namely SCSR, SPMSR, and A+
Comic and Woman images. Fig. 8(b) and Fig. 9(b) are the are depicted in Figs. 8(c)-8(e) and Figs 9(c)-9(e). Whereas

71412 VOLUME 9, 2021


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 10. Factor 3 SR results on Bird image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN (g) SRCNN
(h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

FIGURE 11. Factor 3 SR results on Flowers image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN
(g) SRCNN (h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

the deep learning methods SCN, SRCNN, FSRCNN, VDSR, The results of different SR methods are shown
DWSR, and MSSCNN are shown in Figs. 8(f)-8(k) and in Figs. 10(b)- 10(l) and Figs. 11(b)- 11(l) for s = 3,
Figs. 9(f)-9(k). The SR reconstruction results of the proposed Figs. 12(b)- 12(l) and Figs. 13(b)- 13(l) for s = 2.
method are outlined in Fig. 8(l) and Fig. 9(l). The bicubic Figs. 10(b)-13(b) show the bicubic interpolated images.
method suffers from blurred and ringing artifacts. The SR Figs. 10(c)-13(c), Figs. 10(d)-13(d) and Figs. 10(e)-13(e)
images of dictionary learning methods are distorted and fuzzy represent the dictionary learning methods. The images gener-
in nature. The deep learning methods generate images with ated by deep learning methods are given in Figs. 10(f)-13(f),
less distortion and a minimized number of artifacts. However, Figs. 10(g)-13(g), Figs. 10(h)-13(h), Figs. 10(i)-13(i),
the details of the flower (Figs. 8(f)-8(k)) and the textures of Figs. 10(j)-13(j) and Figs. 10(k)-13(k). The proposed method
the hat (Figs. 9(f)-9(k)) in the cropped regions are not well is outlined in Figs. 10(l)- 13(l). As shown in Figs. 10-13,
preserved. The flower details of Comic image and the hat the bicubic method has severe blurring and ringing artifacts.
textures of Woman image are more efficiently preserved in The performance of dictionary learning methods SCSR,
the proposed method, as shown in Fig. 8(l) and Fig. 9(l). SPMSR, and A+ is inferior to that of the deep learning
Figs. 10(a)- 13(a) represent the ground truth images of methods SCN, SRCNN, FSRCNN, VDSR, DWSR, and
Bird, Flowers, Baby, and Coastguard images, respectively. MSSCNN. However, the edge-preservation in some of these

VOLUME 9, 2021 71413


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

FIGURE 12. Factor 2 SR results on Baby image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN (g) SRCNN
(h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

FIGURE 13. Factor 2 SR results on Coastguard image: (a) Ground truth image (b) bicubic (c) SCSR (d) SPMSR (e) A+ (f) SCN
(g) SRCNN (h) FSRCNN (i) VDSR (j) DWSR (k) MSSCNN (l) Proposed.

methods is still unsatisfactory (e.g., Fig 10(f), Fig 11(k), and our method produces low Qblur and Qring indices for the
Fig 13(j)). The proposed method has a better performance three scaling factors and is comparable with A+ and deep
compared with other methods and is capable of preserving learning methods. We noticed that LRFI also yields low Qblur
the sharpness of edges (e.g., Fig. 10(l) and Fig11(l)). values for s = 2 and s = 3. However, its Qring values are
Table 1 presents PSNR, SSIM, Qblur and Qring indices for higher than A+ and the deep learning methods. Also, SCSR
s = 4. Each metric value in the table represents the average of and SPMSR methods have low Qblur and Qring values for
19 test images from ‘Set5’ and ‘Set14’. Similarly, the average s = 2, but much higher values for s = 3 and s = 4. For
metric indices for s = 3 and s = 2 are listed in Table 2 and better performance of an SR algorithm, high PSNR, SSIM,
Table 3, respectively. low Qblur , and Qring indices are desired. Considering all the
It can be noticed that the proposed method achieves better four objective metrics together, it is evident that the proposed
results in terms of PSNR for all scaling factors. SSIM index method is quantitatively superior to the existing methods.
of the proposed for s = 4 is slightly inferior to MSSCNN In Table 4, we show the running times of different SR
method, but superior for s = 3 and s = 2. In addition, methods for s = 4. Each value in the table represents the

71414 VOLUME 9, 2021


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

TABLE 1. PSNR, SSIM, Qblur and Qring metrics for upscaling factor 4. TABLE 4. Comparison of running times for upscaling factor 4.

TABLE 5. Comparison of running times with 1024 × 1024 pixel images for
upscaling factor 4.

method consume less than 5 seconds. NEDI, SPMSR, and


SRCNN require 10 seconds approximately. LRFI and SCSR
are the slowest methods, with execution times 37 seconds and
117 seconds.
In Table 5, we show the running times of different SR
methods for s = 4. However, all the 19 test images are first
TABLE 2. PSNR, SSIM, Qblur and Qring metrics for upscaling factor 3. resized to 1024 × 1024 before downsampling and upscaling
operation by a factor s. We can notice that bicubic, A+,
DWSR, and MSSCNN execute in 1 second approximately.
SCN, FSRCNN, and the proposed method require less than
10 seconds. NEDI, LRFI, SPMSR, and SRCNN have an
execution range between 20 to 50 seconds. Whereas VDSR
and SCSR demand 65 seconds and 598 seconds.
From the discussion based on Tables 1-3 and Figs. 5-13,
we summarize that our method is superior in terms of objec-
tive and subjective quality assessment. Besides, the low com-
putational time is advantageous for real-time implementation.

V. CONCLUSION
In this work, we presented a new single-image MR-SR algo-
rithm using the VDR-net in the stationary wavelet domain.
The idea of residual learning and the wavelet subbands
TABLE 3. PSNR, SSIM, Qblur and Qring metrics for upscaling factor 2. increase data sparsity in the training and testing phases.
As a result, our algorithm has less computational complexity
and hence suitable for 24fps real-time implementation. SWT
promises shift-invariance and superior directionality features
when compared to DWT. Besides, the edge-preservation
using Gaussian operation helps to maintain the intrinsic
structure of the SR images. We have shown improve-
ments over the conventional and state-of-the-art SR meth-
ods in PSNR, SSIM, Qblur , and Qring metrics. In addition,
the edge-preserving nature of the proposed SR method can
be identified from the subjective analysis.

REFERENCES
[1] J. Tian and K.-K. Ma, ‘‘A survey on super-resolution imaging,’’ Signal,
Image Video Process., vol. 5, no. 3, pp. 329–342, 2011.
[2] S. C. Park, M. K. Park, and M. G. Kang, ‘‘Super-resolution image recon-
struction: A technical overview,’’ IEEE signal Process. Mag., vol. 20, no. 3,
pp. 21–36, Oct. 2003.
[3] M. Krichen, S. Mechti, R. Alroobaea, E. Said, P. Singh, O. I. Khalaf, and
M. Masud, ‘‘A formal testing model for operating room control system
average of 19 test images. The experiments are conducted in using Internet of Things,’’ Comput., Mater. Continua, vol. 66, no. 3,
Matlab on a system with 8 GB RAM and Intel(R) Core(TM) pp. 2997–3011, 2021.
i5-7400 CPU: 3.00 GHz. It is noticed that bicubic, A+, [4] O. I. Khalaf, K. A. Ogudo, and M. Singh, ‘‘A fuzzy-based optimiza-
tion technique for the energy and spectrum efficiencies trade-off in
DWSR, and MSSCNN are the fastest methods that can exe- cognitive radio-enabled 5G network,’’ Symmetry, vol. 13, no. 1, p. 47,
cute in 1 second. SCN, FSRCNN, VDSR, and the proposed Dec. 2020.

VOLUME 9, 2021 71415


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

[5] X. Xiang, Q. Li, S. Khan, and O. I. Khalaf, ‘‘Urban water resource [29] C. Dong, C. C. Loy, K. He, and X. Tang, ‘‘Learning a deep convolutional
management for sustainable environment planning using artificial intel- network for image super-resolution,’’ in Proc. Eur. Conf. Comput. Vis.,
ligence techniques,’’ Environ. Impact Assessment Rev., vol. 86, Jan. 2021, 2014, pp. 184–199.
Art. no. 106515. [30] C. Dong, C. C. Loy, K. He, and X. Tang, ‘‘Image super-resolution using
[6] H. Hou and H. Andrews, ‘‘Cubic splines for image interpolation and digital deep convolutional networks,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
filtering,’’ IEEE Trans. Acoust., Speech, Signal Process., vol. 26, no. 6, vol. 38, no. 2, pp. 295–307, Feb. 2016.
pp. 508–517, Dec. 1978. [31] T.-W. Hui, C. C. Loy, and X. Tang, ‘‘Depth map super-resolution by
[7] X. Li and M. T. Orchard, ‘‘New edge-directed interpolation,’’ IEEE Trans. deep multi-scale guidance,’’ in Proc. Eur. Conf. Comput. Vis., 2016,
Image Process., vol. 10, no. 10, pp. 1521–1527, Mar. 2001. pp. 353–369.
[8] L. Zhang and X. Wu, ‘‘An edge-guided image interpolation algorithm via [32] S. Zhu, S. Liu, C. C. Loy, and X. Tang, ‘‘Deep cascaded bi-network for
directional filtering and data fusion,’’ IEEE Trans. Image Process., vol. 15, face hallucination,’’ in Proc. Eur. Conf. Comput. Vis., 2016, pp. 614–630.
no. 8, pp. 2226–2238, Aug. 2006. [33] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, ‘‘Deep networks for image
[9] H. Chavez-Roman and V. Ponomaryov, ‘‘Super resolution image gen- super-resolution with sparse prior,’’ in Proc. Int. Conf. Comput. Vis., 2015,
eration using wavelet domain interpolation with edge extraction via a pp. 370–378.
sparse representation,’’ IEEE Geosci. Remote Sens. Lett., vol. 11, no. 10, [34] C. Dong, C. C. Loy, and X. Tang, ‘‘Accelerating the super-resolution
pp. 1777–1781, Oct. 2014. convolutional neural network,’’ in Proc. Eur. Conf. Comput. Vis., 2016,
[10] Y. Chen, K. Niu, Z. Zeng, and Y. Pan, ‘‘A wavelet based deep learn- pp. 391–407.
ing method for underwater image super resolution reconstruction,’’ IEEE [35] C. Ledig, ‘‘Photo-realistic single image super-resolution using a generative
Access, vol. 8, pp. 117759–117769, 2020. adversarial network,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
[11] G. Suryanarayana and R. Dhuli, ‘‘Super-resolution image reconstruc- Oct. 2017, pp. 4681–4690.
tion using dual-mode complex diffusion-based shock filter and singular [36] X. Wang, K. Yu, C. Dong, and C. Change Loy, ‘‘Recovering realistic tex-
value decomposition,’’ Circuits, Syst., Signal Process., vol. 36, no. 8, ture in image super-resolution by deep spatial feature transform,’’ in Proc.
pp. 3409–3425, 2017. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 606–615.
[12] M. V. Rao and V. B. Raju, ‘‘Image resolution enhancement technique [37] J.-B. Huang, A. Singh, and N. Ahuja, ‘‘Single image super-resolution from
using lifting wavelet and discrete wavelet transforms,’’ in Innovations in transformed self-exemplars,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Computer Science and Engineering. Hyderabad, India: Springer, 2016, Recognit., 2015, pp. 5197–5206.
pp. 235–239. [38] Z. Cui, H. Chang, S. Shan, B. Zhong, and X. Chen, ‘‘Deep network
[13] M. Z. Iqbal, A. Ghafoor, A. M. Siddiqui, M. M. Riaz, and U. Khalid, cascade for image super-resolution,’’ in Proc. Eur. Conf. Comput. Vis.,
‘‘Dual-tree complex wavelet transform and SVD based medical image res- 2014, pp. 49–64.
olution enhancement,’’ Signal Process., vol. 105, pp. 430–437, Dec. 2014. [39] Z. Wang, Y. Yang, Z. Wang, S. Chang, W. Han, J. Yang, and T. Huang,
[14] Y. Zhang, Q. Fan, F. Bao, Y. Liu, and C. Zhang, ‘‘Single-image super- ‘‘Self-tuned deep super resolution,’’ in Proc. IEEE Conf. Comput. Vis.
resolution based on rational fractal interpolation,’’ IEEE Trans. Image Pattern Recognit. Workshops, 2015, pp. 1–8.
Process., vol. 27, no. 8, pp. 3782–3797, Aug. 2018. [40] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
[15] K. Shao, Q. Fan, Y. Zhang, F. Bao, and C. Zhang, ‘‘Noisy single image large-scale image recognition,’’ 2014, arXiv:1409.1556. [Online]. Avail-
super-resolution based on local fractal feature analysis,’’ IEEE Access, able: http://arxiv.org/abs/1409.1556
vol. 9, pp. 33385–33395, 2021. [41] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learn-
[16] H. Chang, D.-Y. Yeung, and Y. Xiong, ‘‘Super-resolution through neighbor ing applied to document recognition,’’ Proc. IEEE, vol. 86, no. 11,
embedding,’’ in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern pp. 2278–2324, Oct. 1998.
Recognit., vol. 1, Jul. 2004, p. 1. [42] Y. Huang, L. Shao, and A. F. Frangi, ‘‘Simultaneous super-resolution and
[17] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, ‘‘Low- cross-modality synthesis of 3D medical images using weakly-supervised
complexity single-image super-resolution based on nonnegative neighbor joint convolutional sparse coding,’’ in Proc. IEEE Conf. Comput. Vis.
embedding,’’ in Proc. BMVC, 2012, pp. 1–5. Pattern Recognit. (CVPR), Jul. 2017, pp. 5787–5796.
[18] J. Yang, J. Wright, T. S. Huang, and Y. Ma, ‘‘Image super-resolution [43] D. Martin, ‘‘A database of human segmented natural images and its appli-
via sparse representation,’’ IEEE Trans. Image Process., vol. 19, no. 11, cation to evaluating segmentation algorithms and measuring ecological
pp. 2861–2873, Nov. 2010. statistics,’’ in Proc. ICCCV, Vancouver, BC, Canada, 2001, pp. 1–10.
[19] R. Zeyde, M. Elad, and M. Protter, ‘‘On single image scale-up using sparse- [44] P. Marziliano, F. Dufaux, S. Winkler, and T. Ebrahimi, ‘‘Perceptual blur
representations,’’ in Proc. Int. Conf. curves Surf., 2010, pp. 711–730. and ringing metrics: Application to JPEG2000,’’ Signal Process., Image
[20] W. Dong, L. Zhang, and G. Shi, ‘‘Centralized sparse representation Commun., vol. 19, no. 2, pp. 163–172, Feb. 2004.
for image restoration,’’ in Proc. Int. Conf. Comput. Vis., Nov. 2011,
pp. 1259–1266.
[21] W. Dong, L. Zhang, G. Shi, and X. Li, ‘‘Nonlocally centralized sparse
representation for image restoration,’’ IEEE Trans. Image Process., vol. 22,
no. 4, pp. 1620–1630, Apr. 2013.
[22] T. Peleg and M. Elad, ‘‘A statistical prediction model based on sparse
representations for single image super-resolution,’’ IEEE Trans. Image
Process., vol. 23, no. 6, pp. 2569–2582, Jun. 2014.
[23] R. Timofte, V. De, and L. V. Gool, ‘‘Anchored neighborhood regression for
fast example-based super-resolution,’’ in Proc. IEEE Int. Conf. Comput.
Vis., Dec. 2013, pp. 1920–1927.
[24] R. Timofte, V. De, and L. V. Gool, ‘‘A+: Adjusted anchored neighborhood
regression for fast super-resolution,’’ in Proc. Asian Conf. Comput. Vis.,
2014, pp. 111–126.
[25] J. Kim, J. Kwon Lee, and K. Mu Lee, ‘‘Accurate image super-resolution GUNNAM SURYANARAYANA received the
using very deep convolutional networks,’’ in Proc. IEEE Conf. Comput.
B.Tech. and M.Tech. degrees from Jawaharlal
Vis. Pattern Recognit., 2016, pp. 1646–1654.
Nehru Technological University, India, in 2008 and
[26] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for
image recognition,’’ in Proc. IEEE Conf. Comput. Vis. pattern Recognit.,
2010, respectively, and the Ph.D. degree from the
Oct. 2016, pp. 770–778. School of Electronics Engineering, VIT Univer-
[27] T. Guo, H. S. Mousavi, T. H. Vu, and V. Monga, ‘‘Deep wavelet prediction sity, Vellore, India, in 2016. He is currently a Post-
for image super-resolution,’’ in Proc. IEEE Conf. Comput. Vis. Pattern doctoral Researcher with the Institute of Image
Recognit. Workshops (CVPRW), Jul. 2017, pp. 104–113. Processing and Pattern Recognition, Shanghai Jiao
[28] G. Suryanarayana, E. Tu, and J. Yang, ‘‘Infrared super-resolution imaging Tong University, Shanghai, China. His research
using multi-scale saliency and deep wavelet residuals,’’ Infr. Phys. Tech- interests include image super-resolution, image
nol., vol. 97, pp. 177–186, Mar. 2019. fusion, and medical imaging.

71416 VOLUME 9, 2021


G. Suryanarayana et al.: Accurate MR Image SR Using Deep Networks and Gaussian Filtering

KARTHIK CHANDRAN (Senior Member, IEEE) YOUSEEF ALOTAIBI received the master’s degree
received the B.E. degree in electronics and instru- in information technology (computer network)
mentation engineering from the Kamaraj College from La Trobe University, Melbourne, Australia,
of Engineering and Technology, India, in 2007, in 2009, and the Ph.D. degree from the Depart-
and the M.Tech. degree in control and instru- ment of Computer Science and Computer Engi-
mentation engineering and the Ph.D. degree from neering, La Trobe University, in 2014. He is cur-
the Kalasalingam Academy of Research and rently an Associate Professor with the Department
Education (KARE), in 2011 and 2017, respec- of Computer Science, College of Computer and
tively. He served as a Postdoctoral Researcher Information Systems, Umm Al-Qura University,
with Shanghai Jiao Tong University, China, from Saudi Arabia. He has published several interna-
2018 to 2020. He is currently serving as an Associate Professor with tional journals and conference papers. His research interests include business
the Department of Mechatronics Engineering, Jyothi Engineering College, process modelling, business process reengineering, information systems,
Thrissur. He has published several international journals and conference security, business, and IT alignment, software engineering, system analysis
papers. His research interests include time delay control problem, nonlinear and design, sustainability, and smart cities development.
system identification, cascade control systems, and unmanned vehicle.

ABDULMAJEED ALSUFYANI received the bach-


elor’s degree (Hons.) in computer science from
OSAMAH IBRAHIM KHALAF received the Taif University, Saudi Arabia, in 2006, and the
B.Sc. degree in software engineering from master’s and Ph.D. degrees in computer science
Al-Rafidain University College, Iraq, the M.Sc. from the University of Kent, U.K., in 2010 and
degree in computer engineering from Belaru- 2015, respectively. He is currently an Associate
sian National Technical University, and the Ph.D. Professor of computer science with the College
degree in computer networks from the Faculty of Computers and Information Technology, Taif
of Computer Systems and Software Engineering, University. His research interests include compu-
University Malaysia Pahang. He is currently a tational intelligence, computational neuroscience,
Senior Engineering and a Telecommunications and machine learning algorithms.
Lecturer with Al-Nahrain University. He has hold
17 years of university-level teaching experience in computer science and
network technology, and has a strong CV about research activities in SALEH AHMED ALGHAMDI received the Bach-
computer science and information technology projects. He has had many elor of Education degree (Hons.) from the Depart-
published articles indexed in ISI/Thomson Reuters and has also participated ment of Computer Science, Teachers College,
and presented at numerous international conferences. He has a patent and has Riyadh, Saudi Arabia, in 2004, the Master of Infor-
received several medals and awards due to his innovative work and research mation Technology degree from La Trobe Uni-
activities. He has good skills in software engineering, including experience versity, Melbourne, Australia, in 2010, and the
with Net, SQL development, database management, mobile applications Doctor of Philosophy degree in computer science
design, mobile techniques, Java development, android development, IOS from the Royal Melbourne Institute of Technol-
mobile development, cloud system and computations, and Website design. ogy (RMIT) University, Melbourne, in 2014, the-
He was the editor in chief and a main guest editor in many Scopus and SCI sis title A Context-aware Navigational Autonomy
index journals. His brilliant personal strengths are in highly self-motivated Aid for the Blind. He is currently an Associate Professor with the Department
team player who can work independently with minimum supervision, strong of Information Technology, College of Computers and Information Technol-
leadership skills, and outgoing personality. He has overseas work experiences ogy, Taif University, Taif, Saudi Arabia. His research interests include con-
in University with Binary University, Malaysia, and University Malaysia text awareness, positioning and navigation, and visually impaired assistance.
Pahang.

VOLUME 9, 2021 71417

You might also like