NAFNET
Introduction
- NAFNET is a lightweight neural network, implemented by the Liangyu
Chen*, Xiaojie Chu* in their Research paper “Simple Baselines for Image
Restoration”
- As we know Deep Learning models have significantly outperformed the
general methods such as band filtering across pixels, mean filtering,
gaussian filters or fourier transforms.
- Not only does this model outperform these orthodox methods, but also
goes on to demonstrate an even higher accuracy in both Denoising and
Deblurring, but for our purpose, we will keep it down to the Denoising
alone.
Key Features
- Nonlinear Activation Free
This corresponds to the NAF in NAFNET
In NAFNET, it has been demonstrated that you can not only achieve
performance that exceeds current State-of-the-art models used in
Deep learning for denoising, but also do so by removing non-linear
activation functions such as RELU, Softmax, Sigmoid, Etc
Key Features
- Nonlinear Activation Free
They have found that GELU in the baseline of the model can be
further simplified into a special case of GLU (Gated Linear Unit)
Gaussian Linear Error Unit -> Gated Linear Unit
This has been tried and tested in “
Language Modeling with Gated Convolutional Networks “
Key Features
- Nonlinear Activation Free
After reducing this GELU into GLU, they have demonstrated how this
can be reduced into element-wise product of feature maps.
Apart from this, CA (Channel Attention) has also been shown to
replaceable by to a form of GLU, called SCA (Simplified Channel
Attention)
Key Features
- Nonlinear Activation Free
As we saw Channel Attention focuses on squeezing the features into
convolutions and then carry out Non-Linear Activation, like this
Now that, when reduced in a GLU Fashion, which is by only retaining
two most important roles of channel attention, Aggregating
Global Information and Channel Information Interaction into
Key Features
- Efficiency and Performance
After successfully removing and simplifying the non-linear
activations in the model, we are working with linear transformations
processing the features in an efficient manner.
We reduce the memory and time complexity of self-attention by
channelwise attention map rather than spatial wise, this also allows
for capturing features block wise.
Key Features
- Efficiency and Performance
The proposed baseline and NAFNet is able to achieve State-of-the-art results
while being computationally efficient:
33.69 dB on GoPro, exceed previous SOTA by 0.38 dB
With 8.4% of its computational costs
40.30 dB on SIDD, exceed by 0.28 dB
With less than half of its computational costs
Dataset (SIDD)
SIDD
- SIDD, the Smartphone Image Denoising Dataset (SIDD)– of ~30,000
noisy images from 10 scenes under different lighting conditions
using five representative smartphone cameras and generated their
ground truth
- Real Noisy Images and generated subsequent ground truth images
for them
- This dataset tries to extract and work with the the effect of spatial
misalignment among images due to lens motion (i.e., optical
stabilization) and radial distortion, and the effect of clipped
intensities due to low-light conditions or over-exposure.
SIDD
- They have used a direct current (DC) light source to avoid the
flickering effect of alternating current (AC) lights.
- Usage of five smartphone cameras
- Apple iPhone 7
- Google Pixel
- Samsung Galaxy S6 Edge
- Motorola Nexus 6
- LG G4
SIDD
- 10 different scenes
- 15 different ISO levels ranging from 50 up to 10,000 to obtain a
variety of noise levels
- 3 Illumination temperatures to simulate the effect of different light
sources: 3200K for tungsten or halogen, 4400K for fluorescent
lamps, and 5500K for daylight.
- Three light brightness levels: low, normal, and high
- (10 scenes x 5 cameras x 4 conditions x 150 images) => 30,000
images