Low-Light-Image-Enhancement-Using-MIRNet

The project entails two main objectives: to understand the entire architecture of the underlying neural network, and to implement a deep learning based image processing algorithm on an embedded device, namely the Arduino Nano 33 BLE Sense.

Introduction

Recently CNNs(Convolutional Neural Networks) have played a key role in various image processing tasks.Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations.In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. MIRNet Architecture has been designed in such a manner that it achieve both the goals collectively: Maintaining spatially precise high-resolution representations through the entire network and receive strong contextual information from low resolution representations. We have adopted the MIRNet architecture for this project and have referred to a research paper for the same.

Technologies Used

MIRNet Architecture

The core of the MIRNet Architecture is a multi-scale residual block containing several key elements:
(a) parallel multi-resolution convolution streams for extracting multi-scale features.
(b) information exchange across the multi-resolution streams.
(c) spatial and channel attention mechanisms for capturing contextual information.
(d) attention based multi-scale feature aggregation.

Framework of the MIRNet Architecture

Overall Pipeline

Let 𝐈 be an Image of dimensions ℝ^HxWx3. The network first applies a convolutional layer to extract low-level features 𝐗_𝐎 ∈ ℝ^HxWxC. Next, the feature maps 𝐗_𝐎 to pass through N number of recursive residual groups (RRGs), which yield deep features 𝐗_𝐝 ∈ ℝ^HxWxC. RRG contains several multi-scale residual blocks. In the next step we apply one more convolutional layer to deep features 𝐗_𝐝 to obtain a residual image 𝐑 ∈ ℝ^HxWx3. The restored image is obtained as follows: Î = 𝐈 + 𝐑. We use Charbonnier loss to optimize our proposed network. $$𝓛(Î,𝐈*) = \sqrt{||Î - 𝐈*||^2 + ε^2}$$ where,
𝐈* denotes the ground-truth image
ε is a constant which we emperically set to 10^-3 for all the experiments.

Multi-scale Residual Block

The research paper proposes a multi scale residual block which is capable of generating a spatially-precise output by maintaining high-resolution representations, while receiving rich contextual information from low-resolution representations.The MRB consists of multiple (three in this paper) fully-convolutional streams connected in parallel. It allows information exchange across parallel streams in order to consolidate the high-resolution features with the help of low-resolution features, and vice versa. The components of MRB are described as follows:

1. Selective Kernel Feature Fusion(SKFF)

SKFF module performs dynamic adjustment of receptive fields via two operations: Fuse and Select.The fuse operator generates global feature descriptors by combining the information from multi-resolution streams. The select operator uses these descriptors to recalibrate the feature maps (of different streams) followed by their aggregation.

Schematic for SKFF

2. Dual Attention Unit(DAU)

While the SKFF block fuses information across multi-resolution branches, there is a need for a mechanism to share information within a feature tensor, both along the spatial and the channel dimensions. For this purpose we require DAU and the feature recalibration is achieved by using channel and spatial attention mechanisms.
Channel Attention branch exploits the inter-channel relationships of the convolutional feature maps by applying squeeze and excitation operations
Spatial Attention branch is designed to exploit the inter-spatial dependencies of convolutional features.

DAU with channel and spatial attention mechanisms

3. Residual Resizing Modules

In order to maintain the residual nature of the architecture(refers to the use of skip connections to create residual blocks that allow information to flow directly from one layer to another, bypassing one or more intermediate layers), we introduce residual resizing modules to perform downsampling and upsampling operations.

RRMs to perform upsampling and downsampling

Dataset

For the purpose of image enhancement, the architecture is trained on LoL dataset. LoL is created for low-light image enhancement problem. It consists of 485 images for training and 15 images for testing. Each image pair in LoL consists of a low-light input image and its corresponding well-exposed reference image.

Performance and Results

References

Project Mentors:

Project Mentees:

Dev Goti
K V Srinanda
Shubham Swadi
Sushree Shivani Sethi
D Jagannadha Reddy

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
List of Courses.docx		List of Courses.docx
MIRNet_bare.ipynb		MIRNet_bare.ipynb
MIRNet_v1.ipynb		MIRNet_v1.ipynb
MIRNet_v2.ipynb		MIRNet_v2.ipynb
MIRNet_v3.ipynb		MIRNet_v3.ipynb
MIRNet_v4.ipynb		MIRNet_v4.ipynb
README.md		README.md
tf_lite_convert.ipynb		tf_lite_convert.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Low-Light-Image-Enhancement-Using-MIRNet

Introduction

Technologies Used