Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Last update: Jan 07, 2023

Related tags

Overview

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

The following results are obtained by our SCUNet with purely synthetic training data! We did not use the paired noisy/clean data by DND and SIDD during training!

Swin-Conv-UNet (SCUNet) denoising network

The architecture of the proposed Swin-Conv-UNet (SCUNet) denoising network. SCUNet exploits the swin-conv (SC) block as the main building block of a UNet backbone. In each SC block, the input is first passed through a 1×1 convolution, and subsequently is split evenly into two feature map groups, each of which is then fed into a swin transformer (SwinT) block and residual 3×3 convolutional (RConv) block, respectively; after that, the outputs of SwinT block and RConv block are concatenated and then passed through a 1×1 convolution to produce the residual of the input. “SConv” and “TConv” denote 2×2 strided convolution with stride 2 and 2×2 transposed convolution with stride 2, respectively.

New data synthesis pipeline for real image denoising

Schematic illustration of the proposed paired training patches synthesis pipeline. For a high quality image, a randomly shuffled degradation sequence is performed to produce a noisy image. Meanwhile, the resizing and reverse-forward tone mapping are performed to produce a corresponding clean image. A paired noisy/clean training patches are then cropped for training deep blind denoising model. Note that, since Poisson noise is signal-dependent, the dashed arrow for “Poisson” means the clean image is used to generate the Poisson noise. To tackle with the color shift issue, the dashed arrow for “Camera Sensor” means the reverse-forward tone mapping is performed on the clean image.

Synthesized noisy/clean patch pairs via our proposed training data synthesis pipeline. The size of the high quality image patch is 544×544. The size of the noisy/clean patches is 128×128.

Web Demo

Try Replicate web demo for SCUNet models here

Codes

Download SCUNet models

python main_download_pretrained_models.py --models "SCUNet" --model_dir "model_zoo"

Gaussian denoising

grayscale images

python main_test_scunet_gray_gaussian.py --model_name scunet_gray_25 --noise_level_img 25 --testset_name set12

color images

python main_test_scunet_color_gaussian.py --model_name scunet_color_25 --noise_level_img 25 --testset_name bsd68

Blind real image denoising

python main_test_scunet_real_application.py --model_name scunet_color_real_psnr --testset_name real3

Results on Gaussian denoising

Results on real image denoising

@article{zhang2022practical,
title={Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis},
author={Zhang, Kai and Li, Yawei and Liang, Jingyun and Cao, Jiezhang and Zhang, Yulun and Tang, Hao and Timofte, Radu and Van Gool, Luc},
journal={arXiv preprint},
year={2022}
}

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Related tags

Overview

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Swin-Conv-UNet (SCUNet) denoising network

New data synthesis pipeline for real image denoising

Web Demo

Codes

Results on Gaussian denoising

Results on real image denoising

Owner

Kai Zhang

Fastquant - Backtest and optimize your trading strategies with only 3 lines of code!

MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation

A toolkit for document-level event extraction, containing some SOTA model implementations

A curated list and survey of awesome Vision Transformers.

Human Pose Detection on EdgeTPU

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

A "gym" style toolkit for building lightweight Neural Architecture Search systems

An implementation of the proximal policy optimization algorithm

Human Detection - Pedestrian Detection using OpenCV Python

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

A python library for face detection and features extraction based on mediapipe library

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Learning to Predict Gradients for Semi-Supervised Continual Learning

PyTorch Implementation of Temporal Output Discrepancy for Active Learning, ICCV 2021

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Predicting Event Memorability from Contextual Visual Semantics

C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.