SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)

Overview

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020 Oral)

Python 3.7 pytorch 1.2.0 pyqt5 5.13.0

image Figure: Face image editing controlled via style images and segmentation masks with SEAN

We propose semantic region-adaptive normalization (SEAN), a simple but effective building block for Generative Adversarial Networks conditioned on segmentation masks that describe the semantic regions in the desired output image. Using SEAN normalization, we can build a network architecture that can control the style of each semantic region individually, e.g., we can specify one style reference image per region. SEAN is better suited to encode, transfer, and synthesize style than the best previous method in terms of reconstruction quality, variability, and visual quality. We evaluate SEAN on multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than the current state of the art. SEAN also pushes the frontier of interactive image editing. We can interactively edit images by changing segmentation masks or the style for any given region. We can also interpolate styles from two reference images per region.

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka
Computer Vision and Pattern Recognition CVPR 2020, Oral

[Paper] [Project Page] [Demo]

Installation

Clone this repo.

git clone https://github.com/ZPdesu/SEAN.git
cd SEAN/

This code requires PyTorch, python 3+ and Pyqt5. Please install dependencies by

pip install -r requirements.txt

This model requires a lot of memory and time to train. To speed up the training, we recommend using 4 V100 GPUs

Dataset Preparation

This code uses CelebA-HQ and CelebAMask-HQ dataset. The prepared dataset can be directly downloaded here. After unzipping, put the entire CelebA-HQ folder in the datasets folder. The complete directory should look like ./datasets/CelebA-HQ/train/ and ./datasets/CelebA-HQ/test/.

Generating Images Using Pretrained Models

Once the dataset is prepared, the reconstruction results be got using pretrained models.

  1. Create ./checkpoints/ in the main folder and download the tar of the pretrained models from the Google Drive Folder. Save the tar in ./checkpoints/, then run

    cd checkpoints
    tar CelebA-HQ_pretrained.tar.gz
    cd ../
    
  2. Generate the reconstruction results using the pretrained model.

    python test.py --name CelebA-HQ_pretrained --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/test/labels --image_dir datasets/CelebA-HQ/test/images --label_nc 19 --no_instance --gpu_ids 0
  3. The reconstruction images are saved at ./results/CelebA-HQ_pretrained/ and the corresponding style codes are stored at ./styles_test/style_codes/.

  4. Pre-calculate the mean style codes for the UI mode. The mean style codes can be found at ./styles_test/mean_style_code/.

    python calculate_mean_style_code.py

Training New Models

To train the new model, you need to specify the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, and --no_instance to denote the dataset doesn't have instance maps.

python train.py --name [experiment_name] --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/train/labels --image_dir datasets/CelebA-HQ/train/images --label_nc 19 --no_instance --batchSize 32 --gpu_ids 0,1,2,3

If you only have single GPU with small memory, please use --batchSize 2 --gpu_ids 0.

UI Introduction

We provide a convenient UI for the users to do some extension works. To run the UI mode, you need to:

  1. run the step Generating Images Using Pretrained Models to save the style codes of the test images and the mean style codes. Or you can directly download the style codes from here. (Note: if you directly use the downloaded style codes, you have to use the pretrained model.

  2. Put the visualization images of the labels used for generating in ./imgs/colormaps/ and the style images in ./imgs/style_imgs_test/. Some example images are provided in these 2 folders. Note: the visualization image and the style image should be picked from ./datasets/CelebAMask-HQ/test/vis/ and ./datasets/CelebAMask-HQ/test/labels/, because only the style codes of the test images are saved in ./styles_test/style_codes/. If you want to use your own images, please prepare the images, labels and visualization of the labels in ./datasets/CelebAMask-HQ/test/ with the same format, and calculate the corresponding style codes.

  3. Run the UI mode

    python run_UI.py --name CelebA-HQ_pretrained --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/test/labels --image_dir datasets/CelebA-HQ/test/images --label_nc 19 --no_instance --gpu_ids 0
  4. How to use the UI. Please check the detail usage of the UI from our Video.

    image

Other Datasets

Will be released soon.

License

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International) The code is released for academic research use only.

Citation

If you use this code for your research, please cite our papers.

@InProceedings{Zhu_2020_CVPR,
author = {Zhu, Peihao and Abdal, Rameen and Qin, Yipeng and Wonka, Peter},
title = {SEAN: Image Synthesis With Semantic Region-Adaptive Normalization},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Acknowledgments

We thank Wamiq Reyaz Para for helpful comments. This code borrows heavily from SPADE. We thank Taesung Park for sharing his codes. This work was supported by the KAUST Office of Sponsored Research (OSR) under AwardNo. OSR-CRG2018-3730.

Owner
Peihao Zhu
CS PhD at KAUST
Peihao Zhu
A library for optimization on Riemannian manifolds

TensorFlow RiemOpt A library for manifold-constrained optimization in TensorFlow. Installation To install the latest development version from GitHub:

Oleg Smirnov 83 Dec 27, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

348 Jan 07, 2023
GPU Accelerated Non-rigid ICP for surface registration

GPU Accelerated Non-rigid ICP for surface registration Introduction Preivous Non-rigid ICP algorithm is usually implemented on CPU, and needs to solve

Haozhe Wu 144 Jan 04, 2023
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
social humanoid robots with GPGPU and IoT

Social humanoid robots with GPGPU and IoT Social humanoid robots with GPGPU and IoT Paper Authors Mohsen Jafarzadeh, Stephen Brooks, Shimeng Yu, Balak

0 Jan 07, 2022
AITUS - An atomatic notr maker for CYTUS

AITUS an automatic note maker for CYTUS. 利用AI根据指定乐曲生成CYTUS游戏谱面。 效果展示:https://www

GradiusTwinbee 6 Feb 24, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification

Understanding Bayesian Classification This repository hosts the code to reproduce the results presented in the paper On Uncertainty, Tempering, and Da

Sanyam Kapoor 18 Nov 17, 2022
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Easy-to-use toolkit for retrieval-based Chatbot Recent Activity Our released RRS corpus can be found here. Our released BERT-FP post-training checkpoi

GMFTBY 32 Nov 13, 2022
Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering Pytorch implementation of our method for regularizing nerual radiance fields f

106 Jan 06, 2023
This repo is duplication of jwyang/faster-rcnn.pytorch

Faster RCNN Pytorch This repo is duplication of jwyang/faster-rcnn.pytorch C/C++ code are removed and easier to study. Python 3.8.5 Ubuntu 20.04.1 LTS

Kim Jihwan 1 Jan 14, 2022
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

This is a Pytorch implementation of Janai, J., Güney, F., Ranjan, A., Black, M. and Geiger, A., Unsupervised Learning of Multi-Frame Optical Flow with

Anurag Ranjan 110 Nov 02, 2022
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Improved Techniques for Training Score-Based Generative Models This repo contains the official implementation for the paper Improved Techniques for Tr

174 Dec 26, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022
Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
implicit displacement field

Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields [project page][paper][cite] Geometry-Consistent Neural Shape Represe

Yifan Wang 100 Dec 19, 2022