A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

Overview

faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture.

Updates

Date    Update
2018-08-27     Colab support: A colab notebook for faceswap-GAN v2.2 is provided.
2018-07-25     Data preparation: Add a new notebook for video pre-processing in which MTCNN is used for face detection as well as face alignment.
2018-06-29     Model architecture: faceswap-GAN v2.2 now supports different output resolutions: 64x64, 128x128, and 256x256. Default RESOLUTION = 64 can be changed in the config cell of v2.2 notebook.
2018-06-25     New version: faceswap-GAN v2.2 has been released. The main improvements of v2.2 model are its capability of generating realistic and consistent eye movements (results are shown below, or Ctrl+F for eyes), as well as higher video quality with face alignment.
2018-06-06     Model architecture: Add a self-attention mechanism proposed in SAGAN into V2 GAN model. (Note: There is still no official code release for SAGAN, the implementation in this repo. could be wrong. We'll keep an eye on it.)

Google Colab support

Here is a playground notebook for faceswap-GAN v2.2 on Google Colab. Users can train their own model in the browser.

[Update 2019/10/04] There seems to be import errors in the latest Colab environment due to inconsistent version of packages. Please make sure that the Keras and TensorFlow follow the version number shown in the requirement section below.

Descriptions

faceswap-GAN v2.2

  • FaceSwap_GAN_v2.2_train_test.ipynb

    • Notebook for model training of faceswap-GAN model version 2.2.
    • This notebook also provides code for still image transformation at the bottom.
    • Require additional training images generated through prep_binary_masks.ipynb.
  • FaceSwap_GAN_v2.2_video_conversion.ipynb

    • Notebook for video conversion of faceswap-GAN model version 2.2.
    • Face alignment using 5-points landmarks is introduced to video conversion.
  • prep_binary_masks.ipynb

    • Notebook for training data preprocessing. Output binary masks are save in ./binary_masks/faceA_eyes and ./binary_masks/faceB_eyes folders.
    • Require face_alignment package. (An alternative method for generating binary masks (not requiring face_alignment and dlib packages) can be found in MTCNN_video_face_detection_alignment.ipynb.)
  • MTCNN_video_face_detection_alignment.ipynb

    • This notebook performs face detection/alignment on the input video.
    • Detected faces are saved in ./faces/raw_faces and ./faces/aligned_faces for non-aligned/aligned results respectively.
    • Crude eyes binary masks are also generated and saved in ./faces/binary_masks_eyes. These binary masks can serve as a suboptimal alternative to masks generated through prep_binary_masks.ipynb.

Usage

  1. Run MTCNN_video_face_detection_alignment.ipynb to extract faces from videos. Manually move/rename the aligned face images into ./faceA/ or ./faceB/ folders.
  2. Run prep_binary_masks.ipynb to generate binary masks of training images.
    • You can skip this pre-processing step by (1) setting use_bm_eyes=False in the config cell of the train_test notebook, or (2) use low-quality binary masks generated in step 1.
  3. Run FaceSwap_GAN_v2.2_train_test.ipynb to train models.
  4. Run FaceSwap_GAN_v2.2_video_conversion.ipynb to create videos using the trained models in step 3.

Miscellaneous

Training data format

  • Face images are supposed to be in ./faceA/ or ./faceB/ folder for each taeget respectively.
  • Images will be resized to 256x256 during training.

Generative adversarial networks for face swapping

1. Architecture

enc_arch3d

dec_arch3d

dis_arch3d

2. Results

  • Improved output quality: Adversarial loss improves reconstruction quality of generated images. trump_cage

  • Additional results: This image shows 160 random results generated by v2 GAN with self-attention mechanism (image format: source -> mask -> transformed).

  • Evaluations: Evaluations of the output quality on Trump/Cage dataset can be found here.

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

  • VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face. It also smoothes out artifacts in the segmentation mask, resulting higher output quality.

  • Attention mask: Model predicts an attention mask that helps on handling occlusion, eliminating artifacts, and producing natrual skin tone.

  • Configurable input/output resolution (v2.2): The model supports 64x64, 128x128, and 256x256 outupt resolutions.

  • Face tracking/alignment using MTCNN and Kalman filter in video conversion:

    • MTCNN is introduced for more stable detections and reliable face alignment (FA).
    • Kalman filter smoothen the bounding box positions over frames and eliminate jitter on the swapped face. comp_FA
  • Eyes-aware training: Introduce high reconstruction loss and edge loss in eyes area, which guides the model to generate realistic eyes.

Frequently asked questions and troubleshooting

1. How does it work?

  • The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this. flow_chart

2. Previews look good, but it does not transform to the output videos?

  • Model performs its full potential when the input images are preprocessed with face alignment methods.
    • readme_note001

Requirements

Acknowledgments

Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and reddit user deepfakes' project. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

Junha Lee 10 Dec 02, 2022
Crowd-sourced Annotation of Human Motion.

Motion Annotation Tool Live: https://motion-annotation.humanoids.kit.edu Paper: The KIT Motion-Language Dataset Installation Start by installing all P

Matthias Plappert 4 May 25, 2020
LegoDNN: a block-grained scaling tool for mobile vision systems

Table of contents 1 Introduction 1.1 Major features 1.2 Architecture 2 Code and Installation 2.1 Code 2.2 Installation 3 Repository of DNNs in vision

41 Dec 24, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins Deep learning has brought most profound contributio

Narinder Singh Punn 12 Dec 04, 2022
Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

Scene Graph Generation Object Detections Ground truth Scene Graph Generated Scene Graph In this visualization, woman sitting on rock is a zero-shot tr

Boris Knyazev 93 Dec 28, 2022
Survival analysis in Python

What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu

Cameron Davidson-Pilon 2k Jan 08, 2023
An implementation of Deep Graph Infomax (DGI) in PyTorch

DGI Deep Graph Infomax (Veličković et al., ICLR 2019): https://arxiv.org/abs/1809.10341 Overview Here we provide an implementation of Deep Graph Infom

Petar Veličković 491 Jan 03, 2023
Tensorflow-Project-Template - A best practice for tensorflow project template architecture.

Tensorflow Project Template A simple and well designed structure is essential for any Deep Learning project, so after a lot of practice and contributi

Mahmoud G. Salem 3.6k Dec 22, 2022
Connecting Java/ImgLib2 + Python/NumPy

imglyb imglyb aims at connecting two worlds that have been seperated for too long: Python with numpy Java with ImgLib2 imglyb uses jpype to access num

ImgLib2 29 Dec 21, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021
Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

Phil Wang 1.5k Jan 02, 2023
Adversarial Learning for Modeling Human Motion

Adversarial Learning for Modeling Human Motion This repository contains the open source code which reproduces the results for the paper: Adversarial l

wangqi 6 Jun 15, 2021
Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

Sayak Paul 19 Dec 11, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
Hypersearch weight debugging and losses tutorial

tutorial Activate tensorboard option Running TensorBoard remotely When working on a remote server, you can use SSH tunneling to forward the port of th

1 Dec 11, 2021
A Pythonic library for Nvidia Codec.

A Pythonic library for Nvidia Codec. The project is still in active development; expect breaking changes. Why another Python library for Nvidia Codec?

Zesen Qian 12 Dec 27, 2022
Code for "Learning to Regrasp by Learning to Place"

Learning2Regrasp Learning to Regrasp by Learning to Place, CoRL 2021. Introduction We propose a point-cloud-based system for robots to predict a seque

Shuo Cheng (成硕) 18 Aug 27, 2022
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built

remsens-lim 2 Apr 28, 2022
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''

Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu

1 Nov 10, 2021