Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Last update: Dec 23, 2022

Overview

SAFA: Structure Aware Face Animation (3DV2021)

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Getting Started

git clone https://github.com/Qiulin-W/SAFA.git

Installation

Python 3.6 or higher is recommended.

1. Install PyTorch3D

Follow the guidance from: https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md.

2. Install Other Dependencies

To install other dependencies run:

pip install -r requirements.txt

Usage

1. Preparation

a. Download FLAME model, choose FLAME 2020 and unzip it, put generic_model.pkl under ./modules/data.

b. Download head_template.obj, landmark_embedding.npy, uv_face_eye_mask.png and uv_face_mask.png from DECA/data, and put them under ./module/data.

c. Download SAFA model checkpoint from Google Drive and put it under ./ckpt.

d. (Optional, required by the face swap demo) Download the pretrained face parser from face-parsing.PyTorch and put it under ./face_parsing/cp.

2. Demos

We provide demos for animation and face swap.

a. Animation demo

python animation_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --relative --adapt_scale --find_best_frame

b. Face swap demo We adopt face-parsing.PyTorch for indicating the face regions in both the source and driving images.

For preprocessed source images and driving videos, run:

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video

For arbitrary images and videos, we use a face detector to detect and swap the corresponding face parts. Cropped images will be resized to 256*256 in order to fit to our model.

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --use_detection

Training

We modify the distributed traininig framework used in that of the First Order Motion Model. Instead of using torch.nn.DataParallel (DP), we adopt torch.distributed.DistributedDataParallel (DDP) for faster training and more balanced GPU memory load. The training procedure is divided into two steps: (1) Pretrain the 3DMM estimator, (2) End-to-end Training.

3DMM Estimator Pre-training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/pretrain.yaml

End-to-end Training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/end2end.yaml --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Evaluation / Inference

Video Reconstrucion

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode reconstruction

Image Animation

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode animation

3D Face Reconstruction

python tdmm_inference.py --data_dir directory/to/images --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Dataset and Preprocessing

We use VoxCeleb1 to train and evaluate our model. Original Youtube videos are downloaded, cropped and splited following the instructions from video-preprocessing.

a. To obtain the facial landmark meta data from the preprocessed videos, run:

python video_ldmk_meta.py --video_dir directory/to/preprocessed_videos out_dir directory/to/output_meta_files

b. (Optional) Extract images from videos for 3DMM pretraining:

python extract_imgs.py

Citation

If you find our work useful to your research, please consider citing:

@article{wang2021safa,
  title={SAFA: Structure Aware Face Animation},
  author={Wang, Qiulin and Zhang, Lu and Li, Bo},
  journal={arXiv preprint arXiv:2111.04928},
  year={2021}
}

License

Please refer to the LICENSE file.

Acknowledgement

Here we provide the list of external sources that we use or adapt from:

Codes are heavily borrowed from First Order Motion Model, LICENSE.
Some codes are also borrowed from: a. FLAME_PyTorch, LICENSE b. generative-inpainting-pytorch, LICENSE c. face-parsing.PyTorch, LICENSE d. video-preprocessing.
We adopt FLAME model resources from: a. DECA, LICENSE b. FLAME, LICENSE
External Libaraies: a. PyTorch3D, LICENSE b. face-alignment, LICENSE

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Related tags

Overview

SAFA: Structure Aware Face Animation (3DV2021)

Getting Started

Installation

1. Install PyTorch3D

2. Install Other Dependencies

Usage

1. Preparation

2. Demos

Training

3DMM Estimator Pre-training

End-to-end Training

Evaluation / Inference

Video Reconstrucion

Image Animation

3D Face Reconstruction

Dataset and Preprocessing

Citation

License

Acknowledgement

Owner

QiulinW

Drone detection using YOLOv5

Implementation of ICLR 2020 paper "Revisiting Self-Training for Neural Sequence Generation"

classification task on dataset-CIFAR10,by using Tensorflow/keras

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

Deep Learning and Logical Reasoning from Data and Knowledge

PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

[ICLR'21] Counterfactual Generative Networks

Implementation of E(n)-Transformer, which extends the ideas of Welling's E(n)-Equivariant Graph Neural Network to attention

Code for CPM-2 Pre-Train

A Small and Easy approach to the BraTS2020 dataset (2D Segmentation)

Software Platform for solving and manipulating multiparametric programs in Python

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

This is an official implementation for "PlaneRecNet".

Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"