The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Last update: Nov 10, 2022

Overview

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021)

Project Page | Paper

Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai

GOF can synthesize high-quality images with high 3D consistency and simultaneously learn compact and smooth object surfaces.

Requirements

Python 3.8 is used. Basic requirements are listed in the requirements.txt

pip install -r requirements.txt

Training

We have put several bash files of BFM, CelebA, and Cats datasets in auto_bash for reference. The adopted hyperparameters in our paper has been listed in the curriculums.py file.

If you want to train with your own dataset, you should set the hyperparameters carefully, especially those related to the camera pose distribution. Just as the settings in the curriculums.py file, you can leverage some camera pose predictors to obtain the rough 'h_stddev' and 'v_stddev', and tune them according to the corresponding performance. Besides, you should add the dataset class in dataset.py and modify the reference bash file to fit your own dataset accordingly.

Evaluation

Evaluation Metrics

To calculate FID/IS/KID scores, please run

python eval_metrics.py path/to/generator.pth --real_image_dir path/to/real_images --curriculum CURRICULUM

To calculate weighted variance proposed in the paper, please run

python cal_weighted_var.py path/to/generator.pth --curriculum CURRICULUM

Render Multi-view Images

python render_multiview_images.py path/to/generator.pth --curriculum CURRICULUM --seeds_start 0 --seeds_end 100

Render Videos

python render_video.py path/to/generator.pth --curriculum CURRICULUM --seed 0

After running, you will obtain a series of images in a specific folder. And then you can transfer them into a video with ffmpeg:

ffmpeg -r 15 -f image2 -i xxx.png -c:v libx264 -crf 25 -pix_fmt yuv420p xxx.mp4

Similarly, you can render videos interpolating bettween given latent codes/seeds following:

python render_video_interpolation.py path/to/generator.pth --curriculum CURRICULUM --seeds 0 1 2 3

Extract 3D Shapes

You should first generate a voxel npy file by running:

python extract_shapes.py path/to/generator.pth --curriculum CURRICULUM --seed 0

and render it to the corresponding multi-view images with the render_meshimg.py script.

Pretrained Models

We provide pretrained models for BFM, CelebA, and Cats. Please refer to this link.

As mentioned in the supplementary, the training of all models starts from an early (about 2K iterations) pretrained model with the correct outward-facing faces. We also provide the early pretrained models for three datasets in this link. If you want to start from the early pretrained models, you can replace the 'load_dir' name in bash files in auto_bash with the corresponding path of these pretrained models. Since the optimizer parameters are not provided here, you may need to comment L138~139 out.

Citation

If you find this codebase useful for your research, please cite:

@inproceedings{xu2021generative,
  title={Generative Occupancy Fields for 3D Surface-Aware Image Synthesis},
  author={Xu, Xudong and Pan, Xingang and Lin, Dahua and Dai, Bo},
  booktitle={Advances in Neural Information Processing Systems(NeurIPS)},
  year={2021}
}

Acknowledgement

The structure of this codebase is borrowed from pi-GAN.

The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Related tags

Overview

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021)

Project Page | Paper

Requirements

Training

Evaluation

Evaluation Metrics

Render Multi-view Images

Render Videos

Extract 3D Shapes

Pretrained Models

Citation

Acknowledgement

Owner

xuxudong

Simple implementation of OpenAI CLIP model in PyTorch.

This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

Predictive Maintenance LSTM

Official implementation of Monocular Quasi-Dense 3D Object Tracking

No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

Geometric Algebra package for JAX

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

Implementation of H-UCRL Algorithm

A curated list of awesome Deep Learning tutorials, projects and communities.

pip install python-office

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

Code for binary and multiclass model change active learning, with spectral truncation implementation.

An example showing how to use jax to train resnet50 on multi-node multi-GPU

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)