DeepVoxels is an object-specific, persistent 3D feature embedding.

Overview

DeepVoxels

DeepVoxels is an object-specific, persistent 3D feature embedding. It is found by globally optimizing over all available 2D observations of an object in a deeplearning framework. At test time, the training set can be discarded, and DeepVoxels can be used to render novel views of the same object.

deepvoxels_video

Usage

Installation

This code was developed in python 3.7 and pytorch 1.0. I recommend to use anaconda for dependency management. You can create an environment with name "deepvoxels" with all dependencies like so:

conda env create -f environment.yml

High-Level structure

The code is organized as follows:

  • dataio.py loads training and testing data.
  • data_util.py and util.py contain utility functions.
  • run_deepvoxels.py contains the training and testing code as well as setting up the dataset, dataloading, command line arguments etc.
  • deep_voxels.py contains the core DeepVoxels model.
  • custom_layers.py contains implementations of the integration and occlusion submodules.
  • projection.py contains utility functions for 3D and projective geometry.

Data

The datasets have been rendered from a set of high-quality 3D scans of a variety of objects. The datasets are available for download here. Each object has its own directory, which is the directory that the "data_root" command-line argument of the run_deepvoxels.py script is pointed to.

Coordinate and camera parameter conventions

This code uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates.

The code also reads an "intrinsics.txt" file from the dataset directory. This file is expected to be structured as follows:

f cx cy
origin_x origin_y origin_z
near_plane (if 0, defaults to sqrt(3)/2)
scale
img_height img_width

The focal length, cx and cy are in pixels. (origin_x, origin_y, origin_z) denotes the origin of the voxel grid in world coordinates. The near plane is also expressed in world units. Per default, each voxel has a sidelength of 1 in world units - the scale is a factor that scales the sidelength of each voxel. Finally, height and width are the resolution of the image.

To create your own dataset, I recommend using the amazing, open-source Colmap. Follow the instructions on the website to install it. I have written a little wrapper in python that will automatically reconstruct a directory of images, and then extract the camera extrinsic & intrinsic camera parameters. It can be used like so:

python colmap_wrapper.py --img_dir [path to directory with images] \
                         --trgt_dir [path where output will be written to] 

To get the scale and origin of the voxel grid as well as the near plane, one has to inspect the reconstructed point cloud and manually edit the intrinsics.txt file written out by colmap_wrapper.py.

Training

  • See python run_deepvoxels.py --help for all train options. Example train call:
python run_deepvoxels.py --train_test train \
                         --data_root [path to directory with dataset] \
                         --logging_root [path to directory where tensorboard summaries and checkpoints should be written to] 

To monitor progress, the training code writes tensorboard summaries every 100 steps into a "runs" subdirectory in the logging_root.

Testing

Example test call:

python run_deepvoxels.py --train_test test \
                         --data_root [path to directory with dataset] ]
                         --logging_root [path to directoy where test output should be written to] \
                         --checkpoint [path to checkpoint]

Misc

Citation:

If you find our work useful in your research, please consider citing:

@inproceedings{sitzmann2019deepvoxels,
	author = {Sitzmann, Vincent 
	          and Thies, Justus 
	          and Heide, Felix 
	          and Nie{\ss}ner, Matthias 
	          and Wetzstein, Gordon 
	          and Zollh{\"o}fer, Michael},
	title = {DeepVoxels: Learning Persistent 3D Feature Embeddings},
	booktitle = {Proc. CVPR},
	year={2019}
}

Follow-up work

Check out our new project, Scene Representation Networks, where we replace the voxel grid with a continuous function that naturally generalizes across scenes and smoothly parameterizes scene surfaces!

Submodule "pytorch_prototyping"

The code in the subdirectory "pytorch_prototyping" comes from a little library of custom pytorch modules that I use throughout my research projects. You can find it here.

Other cool projects

Some of the code in this project is based on code from these two very cool papers:

Check them out!

Contact:

If you have any questions, please email Vincent Sitzmann at [email protected].

Owner
Vincent Sitzmann
Incoming Assistant Professor @mit EECS. I'm researching neural scene representations - the way neural networks learn to represent information on our world.
Vincent Sitzmann
clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

README clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzo

110 Dec 29, 2022
Semantic similarity computation with different state-of-the-art metrics

Semantic similarity computation with different state-of-the-art metrics Description • Installation • Usage • License Description TaxoSS is a semantic

6 Jun 22, 2022
Implementation of Shape and Electrostatic similarity metric in deepFMPO.

DeepFMPO v3D Code accompanying the paper "On the value of using 3D-shape and electrostatic similarities in deep generative methods". The paper can be

34 Nov 28, 2022
Efficient 6-DoF Grasp Generation in Cluttered Scenes

Contact-GraspNet Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter

NVIDIA Research Projects 148 Dec 28, 2022
Lightweight plotting to the terminal. 4x resolution via Unicode.

Uniplot Lightweight plotting to the terminal. 4x resolution via Unicode. When working with production data science code it can be handy to have plotti

Olav Stetter 203 Dec 29, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

Semantic Code Search Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project. The model

Chen Wu 24 Nov 29, 2022
JittorVis - Visual understanding of deep learning models

JittorVis: Visual understanding of deep learning model JittorVis is an open-source library for understanding the inner workings of Jittor models by vi

thu-vis 182 Jan 06, 2023
[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

AugMax: Adversarial Composition of Random Augmentations for Robust Training Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, an

VITA 112 Nov 07, 2022
Source code for CVPR2022 paper "Abandoning the Bayer-Filter to See in the Dark"

Abandoning the Bayer-Filter to See in the Dark (CVPR 2022) Paper: https://arxiv.org/abs/2203.04042 (Arxiv version) This code includes the training and

74 Dec 15, 2022
Source Code for Simulations in the Publication "Can the brain use waves to solve planning problems?"

Code for Simulations in the Publication Can the brain use waves to solve planning problems? Installing Required Python Packages Please use Python vers

EMD Group 2 Jul 01, 2022
Pytorch for Segmentation

Pytorch for Semantic Segmentation This repo has been deprecated currently and I will not maintain it. Meanwhile, I strongly recommend you can refer to

ycszen 411 Nov 22, 2022
Double pendulum simulator using a symplectic Euler's method and Hamiltonian mechanics

Symplectic Double Pendulum Simulator Double pendulum simulator using a symplectic Euler's method. The program calculates the momentum and position of

Scott Marino 1 Jan 12, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 37 Nov 27, 2022
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
Let's Git - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu dieser Hausaufgabe für unseren MOOC: Let's Git! Wir hoffen, dass Du vi

1 Dec 13, 2021
OpenFed: A Comprehensive and Versatile Open-Source Federated Learning Framework

OpenFed: A Comprehensive and Versatile Open-Source Federated Learning Framework Introduction OpenFed is a foundational library for federated learning

25 Dec 12, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
Contrastive Learning of Structured World Models

Contrastive Learning of Structured World Models This repository contains the official PyTorch implementation of: Contrastive Learning of Structured Wo

Thomas Kipf 371 Jan 06, 2023