[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

Last update: Jan 01, 2023

Overview

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

[Paper] [Project Website] [Google Colab]

We propose a method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view. We use a Layered Depth Image with explicit pixel connectivity as underlying representation, and present a learning-based inpainting model that iteratively synthesizes new local color-and-depth content into the occluded region in a spatial context-aware manner. The resulting 3D photos can be efficiently rendered with motion parallax using standard graphics engines. We validate the effectiveness of our method on a wide range of challenging everyday scenes and show fewer artifacts when compared with the state-of-the-arts.

3D Photography using Context-aware Layered Depth Inpainting
Meng-Li Shih, Shih-Yang Su, Johannes Kopf, and Jia-Bin Huang
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

Prerequisites

Linux (tested on Ubuntu 18.04.4 LTS)
Anaconda
Python 3.7 (tested on 3.7.4)
PyTorch 1.4.0 (tested on 1.4.0 for execution)

and the Python dependencies listed in requirements.txt

To get started, please run the following commands:

conda create -n 3DP python=3.7 anaconda
conda activate 3DP
pip install -r requirements.txt
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit==10.1.243 -c pytorch

Next, please download the model weight using the following command:
```
chmod +x download.sh
./download.sh
```

Quick start

Please follow the instructions in this section. This should allow to execute our results. For more detailed instructions, please refer to DOCUMENTATION.md.

Execute

Put .jpg files (e.g., test.jpg) into the image folder.
- E.g., image/moon.jpg
Run the following command
```
python main.py --config argument.yml
```
- Note: The 3D photo generation process usually takes about 2-3 minutes depending on the available computing resources.
The results are stored in the following directories:
- Corresponding depth map estimated by MiDaS
  - E.g. depth/moon.npy, depth/moon.png
  - User could edit depth/moon.png manually.
    - Remember to set the following two flags as listed below if user wants to use manually edited depth/moon.png as input for 3D Photo.
      - depth_format: '.png'
      - require_midas: False
- Inpainted 3D mesh (Optional: User need to switch on the flag save_ply)
  - E.g. mesh/moon.ply
- Rendered videos with zoom-in motion
  - E.g. video/moon_zoom-in.mp4
- Rendered videos with swing motion
  - E.g. video/moon_swing.mp4
- Rendered videos with circle motion
  - E.g. video/moon_circle.mp4
- Rendered videos with dolly zoom-in effect
  - E.g. video/moon_dolly-zoom-in.mp4
  - Note: We assume that the object of focus is located at the center of the image.
(Optional) If you want to change the default configuration. Please read DOCUMENTATION.md and modified argument.yml.

License

This work is licensed under MIT License. See LICENSE for details.

If you find our code/models useful, please consider citing our paper:

@inproceedings{Shih3DP20,
  author = {Shih, Meng-Li and Su, Shih-Yang and Kopf, Johannes and Huang, Jia-Bin},
  title = {3D Photography using Context-aware Layered Depth Inpainting},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}

Acknowledgments

We thank Pratul Srinivasan for providing clarification of the method Srinivasan et al. CVPR 2019.
We thank the author of Zhou et al. 2018, Choi et al. 2019, Mildenhall et al. 2019, Srinivasan et al. 2019, Wiles et al. 2020, Niklaus et al. 2019 for providing their implementations online.
Our code builds upon EdgeConnect, MiDaS and pytorch-inpainting-with-partial-conv

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

Related tags

Overview

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

[Paper] [Project Website] [Google Colab]

Prerequisites

Quick start

Execute

License

Acknowledgments

Owner

Virginia Tech Vision and Learning Lab

Learning from Synthetic Shadows for Shadow Detection and Removal [Inoue+, IEEE TCSVT 2020].

Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

CARL provides highly configurable contextual extensions to several well-known RL environments.

Trading environnement for RL agents, backtesting and training.

Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Streamlit component for TensorBoard, TensorFlow's visualization toolkit

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

CTF Challenge for CSAW Finals 2021

LibMTL: A PyTorch Library for Multi-Task Learning

ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

An alarm clock coded in Python 3 with Tkinter

An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)

E2C implementation in PyTorch