Robust Consistent Video Depth Estimation

Overview

[CVPR 2021] Robust Consistent Video Depth Estimation

Open in Colab

This repository contains Python and C++ implementation of Robust Consistent Video Depth, as described in the paper

Johannes Kopf, Xuejian Rong, and Jia-Bin Huang. Robust Consistent Video Despth Estimation. CVPR 2021

Project | Paper | Video | Colab

We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video. We integrate a learning-based depth prior, in the form of a convolutional neural network trained for single-image depth estimation, with geometric optimization, to estimate a smooth camera trajectory as well as detailed and stable depth reconstruction.

teaser

Changelog

  • [June 2021] Released the companion Colab notebook.
  • [June 2021] Initial release of Robust CVD.

Installation

Please refer to the colab notebook for how to install the dependencies.

Running

Please refer to the colab notebook for how to run the cli tool for now.

Result Folder Structure

frames.txt              # meta data about number of frames, image resolution and timestamps for each frame
color_full/             # extracted frames in the original resolution
color_down/             # extracted frames in the resolution for disparity estimation 
color_down_png/      
color_flow/             # extracted frames in the resolution for flow estimation
flow_list.json          # indices of frame pairs to finetune the model with
flow/                   # optical flow 
mask/                   # mask of consistent flow estimation between frame pairs.
vis_flow/               # optical flow visualization. Green regions contain inconsistent flow. 
vis_flow_warped/        # visualzing flow accuracy by warping one frame to another using the estimated flow. e.g., frame_000000_000032_warped.png warps frame_000032 to frame_000000.
depth_${model_type}/    # initial disparity estimation using the original monocular depth model before test-time training
R_hierarchical2_${model_type}/ 
    flow_list_0.20.json                 # indices of frame pairs passing overlap ratio test of threshold 0.2. Same content as ../flow_list.json.
    videos/                             # video visualization of results 
    B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam/
        checkpoints/                    # checkpoint after each epoch
        depth/                          # final disparity map results after finishing test-time training
        eval/                           # intermediate losses and disparity maps after each epoch 
        tensorboard/                    # tensorboard log for the test-time training process

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{kopf2021rcvd,
 title={Robust Consistent Video Depth Estimation},
 author={Kopf, Johannes and Rong, Xuejian and Huang, Jia-Bin},
 year={2021},
 booktitle=IEEE/CVF Conference on Computer Vision and Pattern Recognition
}

License

See the LICENSE for more details.

Issues & Help

For help or issues using Robust CVD, please submit a GitHub issue or a PR request.

Before you do this, make sure you have checked CODE_OF_CONDUCT, CONTRIBUTING, ISSUE_TEMPLATE, and PR_TEMPLATE.

Acknowledgements

Check our previous work on Consistent Video Depth Estimation.

We also thank the authors for releasing PyTorch, Ceres Solver, OpenCV, Eigen, MiDaS, RAFT, and detectron2.

This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

Face Alignment in Full Pose Range: A 3D Total Solution By Jianzhu Guo. [Updates] 2020.8.30: The pre-trained model and code of ECCV-20 are made public

Jianzhu Guo 3.4k Jan 02, 2023
Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ryuichiro Hataya 50 Dec 05, 2022
Code Repository for Liquid Time-Constant Networks (LTCs)

Liquid time-constant Networks (LTCs) [Update] A Pytorch version is added in our sister repository: https://github.com/mlech26l/keras-ncp This is the o

Ramin Hasani 553 Dec 27, 2022
2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

Panteleris Paschalis 23 Oct 17, 2022
The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

9 Dec 09, 2022
Layered Neural Atlases for Consistent Video Editing

Layered Neural Atlases for Consistent Video Editing Project Page | Paper This repository contains an implementation for the SIGGRAPH Asia 2021 paper L

Yoni Kasten 353 Dec 27, 2022
PaRT: Parallel Learning for Robust and Transparent AI

PaRT: Parallel Learning for Robust and Transparent AI This repository contains the code for PaRT, an algorithm for training a base network on multiple

Mahsa 0 May 02, 2022
A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

jaxdf - JAX-based Discretization Framework Overview | Example | Installation | Documentation ⚠️ This library is still in development. Breaking changes

UCL Biomedical Ultrasound Group 65 Dec 23, 2022
YOLOX-CondInst - Implement CondInst which is a instances segmentation method on YOLOX

YOLOX CondInst -- YOLOX 实例分割 前言 本项目是自己学习实例分割时,复现的代码. 通过自己编程,让自己对实例分割有更进一步的了解。 若想

DDGRCF 16 Nov 18, 2022
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Rate-limit-semaphore - Semaphore implementation with rate limit restriction for async-style (any core)

Rate Limit Semaphore Rate limit semaphore for async-style (any core) There are t

Yan Kurbatov 4 Jun 21, 2022
Download and preprocess popular sequential recommendation datasets

Sequential Recommendation Datasets This repository collects some commonly used sequential recommendation datasets in recent research papers and provid

125 Dec 06, 2022
An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

CLCC: Contrastive Learning for Color Constancy (CVPR 2021) Yi-Chen Lo*, Chia-Che Chang*, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang,

Yi-Chen (Howard) Lo 58 Dec 17, 2022
This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

The-Emergence-of-Objectness This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

44 Oct 08, 2022
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

DV Lab 115 Dec 23, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 05, 2023
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Rowel Atienza 198 Dec 27, 2022
Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT This repository is the official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. ArXiv If

International Business Machines 168 Dec 29, 2022
Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

Edward Beeching 326 Dec 30, 2022