Towards Part-Based Understanding of RGB-D Scans

Last update: Nov 23, 2022

Overview

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021)

We propose the task of part-based scene understanding of real-world 3D environments: from an RGB-D scan of a scene, we detect objects, and for each object predict its decomposition into geometric part masks, which composed together form the complete geometry of the observed object.

Download Paper (.pdf)

Demo samples

Get started

The core of this repository is a network, which takes as input preprocessed scan voxel crops and produces voxelized part trees. However, data preparation is very massive step before launching actual training and inference. That's why we release already prepared data for training and checkpoint to perform inference. If you want to launch training with our data, please follow the steps below:

Clone repo: git clone https://github.com/alexeybokhovkin/part-based-scan-understanding.git
Download data and/or checkpoint:
ScanNet MLCVNet crops (finetune) [894M]
ScanNet clean crops (pretraining) [995M]
PartNet GT trees [103M]
Parts priors [169M]
Checkpoint [19M]
For training, prepare augmented version of ScanNet crops with script dataproc/prepare_rot_aug_data.py. After this, create a folder with all necessary dataset metadata using script dataproc/gather_all_shapes.py
Create config file similar to configs/config_gnn_scannet_allshapes.yaml (you need to provide paths to some directories and files)
Launch training with train_gnn_scannet.py

Citation

If you use this framework please cite:

@article{Bokhovkin2020TowardsPU,
  title={Towards Part-Based Understanding of RGB-D Scans},
  author={Alexey Bokhovkin and V. Ishimtsev and Emil Bogomolov and D. Zorin and A. Artemov and Evgeny Burnaev and Angela Dai},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.02094}
}

You might also like...

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

1 Oct 2, 2021

PoseCamera is python based SDK for human pose estimation through RGB webcam.

PoseCamera PoseCamera is python based SDK for human pose estimation through RGB webcam. Install install posecamera package through pip pip install pos

7 Jul 20, 2021

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

188 Dec 27, 2022

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

OcclusionFusion (CVPR'2022) Project Page | Paper | Video Overview This repository contains the code for the CVPR 2022 paper OcclusionFusion, where we

193 Dec 15, 2022

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow

18 Oct 6, 2022

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

Comments

scannet_shape_ids files and part segmentation
First of all, thanks for the great work! I have two questions about this repo and your paper:

It seems that txt files for scannet_shape_ids are required for prepare_rot_aug_data.py. But I cannot find them in the provided dataset files.

Could you explain more details about part segmentation on 3D scans? I'm confused if the part segmentation labels for 3d scans are generated by 1) aligning PartNet data, 2) assigning part labels to overlapped regions. Do you provide point-wise (or voxel-wise) part segmentation annotation?
opened by jeonghyunkeem 0

Towards Part-Based Understanding of RGB-D Scans

Related tags

Overview

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021)

Demo samples

Get started

Citation

You might also like...

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PoseCamera is python based SDK for human pose estimation through RGB webcam.

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Comments

scannet_shape_ids files and part segmentation

Releases(v0.1)

v0.1(Jun 18, 2021)

Owner

A Learning-based Camera Calibration Toolbox

CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes

The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Example of a Quantum LSTM

[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

A library for differentiable nonlinear optimization.

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

High-fidelity 3D Model Compression based on Key Spheres

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Moer Grounded Image Captioning by Distilling Image-Text Matching Model

Apply AnimeGAN-v2 across frames of a video clip

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Serverless proxy for Spark cluster

Material del curso IIC2233 Programación Avanzada 📚

Context-Sensitive Misspelling Correction of Clinical Text via Conditional Independence, CHIL 2022

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

Pytorch Lightning Implementation of SC-Depth Methods.

A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A benchmark framework for Tensorflow