[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Last update: Dec 26, 2022

Overview

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Created by Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou

This repository contains PyTorch implementation for PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers (ICCV 2021 Oral Presentation) [arXiv].

PoinTr is a transformer-based model for point cloud completion. By representing the point cloud as a set of unordered groups of points with position embeddings, we convert the point cloud to a sequence of point proxies and employ a transformer encoder-decoder architecture for generation. We also propose two more challenging benchmarks ShapeNet-55/34 with more diverse incomplete point clouds that can better reflect the real-world scenarios to promote future research.

Pretrained Models

We provide pretrained PoinTr models:

dataset	url
ShapeNet-55	[Tsinghua Cloud] / [Google Drive] / [BaiDuYun] (code:erdh)
ShapeNet-34	[Tsinghua Cloud] / [Google Drive] / [BaiDuYun] (code:atbb )
PCN	[Tsinghua Cloud] / [Google Drive] / [BaiDuYun] (code:9g79)
KITTI	coming soon

Usage

Requirements

PyTorch >= 1.7.0
python >= 3.7
CUDA >= 9.0
GCC >= 4.9
torchvision
timm
open3d
tensorboardX

pip install -r requirements.txt

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

NOTE: PyTorch >= 1.7 and GCC >= 4.9 are required.

# Chamfer Distance
bash install.sh
# PointNet++
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Dataset

The details of our new ShapeNet-55/34 datasets and other existing datasets can be found in DATASET.md.

Evaluation

To evaluate a pre-trained PoinTr model on the Three Dataset with single GPU, run:

bash ./scripts/test.sh <GPU_IDS> --ckpts <path> --config <config> --exp_name <name> [--mode <easy/median/hard>]

Some examples:

Test the PoinTr pretrained model on the PCN benchmark:

bash ./scripts/test.sh 0 --ckpts ./pretrained/PoinTr_PCN.pth --config ./cfgs/PCN_models/PoinTr.yaml --exp_name example

Test the PoinTr pretrained model on ShapeNet55 benchmark (easy mode):

bash ./scripts/test.sh 0 --ckpts ./pretrained/PoinTr_ShapeNet55.pth --config ./cfgs/ShapeNet55_models/PoinTr.yaml --mode easy --exp_name example

Test the PoinTr pretrained model on the KITTI benchmark:

bash ./scripts/test.sh 0 --ckpts ./pretrained/PoinTr_KITTI.pth --config ./cfgs/KITTI_models/PoinTr.yaml --exp_name example

Training

To train a point cloud completion model from scratch, run:

# Use DistributedDataParallel (DDP)
bash ./scripts/dist_train.sh <NUM_GPU> <port> --config <config> --exp_name <name> [--resume] [--start_ckpts <path>] [--val_freq <int>]
# or just use DataParallel (DP)
bash ./scripts/train.sh <GPUIDS> --config <config> --exp_name <name> [--resume] [--start_ckpts <path>] [--val_freq <int>]

Some examples:

Train a PoinTr model on PCN benchmark with 2 gpus:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/dist_train.sh 2 13232 --config ./cfgs/PCN_models/PoinTr.yaml --exp_name example

Resume a checkpoint:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/dist_train.sh 2 13232 --config ./cfgs/PCN_models/PoinTr.yaml --exp_name example --resume

Finetune a PoinTr on PCNCars

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/dist_train.sh 2 13232 --config ./cfgs/KITTI_models/PoinTr.yaml --exp_name example --start_ckpts ./weight.pth

Train a PoinTr model with a single GPU:

bash ./scripts/train.sh 0 --config ./cfgs/KITTI_models/PoinTr.yaml --exp_name example

We also provide the Pytorch implementation of several baseline models including GRNet, PCN, TopNet and FoldingNet. For example, to train a GRNet model on ShapeNet-55, run:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/dist_train.sh 2 13232 --config ./cfgs/ShapeNet55_models/GRNet.yaml --exp_name example

Completion Results on ShapeNet55 and KITTI-Cars

License

MIT License

Acknowledgements

Our code is inspired by GRNet and mmdetection3d.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{yu2021pointr,
  title={PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers},
  author={Yu, Xumin, Rao, Yongming and Wang, Ziyi and Liu, Zuyan, and Lu, Jiwen and Zhou, Jie},
  booktitle={ICCV},
  year={2021}
}

[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Related tags

Overview

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Pretrained Models

Usage

Requirements

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

Dataset

Evaluation

Some examples:

Training

Some examples:

Completion Results on ShapeNet55 and KITTI-Cars

License

Acknowledgements

Citation

Owner

Xumin Yu

Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

FMA: A Dataset For Music Analysis

Simple and Distributed Machine Learning

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

PyMatting: A Python Library for Alpha Matting

SIEM Logstash parsing for more than hundred technologies

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Classical OCR DCNN reproduction based on PaddlePaddle framework.

Open Source Differentiable Computer Vision Library for PyTorch

This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

Exploring Simple Siamese Representation Learning

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"

Single Image Deraining Using Bilateral Recurrent Network (TIP 2020)

Repo for the ACMMM20 submission: "Personalized breath based biometric authentication with wearable multimodality".

Some methods for comparing network representations in deep learning and neuroscience.

An implementation of shampoo