Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Last update: Dec 29, 2022

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. To capture the dynamics in point cloud videos, point tracking is usually employed. However, as points may flow in and out across frames, computing accurate point trajectories is extremely difficult. Moreover, tracking usually relies on point colors and thus may fail to handle colorless point clouds. In this paper, to avoid point tracking, we propose a novel Point 4D Transformer (P4Transformer) network to model raw point cloud videos. Specifically, P4Transformer consists of (i) a point 4D convolution to embed the spatio-temporal local structures presented in a point cloud video and (ii) a transformer to capture the appearance and motion information across the entire video by performing self-attention on the embedded local features. In this fashion, related or similar local areas are merged with attention weight rather than by explicit tracking.

Installation

The code is tested with Red Hat Enterprise Linux Workstation release 7.7 (Maipo), g++ (GCC) 8.3.1, PyTorch (both v1.4.0 and v1.8.1 are supported), CUDA 10.2 and cuDNN v7.6.

Compile the CUDA layers for PointNet++, which we used for furthest point sampling (FPS) and radius neighbouring search:

mv modules-pytorch-1.4.0/modules-pytorch-1.8.1 modules
cd modules
python setup.py install

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{fan21p4transformer,
  author    = {Hehe Fan and
               Yi Yang and
               Mohan Kankanhalli},
  title     = {Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos},
  booktitle = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR}},
  year      = {2021}
}

Related Repos

PointNet++ PyTorch implementation: https://github.com/facebookresearch/votenet/tree/master/pointnet2
MeteorNet: https://github.com/xingyul/meteornet
3DV: https://github.com/3huo/3DV-Action
PSTNet: https://github.com/hehefan/Point-Spatio-Temporal-Convolution
Transformer: https://github.com/lucidrains/vit-pytorch
PointRNN (TensorFlow implementation): https://github.com/hehefan/PointRNN
PointRNN (PyTorch implementation): https://github.com/hehefan/PointRNN-PyTorch

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Installation

Citation

Related Repos

Owner

Hehe Fan

WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

Tutorial: Introduction to Graph Machine Learning, with Jupyter notebooks

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Individual Treatment Effect Estimation

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification (NeurIPS 2021)

MAterial del programa Misión TIC 2022

Official code for paper Exemplar Based 3D Portrait Stylization.

Official PyTorch implementation of SyntaSpeech (IJCAI 2022)

Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

How to Leverage Multimodal EHR Data for Better Medical Predictions?

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

TensorLight - A high-level framework for TensorFlow

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Prototype-based Incremental Few-Shot Semantic Segmentation

Several simple examples for popular neural network toolkits calling custom CUDA operators.

The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

Omnidirectional camera calibration in python