Improving 3D Object Detection with Channel-wise Transformer

Last update: Dec 20, 2022

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v0.3. Our paper can be downloaded here ICCV2021.

Overview of CT3D. The raw points are first fed into the RPN for generating 3D proposals. Then the raw points along with the corresponding proposals are processed by the channel-wise Transformer composed of the proposal-to-point encoding module and the channel-wise decoding module. Specifically, the proposal-to-point encoding module is to modulate each point feature with global proposal-aware context information. After that, the encoded point features are transformed into an effective proposal feature representation by the channel-wise decoding module for confidence prediction and box regression.

	[email protected]	[email protected]	Download
Only Car	86.06	85.79	model-car
3-Category (Car)	85.04	84.97	model-3cat
3-Category (Pedestrian)	56.28	55.58	-
3-Category (Cyclist)	71.71	71.88	-

1. Recommended Environment

Linux (tested on Ubuntu 16.04)
Python 3.6+
PyTorch 1.1 or higher (tested on PyTorch 1.6)
CUDA 9.0 or higher (PyTorch 1.3+ needs CUDA 9.2+)

2. Set the Environment

pip install -r requirement.txt
python setup.py develop

3. Data Preparation

Prepare KITTI dataset and road planes

# Download KITTI and organize it into the following form:
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2

# Generatedata infos:
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml

Prepare Waymo dataset

# Download Waymo and organize it into the following form:
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_gt_database_train_sampled_xx/
│   │   │── pcdet_waymo_dbinfos_train_sampled_xx.pkl

# Install tf 2.1.0
# Install the official waymo-open-dataset by running the following command:
pip3 install --upgrade pip
pip3 install waymo-open-dataset-tf-2-1-0 --user

# Extract point cloud data from tfrecord and generate data infos:
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

4. Train

Train with a single GPU

python train.py --cfg_file ${CONFIG_FILE}

# e.g.,
python train.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

Train with multiple GPUs or multiple machines

bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
# or 
bash scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} --cfg_file ${CONFIG_FILE}

# e.g.,
bash scripts/dist_train.sh 8 --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

5. Test

Test with a pretrained model:

python test.py --cfg_file ${CONFIG_FILE} --ckpt ${CKPT}

# e.g., 
python test.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml --ckpt output/kitti_models/second_ct3d/default/kitti_val.pth

Improving 3D Object Detection with Channel-wise Transformer

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

1. Recommended Environment

2. Set the Environment

3. Data Preparation

4. Train

5. Test

Owner

Hualian Sheng

This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

Code accompanying the paper "How Tight Can PAC-Bayes be in the Small Data Regime?"

LocUNet is a deep learning method to localize a UE based solely on the reported signal strengths from a set of BSs.

The project covers common metrics for super-resolution performance evaluation.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

UPSNet: A Unified Panoptic Segmentation Network

Pytorch Implementation for Dilated Continuous Random Field

Codes for Causal Semantic Generative model (CSG), the model proposed in "Learning Causal Semantic Representation for Out-of-Distribution Prediction" (NeurIPS-21)

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, wav2lip, picture repair, image editing, photo2cartoon, image style transfer, and so on.

To SMOTE, or not to SMOTE?

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.