《Train in Germany, Test in The USA: Making 3D Object Detectors Generalize》(CVPR 2020)

Overview

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize

This paper has been accpeted by Conference on Computer Vision and Pattern Recognition (CVPR) 2020.

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize

by Yan Wang*, Xiangyu Chen*, Yurong You, Li Erran, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao*

Figure

Dependencies

Usage

Prepare Datasets (Jupyter notebook)

We develop our method on these datasets:

  1. Configure dataset_path in config_path.py.

    Raw datasets will be organized as the following structure:

     dataset_path/
         | kitti/               # KITTI object detection 3D dataset
             | training/
             | testing/
         | argo/                # Argoverse dataset v1.1
             | train1/
             | train2/
             | train3/
             | train4/
             | val/
             | test/
         | nusc/                # nuScenes dataset v1.0
             | maps/
             | samples/
             | sweeps/
             | v1.0-trainval/
         | lyft/                # Lyft Level 5 dataset v1.02
             | v1.02-train/
         | waymo/               # Waymo dataset v1.0
             | training/
             | validation/
     
  2. Download all datasets.

    For KITTI, Argoverse and Waymo, we provide scripts for automatic download.

    cd scripts/
    python download.py [--datasets kitti+argo+waymo]

    nuScenes and Lyft need to downloaded manually.

  3. Convert all datasets to KITTI format.

    cd scripts/
    python -m pip install -r convert_requirements.txt
    python convert.py [--datasets argo+nusc+lyft+waymo]
  4. Split validation set

    We provide the train/val split used in our experiments under split folder.

    cd split/
    python replace_split.py
  5. Generate car subset

    We filter scenes and only keep those with cars.

    cd scripts/
    python gen_car_split.py

Statistical Normalization (Jupyter notebook)

  1. Compute car size statistics of each dataset. The computed statistics are stored as label_stats_{train/val/test}.json under KITTI format dataset root.

    cd stat_norm/
    python stat.py
  2. Generate rescaled datasets according to car size statistics. The rescaled datasets are stored under $dataset_path/rescaled_datasets by default.

    cd stat_norm/
    python norm.py [--path $PATH]

Training (To be updated)

We use PointRCNN to validate our method.

  1. Setup PointRCNN

    cd pointrcnn/
    ./build_and_install.sh
  2. Build datasets in PointRCNN format.

    cd pointrcnn/tools/
    python generate_multi_data.py
    python generate_gt_database.py --root ...
  3. Download the models pretrained on source domains from google drive using gdrive.

    cd pointrcnn/tools/
    gdrive download -r 14MXjNImFoS2P7YprLNpSmFBsvxf5J2Kw
  4. Adapt to a new domain by re-training with rescaled data.

    cd pointrcnn/tools/
    
    python train_rcnn.py --cfg_file ...

Inference

cd pointrcnn/tools/
python eval_rcnn.py --ckpt /path/to/checkpoint.pth --dataset $dataset --output_dir $output_dir 

Evaluation

We provide evaluation code with

  • old (based on bbox height) and new (based on distance) difficulty metrics
  • output transformation functions to locate domain gap
python evaluate/
python evaluate.py --result_path $predictions --dataset_path $dataset_root --metric [old/new]

Citation

@inproceedings{wang2020train,
  title={Train in germany, test in the usa: Making 3d object detectors generalize},
  author={Yan Wang and Xiangyu Chen and Yurong You and Li Erran and Bharath Hariharan and Mark Campbell and Kilian Q. Weinberger and Wei-Lun Chao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11713-11723},
  year={2020}
}
Owner
Xiangyu Chen
Ph.D. Student in Computer Science
Xiangyu Chen
SPEAR: Semi suPErvised dAta progRamming

Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem

decile-team 91 Dec 06, 2022
A TensorFlow implementation of DeepMind's WaveNet paper

A TensorFlow implementation of DeepMind's WaveNet paper This is a TensorFlow implementation of the WaveNet generative neural network architecture for

Igor Babuschkin 5.3k Dec 28, 2022
Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving This is the source code for our paper Frequency Domain Image Tran

Mu Cai 52 Dec 23, 2022
TDmatch is a Python library developed to perform matching tasks in three categories:

TDmatch TDmatch is a Python library developed to perform matching tasks in three categories: Text to Data which matches tuples of a table to text docu

Naser Ahmadi 5 Aug 11, 2022
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

semantic-segmentation-tensorflow This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscape

HsuanKung Yang 83 Oct 13, 2022
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work 🌟 Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 09, 2022
a general-purpose Transformer based vision backbone

Swin Transformer By Ze Liu*, Yutong Lin*, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo. This repo is the official implement

Microsoft 9.9k Jan 08, 2023
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Paddle-PANet 目录 结果对比 论文介绍 快速安装 结果对比 CTW1500 Method Backbone Fine

7 Aug 08, 2022
A embed able annotation tool for end to end cross document co-reference

CoRefi CoRefi is an emebedable web component and stand alone suite for exaughstive Within Document and Cross Document Coreference Anntoation. For a de

PythicCoder 39 Dec 12, 2022
🥇 LG-AI-Challenge 2022 1위 솔루션 입니다.

LG-AI-Challenge-for-Plant-Classification Dacon에서 진행된 농업 환경 변화에 따른 작물 병해 진단 AI 경진대회 에 대한 코드입니다. (colab directory에 코드가 잘 정리 되어있습니다.) Requirements python

siwooyong 10 Jun 30, 2022
the official implementation of the paper "Isometric Multi-Shape Matching" (CVPR 2021)

Isometric Multi-Shape Matching (IsoMuSh) Paper-CVF | Paper-arXiv | Video | Code Citation If you find our work useful in your research, please consider

Maolin Gao 9 Jul 17, 2022
Learning multiple gaits of quadruped robot using hierarchical reinforcement learning

Learning multiple gaits of quadruped robot using hierarchical reinforcement learning We propose a method to learn multiple gaits of quadruped robot us

Yunho Kim 17 Dec 11, 2022
DAN: Unfolding the Alternating Optimization for Blind Super Resolution

DAN-Basd-on-Openmmlab DAN: Unfolding the Alternating Optimization for Blind Super Resolution We reproduce DAN via mmediting based on open-sourced code

AlexZou 72 Dec 13, 2022
Official code for "Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes", CVPR2022

[CVPR 2022] Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes Dongkwon Jin, Wonhui Park, Seong-Gyun Jeong, Heeyeon Kwon, and Cha

Dongkwon Jin 106 Dec 29, 2022
CLIPImageClassifier wraps clip image model from transformers

CLIPImageClassifier CLIPImageClassifier wraps clip image model from transformers. CLIPImageClassifier is initialized with the argument classes, these

Jina AI 6 Sep 12, 2022
A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

SG-GAN TensorFlow implementation of SG-GAN. Prerequisites TensorFlow (implemented in v1.3) numpy scipy pillow Getting Started Train Prepare dataset. W

lplcor 61 Jun 07, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022