Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Overview

SA-AutoAug

Scale-aware Automatic Augmentation for Object Detection

Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia

[Paper] [BibTeX]


This project provides the implementation for the CVPR 2021 paper "Scale-aware Automatic Augmentation for Object Detection". Scale-aware AutoAug provides a new search space and search metric to find effective data agumentation policies for object detection. It is implemented on maskrcnn-benchmark and FCOS. Both search and training codes have been released. To facilitate more use, we re-implement the training code based on Detectron2.

Installation

For maskrcnn-benchmark code, please follow INSTALL.md for instruction.

For FCOS code, please follow INSTALL.md for instruction.

For Detectron2 code, please follow INSTALL.md for instruction.

Search

(You can skip this step and directly train on our searched policies.)

To search with 8 GPUs, run:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/search.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_search.yaml OURPUT_DIR /path/to/searchlog_dir

Since we finetune on an existing baseline model during search, a baseline model is needed. You can download this model for search, or you can use other Retinanet baseline model trained by yourself.

Training

To train the searched policies on maskrcnn-benchmark (FCOS)

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/CONFIG_FILE  OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_6x.yaml  OUTPUT_DIR models/retinanet_R-50-FPN_6x_SAAutoAug

To train the searched policies on detectron2

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/CONFIG_FILE OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/retinanet_R_50_FPN_6x.yaml OUTPUT_DIR output_retinanet_R_50_FPN_6x_SAAutoAug

Results

We provide the results on COCO val2017 set with pretrained models.

Based on maskrcnn-benchmark

Method Backbone APbbox Download
Faster R-CNN ResNet-50 41.8 Model
Faster R-CNN ResNet-101 44.2 Model
RetinaNet ResNet-50 41.4 Model
RetinaNet ResNet-101 42.8 Model
Mask R-CNN ResNet-50 42.8 Model
Mask R-CNN ResNet-101 45.3 Model

Based on FCOS

Method Backbone APbbox Download
FCOS ResNet-50 42.6 Model
FCOS ResNet-101 44.0 Model
ATSS ResNext-101-32x8d-dcnv2 48.5 Model
ATSS ResNext-101-32x8d-dcnv2 (1200 size) 49.6 Model

Based on Detectron2

Method Backbone APbbox Download
Faster R-CNN ResNet-50 41.9 Model - Metrics
Faster R-CNN ResNet-101 44.2 Model - Metrics
RetinaNet ResNet-50 40.8 Model - Metrics
RetinaNet ResNet-101 43.1 Model - Metrics
Mask R-CNN ResNet-50 42.9 Model - Metrics
Mask R-CNN ResNet-101 45.6 Model - Metrics

Citing SA-AutoAug

Consider cite SA-Autoaug in your publications if it helps your research.

@inproceedings{saautoaug,
  title={Scale-aware Automatic Augmentation for Object Detection},
  author={Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Acknowledgments

This training code of this project is built on maskrcnn-benchmark, Detectron2, FCOS, and ATSS. The search code of this project is modified from DetNAS. Some augmentation code and settings follow AutoAug-Det. We thanks a lot for the authors of these projects.

Note that:

(1) We also provides script files for search and training in maskrcnn-benchmark, FCOS, and, detectron2.

(2) Any issues or pull requests on this project are welcome. In addition, if you meet problems when applying the augmentations to other datasets or codebase, feel free to contact Yukang Chen ([email protected]).

Owner
DV Lab
Deep Vision Lab
DV Lab
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022
Certifiable Outlier-Robust Geometric Perception

Certifiable Outlier-Robust Geometric Perception About This repository holds the implementation for certifiably solving outlier-robust geometric percep

83 Dec 31, 2022
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Aritra Roy Gosthipaty 23 Dec 24, 2022
This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

ROSEFusion 🌹 This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Moti

219 Dec 27, 2022
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

Shi Guo 32 Dec 15, 2022
《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022
Safe Policy Optimization with Local Features

Safe Policy Optimization with Local Feature (SPO-LF) This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization wi

Akifumi Wachi 6 Jun 05, 2022
This repo generates the training data and the model for Morpheus-Deblend

Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as

Ryan Hausen 2 Apr 18, 2022
A curated list of neural rendering resources.

Awesome-of-Neural-Rendering A curated list of neural rendering and related resources. Please feel free to pull requests or open an issue to add papers

Zhiwei ZHANG 43 Dec 09, 2022
Revisiting Self-Training for Few-Shot Learning of Language Model.

SFLM This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few

15 Nov 19, 2022
(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

CLNet (ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning" [project page] [paper] Citing CLNet If yo

Chen Zhao 22 Aug 26, 2022
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart

Andrew Zeng 36 Dec 19, 2022
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

MeTAL - Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning (ICCV2021 Oral) Sungyong Baik, Janghoon Choi, Heewon Kim, Dohee Cho, Jaes

Sungyong Baik 44 Dec 29, 2022
The official repository for BaMBNet

BaMBNet-Pytorch Paper

Junjun Jiang 18 Dec 04, 2022
Everything about being a TA for ITP/AP course!

تی‌ای بودن! تی‌ای یا دستیار استاد از نقش‌های رایج بین دانشجویان مهندسی است، این ریپوزیتوری قرار است نکات مهم درمورد تی‌ای بودن و تی ای شدن را به ما نش

<a href=[email protected]"> 14 Sep 10, 2022
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 78 Dec 10, 2022
Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

Diffusion Probabilistic Models for 3D Point Cloud Generation [Paper] [Code] The official code repository for our CVPR 2021 paper "Diffusion Probabilis

Shitong Luo 323 Jan 05, 2023
Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

CVPR 2021 | Activate or Not: Learning Customized Activation. This repository contains the official Pytorch implementation of the paper Activate or Not

184 Dec 27, 2022
Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

light-weight-depth-estimation Boosting Light-Weight Depth Estimation Via Knowledge Distillation, https://arxiv.org/abs/2105.06143 Junjie Hu, Chenyou F

Junjie Hu 13 Dec 10, 2022
Fake News Detection Using Machine Learning Methods

Fake-News-Detection-Using-Machine-Learning-Methods Fake news is always a real and dangerous issue. However, with the presence and abundance of various

Achraf Safsafi 1 Jan 11, 2022