DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Overview

NVIDIA Source Code License Python 3.8

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Project page | Demo (Youtube) | Demo (Bilibili)

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.
Shiyi Lan, Zhiding Yu, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry Davis, Anima Anandkumar
International Conference on Computer Vision (ICCV) 2021

This repository contains the official Pytorch implementation of training & evaluation code and pretrained models for DiscoBox. DiscoBox is a state of the art framework that can jointly predict high quality instance segmentation and semantic correspondence from box annotations.

We use MMDetection v2.10.0 as the codebase.

All of our models are trained and tested using automatic mixed precision, which leverages float16 for speedup and less GPU memory consumption.

Installation

This implementation is based on PyTorch==1.9.0, mmcv==2.13.0, and mmdetection==2.10.0

Please refer to get_started.md for installation.

Or you can download the docker image from our dockerhub repository.

Models

Results on COCO val 2017

Backbone Weights AP [email protected] [email protected] [email protected] [email protected] [email protected]
ResNet-50 download 30.7 52.6 30.6 13.3 34.1 45.6
ResNet-101-DCN download 35.3 59.1 35.4 16.9 39.2 53.0
ResNeXt-101-DCN download 37.3 60.4 39.1 17.8 41.1 55.4

Results on COCO test-dev

We also evaluate the models in the section Results on COCO val 2017 with the same weights on COCO test-dev.

Backbone Weights AP [email protected] [email protected] [email protected] [email protected] [email protected]
ResNet-50 download 32.0 53.6 32.6 11.7 33.7 48.4
ResNet-101-DCN download 35.8 59.8 36.4 16.9 38.7 52.1
ResNeXt-101-DCN download 37.9 61.4 40.0 18.0 41.1 53.9

Training

COCO

ResNet-50 (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py 8

ResNet-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py 8

ResNeXt-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x.py 8

Pascal VOC 2012

ResNet-50 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_6x.py 4

ResNet-101 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_6x.py 4

Testing

COCO

ResNet-50 (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py \
     work_dirs/coco_r50_fpn_3x.pth 8 --eval segm

ResNet-101-DCN (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py \
     work_dirs/coco_r101_dcn_fpn_3x.pth 8 --eval segm

ResNeXt-101-DCN (GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x_fp16.py \
     work_dirs/coco_x101_dcn_fpn_3x.pth 8 --eval segm

Pascal VOC 2012 (COCO API)

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 --eval segm

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 --eval segm

Pascal VOC 2012 (Matlab)

Step 1: generate results

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r50_results.json"

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r101_results.json"

Step 2: format conversion

ResNet-50:

python tools/json2mat.pywork_dirs/voc_r50_results.json work_dirs/voc_r50_results.mat

ResNet-101:

python tools/json2mat.pywork_dirs/voc_r101_results.json work_dirs/voc_r101_results.mat

Step 3: evaluation

Please visit BBTP for the evaluation code written in Matlab.

PF-Pascal

Please visit this repository.

LICENSE

Please check the LICENSE file. DiscoBox may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{lan2021discobox,
  title={DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision},
  author={Lan, Shiyi and Yu, Zhiding and Choy, Christopher and Radhakrishnan, Subhashree and Liu, Guilin and Zhu, Yuke and Davis, Larry S and Anandkumar, Anima},
  journal={arXiv preprint arXiv:2105.06464},
  year={2021}
}
Owner
Shiyi Lan
PhD Candidate. Research Interests: Object Detection, Instance segmentation, 3D Object Detection, 3D vehicle trajectory, Weakly/Semi-supervised learning
Shiyi Lan
[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by

VITA lab at EPFL 125 Dec 23, 2022
JAXDL: JAX (Flax) Deep Learning Library

JAXDL: JAX (Flax) Deep Learning Library Simple and clean JAX/Flax deep learning algorithm implementations: Soft-Actor-Critic (arXiv:1812.05905) Transf

Patrick Hart 4 Nov 27, 2022
Contextual Attention Localization for Offline Handwritten Text Recognition

CALText This repository contains the source code for CALText model introduced in "CALText: Contextual Attention Localization for Offline Handwritten T

0 Feb 17, 2022
A GPT, made only of MLPs, in Jax

MLP GPT - Jax (wip) A GPT, made only of MLPs, in Jax. The specific MLP to be used are gMLPs with the Spatial Gating Units. Working Pytorch implementat

Phil Wang 53 Sep 27, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Python library for loading and using triangular meshes.

Trimesh is a pure Python (2.7-3.4+) library for loading and using triangular meshes with an emphasis on watertight surfaces. The goal of the library i

Michael Dawson-Haggerty 2.2k Jan 07, 2023
PyQt6 configuration in yaml format providing the most simple script.

PyamlQt(ぴゃむるきゅーと) PyQt6 configuration in yaml format providing the most simple script. Requirements yaml PyQt6, ( PyQt5 ) Installation pip install Pya

Ar-Ray 7 Aug 15, 2022
[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

SSVC The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning] samples of the

7 Oct 26, 2022
This repository contains the code for designing risk bounded motion plans for car-like robot using Carla Simulator.

Nonlinear Risk Bounded Robot Motion Planning This code simulates the bicycle dynamics of car by steering it on the road by avoiding another static car

8 Sep 03, 2022
Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

Luciana Ferrer 31 Aug 05, 2022
Loopy belief propagation for factor graphs on discrete variables, in JAX!

PGMax implements general factor graphs for discrete probabilistic graphical models (PGMs), and hardware-accelerated differentiable loopy belief propagation (LBP) in JAX.

Vicarious 62 Dec 23, 2022
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 06, 2023
thundernet ncnn

MMDetection_Lite 基于mmdetection 实现一些轻量级检测模型,安装方式和mmdeteciton相同 voc0712 voc 0712训练 voc2007测试 coco预训练 thundernet_voc_shufflenetv2_1.5 input shape mAP 320

DayBreak 39 Dec 05, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

💎A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily instal

DefTruth 142 Dec 25, 2022
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"

KSTER Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples" [paper]. Usage Download the processed datas

jiangqn 23 Nov 24, 2022
使用yolov5训练自己数据集(详细过程)并通过flask部署

使用yolov5训练自己的数据集(详细过程)并通过flask部署 依赖库 torch torchvision numpy opencv-python lxml tqdm flask pillow tensorboard matplotlib pycocotools Windows,请使用 pycoc

HB.com 19 Dec 28, 2022
Pairwise learning neural link prediction for ogb link prediction

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
A Dataset of Python Challenges for AI Research

Python Programming Puzzles (P3) This repo contains a dataset of python programming puzzles which can be used to teach and evaluate an AI's programming

Microsoft 850 Dec 24, 2022