Learning Visual Words for Weakly-Supervised Semantic Segmentation

Last update: Oct 05, 2022

Related tags

Overview

[IJCAI 2021] Learning Visual Words for Weakly-Supervised Semantic Segmentation

Implementation of IJCAI 2021 paper Learning Visual Words for Weakly-Supervised Semantic Segmentation, a work on generating better CAM for Weakly-Supervised Semantic Segmentation.

Abstract

Current weakly-supervised semantic segmentation (WSSS) methods with image-level labels mainly adopt class activation maps (CAM) to generate the initial pseudo labels. However, CAM usually only identifies the most discriminative object extents, which is attributed to the fact that the network doesn't need to discover the integral object to recognize image-level labels. In this work, to tackle this problem, we proposed to simultaneously learn the image-level labels and local visual word labels. Specifically, in each forward propagation, the feature maps of the input image will be encoded to visual words with a learnable codebook. By enforcing the network to classify the encoded fine-grained visual words, the generated CAM could cover more semantic regions. Besides, we also proposed a hybrid spatial pyramid pooling module that could preserve local maximum and global average values of feature maps, so that more object details and less background were considered. Based on the proposed methods, we conducted experiments on the PASCAL VOC 2012 dataset. Our proposed method achieved 67.2% mIoU on the val set and 67.3% mIoU on the test set, which outperformed recent state-of-the-art methods.

Start

Create and activate conda environment

conda create --name py36 python=3.6
conda activate py36
pip install -r requirments.txt

Clone this repo

git clone https://github.com/rulixiang/vwe.git
cd vwe

train & infer & evaluate

# train network
python train_cam.py --gpu 0,1 --configs/voc.yaml
# infer cam
python infer_cam.py --gpu 0,1 --configs/voc.yaml
# evaluate cam
python eval_cam.py

Comparasion of the generated CAM

Evaulation on the PASCAL VOC 2012 Dataset

	bkg	aero	bike	bird	boat	bottle	bus	car	cat	chair	cow	table	dog	horse	motor	person	plant	sheep	sofa	train	tv	mIOU
val	89.2	75.7	31.1	82.4	66.1	61.7	87.5	77.8	82.8	32.2	81.4	34.1	77.4	77.6	76.7	75.0	51.2	78.1	42.4	71.4	59.6	67.2
test	90.3	76.9	31.6	89.1	54.8	57.8	86.4	77.8	81.8	32.2	76.0	39.3	80.3	80.3	81.8	74.3	44.5	80.1	54.5	63.0	60.4	67.3

Citation

@inproceedings{
  ru2021learning,
  title={Learning Visual Words for Weakly-Supervised Semantic Segmentation},
  author={Lixiang Ru and Bo Du and Chen Wu},
  booktitle={International Joint Conference on Artificial Intelligence},
  year={2021},
}

Acknowledgement

This repo is heavily borrowed from IRNet. Thanks for their brilliant work!

Learning Visual Words for Weakly-Supervised Semantic Segmentation

Related tags

Overview

[IJCAI 2021] Learning Visual Words for Weakly-Supervised Semantic Segmentation

Abstract

Start

Create and activate conda environment

Clone this repo

train & infer & evaluate

Comparasion of the generated CAM

Evaulation on the PASCAL VOC 2012 Dataset

Citation

Acknowledgement

Owner

Lixiang Ru

Optimizaciones incrementales al problema N-Body con el fin de evaluar y comparar las prestaciones de los traductores de Python en el ámbito de HPC.

A quantum game modeling of pandemic (QHack 2022)

The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Time should be taken seer-iously

Training DiffWave using variational method from Variational Diffusion Models.

Code for the Active Speakers in Context Paper (CVPR2020)

generate-2D-quadrilateral-mesh-with-neural-networks-and-tree-search

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

Mahadi-Now - This Is Pakistani Just Now Login Tools

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

unet for image segmentation

Towards Part-Based Understanding of RGB-D Scans

Learning Visual Words for Weakly-Supervised Semantic Segmentation

Related tags

Overview

[IJCAI 2021] Learning Visual Words for Weakly-Supervised Semantic Segmentation

Abstract

Start

Create and activate conda environment

Clone this repo

train & infer & evaluate

Comparasion of the generated CAM

Evaulation on the PASCAL VOC 2012 Dataset

Citation

Acknowledgement

Owner

Lixiang Ru

Optimizaciones incrementales al problema N-Body con el fin de evaluar y comparar las prestaciones de los traductores de Python en el ámbito de HPC.

A quantum game modeling of pandemic (QHack 2022)

The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Time should be taken seer-iously

Training DiffWave using variational method from Variational Diffusion Models.

Code for the Active Speakers in Context Paper (CVPR2020)

generate-2D-quadrilateral-mesh-with-neural-networks-and-tree-search

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

Mahadi-Now - This Is Pakistani Just Now Login Tools

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

unet for image segmentation

Towards Part-Based Understanding of RGB-D Scans

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.