Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Last update: Dec 26, 2022

Related tags

Overview

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham, Guillaume Thibault, Lucas Pagano, Archana Machireddy, Joe Gray, Young Hwan Chang and Xubo Song.

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for SenFormer.

💾 Code Snippet (SenFormer)| ⌨️ Code Snippet (FPNT)| 📜 Paper | 论文

🔨 Installation

Conda environment

Clone this repository and enter it: git clone [email protected]:WalBouss/SenFormer.git && cd SenFormer.
Create a conda environment conda create -n senformer python=3.8, and activate it conda activate senformer.
Install Pytorch and torchvision conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch — (you may also switch to other version by specifying the version number).
Install MMCV library pip install mmcv-full==1.4.0
Install MMSegmentation library by running pip install -e . in SenFormer directory.
Install other requirements pip install timm einops

Here is a full script for setting up a conda environment to use SenFormer (with CUDA 10.2 and pytorch 1.7.1):

conda create -n senformer python=3.8
conda activate senformer
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch

git clone [email protected]:WalBouss/SenFormer.git && cd SenFormer
pip install mmcv-full==1.4.0
pip install -e .
pip install timm einops

Datasets

For datasets preparations please refer to MMSegmentation guidelines.

Pretrained weights

ResNet pretrained weights will be automatically downloaded before training.

For Swin Transformer ImageNet pretrained weights, you can either:

run bash tools/download_swin_weights.sh in SenFormer project to download all Swin Transformer pretrained weights (it will place weights under pretrain/ folder ).
download desired backbone weights here: Swin-T, Swin-S, Swin-B, Swin-L and place them under pretrain/ folder.
download weights from official repository then, convert them to mmsegmentation format following mmsegmentation guidelines.

🎯 Model Zoo

SenFormer models with ResNet and Swin's backbones and ADE20K, COCO-Stuff 10K, Pascal Context and Cityscapes.

ADE20K

Backbone	mIoU	mIoU (MS)	#params	FLOPs	Resolution	Download
ResNet-50	44.6	45.6	144M	179G	512x512	model	config
ResNet-101	46.5	47.0	163M	199G	512x512	model	config
Swin-Tiny	46.0	46.4	144M	179G	512x512	model	config
Swin-Small	49.2	50.4	165M	202G	512x512	model	config
Swin-Base	51.8	53.2	204M	242G	640x640	model	config
Swin-Large	53.1	54.2	314M	546G	640x640	model	config

COCO-Stuff 10K

Backbone	mIoU	mIoU (MS)	#params	Resolution	Download
ResNet-50	39.0	39.7	144M	512x512	model	config
ResNet-101	39.6	40.6	163M	512x512	model	config
Swin-Large	49.1	50.1	314M	512x512	model	config

Pascal Context

Backbone	mIoU	mIoU (MS)	#params	Resolution	Download
ResNet-50	53.2	54.3	144M	480x480	model	config
ResNet-101	55.1	56.6	163M	480x480	model	config
Swin-Large	62.4	64.0	314M	480x480	model	config

Cityscapes

Backbone	mIoU	mIoU (MS)	#params	Resolution	Download
ResNet-50	78.8	80.1	144M	512x1024	model	config
ResNet-101	80.3	81.4	163M	512x1024	model	config
Swin-Large	82.2	83.3	314M	512x1024	model	config

🔭 Inference

Download one checkpoint weights from above, for example SenFormer with ResNet-50 backbone on ADE20K:

Inference on a dataset

# Single-gpu testing
python tools/test.py senformer_configs/senformer/ade20k/senformer_fpnt_r50_512x512_160k_ade20k.py /path/to/checkpoint_file

# Multi-gpu testing
./tools/dist_test.sh senformer_configs/senformer/ade20k/senformer_fpnt_r50_512x512_160k_ade20k.py /path/to/checkpoint_file <GPU_NUM>

# Multi-gpu, multi-scale testing
tools/dist_test.sh senformer_configs/senformer/ade20k/senformer_fpnt_r50_512x512_160k_ade20k.py /path/to/checkpoint_file <GPU_NUM> --aug-test

Inference on custom data

To generate segmentation maps for your own data, run the following command:

python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE}

Run python demo/image_demo.py --help for additional options.

🔩 Training

Follow above instructions to download ImageNet pretrained weights for backbones and run one of the following command:

# Single-gpu training
python tools/train.py path/to/model/config 

# Multi-gpu training
./tools/dist_train.sh path/to/model/config <GPU_NUM>

For example to train SenFormer with a ResNet-50 as backbone on ADE20K:

# Single-gpu training
python tools/train.py senformer_configs/senformer/ade20k/senformer_fpnt_r50_512x512_160k_ade20k.py 

# Multi-gpu training
./tools/dist_train.sh senformer_configs/senformer/ade20k/senformer_fpnt_r50_512x512_160k_ade20k.py <GPU_NUM>

Note that the default learning rate and training schedule is for an effective batch size of 16, (e.g. 8 GPUs & 2 imgs/gpu).

⭐ Acknowledgement

This code is build using MMsegmentation library as codebase and uses timm and einops as well.

📚 Citation

If you find this repository useful, please consider citing our work 📝 and giving a star 🌟 :

@article{bousselham2021senformer,
  title={Efficient Self-Ensemble Framework for Semantic Segmentation},
  author={Walid Bousselham, Guillaume Thibault, Lucas Pagano, Archana Machireddy, Joe Gray, Young Hwan Chang, Xubo Song},
  journal={arXiv preprint arXiv:2111.13280},
  year={2021}
}

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Related tags

Overview

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

🔨 Installation

Conda environment

Datasets

Pretrained weights

🎯 Model Zoo

ADE20K

COCO-Stuff 10K

Pascal Context

Cityscapes

🔭 Inference

Inference on a dataset

Inference on custom data

🔩 Training

⭐ Acknowledgement

📚 Citation

Owner

Flexible Networks for Learning Physical Dynamics of Deformable Objects (2021)

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

To SMOTE, or not to SMOTE?

Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Painting app using Python machine learning and vision technology.

COIN the currently largest dataset for comprehensive instruction video analysis.

GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

Hack Camera, Microphone, Location, Clipboard With Just a Link. Also, Get Many Details About Victim's Device. And So On...

code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction

Reference PyTorch implementation of "End-to-end optimized image compression with competition of prior distributions"

Simulations for Turring patterns on an apically expanding domain. T