Semi-supevised Semantic Segmentation with High- and Low-level Consistency

Overview

Semi-supevised Semantic Segmentation with High- and Low-level Consistency

This Pytorch repository contains the code for our work Semi-supervised Semantic Segmentation with High- and Low-level Consistency. The approach uses two network branches that link semi-supervised classification with semi-supervised segmentation including self-training. The approach attains significant improvement over existing methods, especially when trained with very few labeled samples. On several standard benchmarks - PASCAL VOC 2012,PASCAL-Context, and Cityscapes - the approach achieves new state-of-the-art in semi-supervised learning.

We propose a two-branch approach to the task of semi-supervised semantic segmentation. The lower branch predicts pixel-wise class labels and is referred to as the Semi-Supervised Semantic Segmentation GAN(s4GAN). The upper branch performs image-level classification and is denoted as the Multi-Label Mean Teacher(MLMT).

Here, this repository contains the source code for the s4GAN branch. MLMT branch is adapted from Mean-Teacher work for semi-supervised classification. Instructions for setting up the MLMT branch are given below.

Package pre-requisites

The code runs on Python 3 and Pytorch 0.4 The following packages are required.

pip install scipy tqdm matplotlib numpy opencv-python

Dataset preparation

Download ImageNet pretrained Resnet-101(Link) and place it ./pretrained_models/

PASCAL VOC

Download the dataset(Link) and extract in ./data/voc_dataset/

PASCAL Context

Download the annotations(Link) and extract in ./data/pcontext_dataset/

Cityscapes

Download the dataset from the Cityscapes dataset server(Link). Download the files named 'gtFine_trainvaltest.zip', 'leftImg8bit_trainvaltest.zip' and extract in ./data/city_dataset/

Training and Validation on PASCAL-VOC Dataset

Results in the paper are averaged over 3 random splits. Same splits are used for reporting baseline performance for fair comparison.

Training fully-supervised Baseline (FSL)

python train_full.py    --dataset pascal_voc  \
                        --checkpoint-dir ./checkpoints/voc_full \
                        --ignore-label 255 \
                        --num-classes 21 

Training semi-supervised s4GAN (SSL)

python train_s4GAN.py   --dataset pascal_voc  \
                        --checkpoint-dir ./checkpoints/voc_semi_0_125 \
                        --labeled-ratio 0.125 \
                        --ignore-label 255 \ 
                        --num-classes 21

Validation

python evaluate.py --dataset pascal_voc  \
                   --num-classes 21 \
                   --restore-from ./checkpoints/voc_semi_0_125/VOC_30000.pth 

Training MLMT Branch

python train_mlmt.py \
        --batch-size-lab 16 \
        --batch-size-unlab 80 \
        --labeled-ratio 0.125 \
        --exp-name voc_semi_0_125_MLMT \
        --pkl-file ./checkpoints/voc_semi_0_125/train_voc_split.pkl

Final Evaluation S4GAN + MLMT

python evaluate.py --dataset pascal_voc  \
                   --num-classes 21 \
                   --restore-from ./checkpoints/voc_semi_0_125/VOC_30000.pth \
                   --with-mlmt \
                   --mlmt-file ./mlmt_output/voc_semi_0_125_MLMT/output_ema_raw_100.txt
    

Training and Validation on PASCAL-Context Dataset

python train_full.py    --dataset pascal_context  \
                        --checkpoint-dir ./checkpoints/pc_full \
                        --ignore-label -1 \
                        --num-classes 60

python train_s4GAN.py  --dataset pascal_context  \
                       --checkpoint-dir ./checkpoints/pc_semi_0_125 \
                       --labeled-ratio 0.125 \
                       --ignore-label -1 \
                       --num-classes 60 \
                       --split-id ./splits/pc/split_0.pkl
                       --num-steps 60000

python evaluate.py     --dataset pascal_context  \
                       --num-classes 60 \
                       --restore-from ./checkpoints/pc_semi_0_125/VOC_40000.pth

Training and Validation on Cityscapes Dataset

python train_full.py    --dataset cityscapes \
                        --checkpoint-dir ./checkpoints/city_full_0_125 \
                        --ignore-label 250 \
                        --num-classes 19 \
                        --input-size '256,512'  

python train_s4GAN.py   --dataset cityscapes \
                        --checkpoint-dir ./checkpoints/city_semi_0_125 \
                        --labeled-ratio 0.125 \
                        --ignore-label 250 \
                        --num-classes 19 \
                        --split-id ./splits/city/split_0.pkl \
                        --input-size '256,512' \
                        --threshold-st 0.7 \
                        --learning-rate-D 1e-5 

python evaluate.py      --dataset cityscapes \
                        --num-classes 19 \
                        --restore-from ./checkpoints/city_semi_0_125/VOC_30000.pth 

Acknowledgement

Parts of the code have been adapted from: DeepLab-Resnet-Pytorch, AdvSemiSeg, PyTorch-Encoding

Citation

@ARTICLE{8935407,
  author={S. {Mittal} and M. {Tatarchenko} and T. {Brox}},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Semi-Supervised Semantic Segmentation With High- and Low-Level Consistency}, 
  year={2021},
  volume={43},
  number={4},
  pages={1369-1379},
  doi={10.1109/TPAMI.2019.2960224}}
MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

sfwang 96 Dec 13, 2022
Code for our CVPR 2021 paper "MetaCam+DSCE"

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification (CVPR'21) Introduction Code for our CVPR 2021

FlyingRoastDuck 59 Oct 31, 2022
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization This repository contains the code for the BBI optimizer, introduced in the p

G. Bruno De Luca 5 Sep 06, 2022
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

SIV-LAB 22 Aug 31, 2022
DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
百度2021年语言与智能技术竞赛机器阅读理解Pytorch版baseline

项目说明: 百度2021年语言与智能技术竞赛机器阅读理解Pytorch版baseline 比赛链接:https://aistudio.baidu.com/aistudio/competition/detail/66?isFromLuge=true 官方的baseline版本是基于paddlepadd

周俊贤 54 Nov 23, 2022
Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting Official PyTorch Implementation of paper "NeLF: Neural Light-tran

Ken Lin 38 Dec 26, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

151 Dec 26, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 71 Dec 28, 2022
Inflated i3d network with inception backbone, weights transfered from tensorflow

I3D models transfered from Tensorflow to PyTorch This repo contains several scripts that allow to transfer the weights from the tensorflow implementat

Yana 479 Dec 08, 2022
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

CapsuleVOS This is the code for the ICCV 2019 paper CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing. Arxiv Link: https://a

53 Oct 27, 2022
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.

genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact

Zerite Development 5 Nov 22, 2022
Official Implementation of VAT

Semantic correspondence Few-shot segmentation Cost Aggregation Is All You Need for Few-Shot Segmentation For more information, check out project [Proj

Hamacojr 114 Dec 27, 2022
[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

x-magical x-magical is a benchmark extension of MAGICAL specifically geared towards cross-embodiment imitation. The tasks still provide the Demo/Test

Kevin Zakka 36 Nov 26, 2022
Projects of Andfun Yangon

AndFunYangon Projects of Andfun Yangon First Commit We can use gsearch.py to sea

Htin Aung Lu 1 Dec 28, 2021
DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

English | 简体中文 Introduction DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks Reference Pat

CV Newbie 28 Dec 13, 2022
Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds Introduction This is the official PyTorch implementation of o

Yijia Weng 96 Dec 07, 2022
Official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR)

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

12 Jan 13, 2022
LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

TuSimple 295 Jan 05, 2023