List of awesome things around semantic segmentation 🎉

Overview

Awesome Semantic Segmentation

Awesome

List of awesome things around semantic segmentation 🎉

Semantic segmentation is a computer vision task in which we label specific regions of an image according to what's being shown. Semantic segmentation awswers for the question: "What's in this image, and where in the image is it located?".

Semantic segmentation is a critical module in robotics related applications, especially autonomous driving, remote sensing. Most of the research on semantic segmentation is focused on improving the accuracy with less attention paid to computationally efficient solutions.

Seft-driving-car

The recent appoarch in semantic segmentation is using deep neural network, specifically Fully Convolutional Network (a.k.a FCN). We can follow the trend of semantic segmenation approach at: paper-with-code.

Evaluate metrics: mIOU, accuracy, speed,...

State-Of-The-Art (SOTA) methods of Semantic Segmentation

Paper Benchmark on PASALVOC12 Release Implement
EfficientNet-L2+NAS-FPN Rethinking Pre-training and Self-training 90.5% NeurIPS 2020 TF
DeepLab V3+ Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation 89% ECCV 2018 TF, Keras, Pytorch, Demo
DeepLab V3 Rethinking Atrous Convolution for Semantic Image Segmentation 86.9% 17 Jun 2017 TF, TF
Smooth Network with Channel Attention Block Learning a Discriminative Feature Network for Semantic Segmentation 86.2% CVPR 2018 Pytorch
PSPNet Pyramid Scene Parsing Network 85.4% CVPR 2017 Keras, Pytorch, Pytorch
ResNet-38 MS COCO Wider or Deeper: Revisiting the ResNet Model for Visual Recognition 84.9% 30 Nov 2016 MXNet
RefineNet RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation 84.2% CVPR 2017 Matlab, Keras
GCN Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network 83.6% CVPR 2017 TF
CRF-RNN Conditional Random Fields as Recurrent Neural Networks 74.7% ICCV 2015 Matlab, TF
ParseNet ParseNet: Looking Wider to See Better 69.8% 15 Jun 2015 Caffe
Dilated Convolutions Multi-Scale Context Aggregation by Dilated Convolutions 67.6% 23 Nov 2015 Caffe
FCN Fully Convolutional Networks for Semantic Segmentation 67.2% CVPR 2015 Caffe

Variants

  • FCN with VGG(Resnet, Densenet) backbone: pytorch
  • The easiest implementation of fully convolutional networks (FCN8s VGG): pytorch
  • TernausNet (UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset paper: pytorch
  • TernausNetV2: Fully Convolutional Network for Instance Segmentation: pytorch

Review list of Semantic Segmentation

  • Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey 2020 (University of Gour Banga,India)
  • A peek of Semantic Segmentation 2018 (mc.ai)
  • Semantic Segmentation guide 2018 (towardds)
  • An overview of semantic image segmentation (jeremyjordan.me)
  • Recent progress in semantic image segmentation 2018 (arxiv, towardsdatascience)
  • A 2017 Guide to Semantic Segmentation Deep Learning Review (blog.qure.ai)
  • Review popular network architecture (medium-towardds)
  • Lecture 11 - Detection and Segmentation - CS231n (slide, vid):
  • A Survey of Semantic Segmentation 2016 (arxiv)

Case studies

  • Dstl Satellite Imagery Competition, 3rd Place Winners' Interview: Vladimir & Sergey: Blog, Code
  • Carvana Image Masking Challenge–1st Place Winner's Interview: Blog, Code
  • Data Science Bowl 2017, Predicting Lung Cancer: Solution Write-up, Team Deep Breath: Blog
  • MICCAI 2017 Robotic Instrument Segmentation: Code and explain
  • 2018 Data Science Bowl Find the nuclei in divergent images to advance medical discovery: 1st place, 2nd, 3rd, 4th, 5th, 10th
  • Airbus Ship Detection Challenge: 4th place, 6th

Most used loss functions

  • Pixel-wise cross entropy loss:
  • Dice loss: which is pretty nice for balancing dataset
  • Focal loss:
  • Lovasz-Softmax loss:

Datasets

Frameworks for segmentation

Related techniques

Feel free to show your ❤️ by giving a star

🎁 Check Out the List of Contributors - Feel free to add your details here!

Owner
Dam Minh Tien
Tech enthusiast
Dam Minh Tien
Recovering Brain Structure Network Using Functional Connectivity

Recovering-Brain-Structure-Network-Using-Functional-Connectivity Framework: Papers: This repository provides a PyTorch implementation of the models ad

5 Nov 30, 2022
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data

VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De

6 Dec 15, 2022
[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Balanced MSE Code for the paper: Balanced MSE for Imbalanced Visual Regression Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu CVPR 2022 (Oral) News

Jiawei Ren 267 Jan 01, 2023
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

FFD Source Code Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face M

88 Nov 22, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
We simulate traveling back in time with a modern camera to rephotograph famous historical subjects.

[SIGGRAPH Asia 2021] Time-Travel Rephotography [Project Website] Many historical people were only ever captured by old, faded, black and white photos,

298 Jan 02, 2023
Notification Triggers for Python

Notipyer Notification triggers for Python Send async email notifications via Python. Get updates/crashlogs from your scripts with ease. Installation p

Chirag Jain 17 May 16, 2022
TriMap: Large-scale Dimensionality Reduction Using Triplets

TriMap TriMap is a dimensionality reduction method that uses triplet constraints to form a low-dimensional embedding of a set of points. The triplet c

Ehsan Amid 235 Dec 24, 2022
Layer 7 DDoS Panel with Cloudflare Bypass ( UAM, CAPTCHA, BFM, etc.. )

Blood Deluxe DDoS DDoS Attack Panel includes CloudFlare Bypass (UAM, CAPTCHA, BFM, etc..)(It works intermittently. Working on it) Don't attack any web

272 Nov 01, 2022
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023
OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Documentation: https://mmsegmentation.readthedocs.io/ English | 简体中文 Introduction MMSegmentation is an open source semantic segmentation toolbox based

OpenMMLab 5k Dec 31, 2022
An LSTM based GAN for Human motion synthesis

GAN-motion-Prediction An LSTM based GAN for motion synthesis has a few issues reading H3.6M data from A.Jain et al , will fix soon. Prediction of the

Amogh Adishesha 9 Jun 17, 2022
Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.

Stock-Prediction- In this project, we aim to enhance the prediction of stock market movements using sentiment analysis and deep learning. We divide th

5 Jan 25, 2022
PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

Lingbo Yang 401 Nov 19, 2022
A collection of models for image<->text generation in ACM MM 2021.

Bi-directional Image and Text Generation UMT-BITG (image & text generator) Unifying Multimodal Transformer for Bi-directional Image and Text Generatio

Multimedia Research 63 Oct 30, 2022
DLL: Direct Lidar Localization

DLL: Direct Lidar Localization Summary This package presents DLL, a direct map-based localization technique using 3D LIDAR for its application to aeri

Service Robotics Lab 127 Dec 16, 2022
Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”. Introduction Based

96 Dec 13, 2022
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-

Khanh Nguyen 131 Oct 21, 2022
PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

How to Reproduce our Results This repository contains PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Represen

opcrisis 46 Dec 15, 2022