计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Overview

Awesome-Attention-Mechanism-in-cv

Table of Contents

Introduction

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制,还收集了一些即插即用模块。由于能力有限精力有限,可能很多模块并没有包括进来,有任何的建议或者改进,可以提交issue或者进行PR。

Attention Mechanism

Paper Publish Link Main Idea Blog
Global Second-order Pooling Convolutional Networks CVPR19 GSoPNet 将高阶和注意力机制在网络中部地方结合起来
Neural Architecture Search for Lightweight Non-Local Networks CVPR20 AutoNL NAS+LightNL
Squeeze and Excitation Network CVPR18 SENet 最经典的通道注意力 zhihu
Selective Kernel Network CVPR19 SKNet SE+动态选择 zhihu
Convolutional Block Attention Module ECCV18 CBAM 串联空间+通道注意力 zhihu
BottleNeck Attention Module BMVC18 BAM 并联空间+通道注意力 zhihu
Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks MICCAI18 scSE 并联空间+通道注意力 zhihu
Non-local Neural Networks CVPR19 Non-Local(NL) self-attention zhihu
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond ICCVW19 GCNet 对NL进行改进 zhihu
CCNet: Criss-Cross Attention for Semantic Segmentation ICCV19 CCNet 对NL改进
SA-Net:shuffle attention for deep convolutional neural networks ICASSP 21 SANet SGE+channel shuffle zhihu
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks CVPR20 ECANet SE的改进
Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks CoRR19 SGENet Group+spatial+channel
FcaNet: Frequency Channel Attention Networks CoRR20 FcaNet 频域上的SE操作
$A^2\text{-}Nets$: Double Attention Networks NeurIPS18 DANet NL的思想应用到空间和通道
Asymmetric Non-local Neural Networks for Semantic Segmentation ICCV19 APNB spp+NL
Efficient Attention: Attention with Linear Complexities CoRR18 EfficientAttention NL降低计算量
Image Restoration via Residual Non-local Attention Networks ICLR19 RNAN
Exploring Self-attention for Image Recognition CVPR20 SAN 理论性很强,实现起来很简单
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None MSRA综述self-attention
Object-Contextual Representations for Semantic Segmentation ECCV20 OCRNet 复杂的交互机制,效果确实好
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification TTNNLS20 IAUNet 引入时序信息
ResNeSt: Split-Attention Networks CoRR20 ResNeSt SK+ResNeXt
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks NeurIPS18 GENet SE续作
Improving Convolutional Networks with Self-calibrated Convolutions CVPR20 SCNet 自校正卷积
Rotate to Attend: Convolutional Triplet Attention Module WACV21 TripletAttention CHW两两互相融合
Dual Attention Network for Scene Segmentation CVPR19 DANet self-attention
Relation-Aware Global Attention for Person Re-identification CVPR20 RGA 用于reid
Attentional Feature Fusion WACV21 AFF 特征融合的attention方法
An Attentive Survey of Attention Models CoRR19 None 包括NLP/CV/推荐系统等方面的注意力机制
Stand-Alone Self-Attention in Vision Models NeurIPS19 FullAttention 全部的卷积都替换为self-attention
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation ECCV18 BiSeNet 类似FPN的特征融合方法 zhihu
DCANet: Learning Connected Attentions for Convolutional Neural Networks CoRR20 DCANet 增强attention之间信息流动
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None 对空间注意力进行针对性分析
Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition CVPR17 Oral RA-CNN 细粒度识别
Guided Attention Network for Object Detection and Counting on Drones ACM MM20 GANet 处理目标检测问题
Attention Augmented Convolutional Networks ICCV19 AANet 多头+引入额外特征映射
GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION ICLR21 GSA 新的全局注意力模块
Attention-Guided Hierarchical Structure Aggregation for Image Matting CVPR20 HAttMatting 抠图方面的应用,高层使用通道注意力机制,然后再使用空间注意力机制指导低层。
Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks ECCV20 None 与SE互补的权值激活机制
Expectation-Maximization Attention Networks for Semantic Segmentation ICCV19 Oral EMANet EM+Attention

Plug and Play Module

  • ACBlock
  • Swish、wish Activation
  • ASPP Block
  • DepthWise Convolution
  • Fused Conv & BN
  • MixedDepthwise Convolution
  • PSP Module
  • RFBModule
  • SematicEmbbedBlock
  • SSH Context Module
  • Some other usefull tools such as concate feature map、flatten feature map
  • WeightedFeatureFusion:EfficientDet中的FPN用到的fuse方式
  • StripPooling:CVPR2020中核心代码StripPooling
  • GhostModule: CVPR2020GhostNet的核心模块
  • SlimConv: SlimConv3x3
  • Context Gating: video classification
  • EffNetBlock: EffNet
  • ECCV2020 BorderDet: Border aligment module
  • CVPR2019 DANet: Dual Attention
  • Object Contextual Representation for sematic segmentation: OCRModule
  • FPT: 包含Self Transform、Grounding Transform、Rendering Transform
  • DOConv: 阿里提出的Depthwise Over-parameterized Convolution
  • PyConv: 起源人工智能研究院提出的金字塔卷积
  • ULSAM:用于紧凑型CNN的超轻量级子空间注意力模块
  • DGC: ECCV 2020用于加速卷积神经网络的动态分组卷积
  • DCANet: ECCV 2020 学习卷积神经网络的连接注意力
  • PSConv: ECCV 2020 将特征金字塔压缩到紧凑的多尺度卷积层中
  • Dynamic Convolution: CVPR2020 动态滤波器卷积(非官方)
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Evaluation

基于CIFAR10+ResNet+待测评模块,对模块进行初步测评。测评代码来自于另外一个库:https://github.com/kuangliu/pytorch-cifar/ 实验过程中,不使用预训练权重,进行随机初始化。

模型 top1 acc time params(MB)
SENet18 95.28% 1:27:50 11,260,354
ResNet18 95.16% 1:13:03 11,173,962
ResNet50 95.50% 4:24:38 23,520,842
ShuffleNetV2 91.90% 1:02:50 1,263,854
GoogLeNet 91.90% 1:02:50 6,166,250
MobileNetV2 92.66% 2:04:57 2,296,922
SA-ResNet50 89.83% 2:10:07 23,528,758
SA-ResNet18 95.07% 1:39:38 11,171,394

Paper List

SENet 论文: https://arxiv.org/abs/1709.01507 解读:https://zhuanlan.zhihu.com/p/102035721

Contribute

欢迎在issue中提出补充的文章paper和对应code链接。

Owner
PJDong
Computer vision learner, deep learner
PJDong
Udacity's CS101: Intro to Computer Science - Building a Search Engine

Udacity's CS101: Intro to Computer Science - Building a Search Engine All soluti

Phillip 0 Feb 26, 2022
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation

992 Dec 30, 2022
Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

CoCosNet Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation" (CVPR 2020 oral). Update: 202

Lingbo Yang 38 Sep 22, 2021
SOLOv2 on onnx & tensorRT

SOLOv2.tensorRT: NOTE: code based on WXinlong/SOLO add support to TensorRT inference onnxruntime tensorRT full_dims and dynamic shape postprocess with

47 Nov 26, 2022
Code for "Adversarial attack by dropping information." (ICCV 2021)

AdvDrop Code for "AdvDrop: Adversarial Attack to DNNs by Dropping Information(ICCV 2021)." Human can easily recognize visual objects with lost informa

Ranjie Duan 52 Nov 10, 2022
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag

Aljaz Bozic 165 Jan 09, 2023
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-

Sea AI Lab 62 Nov 08, 2022
Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Crypto_Bot Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies. Steps to get started using the bot: Sign up

21 Oct 03, 2022
EssentialMC2 Video Understanding

EssentialMC2 Introduction EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relatio

Alibaba 106 Dec 11, 2022
A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).

UniNAS A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS). under development (which happens mostly on our internal Gi

Cognitive Systems Research Group 19 Nov 23, 2022
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.

Blitz - Bayesian Layers in Torch Zoo BLiTZ is a simple and extensible library to create Bayesian Neural Network Layers (based on whats proposed in Wei

Pi Esposito 722 Jan 08, 2023
这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer

Time Series Research with Torch 这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer。 建立原因 相较于mxnet和TF,Torch框架中的神经网络层需要提前指定输入维度: # 建立线性层 TensorF

Chi Zhang 85 Dec 29, 2022
Post-Training Quantization for Vision transformers.

PTQ4ViT Post-Training Quantization Framework for Vision Transformers. We use the twin uniform quantization method to reduce the quantization error on

Zhihang Yuan 61 Dec 28, 2022
CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

CBREN This is the Pytorch implementation for our IEEE TCSVT paper : CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhanceme

Zhao Hengrun 3 Nov 04, 2022
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

IIGROUP 6 Sep 21, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

102 Dec 25, 2022