Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Last update: Dec 24, 2022

Overview

Deformable Attention

Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DETR. The relative positional embedding has also been modified for better extrapolation, using the Continuous Positional Embedding proposed in SwinV2.

Install

$ pip install deformable-attention

Usage

import torch
from deformable_attention import DeformableAttention

attn = DeformableAttention(
    dim = 512,                   # feature dimensions
    dim_head = 64,               # dimension per head
    heads = 8,                   # attention heads
    dropout = 0.,                # dropout
    downsample_factor = 4,       # downsample factor (r in paper)
    offset_scale = 4,            # scale of offset, maximum offset
    offset_groups = None,        # number of offset groups, should be multiple of heads
    offset_kernel_size = 6,      # offset kernel size
)

x = torch.randn(1, 512, 64, 64)
attn(x) # (1, 512, 64, 64)

3d deformable attention

import torch
from deformable_attention import DeformableAttention3D

attn = DeformableAttention3D(
    dim = 512,                          # feature dimensions
    dim_head = 64,                      # dimension per head
    heads = 8,                          # attention heads
    dropout = 0.,                       # dropout
    downsample_factor = (2, 8, 8),      # downsample factor (r in paper)
    offset_scale = (2, 8, 8),           # scale of offset, maximum offset
    offset_kernel_size = (4, 10, 10),   # offset kernel size
)

x = torch.randn(1, 512, 10, 32, 32) # (batch, dimension, frames, height, width)
attn(x) # (1, 512, 10, 32, 32)

1d deformable attention for good measure

import torch
from deformable_attention import DeformableAttention1D

attn = DeformableAttention1D(
    dim = 128,
    downsample_factor = 4,
    offset_scale = 2,
    offset_kernel_size = 6
)

x = torch.randn(1, 128, 512)
attn(x) # (1, 128, 512)

Citation

@misc{xia2022vision,
    title   = {Vision Transformer with Deformable Attention}, 
    author  = {Zhuofan Xia and Xuran Pan and Shiji Song and Li Erran Li and Gao Huang},
    year    = {2022},
    eprint  = {2201.00520},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

@misc{liu2021swin,
    title   = {Swin Transformer V2: Scaling Up Capacity and Resolution},
    author  = {Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo},
    year    = {2021},
    eprint  = {2111.09883},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

28 Dec 15, 2022

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

3DDUNET This is the code for 3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021) Conference Paper Link Dataset We use SMOID dataset

1 Jan 7, 2022

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

189 Nov 22, 2022

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

180 Jan 5, 2023

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

43 Dec 7, 2022

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

1k Dec 31, 2022

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

84 Jan 4, 2023

Comments

The relationship between 'dim' and 'inner_dim'

Hi, I have a question about DeformableAttention module,

I calculated the output volumes of the forward processes step by step, According to my calculation, the code works only when 'dim' and 'inner_dim' is same.

Is there any reason why you implement it this way?

Best regards, Hankyu

opened by hanq0212 4
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

@lucidrains while trying to perform import torch from deformable_attention import DeformableAttention

attn = DeformableAttention( dim = 512, # feature dimensions dim_head = 64, # dimension per head heads = 8, # attention heads dropout = 0., # dropout downsample_factor = 4, # downsample factor (r in paper) offset_scale = 4, # scale of offset, maximum offset offset_groups = None, # number of offset groups, should be multiple of heads offset_kernel_size = 6, # offset kernel size )

x = torch.randn(1, 512, 64, 64) attn(x)

Got error below from the line.. Kindly help

https://github.com/lucidrains/deformable-attention/blob/9f3c0ae35652ce54687e0db409921018bfca3310/deformable_attention/deformable_attention_2d.py#L26

opened by ChidanandKumarKS 1

Releases(0.0.16)

0.0.16(Apr 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.15(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.14(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.10(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.9(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.3(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.2(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.1(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need

GitHub Repository

Official implementation of Long-Short Transformer in PyTorch.

Long-Short Transformer (Transformer-LS) This repository hosts the code and models for the paper: Long-Short Transformer: Efficient Transformers for La

198 Dec 29, 2022

A DCGAN to generate anime faces using custom mined dataset

Anime-Face-GAN-Keras A DCGAN to generate anime faces using custom dataset in Keras. Dataset The dataset is created by crawling anime database websites

190 Jan 03, 2023

A Simple Example for Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env

Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env This repository implements a simple algorithm for imitation learning: DAGGER. In thi

66 Nov 23, 2022

Generative Handwriting using LSTM Mixture Density Network with TensorFlow

Generative Handwriting Demo using TensorFlow An attempt to implement the random handwriting generation portion of Alex Graves' paper. See my blog post

686 Nov 24, 2022

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

[email protected]"> 18 Sep 10, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.

Angora Angora is a mutation-based coverage guided fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without s

833 Jan 07, 2023

Causal-Adversarial-Instruments - PyTorch Implementation for Developing Library of Investigating Adversarial Examples on A Causal View by Instruments

Causal-Adversarial-Instruments This is a PyTorch Implementation code for develop

26 Dec 28, 2022

PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis

5 Feb 04, 2022

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi

99 Dec 27, 2022

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

305 Dec 30, 2022

Python implementation of "Elliptic Fourier Features of a Closed Contour"

PyEFD An Python/NumPy implementation of a method for approximating a contour with a Fourier series, as described in [1]. Installation pip install pyef

71 Dec 09, 2022

Udacity's CS101: Intro to Computer Science - Building a Search Engine

Udacity's CS101: Intro to Computer Science - Building a Search Engine All soluti

0 Feb 26, 2022

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

Pseudo-random numbers with pseudoscience rng is so complicated! Why cant we have a horoscopic, vibe-y way of calculating a random number? Why cant rng

1 Dec 27, 2021

bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Code Submission for: Bio-inspired Min-Nets Improve the Performance and Robustness of Deep Networks Run with docker To build a docker environment, chan

0 Dec 09, 2021

Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Triangle Multiplicative Module - Pytorch Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or c

22 Oct 28, 2022

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Related tags

Overview

Deformable Attention

Install

Usage

Citation

You might also like...

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Comments

The relationship between 'dim' and 'inner_dim'

TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Releases(0.0.16)

0.0.16(Apr 21, 2022)

0.0.15(Mar 24, 2022)

0.0.14(Mar 24, 2022)

0.0.12(Mar 24, 2022)

0.0.11(Mar 24, 2022)

0.0.10(Mar 24, 2022)

0.0.9(Mar 21, 2022)

0.0.8(Mar 21, 2022)

0.0.7(Mar 21, 2022)

0.0.6(Mar 21, 2022)

0.0.5(Mar 17, 2022)

0.0.4(Mar 17, 2022)

0.0.3(Mar 17, 2022)

0.0.2(Mar 17, 2022)

0.0.1(Mar 17, 2022)