This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

Last update: Dec 02, 2022

Related tags

Overview

OODformer: Out-Of-Distribution Detection Transformer

This repo is the official the implementation of the OODformer: Out-Of-Distribution Detection Transformer in PyTorch using CIFAR as an illustrative example:
##Getting started

At first please install all the dependencies using : pip install -r requirement.txt ##Datasets Please download all the in-distribution (CIFAR-10,CIFAR-100,ImageNet-30) and out-of-distribution dataset(LSUN_resize, ImageNet_resize, Places-365, DTD, Stanford Dogs, Food-101, Caltech-256, CUB-200) to data folder under the root directory.

Training

For training Vision Transformer and its Data efficient variant please download their corresponding pre-train weight from ViT and DeiT repository.

To fine-tune vision transformer network on any in-distribution dataset on multi GPU settings:

srun --gres=gpu:4  python vit/src/train.py --exp-name name_of_the_experimet --tensorboard --model-arch b16 --checkpoint-path path/to/checkpoint --image-size 224 --data-dir data/ImageNet30 --dataset ImageNet --num-classes 30 --train-steps 4590 --lr 0.01 --wd 1e-5 --n-gpu 4 --num-workers 16 --batch-size 512 --method SupCE

model-arch : specify the model of vit and deit variants (see vit/src/config.py )
method : currently we support only supervised cross-entropy
train_steps : cyclic lr has been used for lr scheduler, number of training epoch can be calculated using (#train steps* batch size)/#training samples
checkpoint_path : for loading pre-trained weight of vision transformer based on their different model.

Training Support

OODformer can also be trained with various supervised and self-supervised loss like :

Training Base ResNet model

To train resnet variants(e.g., resent-50,wide-resent) as base model on in-distribution dataset :

srun --gres=gpu:4  python main_ce.py --batch_size 512 --epochs 500 --model resent34 --learning_rate 0.8  --cosine --warm --dataset cifar10

Evaluation

To evaluate the similarity distance from the mean embedding of an in-distribution (e.g., CIFAR-10) class a list of distance metrics (e.g., Mahalanobis, Cosine, Euclidean, and Softmax) can be used with OODformer as stated below :

srun --gres=gpu:1 python OOD_Distance.py --ckpt checkpoint_path --model vit --model_arch b16 --distance Mahalanobis --dataset id_dataset --out_dataset ood_dataset

Visualization

Various embedding visualization can be viewed using generate_tsne.py

(1) UMAP of in-distribution embedding

(2) UMAP of combined in and out-of distribution embedding

Reference

@article{koner2021oodformer,
  title={OODformer: Out-Of-Distribution Detection Transformer},
  author={Koner, Rajat and Sinhamahapatra, Poulami and Roscher, Karsten and G{\"u}nnemann, Stephan and Tresp, Volker},
  journal={arXiv preprint arXiv:2107.08976},
  year={2021}
}

Acknowledgments

Part of this code is inspired by HobbitLong/SupContrast.

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

Related tags

Overview

OODformer: Out-Of-Distribution Detection Transformer

Training

Training Support

Training Base ResNet model

Evaluation

Visualization

Reference

Acknowledgments

Owner

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Multi-task head pose estimation in-the-wild

Intrusion Test Tool with Python

Refactoring dalle-pytorch and taming-transformers for TPU VM

Acoustic mosquito detection code with Bayesian Neural Networks

MoveNetを用いたPythonでの姿勢推定のデモ

Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss （ATVGnet）

Road Crack Detection Using Deep Learning Methods

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

3.8% and 18.3% on CIFAR-10 and CIFAR-100

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data