Face Transformer for Recognition

Last update: Nov 30, 2022

Related tags

Overview

Face-Transformer

This is the code of Face Transformer for Recognition (https://arxiv.org/abs/2103.14803v2).

Recently there has been great interests of Transformer not only in NLP but also in computer vision. We wonder if transformer can be used in face recognition and whether it is better than CNNs. Therefore, we investigate the performance of Transformer models in face recognition. The models are trained on a large scale face recognition database MS-Celeb-1M and evaluated on several mainstream benchmarks, including LFW, SLLFW, CALFW, CPLFW, TALFW, CFP-FP, AGEDB and IJB-C databases. We demonstrate that Transformer models achieve comparable performance as CNN with similar number of parameters and MACs.

Usage Instructions

1. Preparation

The code is mainly adopted from Vision Transformer, and DeiT. In addition to PyTorch and torchvision, install vit_pytorch by Phil Wang, and package timm==0.3.2 by Ross Wightman. Sincerely appreciate for their contributions.

pip install vit-pytorch

pip install timm==0.3.2

Copy the files of fold "copy-to-vit_pytorch-path" to vit-pytorch path.

.
├── __init__.py
├── vit_face.py
└── vits_face.py

2. Databases

You can download the training databases, MS-Celeb-1M (version ms1m-retinaface), and put it in folder 'Data'.

You can download the testing databases as follows and put them in folder 'eval'.

LFW: Baidu Netdisk(password: dfj0), Google Drive
SLLFW: Baidu Netdisk(password: l1z6), Google Drive
CALFW: Baidu Netdisk(password: vvqe), Google Drive
CPLFW: Baidu Netdisk(password: jyp9), Google Drive
TALFW: Baidu Netdisk(password: izrg), Google Drive
CFP_FP: Baidu Netdisk(password: 4fem), Google Drive--refer to Insightface
AGEDB: Baidu Netdisk(password: rlqf), Google Drive--refer to Insightface

3. Train Models

ViT-P8S8

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s1 --warmup-epochs 1 --lr 3e-4 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s2 --warmup-epochs 0 --lr 1e-4 -r path_to_model 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VIT -head CosFace --outdir ./results/ViT-P8S8_ms1m_cosface_s3 --warmup-epochs 0 --lr 5e-5 -r path_to_model

ViT-P12S8

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s1 --warmup-epochs 1 --lr 3e-4 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s2 --warmup-epochs 0 --lr 1e-4 -r path_to_model 

CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s3 --warmup-epochs 0 --lr 5e-5 -r path_to_model

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

You can download the following models

ViT-P8S8: Baidu Netdisk(password: spkf), Google Drive
ViT-P12S8: Baidu Netdisk(password: 7caa), Google Drive

You can test Models

python test.py --model ./results/ViT-P12S8_ms1m_cosface/Backbone_VITs_Epoch_2_Batch_12000_Time_2021-03-17-04-05_checkpoint.pth --network VIT 

python test.py --model ./results/ViT-P12S8_ms1m_cosface/Backbone_VITs_Epoch_2_Batch_12000_Time_2021-03-17-04-05_checkpoint.pth --network VITs

Face Transformer for Recognition

Related tags

Overview

Face-Transformer

Usage Instructions

1. Preparation

2. Databases

3. Train Models

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

Owner

Zhong Yaoyao

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

3D cascade RCNN for object detection on point cloud

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

nfelo: a power ranking, prediction, and betting model for the NFL

Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"

Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Eff video representation - Efficient video representation through neural fields

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Einshape: DSL-based reshaping library for JAX and other frameworks.

Migration of Edge-based Distributed Federated Learning

What can linearized neural networks actually say about generalization?

【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

A geometric deep learning pipeline for predicting protein interface contacts.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.