[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Overview

F8Net
Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral)

OpenReview | arXiv | PDF | Model Zoo | BibTex

PyTorch implementation of neural network quantization with fixed-point 8-bit only multiplication.

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin1,2, Jian Ren1, Richard Zhuang1, Sumant Hanumante1, Zhengang Li2, Zhiyu Chen3, Yanzhi Wang2, Kaiyuan Yang3, Sergey Tulyakov1
1Snap Inc., 2Northeastern University, 3Rice University
ICLR 2022 Oral.

Overview Neural network quantization implements efficient inference via reducing the weight and input precisions. Previous methods for quantization can be categorized as simulated quantization, integer-only quantization, and fixed-point quantization, with the former two involving high-precision multiplications with 32-bit floating-point or integer scaling. In contrast, fixed-point models can avoid such high-demanding requirements but demonstrates inferior performance to the other two methods. In this work, we study the problem of how to train such models. Specifically, we conduct statistical analysis on values for quantization and propose to determine the fixed-point format from data during training with some semi-empirical formula. Our method demonstrates that high-precision multiplication is not necessary for the quantized model to achieve comparable performance as their full-precision counterparts.

Getting Started

Requirements
  1. Please check the requirements and download packages.

  2. Prepare ImageNet-1k data following pytorch example, and create a softlink to the ImageNet data path to data under current the code directory (ln -s /path/to/imagenet data).

Model Training
Conventional training
  • We train the model with the file distributed_run.sh and the command
    bash distributed_run.sh /path/to/yml_file batch_size
    
  • We set batch_size=2048 for conventional training of floating-/fixed-point ResNet18 and MobileNet V1/V2.
  • Before training, please update the dataset_dir and log_dir arguments in the yaml files for training the floating-/fixed-point models.
  • To train the floating-point model, please use the yaml file ***_floating_train.yml in the conventional subfolder under the corresponding folder of the model.
  • To train the fixed-point model, please first train the floating-point model as the initialization. Please use the yaml file ***_fix_quant_train.yml in the conventional subfolder under the corresponding folder of the model. Please make sure the argument fp_pretrained_file directs to the correct path for the corresponding floating-point checkpoint. We also provide our pretrained floating-point models in the Model Zoo below.
Tiny finetuning
  • We finetune the model with the file run.sh and the command

    bash run.sh /path/to/yml_file batch_size
    
  • We set batch_size=128 and use one GPU for tiny-finetuning of fixed-point ResNet18/50.

  • Before fine-tuning, please update the dataset_dir and log_dir arguments in the yaml files for finetuning the fixed-point models.

  • To finetune the fixed-point model, please use the yaml file ***_fix_quant_***_pretrained_train.yml in the tiny_finetuning subfolder under the corresponding folder of the model. For model pretrained with PytorchCV (Baseline of ResNet18 and Baseline#1 of ResNet50), the floating-point checkpoint will be downloaded automatically during code running. For the model pretrained by Nvidia (Baseline#2 of ResNet50), please download the checkpoint first and make sure the argument nvidia_pretrained_file directs to the correct path of this checkpoint.

Model Testing
  • We test the model with the file run.sh and the command

    bash run.sh /path/to/yml_file batch_size
    
  • We set batch_size=128 and use one GPU for model testing.

  • Before testing, please update the dataset_dir and log_dir arguments in the yaml files. Please update the argument integize_file_path and int_op_only_file_path arguments in the yaml files ***_fix_quant_test***_integize.yml and ***_fix_quant_test***_int_op_only.yml, respectively. Please also update other arguments like nvidia_pretrained_file if necessary (even if they are not used during testing).

  • We use the yaml file ***_floating_test.yml for testing the floating-point model; ***_fix_quant***_test.yml for testing the fixed-point model with the same setting as during training/tiny-finetuning; ***_fix_quant***_test_int_model.yml for testing the fixed-point model on GPU with all quantized weights, bias and inputs implemented with integers (but with float dtype as GPU does not support integer operations) and use the original modules during training (e.g. with batch normalization layers); ***_fix_quant***_test_integize.yml for testing the fixed-point model on GPU with all quantized weights, bias and inputs implemented with integers (but with float dtype as GPU does not support integer operations) and a new equivalent model with only convolution, pooling and fully-connected layers; ***_fix_quant***_test_int_op_only.yml for testing the fixed-point model on CPU with all quantized weights, bias and inputs implemented with integers (with int dtype) and a new equivalent model with only convolution, pooling and fully-connected layers. Note that the accuracy from the four testing files can differ a little due to numerical error.

Model Export
  • We export fixed-point model with integer weights, bias and inputs to run on GPU and CPU during model testing with ***_fix_quant_test_integize.yml and ***_fix_quant_test_int_op_only.yml files, respectively.

  • The exported onnx files are saved to the path given by the arguments integize_file_path and int_op_only_file_path.

F8Net Model Zoo

All checkpoints and onnx files are available at here.

Conventional

Model Type Top-1 Acc.a Checkpoint
ResNet18 FP
8-bit
70.3
71.0
Res18_32
Res18_8
MobileNet-V1 FP
8-bit
72.4
72.8
MBV1_32
MBV1_8
MobileNet-V2b FP
8-bit
72.7
72.6
MBV2b_32
MBV2b_8

Tiny Finetuning

Model Type Top-1 Acc.a Checkpoint
ResNet18 FP
8-bit
73.1
72.3
Res18_32p
Res18_8p
ResNet50b (BL#1) FP
8-bit
77.6
77.6
Res50b_32p
Res50b_8p
ResNet50b (BL#2) FP
8-bit
78.5
78.1
Res50b_32n
Res50b_8n

a The accuracies are obtained from the inference step during training. Test accuracy for the final exported model might have some small accuracy difference due to numerical error.

Technical Details

The main techniques for neural network quantization with 8-bit fixed-point multiplication includes the following:

  • Quantized methods/modules including determining fixed-point formats from statistics or by grid-search, fusing convolution and batch normalization layers, and reformulating PACT with fixed-point quantization are implemented in models/fix_quant_ops.
  • Clipping-level sharing and private fractional length for residual blocks are implemented in the ResNet (models/fix_resnet) and MobileNet V2 (models/fix_mobilenet_v2).

Acknowledgement

This repo is based on AdaBits.

Citation

If our code or models help your work, please cite our paper:

@inproceedings{
  jin2022fnet,
  title={F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization},
  author={Qing Jin and Jian Ren and Richard Zhuang and Sumant Hanumante and Zhengang Li and Zhiyu Chen and Yanzhi Wang and Kaiyuan Yang and Sergey Tulyakov},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=_CfpJazzXT2}
}
Owner
Snap Research
Snap Research
Image-to-image regression with uncertainty quantification in PyTorch

Image-to-image regression with uncertainty quantification in PyTorch. Take any dataset and train a model to regress images to images with rigorous, distribution-free uncertainty quantification.

Anastasios Angelopoulos 25 Dec 26, 2022
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
Harmonic Memory Networks for Graph Completion

HMemNetworks Code and documentation for Harmonic Memory Networks, a series of models for compositionally assembling representations of graph elements

mlalisse 0 Oct 27, 2021
[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

Seokeon Choi 35 Oct 26, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023
Pytorch implemenation of Stochastic Multi-Label Image-to-image Translation (SMIT)

SMIT: Stochastic Multi-Label Image-to-image Translation This repository provides a PyTorch implementation of SMIT. SMIT can stochastically translate a

Biomedical Computer Vision Group @ Uniandes 37 Mar 01, 2022
Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

Hierarchical Pretraining: Research Repository This is a research repository for reproducing the results from the project "Self-supervised pretraining

Colorado Reed 53 Nov 09, 2022
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

YonghaoHe 1.3k Dec 25, 2022
This is a file about Unet implemented in Pytorch

Unet this is an implemetion of Unet in Pytorch and it's architecture is as follows which is the same with paper of Unet component of Unet Convolution

Dragon 1 Dec 03, 2021
Credit fraud detection in Python using a Jupyter Notebook

Credit-Fraud-Detection - Credit fraud detection in Python using a Jupyter Notebook , using three classification models (Random Forest, Gaussian Naive Bayes, Logistic Regression) from the sklearn libr

Ali Akram 4 Dec 28, 2021
PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

DECOR-GAN PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement, Zhiqin Chen, Vladimir G. Kim, Matthew Fish

Zhiqin Chen 72 Dec 31, 2022
Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

Minchul Shin 53 Jan 22, 2022
Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-DETR and DELA-DETR in

Wen Wang 61 Dec 12, 2022
This repo is duplication of jwyang/faster-rcnn.pytorch

Faster RCNN Pytorch This repo is duplication of jwyang/faster-rcnn.pytorch C/C++ code are removed and easier to study. Python 3.8.5 Ubuntu 20.04.1 LTS

Kim Jihwan 1 Jan 14, 2022
Revisiting Weakly Supervised Pre-Training of Visual Perception Models

SWAG: Supervised Weakly from hashtAGs This repository contains SWAG models from the paper Revisiting Weakly Supervised Pre-Training of Visual Percepti

Meta Research 134 Jan 05, 2023
Code accompanying the paper on "An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers" published at NeurIPS, 2021

Code for "An Empirical Investigation of Domian Generalization with Empirical Risk Minimizers" (NeurIPS 2021) Motivation and Introduction Domain Genera

Meta Research 15 Dec 27, 2022
Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022
Train CNNs for the fruits360 data set in NTOU CS「Machine Vision」class.

CNNs fruits360 Train CNNs for the fruits360 data set in NTOU CS「Machine Vision」class. CNN on a pretrained model Build a CNN on a pretrained model, Res

Ricky Chuang 1 Mar 07, 2022
Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022)

Blockwise Sequential Model Learning Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022) For ins

2 Jun 17, 2022
Source Code for ICSE 2022 Paper - ``Can We Achieve Fairness Using Semi-Supervised Learning?''

Fair-SSL Source Code for ICSE 2022 Paper - Can We Achieve Fairness Using Semi-Supervised Learning? Ethical bias in machine learning models has become

1 Dec 18, 2021