Model Zoo for AI Model Efficiency Toolkit

Overview

Qualcomm Innovation Center, Inc.

Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance. Results demonstrate that quantized models can provide good accuracy, comparable to floating point models. Together with results, we also provide recipes for users to quantize floating-point models using the AI Model Efficiency ToolKit (AIMET).

Table of Contents

Introduction

Quantized inference is significantly faster than floating-point inference, and enables models to run in a power-efficient manner on mobile and edge devices. We use AIMET, a library that includes state-of-the-art techniques for quantization, to quantize various models available in TensorFlow and PyTorch frameworks. The list of models is provided in the sections below.

An original FP32 source model is quantized either using post-training quantization (PTQ) or Quantization-Aware-Training (QAT) technique available in AIMET. Example scripts for evaluation are provided for each model. When PTQ is needed, the evaluation script performs PTQ before evaluation. Wherever QAT is used, the fine-tuned model checkpoint is also provided.

Tensorflow Models

Model Zoo

Network Model Source [1] Floating Pt (FP32) Model [2] Quantized Model [3] Results [4] Documentation
ResNet-50 (v1) GitHub Repo Pretrained Model See Documentation (ImageNet) Top-1 Accuracy
FP32: 75.21%
INT8: 74.96%
ResNet50.md
MobileNet-v2-1.4 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 75%
INT8: 74.21%
MobileNetV2.md
EfficientNet Lite GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 74.93%
INT8: 74.99%
EfficientNetLite.md
SSD MobileNet-v2 GitHub Repo Pretrained Model See Example (COCO) Mean Avg. Precision (mAP)
FP32: 0.2469
INT8: 0.2456
SSDMobileNetV2.md
RetinaNet GitHub Repo Pretrained Model See Example (COCO) mAP
FP32: 0.35
INT8: 0.349
Detailed Results
RetinaNet.md
Pose Estimation Based on Ref. Based on Ref. Quantized Model (COCO) mAP
FP32: 0.383
INT8: 0.379,
Mean Avg.Recall (mAR)
FP32: 0.452
INT8: 0.446
PoseEstimation.md
SRGAN GitHub Repo Pretrained Model See Example (BSD100) PSNR/SSIM
FP32: 25.45/0.668
INT8: 24.78/0.628
INT8W/INT16Act.: 25.41/0.666
Detailed Results
SRGAN.md

[1] Original FP32 model source
[2] FP32 model checkpoint
[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit activations (INT8W/INT16Act.) are used to further improve performance of post-training quantization.
[4] Results comparing float and quantized performance
[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

RetinaNet

(COCO dataset)

Average Precision/Recall @[ IoU | area | maxDets] FP32 INT8
Average Precision @[ 0.50:0.95 | all | 100 ] 0.350 0.349
Average Precision @[ 0.50 | all | 100 ] 0.537 0.536
Average Precision @[ 0.75 | all | 100 ] 0.374 0.372
Average Precision @[ 0.50:0.95 | small | 100 ] 0.191 0.187
Average Precision @[ 0.50:0.95 | medium | 100 ] 0.383 0.381
Average Precision @[ 0.50:0.95 | large | 100 ] 0.472 0.472
Average Recall @[ 0.50:0.95 | all | 1 ] 0.306 0.305
Average Recall @[0.50:0.95 | all | 10 ] 0.491 0.490
Average Recall @[ 0.50:0.95 | all |100 ] 0.533 0.532
Average Recall @[ 0.50:0.95 | small | 100 ] 0.345 0.341
Average Recall @[ 0.50:0.95 | medium | 100 ] 0.577 0.577
Average Recall @[ 0.50:0.95 | large | 100 ] 0.681 0.679

SRGAN

Model Dataset PSNR SSIM
FP32 Set5/Set14/BSD100 29.17/26.17/25.45 0.853/0.719/0.668
INT8/ACT8 Set5/Set14/BSD100 28.31/25.55/24.78 0.821/0.684/0.628
INT8/ACT16 Set5/Set14/BSD100 29.12/26.15/25.41 0.851/0.719/0.666

PyTorch Models

Model Zoo

Network Model Source [1] Floating Pt (FP32) Model [2] Quantized Model [3] Results [4] Documentation
MobileNetV2 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 71.67%
INT8: 71.14%
MobileNetV2.md
EfficientNet-lite0 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 75.42%
INT8: 74.44%
EfficientNet-lite0.md
DeepLabV3+ GitHub Repo Pretrained Model Quantized Model (PascalVOC) mIOU
FP32: 72.62%
INT8: 72.22%
DeepLabV3.md
MobileNetV2-SSD-Lite GitHub Repo Pretrained Model Quantized Model (PascalVOC) mAP
FP32: 68.7%
INT8: 68.6%
MobileNetV2-SSD-lite.md
Pose Estimation Based on Ref. Based on Ref. Quantized Model (COCO) mAP
FP32: 0.364
INT8: 0.359
mAR
FP32: 0.436
INT8: 0.432
PoseEstimation.md
SRGAN GitHub Repo Pretrained Model (older version from here) See Example (BSD100) PSNR/SSIM
FP32: 25.51/0.653
INT8: 25.5/0.648
Detailed Results
SRGAN.md
DeepSpeech2 GitHub Repo Pretrained Model See Example (Librispeech Test Clean) WER
FP32
9.92%
INT8: 10.22%
DeepSpeech2.md

[1] Original FP32 model source
[2] FP32 model checkpoint
[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit weights are used to further improve performance of post-training quantization.
[4] Results comparing float and quantized performance
[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

SRGAN Pytorch

Model Dataset PSNR SSIM
FP32 Set5/Set14/BSD100 29.93/26.58/25.51 0.851/0.709/0.653
INT8 Set5/Set14/BSD100 29.86/26.59/25.55 0.845/0.705/0.648

Examples

Install AIMET

Before you can run the example script for a specific model, you need to install the AI Model Efficiency ToolKit (AIMET) software. Please see this Getting Started page for an overview. Then install AIMET and its dependencies using these Installation instructions.

NOTE: To obtain the exact version of AIMET software that was used to test this model zoo, please install release 1.13.0 when following the above instructions.

Running the scripts

Download the necessary datasets and code required to run the example for the model of interest. The examples run quantized evaluation and if necessary apply AIMET techniques to improve quantized model performance. They generate the final accuracy results noted in the table above. Refer to the Docs for TensorFlow or PyTorch folder to access the documentation and procedures for a specific model.

Team

AIMET Model Zoo is a project maintained by Qualcomm Innovation Center, Inc.

License

Please see the LICENSE file for details.

Comments
  • Added PyTorch FFNet model, added INT4 to several models

    Added PyTorch FFNet model, added INT4 to several models

    Added the following new model: PyTorch FFNet Added INT4 quantization support to the following models:

    • Pytorch Classification (regnet_x_3_2gf, resnet18, resnet50)
    • PyTorch HRNet Posenet
    • PyTorch HRNet
    • PyTorch EfficientNet Lite0
    • PyTorch DeeplabV3-MobileNetV2

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • Added TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models

    Added TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models

    Added two new models - TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models Fixed TF version for 2 models in README file Minor updates to Tensorflow EfficientNet Lite-0 doc and PyTorch ssd_mobilenetv2 script

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • Updated post estimation evaluation code and documentation for updated…

    Updated post estimation evaluation code and documentation for updated…

    … model .pth file with weights state-dict Fixed model loading problem by including model definition in pose_estimation_quanteval.py Add Quantizer Op Assumptions to Pose Estimation document

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • error when run the pose estimation example

    error when run the pose estimation example

    $ python3.6 pose_estimation_quanteval.py pe_weights.pth ./data/

    2022-05-24 22:37:22,500 - root - INFO - AIMET defining network with shared weights Traceback (most recent call last): File "pose_estimation_quanteval.py", line 700, in pose_estimation_quanteval(args) File "pose_estimation_quanteval.py", line 687, in pose_estimation_quanteval sim = quantsim.QuantizationSimModel(model, dummy_input=(1, 3, 128, 128), quant_scheme=args.quant_scheme) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/quantsim.py", line 157, in init self.connected_graph = ConnectedGraph(self.model, dummy_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 132, in init self._construct_graph(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 254, in _construct_graph module_tensor_shapes_map = ConnectedGraph._generate_module_tensor_shapes_lookup_table(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 244, in _generate_module_tensor_shapes_lookup_table run_hook_for_layers_with_given_input(model, model_input, forward_hook, leaf_node_only=False) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/utils.py", line 277, in run_hook_for_layers_with_given_input _ = model(*input_tensor) File "/home/jlchen/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1071, in _call_impl result = forward_call(*input, **kwargs) TypeError: forward() takes 2 positional arguments but 5 were given

    opened by sundyCoder 0
  • I try to quantize deepspeech demo,but error happend

    I try to quantize deepspeech demo,but error happend

    ImportError: /home/mi/anaconda3/envs/aimet/lib/python3.7/site-packages/aimet_common/x86_64-linux-gnu/aimet_tensor_quantizer-0.0.0-py3.7-linux-x86_64.egg/AimetTensorQuantizer.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor8data_ptrIfEEPT_v

    platform:Ubuntu 18.04 GPU: nvidia 2070 CUDA:11.1 pytorch python:3.7

    opened by fmbao 0
  • Request for the MobileNet-V1-1.0 quantized (INT8) model.

    Request for the MobileNet-V1-1.0 quantized (INT8) model.

    Thank you for sharing these valuable models. I'd like to evaluate and look into the 'MobileNet-v1-1.0' model quantized by the DFQ. I'd appreciate it if you could provide the quantized MobileNet-v1-1.0 model either in TF or in PyTorch.

    opened by yschoi-dev 0
  • What's the runtime and AI Framework for DeepSpeech2?

    What's the runtime and AI Framework for DeepSpeech2?

    For DeepSpeech2, may I know what's the runtime for it's quantized (INT8 ) model, Hexagan DSP, NPU or others? And what's the AI framework, SNPE, Hexagan NN or others? Thanks~

    opened by sunfangxun 0
  • Unable to replicate DeepLabV3 Pytorch Tutorial numbers

    Unable to replicate DeepLabV3 Pytorch Tutorial numbers

    I've been working through the DeepLabV3 Pytorch tutorial, which can be founded here: https://github.com/quic/aimet-model-zoo/blob/develop/zoo_torch/Docs/DeepLabV3.md.

    However, when running the evaluation script using optimized checkpoint, I am unable to replicate the mIOU result that was listed in the table. The number that I got was 0.67 while the number reported by Qualcomm was 0.72. I was wondering if anyone have had this issue before and how to resolve it ?

    opened by LLNLanLeN 3
Releases(repo_restructured_1)
Owner
Qualcomm Innovation Center
Qualcomm Innovation Center
PyTorch implementation of DreamerV2 model-based RL algorithm

PyDreamer Reimplementation of DreamerV2 model-based RL algorithm in PyTorch. The official DreamerV2 implementation can be found here. Features ... Run

118 Dec 15, 2022
PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

Flow Gaussian Mixture Model (FlowGMM) This repository contains a PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our pa

Pavel Izmailov 124 Nov 06, 2022
A tensorflow implementation of an HMM layer

tensorflow_hmm Tensorflow and numpy implementations of the HMM viterbi and forward/backward algorithms. See Keras example for an example of how to use

Zach Dwiel 283 Oct 19, 2022
Unofficial PyTorch implementation of Google AI's VoiceFilter system

VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-sour

MINDs Lab 883 Jan 07, 2023
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st

Yuge Zhang 6 Sep 07, 2022
ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

Gengshan Yang 59 Nov 25, 2022
Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons This repository contains the code to repr

Computational Neuroscience, University of Bern 3 Aug 04, 2022
Contrastive Multi-View Representation Learning on Graphs

Contrastive Multi-View Representation Learning on Graphs This work introduces a self-supervised approach based on contrastive multi-view learning to l

Kaveh 208 Dec 23, 2022
Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 03, 2023
Bringing sanity to world of messed-up data

Sanitize sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is dist

Alireza Savand 63 Oct 26, 2021
Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022
A light-weight image labelling tool for Python designed for creating segmentation data sets.

An image labelling tool for creating segmentation data sets, for Django and Flask.

117 Nov 21, 2022
Classify music genre from a 10 second sound stream using a Neural Network.

MusicGenreClassification Academic research in the field of Deep Learning (Deep Neural Networks) and Sound Processing, Tel Aviv University. Featured in

Matan Lachmish 453 Dec 27, 2022
RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

ML-and-DataScience-preparation This repository has the goal to create a learning and preparation roadMap for Machine Learning Engineers and Data Scien

33 Dec 29, 2022
Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

Python scripts to detect faces using Python with the BlazeFace Tensorflow Lite models. Tested on Windows 10, Tensorflow 2.4.0 (Python 3.8).

Ibai Gorordo 46 Nov 17, 2022
MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

DV Lab 126 Dec 20, 2022
Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

fix_m1_rgb Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr. No warranty provided for using th

Kevin Gao 116 Jan 01, 2023
An implementation for Neural Architecture Search with Random Labels (CVPR 2021 poster) on Pytorch.

Neural Architecture Search with Random Labels(RLNAS) Introduction This project provides an implementation for Neural Architecture Search with Random L

18 Nov 08, 2022
Convnext-tf - Unofficial tensorflow keras implementation of ConvNeXt

ConvNeXt Tensorflow This is unofficial tensorflow keras implementation of ConvNe

29 Oct 06, 2022
This repository comes with the paper "On the Robustness of Counterfactual Explanations to Adverse Perturbations"

Robust Counterfactual Explanations This repository comes with the paper "On the Robustness of Counterfactual Explanations to Adverse Perturbations". I

Marco 5 Dec 20, 2022