How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Last update: Sep 20, 2022

Related tags

Overview

AdamBNN

This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021.

In this work, we explore the intrisic reasons why Adam is superior to other optimizers like SGD for BNN optimization and provide analytical explanations that support specific training strategies. By visualizing the optimization trajectory, we show that the optimization lies in extremely rugged loss landscape and the second-order momentum in Adam is crucial to revitalize the weights that are dead due to the activation saturation in BNNs. Based on analysis, we derive a specific training scheme and achieve 70.5% top-1 accuracy on the ImageNet dataset using the same achitecture as ReActNet while achieving 1.1% higher accuracy.

Citation

If you find our code useful for your research, please consider citing:

@conference{liu2021how,
title = {How do adam and training strategies help bnns optimization?},
author = {Liu, Zechun and Shen, Zhiqiang and Li, Shichao and Helwegen, Koen and Huang, Dong and Cheng, Kwang-Ting},
booktitle = {International Conference on Machine Learning},
year = {2021},
organization={PMLR}
}

Run

1. Requirements:

python3, pytorch 1.7.1, torchvision 0.8.2

2. Data:

Download ImageNet dataset

3. Steps to run:

(1) Step1: binarizing activations

Change directory to ./step1/
run bash run.sh

(2) Step2: binarizing weights + activations

Change directory to ./step2/
run bash run.sh

Models

Methods	Backbone	Top1-Acc	FLOPs	Trained Model
ReActNet	ReActNet-A	69.4%	0.87 x 10^8	Model-ReAct
AdamBNN	ReActNet-A	70.5%	0.87 x 10^8	Model-ReAct-AdamBNN-Training

Contact

Zechun Liu, HKUST and CMU (zliubq at connect.ust.hk / zechunl at andrew.cmu.edu)

Zhiqiang Shen, CMU (zhiqians at andrew.cmu.edu)

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Related tags

Overview

AdamBNN

Citation

Run

1. Requirements:

2. Data:

3. Steps to run:

Models

Contact

Owner

Zechun Liu

PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Implementation of Google Brain's WaveGrad high-fidelity vocoder

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Object tracking using YOLO and a tracker(KCF, MOSSE, CSRT) in openCV

A Python library created to assist programmers with complex mathematical functions

The official implementation of Theme Transformer

Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Code and data for paper "Deep Photo Style Transfer"

CSD: Consistency-based Semi-supervised learning for object Detection

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Totally Versatile Miscellanea for Pytorch

Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

simple demo codes for Learning to Teach with Dynamic Loss Functions

Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss （ATVGnet）

The mini-AlphaStar (mini-AS, or mAS) - mini-scale version (non-official) of the AlphaStar (AS)