Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

Overview

Neuron Merging: Compensating for Pruned Neurons

Pytorch implementation of Neuron Merging: Compensating for Pruned Neurons, accepted at 34th Conference on Neural Information Processing Systems (NeurIPS 2020).

Requirements

To install requirements:

conda env create -f ./environment.yml

Python environment & main libraries:

  • python 3.8
  • pytorch 1.5.0
  • scikit-learn 0.22.1
  • torchvision 0.6.0

LeNet-300-100

To test LeNet-300-100 model on FashionMNIST, run:

bash scripts/LeNet_300_100_FashionMNIST.sh -t [model type] -c [criterion] -r [pruning ratio]

You can use three arguments for this script:

  • model type: original | prune | merge
  • pruning criterion : l1-norm | l2-norm | l2-GM
  • pruning ratio : 0.0 ~ 1.0

For example, to test the model after pruning 50% of the neurons with $l_1$-norm criterion, run:

bash scripts/LeNet_300_100_FashionMNIST.sh -t prune -c l1-norm -r 0.5

To test the model after merging , run:

bash scripts/LeNet_300_100_FashionMNIST.sh -t merge -c l1-norm -r 0.5

VGG-16

To test VGG-16 model on CIFAR-10, run:

bash scripts/VGG16_CIFAR10.sh -t [model type] -c [criterion]

You can use two arguments for this script

  • model type: original | prune | merge
  • pruning criterion: l1-norm | l2-norm | l2-GM

As a pretrained model on CIFAR-100 is not included, you must train it first. To train VGG-16 on CIFAR-100, run:

bash scripts/VGG16_CIFAR100_train.sh

All the hyperparameters are as described in the supplementary material.

After training, to test VGG-16 model on CIFAR-100, run:

bash scripts/VGG16_CIFAR100.sh -t [model type] -c [criterion]

You can use two arguments for this script

  • model type: original | prune | merge
  • pruning criterion: l1-norm | l2-norm | l2-GM

ResNet

To test ResNet-56 model on CIFAR-10, run:

bash scripts/ResNet56_CIFAR10.sh -t [model type] -c [criterion] -r [pruning ratio]

You can use three arguments for this script

  • model type: original | prune | merge
  • pruning method : l1-norm | l2-norm | l2-GM
  • pruning ratio : 0.0 ~ 1.0

To test WideResNet-40-4 model on CIFAR-10, run:

bash scripts/WideResNet_40_4_CIFAR10.sh -t [model type] -c [criterion] -r [pruning ratio]

You can use three arguments for this script

  • model type: original | prune | merge
  • pruning method : l1-norm | l2-norm | l2-GM
  • pruning ratio : 0.0 ~ 1.0

Results

Our model achieves the following performance on (without fine-tuning) :

Image classification of LeNet-300-100 on FashionMNIST

Baseline Accuracy : 89.80%

Pruning Ratio Prune ($l_1$-norm) Merge
50% 88.40% 88.69%
60% 85.17% 86.92%
70% 71.26% 82.75%
80% 66.76 80.02%

Image classification of VGG-16 on CIFAR-10

Baseline Accuracy : 93.70%

Criterion Prune Merge
$l_1$-norm 88.70% 93.16%
$l_2$-norm 89.14% 93.16%
$l_2$-GM 87.85% 93.10%

Citation

@inproceedings{kim2020merging,
  title     = {Neuron Merging: Compensating for Pruned Neurons},
  author    = {Kim, Woojeong and Kim, Suhyun and Park, Mincheol and Jeon, Geonseok},
  booktitle = {Advances in Neural Information Processing Systems 33},
  year      = {2020}
}
Owner
Woojeong Kim
Woojeong Kim
Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Non-Intrusive Speech Intelligibility Prediction from Discrete Latent Representations Official repository for paper "Non-Intrusive Speech Intelligibili

Alex McKinney 5 Oct 25, 2022
A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

IconQA About IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and c

Pan Lu 24 Dec 30, 2022
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algori

Anish 324 Dec 27, 2022
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation This project hosts the code for implementing the DCT-MASK algorithms

Alibaba Cloud 57 Nov 27, 2022
A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our rep

7.7k Jan 06, 2023
Attempt at implementation of a simple GAN using Keras

Simple GAN This is my attempt to make a wrapper class for a GAN in keras which can be used to abstract the whole architecture process. Simple GAN Over

Deven96 7 May 23, 2019
Simulate genealogical trees and genomic sequence data using population genetic models

msprime msprime is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consis

Tskit developers 150 Dec 14, 2022
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee

Haozhe Xie 76 Dec 14, 2022
LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

Query Selector Here you can find code and data loaders for the paper https://arxiv.org/pdf/2107.08687v1.pdf . Query Selector is a novel approach to sp

MORAI 62 Dec 17, 2022
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)

Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19) Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Ry

Steve Wong 177 Dec 01, 2022
It is an open dataset for object detection in remote sensing images.

RSOD-Dataset It is an open dataset for object detection in remote sensing images. The dataset includes aircraft, oiltank, playground and overpass. The

136 Dec 08, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
Python wrappers to the C++ library SymEngine, a fast C++ symbolic manipulation library.

SymEngine Python Wrappers Python wrappers to the C++ library SymEngine, a fast C++ symbolic manipulation library. Installation Pip See License section

136 Dec 28, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification.

YerevaNN 75 Nov 06, 2022
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

Explainable_FIQA_WITH_AMVA Note This is the official repository of the paper: Explainability of the Implications of Supervised and Unsupervised Face I

3 May 08, 2022
Bagua is a flexible and performant distributed training algorithm development framework.

Bagua is a flexible and performant distributed training algorithm development framework.

786 Dec 17, 2022