Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

Overview

The Lottery Ticket Hypothesis for Pre-trained BERT Networks

License: MIT

Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks. [NeurIPS 2020]

Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin.

Our implementation is based on Huggingface repo. Details are referred to README here. Pre-trained subnetworks are coming soon.

Overview

The Existence of Matching Subnetworks in BERT

Transfer Learning for BERT Winning Tickets

Method

Reproduce Details

Prerequisites and Installation

Details are referred to README here.

Iterative Magnitude Pruning (IMP)

MLM task:

python -u LT_pretrain.py 
	   --output_dir LT_pretrain_model
	   --model_type bert 
	   --model_name_or_path bert-base-uncased 
	   --train_data_file pretrain_data/en.train 
	   --do_train 
	   --eval_data_file pretrain_data/en.valid 
	   --do_eval 
	   --per_gpu_train_batch_size 16 
	   --per_gpu_eval_batch_size 16 
	   --evaluate_during_training 
	   --num_train_epochs 1 
	   --logging_steps 10000 
	   --save_steps 10000 
	   --mlm 
	   --overwrite_output_dir 
	   --seed 57

Glue task:

python -u LT_glue.py
	   --output_dir tmp/mnli 
	   --logging_steps 36813 
	   --task_name MNLI 
	   --data_dir glue_data/MNLI 
	   --model_type bert 
	   --model_name_or_path bert-base-uncased 
	   --do_train 
	   --do_eval 
	   --do_lower_case 
	   --max_seq_length 128 
	   --per_gpu_train_batch_size 32 
	   --learning_rate 2e-5 
	   --num_train_epochs 30 
	   --overwrite_output_dir 
	   --evaluate_during_training 
	   --save_steps 36813
	   --eval_all_checkpoints 
	   --seed 57

SQuAD task:

python -u squad_trans.py 
	   --output_dir tmp/530/squad 
	   --model_type bert 
	   --model_name_or_path bert-base-uncased 
       --do_train 
       --do_eval 
       --do_lower_case 
       --train_file SQuAD/train-v1.1.json 
       --predict_file SQuAD/dev-v1.1.json 
       --per_gpu_train_batch_size 16 
       --learning_rate 3e-5 
       --num_train_epochs 40 
       --max_seq_length 384 
       --doc_stride 128 
       --evaluate_during_training 
       --eval_all_checkpoints 
       --overwrite_output_dir 
       --logging_steps 22000 
       --save_steps 22000 
       --seed 57

One-shot Magnitude Pruning (OMP)

python oneshot.py --weight [pre or rand] --model [glue or squad or pretrain] --rate 0.5

Fine-tuning

MLM task:

python -u pretrain_trans.py 
	   --dir pre\  [using random weight or official pretrain weight]
	   --weight_pertub tmp/shuffle_weight.pt\ [weight for Bert (not required)]
	   --mask_dir tmp/dif_mask/pretrain_mask.pt \ [mask file]
	   --output_dir tmp/530/pre 
	   --model_type bert 
	   --model_name_or_path bert-base-uncased 
	   --train_data_file pretrain_data/en.train 
	   --do_train --eval_data_file pretrain_data/en.valid 
	   --do_eval 
	   --per_gpu_train_batch_size 8 
	   --per_gpu_eval_batch_size 8 
	   --evaluate_during_training 
	   --num_train_epochs 1 
	   --logging_steps 2000 
	   --save_steps 0 
	   --max_steps 20000  
	   --mlm 
	   --overwrite_output_dir 
	   --seed 57

Glue task:

python -u glue_trans.py 
       --dir pre \  [using random weight or official pretrain weight]
       --weight_pertub tmp/shuffle_weight.pt \ [weight for Bert (not required)]
       --mask_dir tmp/dif_mask/mnli_mask.pt \ [mask file]
       --output_dir tmp/530/mnli 
       --logging_steps 12271 
       --task_name MNLI 
       --data_dir glue_data/MNLI 
       --model_type bert 
       --model_name_or_path bert-base-uncased 
       --do_train 
       --do_eval 
       --do_lower_case 
       --max_seq_length 128 
       --per_gpu_train_batch_size 32 
       --learning_rate 2e-5 
       --num_train_epochs 3 
       --overwrite_output_dir 
       --evaluate_during_training 
       --save_steps 0 
       --eval_all_checkpoints 
       --seed 5

SQuAD task:

python -u squad_trans.py 
	   --dir pre \  [using random weight or official pretrain weight]
	   --weight_pertub tmp/shuffle_weight.pt \ [weight for Bert (not required)]
	   --mask_dir tmp/dif_mask/squad_mask.pt \ [mask file]
	   --output_dir tmp/530/squad 
	   --model_type bert 
	   --model_name_or_path bert-base-uncased 
	   --do_train 
	   --do_eval 
	   --do_lower_case 
	   --train_file SQuAD/train-v1.1.json 
	   --predict_file SQuAD/dev-v1.1.json 
	   --per_gpu_train_batch_size 16 
	   --learning_rate 3e-5 
	   --num_train_epochs 4 
	   --max_seq_length 384 
	   --doc_stride 128 
	   --evaluate_during_training 
	   --eval_all_checkpoints 
	   --overwrite_output_dir 
	   --logging_steps 5500 
	   --save_steps 0 
	   --seed 57

Subnetwork with Ramdomly Suffuled Pre-trined Weight

python pertub_weight.py

Citation

If you use this code for your research, please cite our paper:

@misc{chen2020lottery,
    title={The Lottery Ticket Hypothesis for Pre-trained BERT Networks},
    author={Tianlong Chen and Jonathan Frankle and Shiyu Chang and Sijia Liu and Yang Zhang and Zhangyang Wang and Michael Carbin},
    year={2020},
    eprint={2007.12223},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Acknowlegement

We would like to express our deepest gratitude to the MIT-IBM Watson AI Lab. In particular, we would like to thank John Cohn for his generous help in providing us with the computing resources necessary to conduct this research.

Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Official implementation for “Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior”

HEP Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior Implementation Python3 PyTorch=1.0 NVIDIA GPU+CUDA Training process The

FengZhang 34 Dec 04, 2022
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023
banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services. This library is developed by Bandit ML and ex-authors of Facebook's app

Bandit ML 51 Dec 22, 2022
Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

Marco 3 Feb 08, 2022
Implementation of RegretNet with Pytorch

Dependencies are Python 3, a recent PyTorch, numpy/scipy, tqdm, future and tensorboard. Plotting with Matplotlib. Implementation of the neural network

Horris zhGu 1 Nov 05, 2021
This is the repository of shape matching algorithm Iterative Rotations and Assignments (IRA)

Description This is the repository of shape matching algorithm Iterative Rotations and Assignments (IRA), described in the publication [1]. Directory

MAMMASMIAS Consortium 6 Nov 14, 2022
Realistic lighting in ursina!

Ursina Lighting Realistic lighting in ursina! If you want to have realistic lighting in ursina, import the UrsinaLighting.py in your project and use t

17 Jul 07, 2022
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.

Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers

Amir Shakouri 1 Mar 11, 2022
Additional environments compatible with OpenAI gym

Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning A codebase for training reinforcement learning policies for quad

Zhehui Huang 40 Dec 06, 2022
RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

ML-and-DataScience-preparation This repository has the goal to create a learning and preparation roadMap for Machine Learning Engineers and Data Scien

33 Dec 29, 2022
An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wheat Detection (2021).

Global-Wheat-Detection An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wh

Chuxin Wang 11 Sep 25, 2022
Binary classification for arrythmia detection with ECG datasets.

HEART DISEASE AI DATATHON 2021 [Eng] / [Kor] #English This is an AI diagnosis modeling contest that uses the heart disease echocardiography and electr

HY_Kim 3 Jul 14, 2022
The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

Yabin Zhang 26 Dec 26, 2022
A mini-course offered to Undergrad chemistry students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 19 Dec 19, 2022
Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

Astitva Veer Garg 1 Jan 13, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
ETMO: Evolutionary Transfer Multiobjective Optimization

ETMO: Evolutionary Transfer Multiobjective Optimization To promote the research on ETMO, benchmark problems are of great importance to ETMO algorithm

Songbai Liu 0 Mar 16, 2021
Code for paper "Learning to Reweight Examples for Robust Deep Learning"

learning-to-reweight-examples Code for paper Learning to Reweight Examples for Robust Deep Learning. [arxiv] Environment We tested the code on tensorf

Uber Research 261 Jan 01, 2023
Code accompanying the paper "How Tight Can PAC-Bayes be in the Small Data Regime?"

How Tight Can PAC-Bayes be in the Small Data Regime? This is the code to reproduce all experiments for the following paper: @inproceedings{Foong:2021:

5 Dec 21, 2021
A python bot to move your mouse every few seconds to appear active on Skype, Teams or Zoom as you go AFK. 🐭 🤖

PyMouseBot If you're from GT and annoyed with SGVPN idle timeouts while working on development laptop, You might find this useful. A python cli bot to

Oaker Min 6 Oct 24, 2022