Revisiting Self-Training for Few-Shot Learning of Language Model.

Last update: Nov 19, 2022

Related tags

Overview

SFLM

This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few-shot learning of language model.

Requirements

To run our code, please install all the dependency packages by using the following command:

pip install -r requirements.txt

Preprocess

The original data can be found from LM-BFF. To generate data for the few-shot experiments, please run the below command:

python tools/generate_data.py

The original data shall be in ./data/original, and the sampled data will be in ./data/few-shot/$K-$MU-$SEED. Please refer to ./tools/generate_data.py for more options.

Train

Our code can be run as the below example:

python3 run.py \
  --task_name SST-2 \
  --data_dir data/few-shot/SST-2/16-4-100 \
  --do_train \
  --do_eval \
  --do_predict \
  --evaluate_during_training \
  --model_name_or_path roberta-base \
  --few_shot_type prompt-demo \
  --num_k 16 \
  --max_seq_length 256 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 16 \
  --gradient_accumulation_steps 4 \
  --learning_rate 1e-5 \
  --max_steps 1000 \
  --logging_steps 100 \
  --eval_steps 100 \
  --num_train_epochs 0 \
  --output_dir result/SST-2-16-4-100 \
  --save_logit_dir result/SST-2-16-4-100 \
  --seed 100 \
  --template "*cls**sent_0*_It_was*mask*.*sep+*" \
  --mapping "{'0':'terrible','1':'great'}" \
  --num_sample 16 \
  --threshold 0.95 \
  --lam1 0.5 \
  --lam2 0.1

Most arguments are the same as LM-BFF, and the same manual prompts are used in our experiments. We list additional arguments used in SFLM:

threshold: The threshold used to filter out low-confidence samples for self-training loss
lam1: The weight of self-training loss
lam2: The weight of self-supervised loss

Citation

Please cite our paper if you use SFLM in your work:

@inproceedings{chen2021revisit,        
    title={Revisiting Self-Training for Few-Shot Learning of Language Model},         
    author={Chen, Yiming and Zhang, Yan and Zhang, Chen and Lee, Grandee and Cheng, Ran and Li, Haizhou},         
    booktitle={EMNLP},        
    year={2021},
}

Acknowledgements

Code is implemented based on LM-BFF. We would like to thank the authors of LM-BFF for making their code public.

Revisiting Self-Training for Few-Shot Learning of Language Model.

Related tags

Overview

SFLM

Requirements

Preprocess

Train

Citation

Acknowledgements

Owner

Implementation of algorithms for continuous control (DDPG and NAF).

Implicit Graph Neural Networks

LAMDA: Label Matching Deep Domain Adaptation

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Feedback is important: response-aware feedback mechanism for background based conversation

Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

I will implement Fastai in each projects present in this repository.

Code release for General Greedy De-bias Learning

General purpose Slater-Koster tight-binding code for electronic structure calculations

This is a collection of our NAS and Vision Transformer work.

Rule-based Customer Segmentation

CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

Benchmarks for Object Detection in Aerial Images

A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking.

A Real-World Benchmark for Reinforcement Learning based Recommender System

Keras code and weights files for popular deep learning models.

Boosted neural network for tabular data