code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Last update: Oct 26, 2022

Related tags

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

This repository contains PyTorch evaluation code, training code and pretrained models for AttentiveNAS.

For details see AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling by Dilin Wang, Meng Li, Chengyue Gong and Vikas Chandra.

If you find this project useful in your research, please consider cite:

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Pretrained models and data

Download our pretrained AttentiveNAS models and a (sub-network, FLOPs) lookup table from Google Drive and put them under folder ./attentive_nas_data

Evaluation

To evaluate our pre-trained AttentiveNAS models, from AttentiveNAS-A0 to A6, on ImageNet val with a single GPU, run:

python test_attentive_nas.py --config-file ./configs/eval_attentive_nas_models.yml --model a[0-6]

Expected results:

Name	MFLOPs	Top-1 (%)
AttentiveNAS-A0	203	77.3
AttentiveNAS-A1	279	78.4
AttentiveNAS-A2	317	78.8
AttentiveNAS-A3	357	79.1
AttentiveNAS-A4	444	79.8
AttentiveNAS-A5	491	80.1
AttentiveNAS-A6	709	80.7

Training

To train our AttentiveNAS models from scratch, run

python train_supernet.py --config-file configs/train_attentive_nas_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_attentive_nas_models.yml.

License

The majority of AttentiveNAS is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Once For All is licensed under the Apache 2.0 license.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Related tags

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Pretrained models and data

Evaluation

Training

License

Contributing

Owner

Facebook Research

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Extract Keywords from sentence or Replace keywords in sentences.

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

Text vectorization tool to outperform TFIDF for classification tasks

Anuvada: Interpretable Models for NLP using PyTorch

Crowd sourced training data for Rasa NLU models

EdiTTS: Score-based Editing for Controllable Text-to-Speech

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Edge-Augmented Graph Transformer

FireFlyer Record file format, writer and reader for DL training samples.

A tool helps build a talk preview image by combining the given background image and talk event description

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

Machine translation models released by the Gourmet project

The implementation of Parameter Differentiation based Multilingual Neural Machine Translation

A Paper List for Speech Translation

Simple GUI where you can enter an article and get a crisp summarized version.

Open solution to the Toxic Comment Classification Challenge