Scaling Vision with Sparse Mixture of Experts

This repository contains the code for training and fine-tuning Sparse MoE models for vision (V-MoE) on ImageNet-21k, reproducing the results presented in the paper:

Scaling Vision with Sparse Mixture of Experts, by Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby.

We will soon provide a colab analysing one of the models that we have released, as well as "config" files to train from scratch and fine-tune checkpoints. Stay tuned.

Installation

Simply clone this repository.

The file requirements.txt contains the requirements that can be installed via PyPi. However, we recommend installing jax, flax and optax directly from GitHub, since we use some of the latest features that are not part of any release yet.

In addition, you also have to clone the Vision Transformer repository, since we use some parts of it.

If you want to use RandAugment to train models (which we recommend if you train on ImageNet-21k or ILSVRC2012 from scratch), you must also clone the Cloud TPU repository, and name it cloud_tpu.

Checkpoints

We release the checkpoints containing the weights of some models that we trained on ImageNet (either ILSVRC2012 or ImageNet-21k). All checkpoints contain an index file (with .index extension) and one or multiple data files ( with extension .data-nnnnn-of-NNNNN, called shards). In the following list, we indicate only the prefix of each checkpoint. We recommend using gsutil to obtain the full list of files, download them, etc.

V-MoE S/32, 8 experts on the last two odd blocks, trained from scratch on ILSVRC2012 with RandAugment: gs://vmoe_checkpoints/vmoe_s32_last2_ilsvrc2012_randaug_medium.
V-MoE B/16, 8 experts on every odd block, trained from scratch on ImageNet-21k with RandAugment: gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong.
- Fine-tuned on ILSVRC2012: gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong_ft_ilsvrc2012

Disclaimers

This is not an officially supported Google product.

Scaling Vision with Sparse Mixture of Experts

Related tags

Overview

Scaling Vision with Sparse Mixture of Experts

Installation

Checkpoints

Disclaimers

Owner

Google Research

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021)

Unofficial Implement PU-Transformer

A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

Benchmarks for Object Detection in Aerial Images

Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

Fine-tune pretrained Convolutional Neural Networks with PyTorch

Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

Anomaly detection in multi-agent trajectories: Code for training, evaluation and the OpenAI highway simulation.

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Dataset Condensation with Contrastive Signals

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

This is the official github repository of the Met dataset

Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini!

LSSY量化交易系统

Implementation of Vaswani, Ashish, et al. "Attention is all you need."