This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Last update: Nov 02, 2021

Related tags

Overview

Wide-Networks

This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameterizations of NNs as defined by (Yang & Hu 2021). Although an equivalent description can be given using only ac-parameterizations, we keep the 3 scales (a, b and c) in the code to allow more flexibility depending on how we want to approach the problem of dealing with infinitely wide NNs.

Structure of the code

The BaseModel class

All the code related to neural networks is in the directory pytorch. The different models we have implemented are in this directory along with the base class found in the file base_model.py which implements the generic attributes and methods all our NNs classes will share.

The BaseModel class inherits from the Pytorch Lightning module, and essentially defines the necessary attributes for any NN to work properly, namely the architecture (which is defined in the _build_model() method), the activation function (we consider the same activation function at each layer), the loss function, the optimizer and the initializer for the parameters of the network.

Optionally, the BaseModel class can define attributes for the normalization (e.g. BatchNorm, LayerNorm, etc) and the scheduler, and any of the aforementioned attributes (optional or not) can be customized depending on the needs (see examples for the scheduler of ipllr and the initializer of abc_param).

The ModelConfig class

All the hyper-parameters which define the model (depth, width, activation function name, loss name, optimizer name, etc) have to be passed as argument to _init_() as an object of the class ModelConfig (pytorch/configs/model.py). This class reads from a yaml config file which defines all the necessary objects for a NN (see examples in pytorch/configs). Essentially, the class ModelConfig is here so that one only has to set the yaml config file properly and then the attributes are correctly populated in BaseModel via the class ModelConfig.

abc-parameterizations

The code for abc-parameterizations (Yang & Hu 2021) can be found in pytorch/abc_params. There we define the base class for abc-parameterizations, mainly setting the layer, init and lr scales from the values of a,b,c, as well as defining the initial parameters through Gaussians of appropriate variance depending on the value of b and the activation function.

Everything that is architecture specific (fully-connected, conv, residual, etc) is left out of this base class and has to be implemented in the _build_model() method of the child class (see examples in pytorch/abc_params/fully_connected). We also define there the base classes for the ntk, muP (Yang & Hu 2021), ip and ipllr parameterizations, and there fully-connected implementations in pytorch/abc_params/fully_connected.

Experiment runs

Setup

Before running any experiment, make sure you first install all the necessary packages:

pip3 install -r requirements.txt

You can optionally create a virtual environment through

python3 -m venv your_env_dir

then activate it with

source your_env_dir/bin/activate

and then install the requirements once the environment is activated. Now, if you haven't installed the wide-networks library in site-packages, before running the command for your experiment, make sure you first add the wide-networks library to the PYTHONPATH by running the command

export PYTHONPATH=$PYTHONPATH:"$PWD"

from the root directory (wide-networks/.) of where the wide-networks library is located.

Python jobs

We define python jobs which can be run with arguments from the command line in the directory jobs. Mainly, those jobs launch a training / val / test pipeline for a given model using the Lightning module, and the results are collected in a dictionary which is saved to a pickle file a the end of training for later examination. Additionally, metrics are logged in TensorBoard and can be visualized during training with the command

tensorboard --logdir=`your_experiment_dir`

We have written jobs to launch experiments on MNIST and CIFAR-10 with the fully connected version of different models such as muP (Yang & Hu 2021), IP-LLR, Naive-IP which can be found in jobs/abc_parameterizations. Arguments can be passed to those Python scripts through the command line, but they are optional and the default values will be used if the parameters of the script are not manually set. For example, the command

python3 jobs/abc_parameterizations/fc_muP_run.py --activation="relu" --n_steps=600 --dataset="mnist"

will launch a training / val / test pipeline with ReLU as the activation function, 600 SGD steps and the MNIST dataset. The other parameters of the run (e.g. the base learning rate and batch size) will have their default values. The jobs will automatically create a directory (and potentially subdirectories) for the experiment and save there the python logs, the tensorboard events and the results dictionary saved to a pickle file as well as the checkpoints saved for the network.

Visualizing results

To visualize the results after training for a given experiment, one can launch the notebook experiments-results.ipynb located in pytorch/notebooks/training/abc_parameterizations, and simply change the arguments in the "Set variables" cell to load the results from the corresponding experiment. Then running all the cells will produce (and save) some figures related to the training phase (e.g. loss vs. steps).

This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Related tags

Overview

Wide-Networks

Structure of the code

The BaseModel class

The ModelConfig class

abc-parameterizations

Experiment runs

Setup

Python jobs

Visualizing results

Owner

Karl Hajjar

PyTorch code accompanying the paper "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning" (NeurIPS 2021).

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Accelerating BERT Inference for Sequence Labeling via Early-Exit

3rd Place Solution for ICCV 2021 Workshop SSLAD Track 3A - Continual Learning Classification Challenge

PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate).

Code for "Diffusion is All You Need for Learning on Surfaces"

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Global-Local Context Network for Person Search

Tool for live presentations using manim

Predicting Tweet Sentiment Maching Learning and streamlit

Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

Code for reproducing key results in the paper "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets"

FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

This is a code repository for the paper "Graph Auto-Encoders for Financial Clustering".

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

PyTorch wrapper for Taichi data-oriented class