[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Overview

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Introduction

This repo contains the source code accompanying the paper:

Well-tuned Simple Nets Excel on Tabular Datasets

Authors: Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters.

We empirically assess the impact of these regularization cocktails for MLPs on a large-scale empirical study comprising 40 tabular datasets and demonstrate that: (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.

News: Our work is accepted in the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021).

Setting up the virtual environment

Our work is built on top of AutoPyTorch. To look at our implementation of the regularization cocktail ingredients, you can do the following:

git clone https://github.com/automl/Auto-PyTorch.git
cd Auto-PyTorch/
git checkout regularization_cocktails

To install the version of AutoPyTorch that features our work, you can use these additional commands:

# The following commands assume the user is in the cloned directory
conda create -n reg_cocktails python=3.8
conda activate reg_cocktails
conda install gxx_linux-64 gcc_linux-64 swig
cat requirements.txt | xargs -n 1 -L 1 pip install
python setup.py install

Running the Regularization Cocktail code

The main files to run the regularization cocktails are in the cocktails folder and are main_experiment.py and refit_experiment.py. The first module can be used to start a full HPO search, while, the other module can be used to refit on certain datasets when the time does not suffice to perform the full HPO search and to complete the refit of the incumbent hyperparameter configuration.

The main arguments for main_experiment.py:

  • --task_id: The task id in OpenML. Basically the dataset that will be used in the experiment.
  • --wall_time: The total runtime to be used. It is the total runtime for the HPO search and also final refit.
  • --func_eval_time: The maximal time for one function evaluation parametrized by a certain hyperparameter configuration.
  • --epochs: The number of epochs for one hyperparameter configuration to be evaluated on.
  • --seed: The seed to be used for the run.
  • --tmp_dir: The temporary directory for the results to be stored in.
  • --output_dir: The output directory for the results to be stored in.
  • --nr_workers: The number of workers which corresponds to the number of hyperparameter configurations run in parallel.
  • --nr_threads: The number of threads.
  • --cash_cocktail: An important flag that activates the regularization cocktail formulation.

A minimal example of running the regularization cocktails:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --cash_cocktail True

The example above will run the regularization cocktails for 10 minutes, with a function evaluation limit of 50 seconds for task 233088. Every hyperparameter configuration will be evaluated for 10 epochs, the seed 42 will be used for the experiment and data splits.

A minimal example of running only one regularization method:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --use_weight_decay

In case you would like to investigate individual regularization methods, you can look at the different arguments that control them in the main_experiment.py. Additionally, if you want to remove the limit on the number of hyperparameter configurations, you can remove the following lines:

smac_scenario_args={
    'runcount_limit': number_of_configurations_limit,
}

Plots

The plots that are included in our paper were generated from the functions in the module results.py. Although mentioned in most function documentations, most of the functions that plot the baseline diagrams and plots expect a folder structure as follows:

common_result_folder/baseline/results.csv

There are functions inside the module itself that generate the results.csv files.

Baselines

The code for running the baselines can be found in the baselines folder.

  • TabNet, XGBoost, CatBoost can be found in the baselines/bohb folder.
  • The other baselines like AutoGluon, auto-sklearn and Node can be found in the corresponding folders named the same.

TabNet, XGBoost, CatBoost and AutoGluon have the same two main files as our regularization cocktails, main_experiment.py and refit_experiment.py.

Figures

alt text

Citation

@article{kadra2021regularization,
  title={Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data},
  author={Kadra, Arlind and Lindauer, Marius and Hutter, Frank and Grabocka, Josif},
  journal={arXiv preprint arXiv:2106.11189},
  year={2021}
}
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

Coursera-deep-learning-specialization - Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks an

Aman Chadha 1.7k Jan 08, 2023
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Quande Liu 178 Jan 06, 2023
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

2 Nov 15, 2021
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python = 3.8 pytorch = 1.8.0

DV Lab 63 Dec 16, 2022
Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Kim Seonghyeon 2.2k Jan 01, 2023
Implements Gradient Centralization and allows it to use as a Python package in TensorFlow

Gradient Centralization TensorFlow This Python package implements Gradient Centralization in TensorFlow, a simple and effective optimization technique

Rishit Dagli 101 Nov 01, 2022
Implementation of the Paper: "Parameterized Hypercomplex Graph Neural Networks for Graph Classification" by Tuan Le, Marco Bertolini, Frank Noé and Djork-Arné Clevert

Parameterized Hypercomplex Graph Neural Networks (PHC-GNNs) PHC-GNNs (Le et al., 2021): https://arxiv.org/abs/2103.16584 PHM Linear Layer Illustration

Bayer AG 26 Aug 11, 2022
This code is 3d-CNN model that can predict environmental value

Predict-environmental-value-3dCNN This code is 3d-CNN model that can predict environmental value. Firstly, I built a model that can create a lot of bu

1 Jan 06, 2022
Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Primitive Representation Learning Network (PREN) This repository contains the code for our paper accepted by CVPR 2021 Primitive Representation Learni

Ruijie Yan 76 Jan 02, 2023
The code for Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation

BiMix The code for Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation arxiv Framework: visualization results: Requiremen

stanley 18 Sep 18, 2022
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DenseNAS The code of the CVPR2020 paper Densely Connected Search Space for More Flexible Neural Architecture Search. Neural architecture search (NAS)

Jamin Fong 291 Nov 18, 2022
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
Accelerated Multi-Modal MR Imaging with Transformers

Accelerated Multi-Modal MR Imaging with Transformers Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 torch==1.7.0 runstats==1.8.0 p

54 Dec 16, 2022
Minimal fastai code needed for working with pytorch

fastai_minima A mimal version of fastai with the barebones needed to work with Pytorch #all_slow Install pip install fastai_minima How to use This lib

Zachary Mueller 14 Oct 21, 2022
Implementation of a Transformer using ReLA (Rectified Linear Attention)

ReLA (Rectified Linear Attention) Transformer Implementation of a Transformer using ReLA (Rectified Linear Attention). It will also contain an attempt

Phil Wang 49 Oct 14, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
Galileo library for large scale graph training by JD

近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提

JD Galileo Team 128 Nov 29, 2022
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Quasi-Recurrent Neural Network (QRNN) for PyTorch Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py ex

Salesforce 1.3k Dec 28, 2022
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022