OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Related tags

Deep Learningoreo
Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

We here provide a video demo from confounded Enduro environment (see Figure 8 of the main draft). We also visualize the spatial attention map from a convolutional encoder trained with BC (medium) and OREO (right).

Enduro_total_demo_cropped

Installation

OREO requires CUDA 10.1 to run.

Install the dependencies:

conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install dopamine_rl sklearn tqdm kornia dropblock atari-py==0.2.6 gsutil

Download DQN Replay dataset for expert demonstrations on Atari environments:

mkdir DATAPATH
cp download.sh DATAPATH
cd DATAPATH
sh download.sh

Pre-training

We here provide beta-VAE (for CCIL) and VQ-VAE (for CRLR and OREO) pretraining scripts. For other datasets, change the --env option.

beta-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_beta_vae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1 --ch_div 4 --lmd 10

VQ-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_vqvae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1

Training BC policy

We here provide training scripts for baselines and OREO. For other datasets, change the --env, --beta_vae_path, and --vqvae_path options.

Behavioral cloning

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --num_episodes 20 --num_eval_episodes 100

Dropout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --original_dropout --prob 0.5 --num_episodes 20 --num_eval_episodes 100

DropBlock

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --dropblock --prob 0.3 --num_episodes 20 --num_eval_episodes 100

Cutout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --input_cutout --num_episodes 20 --num_eval_episodes 100

RandomShift

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --random_shift --num_episodes 20 --num_eval_episodes 100

CCIL (w/o interaction)

CUDA_VISIBLE_DEVICES=0 python atari_beta_vae_actor.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --ch_div 4 --beta_vae_path models_beta_vae_coord_conv_chdiv4_actor_lmd10.0/KungFuMaster_s1_epi20_con1_seed1_zdim50_beta4_kltol0_ep1000_beta_vae.pth

CRLR

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor_crlr.py --fixed_size 15000 --num_sub_iters 10 --eval_interval 10 --save_interval 10 --n_epochs 10 --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO

CUDA_VISIBLE_DEVICES=0 python atari_vqvae_oreo.py --env=KungFuMaster --datapath DATAPATH --num_mask 5 --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth
Learning Logic Rules for Document-Level Relation Extraction

LogiRE Learning Logic Rules for Document-Level Relation Extraction We propose to introduce logic rules to tackle the challenges of doc-level RE. Equip

41 Dec 26, 2022
duralava is a neural network which can simulate a lava lamp in an infinite loop.

duralava duralava is a neural network which can simulate a lava lamp in an infinite loop. Example This is not a real lava lamp but a "fake" one genera

Maximilian Bachl 87 Dec 20, 2022
PyTorch implementation for paper Neural Marching Cubes.

NMC PyTorch implementation for paper Neural Marching Cubes, Zhiqin Chen, Hao Zhang. Paper | Supplementary Material (to be updated) Citation If you fin

Zhiqin Chen 109 Dec 27, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
[TPDS'21] COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

COSCO Framework COSCO is an AI based coupled-simulation and container orchestration framework for integrated Edge, Fog and Cloud Computing Environment

imperial-qore 39 Dec 25, 2022
UMPNet: Universal Manipulation Policy Network for Articulated Objects

UMPNet: Universal Manipulation Policy Network for Articulated Objects Zhenjia Xu, Zhanpeng He, Shuran Song Columbia University Robotics and Automation

Columbia Artificial Intelligence and Robotics Lab 33 Dec 03, 2022
B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search This is the offical implementation of the

SNU ADSL 0 Feb 07, 2022
Additional code for Stable-baselines3 to load and upload models from the Hub.

Hugging Face x Stable-baselines3 A library to load and upload Stable-baselines3 models from the Hub. Installation With pip Examples [Todo: add colab t

Hugging Face 34 Dec 10, 2022
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 08, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
A library for researching neural networks compression and acceleration methods.

A library for researching neural networks compression and acceleration methods.

Intel Labs 100 Dec 29, 2022
Yolov5 + Deep Sort with PyTorch

딥소트 수정중 Yolov5 + Deep Sort with PyTorch Introduction This repository contains a two-stage-tracker. The detections generated by YOLOv5, a family of obj

1 Nov 26, 2021
PyTorch code accompanying the paper "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning" (NeurIPS 2021).

HIGL This is a PyTorch implementation for our paper: Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning (NeurIPS 2021). Our cod

Junsu Kim 20 Dec 14, 2022
Count GitHub Stars ⭐

Count GitHub Stars per Day ⭐ Track GitHub stars per day over a date range to measure the open-source popularity of different repositories. Requirement

Ultralytics 20 Nov 20, 2022
QuadTree Attention for Vision Transformers (ICLR2022)

This repository contains codes for quadtree attention. This repo contains codes for feature matching, image classficiation, object detection and seman

tangshitao 222 Dec 28, 2022
CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

CycleTransGAN-EVC CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer Demo emotion CycleTransGAN CycleTransGAN Cycle

24 Dec 15, 2022
Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism

Period-alternatives-of-Softmax Experimental Demo for our paper 'Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechani

slwang9353 0 Sep 06, 2021
PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

Lingbo Yang 401 Nov 19, 2022
Contains source code for the winning solution of the xView3 challenge

Winning Solution for xView3 Challenge This repository contains source code and pretrained models for my (Eugene Khvedchenya) solution to xView 3 Chall

Eugene Khvedchenya 51 Dec 30, 2022
Speedy Implementation of Instance-based Learning (IBL) agents in Python

A Python library to create single or multi Instance-based Learning (IBL) agents that are built based on Instance Based Learning Theory (IBLT) 1 Instal

0 Nov 18, 2021