Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Last update: Nov 22, 2022

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

This is the official repository for Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning. We provide the commands to run the PETS and PlaNet experiments included in the paper. This repository is made minimal for ease of experimentation.

Installations

This repository requires Python (3.6), Pytorch (version 1.3 or above) run the following command to create a conda environment (tested using CUDA10.2):

conda env create -f environment.yml

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

cd cap-pets

CAP

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --learn_kappa --seed 1

CAP with fixed kappa

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --kappa 1.0 --seed 1

CCEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --seed 1

CEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--seed 1

The commands for the PlaNet experiment on the CarRacing environment are:

CAP

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--learn-kappa --penalty-kappa 0.1 \
--id CarRacing-cap --seed 1

CAP with fixed kappa

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--penalty-kappa 1.0 \
--id CarRacing-kappa1 --seed 1

CCEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained \
--id CarRacing-ccem --seed 1

CEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--id CarRacing-cem --seed 1

Contact

If you have any questions regarding the code or paper, feel free to contact [email protected] or open an issue on this repository.

Acknowledgement

This repository contains code adapted from the following repositories: PETS and PlaNet. We thank the authors and contributors for open-sourcing their code.

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Installations

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

The commands for the PlaNet experiment on the CarRacing environment are:

Contact

Acknowledgement

Owner

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Performant, differentiable reinforcement learning

Code accompanying paper: Meta-Learning to Improve Pre-Training

RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

PIXIE: Collaborative Regression of Expressive Bodies

The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Distributionally robust neural networks for group shifts

Supervised forecasting of sequential data in Python.

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

A synthetic texture-invariant dataset for object detection of UAVs

Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

Inference pipeline for our participation in the FeTA challenge 2021.

People movement type classifier with YOLOv4 detection and SORT tracking.

General Assembly Capstone: NBA Game Predictor

Code for all the Advent of Code'21 challenges mostly written in python

PyTorch implementation of SmoothGrad: removing noise by adding noise.

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities