Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Last update: Sep 16, 2022

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Official implementation of ACC, described in the paper "Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning". The source code is based on the pytorch implementation of TQC, which again is based on TD3. We thank the authors for making their source code publicly available.

Requirements

Install MuJoCo

Download and install MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:

Install

We recommend to use an anaconda environment. In our experiments we used python 3.7 and the following dependencies

pip install gym==0.17.2 mujoco-py==1.50.1.68 numpy==1.19.1 torch==1.6.0 torchvision==0.7.0

Running ACC

You can run ACC for TQC on one of the gym continuous control environments by calling

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --seed 0

To run the data efficient variant with 4 critic update steps per environment step you can call

python main.py --env "HalfCheetah-v3" --max_timesteps 1000000 --num_critic_updates 4 --seed 0

An example script that runs the experiments for 10 seeds and all environments is in run_experiment.sh and run_experiment_data_efficient.sh.

You can speed up the experiments by using fewer networks in the ensemble of TQC. This trades off a little bit of performance for a faster runtime (see the Appendix of the paper). The number of networks can be controlled with the flag --n_nets. For example

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --n_nets 2--seed 0

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Requirements

Install MuJoCo

Install

Running ACC

Owner

A Strong Baseline for Image Semantic Segmentation

fcn by tensorflow

This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Code for "AutoMTL: A Programming Framework for Automated Multi-Task Learning"

Using Convolutional Neural Networks (CNN) for Semantic Segmentation of Breast Cancer Lesions (BRCA)

SatelliteSfM - A library for solving the satellite structure from motion problem

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

True Few-Shot Learning with Language Models

Dilated RNNs in pytorch

Automatic differentiation with weighted finite-state transducers.

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Accelerating BERT Inference for Sequence Labeling via Early-Exit

A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Fiddle is a Python-first configuration library particularly well suited to ML applications.

Bringing sanity to world of messed-up data