ProMP: Proximal Meta-Policy Search

Last update: Dec 20, 2022

Overview

ProMP: Proximal Meta-Policy Search

Implementations corresponding to ProMP (Rothfuss et al., 2018). Overall this repository consists of two branches:

master: lightweight branch that provides the necessary code to run Meta-RL algorithms such as ProMP, E-MAML, MAML. This branch is meant to provide an easy start with Meta-RL and can be integrated into other projects and setups.
full-code: branch that provides the comprehensive code that was used to produce the experimental results in Rothfuss et al. (2018). This includes experiment scripts and plotting scripts that can be used to reproduce the experimental results in the paper.

The code is written in Python 3 and builds on Tensorflow. Many of the provided reinforcement learning environments require the Mujoco physics engine. Overall the code was developed under consideration of modularity and computational efficiency. Many components of the Meta-RL algorithm are parallelized either using either MPI or Tensorflow in order to ensure efficient use of all CPU cores.

Documentation

An API specification and explanation of the code components can be found here. Also the documentation can be build locally by running the following commands

# ensure that you are in the root folder of the project
cd docs
# install the sphinx documentaiton tool dependencies
pip install requirements.txt
# build the documentaiton
make clean && make html
# now the html documentation can be found under docs/build/html/index.html

Installation / Dependencies

The provided code can be either run in A) docker container provided by us or B) using python on your local machine. The latter requires multiple installation steps in order to setup dependencies.

A. Docker

If not installed yet, set up docker on your machine. Pull our docker container jonasrothfuss/promp from docker-hub:

docker pull jonasrothfuss/promp

All the necessary dependencies are already installed inside the docker container.

B. Anaconda or Virtualenv

B.1. Installing MPI

Ensure that you have a working MPI implementation (see here for more instructions).

For Ubuntu you can install MPI through the package manager:

sudo apt-get install libopenmpi-dev

B.2. Create either venv or conda environment and activate it

Virtualenv

pip install --upgrade virtualenv
virtualenv 
   
    
source 
    
     /bin/activate

Anaconda

If not done yet, install anaconda by following the instructions here. Then reate a anaconda environment, activate it and install the requirements in requirements.txt.

conda create -n 
   
     python=3.6
source activate

B.3. Install the required python dependencies

pip install -r requirements.txt

B.4. Set up the Mujoco physics engine and mujoco-py

For running the majority of the provided Meta-RL environments, the Mujoco physics engine as well as a corresponding python wrapper are required. For setting up Mujoco and mujoco-py, please follow the instructions here.

Running ProMP

In order to run the ProMP algorithm point environment (no Mujoco needed) with default configurations execute:

python run_scripts/pro-mp_run_point_mass.py

To run the ProMP algorithm in a Mujoco environment with default configurations:

python run_scripts/pro-mp_run_mujoco.py

The run configuration can be change either in the run script directly or by providing a JSON configuration file with all the necessary hyperparameters. A JSON configuration file can be provided through the flag. Additionally the dump path can be specified through the dump_path flag:

python run_scripts/pro-mp_run.py --config_file 
   
     --dump_path

Additionally, in order to run the the gradient-based meta-learning methods MAML and E-MAML (Finn et. al., 2017 and Stadie et. al., 2018) in a Mujoco environment with the default configuration execute, respectively:

python run_scripts/maml_run_mujoco.py 
python run_scripts/e-maml_run_mujoco.py

Cite

To cite ProMP please use

@article{rothfuss2018promp,
  title={ProMP: Proximal Meta-Policy Search},
  author={Rothfuss, Jonas and Lee, Dennis and Clavera, Ignasi and Asfour, Tamim and Abbeel, Pieter},
  journal={arXiv preprint arXiv:1810.06784},
  year={2018}
}

Acknowledgements

This repository includes environments introduced in (Duan et al., 2016, Finn et al., 2017).

ProMP: Proximal Meta-Policy Search

Related tags

Overview

ProMP: Proximal Meta-Policy Search

Documentation

Installation / Dependencies

A. Docker

B. Anaconda or Virtualenv

B.1. Installing MPI

B.2. Create either venv or conda environment and activate it

Virtualenv

Anaconda

B.3. Install the required python dependencies

B.4. Set up the Mujoco physics engine and mujoco-py

Running ProMP

Cite

Acknowledgements

Owner

Jonas Rothfuss

Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

Dilated RNNs in pytorch

Codebase for Inducing Causal Structure for Interpretable Neural Networks

Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

The source code and dataset for the RecGURU paper (WSDM 2022)

An AI Assistant More Than a Toolkit

Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Neural Architecture Search Powered by Swarm Intelligence 🐜

A Real-World Benchmark for Reinforcement Learning based Recommender System

Res2Net for Instance segmentation and Object detection using MaskRCNN

Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Multimodal commodity image retrieval 多模态商品图像检索

Offline Reinforcement Learning with Implicit Q-Learning

RCDNet: A Model-driven Deep Neural Network for Single Image Rain Removal (CVPR2020)

AdamW optimizer for bfloat16 models in pytorch.

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers