Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Related tags

Deep Learningseed_rl
Overview

Off-Policy Correction For Multi-Agent Reinforcement Learning

This repository is the official implementation of Off-Policy Correction For Multi-Agent Reinforcement Learning. It is based on SEED RL, commit 5f07ba2a072c7a562070b5a0b3574b86cd72980f.

Requirements

Execution of our code is done within Docker container, you must install Docker according to the instructions provided by the authors. The specific requirements for our project are prepared as dockerfile (docker/Dockerfile.starcraft) and installed inside a container during the first execution of running script. Before running training, firstly build its base image by running:

./docker_base/marlgrid/docker/build_base.sh

Note that to execute docker commands you may need to use sudo or install Docker in rootless mode.

Training

To train a MA-Trace model, run the following command:

./run_local.sh starcraft vtrace [nb of actors] [configuration]

The [nb of actors] specifies the number of workers used for training, should be a positive natural number.

The [configuration] specifies the hyperparameters of training.

The most important hyperparameters are:

  • learning_rate the learning rate
  • entropy_cost initial entropy cost
  • target_entropy final entropy cost
  • entropy_cost_adjustment_speed how fast should entropy cost be adjusted towards the final value
  • frames_stacked the number of stacked frames
  • batch_size the size of training batches
  • discounting the discount factor
  • full_state_critic whether to use full state as input to critic network, set False to use only agents' observations
  • is_centralized whether to perform centralized or decentralized training
  • task_name name of the SMAC task to train on, see the section below

There are other parameters to configure, listed in the files, though of minor importance.

The running script provides evaluation metrics during training. They are displayed using tmux, consider checking the navigation controls.

For example, to use default parameters and one actor, run:

./run_local.sh starcraft vtrace 1 ""

To train the algorithm specified in the paper:

  • MA-Trace (obs): ./run_local.sh starcraft vtrace 1 "--full_state_critic=False"
  • MA-Trace (full): ./run_local.sh starcraft vtrace 1 "--full_state_critic=True"
  • DecMa-Trace: ./run_local.sh starcraft vtrace 1 "--is_centralized=False"
  • MA-Trace (obs) with 3 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=False --frames_stacked=3"
  • MA-Trace (full) with 4 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=True --frames_stacked=4"

Note that to match the perforance presented in the paper it is required to use higher number of actors, e.g. 20.

StarCraft Multi-Agent Challange

We evaluate our models on the StarCraft Multi-Agent Challange benchmark (latest version, i.e. 4.10). The challange consists of 14 tasks: '2s_vs_1sc', '2s3z', '3s5z', '1c3s5z', '10m_vs_11m', '2c_vs_64zg', 'bane_vs_bane', '5m_vs_6m', '3s_vs_5z', '3s5z_vs_3s6z', '6h_vs_8z', '27m_vs_30m', 'MMM2' and 'corridor'.

To train on a chosen task, e.g. 'MMM2', add --task_name='MMM2' to configuration, e.g.

./run_local.sh starcraft vtrace 1 "--full_state_critic=False --task_name='MMM2'"

Results

Our model achieves the following performance on SMAC:

results.png

In this work, we will implement some basic but important algorithm of machine learning step by step.

WoRkS continued English 中文 Français Probability Density Estimation-Non-Parametric Methods(概率密度估计-非参数方法) 1. Kernel / k-Nearest Neighborhood Density Est

liziyu0104 1 Dec 30, 2021
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.

SOFA This repository is the implementation of SOFA, the Simulator for OFfline leArning and evaluation. Keeping Dataset Biases out of the Simulation: A

22 Nov 23, 2022
Fast, general, and tested differentiable structured prediction in PyTorch

Fast, general, and tested differentiable structured prediction in PyTorch

HNLP 1.1k Dec 16, 2022
Sleep staging from ECG, assisted with EEG

Sleep_Staging_Knowledge Distillation This codebase implements knowledge distillation approach for ECG based sleep staging assisted by EEG based sleep

2 Dec 12, 2022
Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

Git repositoty of the manuscript entitled Statistical quantification of confounding bias in predictive modelling by Tamas Spisak The manuscript descri

PNI - Predictive Neuroimaging Lab, University Hospital Essen, Germany 0 Nov 22, 2021
Project for tracking occupancy in Tel-Aviv parking lots.

Ahuzat Dibuk - Tracking occupancy in Tel-Aviv parking lots main.py This module was set-up to be executed on Google Cloud Platform. I run it every 15 m

Geva Kipper 35 Nov 22, 2022
Deep Structured Instance Graph for Distilling Object Detectors (ICCV 2021)

DSIG Deep Structured Instance Graph for Distilling Object Detectors Authors: Yixin Chen, Pengguang Chen, Shu Liu, Liwei Wang, Jiaya Jia. [pdf] [slide]

DV Lab 31 Nov 17, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 09, 2023
OpenMMLab Computer Vision Foundation

English | 简体中文 Introduction MMCV is a foundational library for computer vision research and supports many research projects as below: MMCV: OpenMMLab

OpenMMLab 4.6k Jan 09, 2023
A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

Ilya Kostrikov 300 Dec 11, 2022
a project for 3D multi-object tracking

a project for 3D multi-object tracking

155 Jan 04, 2023
Easy to use Audio Tagging in PyTorch

Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: Fine-tune on audio classification Fine-tune on audio tagging Fine-tune on s

sithu3 15 Dec 22, 2022
DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection

DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection Code for our Paper DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Obje

Steven Lang 58 Dec 19, 2022
Recurrent Conditional Query Learning

Recurrent Conditional Query Learning (RCQL) This repository contains the Pytorch implementation of One Model Packs Thousands of Items with Recurrent C

Dongda 4 Nov 28, 2022
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022
Writeups for the challenges from DownUnderCTF 2021

cloud Challenge Author Difficulty Release Round Bad Bucket Blue Alder easy round 1 Not as Bad Bucket Blue Alder easy round 1 Lost n Found Blue Alder m

DownUnderCTF 161 Dec 31, 2022