Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Overview

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Trevor Ablett, Daniel (Yifan) Zhai, Jonathan Kelly

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’21)

Paper website: https://papers.starslab.ca/multiview-manipulation/
arXiv paper: https://arxiv.org/abs/2104.13907
DOI: https://doi.org/10.1109/IROS51168.2021.9636440


This work was motivated by a relatively simple question: will increasingly popular end-to-end visuomotor policies work on a mobile manipulator, where the angle of the base will not be repeatable from one execution of a task to another? We conducted a variety of experiments to show that, naively, policies trained on fixed-base data with imitation learning do not generalize to various poses, and also generate multiview datasets and corresponding multiview policies to remedy the problem.

This repository contains the source code for reproducing our results and plots.

Requirements

We have only tested in python 3.7. Our simulated environments use pybullet, and our training code uses TensorFlow 2.x, specifically relying on our manipulator-learning package. All requirements (for simulated environments) are automatically installed by following Setup below.

Our policies also use the groups argument in TensorFlow Conv2d, which requires a GPU.

Setup

Preliminary note on TensorFlow install

This repository uses TensorFlow with GPU support, which can of course can be a bit of a pain to install. If you already have it installed, ignore this message. Otherwise, we have found the following procedure to work:

  1. Install conda.
  2. Create a new conda env to use for this work and activate it.
  3. Run the following to install a version of TensorFlow that may work with Conda
conda install cudatoolkit cudnn
pip install tensorflow==2.6.* tensorflow-probability==0.14

Now you can continue with the regular installation.

Regular Installation

Clone this repository and install in your python environment with pip.

git clone [email protected]:utiasSTARS/multiview-manipulation.git && cd multiview-manipulation
pip install -e .

A Note on Environment Names

The simulated environments that we use are all available in our manipulator-learning package and are called:

  • ThingLiftXYZImage
  • ThingLiftXYZMultiview
  • ThingStackSameImageV2
  • ThingStackSameMultiviewV2
  • ThingPickAndInsertSucDoneImage
  • ThingPickAndInsertSucDoneMultiview
  • ThingDoorImage
  • ThingDoorMultiview

The real environments we use with our mobile manipulator will, of course, be harder to reproduce, but were generated using our thing-gym-ros repository and are called:

  • ThingRosPickAndInsertCloser6DOFImageMB
  • ThingRosDrawerRanGrip6DOFImageMB
  • ThingRosDoorRanGrip6DOFImage
  • ThingRosDoorRanGrip6DOFImageMB

Running and Training Behavioural Cloning (BC) policies

The script in this repository can actually train and test (multiple)policies all in one shot.

  1. Choose one of:

    1. Train and test policies all at once. Download and uncompress any of the simulated expert data (generated using an HTC Vive hand tracker) from this Google Drive Folder.
    2. Generate policies using the procedure outlined in the following section.
    3. Download policies from this Google Drive Folder. We'll assume that you downloaded ThingDoorMultiview_bc_models.zip.

    If you choose i., your folder structure should be:

     .
     └── multiview-manipulation/
         ├── multiview_manipulation/
         └── data/
             ├── bc_models/
             └── demonstrations/
                 ├── ThingDoorMultiview/
                     ├── depth/
                     ├── img/
                     ├── data.npz
                     └── data_swp.npz
    

    If you choose ii. or iii., your folder structure should be:

    .
    └── multiview-manipulation/
        ├── multiview_manipulation/
        └── data/
            └── bc_models/
                ├── ThingDoorMultiview_25_trajs_1/
                ├── ThingDoorMultiview_25_trajs_2/
                ├── ThingDoorMultiview_25_trajs_3/
                ├── ThingDoorMultiview_25_trajs_4/
                ├── ThingDoorMultiview_25_trajs_5/   
                ├── ThingDoorMultiview_50_trajs_1/   
                └── ...   
    
  2. Modify the following options in multiview_manipulation/policies/test_policies.py to match your system and selected data:

    • main_data_dir: top level data directory (default: data)
    • bc_models_dir: top level trained BC models directory (default: bc_models)
    • expert_data_dir: top level expert data directory (default: demonstrations, only required if option i. above was selected).
  3. Change the following options to choose whether you want to test policies in a different environment from which they were trained in (e.g., as stated in the paper, you can test a ThingDoorMultiview policy in both ThingDoorMultiview and ThingDoorImage):

    • env_name: environment to test policy in
    • policy_env_name: name of environment that data for policy was generated from.
  4. Modify the options for choosing which policies to train/test:

    • bc_ckpts_num_traj: The different number of trajectories to use for training/trained policies (default: range(200, 24, -25))
    • seeds: Which seeds to use (default: [1, 2, 3, 4, 5])
  5. Run the script:

python multiview_manipulation/policies/test_policies.py
  1. Your results will show up in data/bc_results/{env_name}_{env_seed}_{experiment_name}.

Training policies with Behavioural Cloning (BC) only

  1. Download and uncompress any of simulated expert data from this Google Drive Folder. We'll assume that you downloaded ThingDoorMultiview.tar.gz and uncompressed it as ThingDoorMultiview.

  2. Modify the following options in multiview_manipulation/policies/gen_policies.py to match your system and selected data:

    • bc_models_dir: top level directory for trained BC models (default: data/bc_models)
    • expert_data_dir: top level directory for expert data (default: data/demonstrations)
    • dataset_dir: the name of the directory containing depth/, img/, data.npz and data_swp.npz.
    • env_str: The string corresponding to the name of the environment (only used for the saved BC policy name)

    For example, if you're using the default folder structure, your setup should look like this:

    .
    └── multiview-manipulation/
        ├── multiview_manipulation/
        └── data/
            ├── bc_models/
            └── demonstrations/
                ├── ThingDoorMultiview/
                    ├── depth/
                    ├── img/
                    ├── data.npz
                    └── data_swp.npz
    
  3. Modify the options for choosing which policies to train:

    • bc_ckpts_num_traj: The different number of trajectories to use for training policies (default: range(25, 201, 25))
    • seeds: Which seeds to train for (default: [1, 2, 3, 4, 5])
  4. Run the file:

python multiview_manipulation/policies/gen_policies.py
  1. Your trained policies will show up in individual folders under the bc_models folder as {env_str}_{num_trajs}_trajs_{seed}/.

Collecting Demonstrations

All of our demonstrations were collected using the collect_demos.py file from the manipulator-learning package and an HTC Vive Hand Tracker. To collect demonstrations, you would use, for example:

git clone [email protected]:utiasSTARS/manipulator-learning.git && cd manipulator-learning
pip install -e .
pip install -r device_requirements.txt
python manipulator_learning/learning/imitation/collect_demos.py --device vr --directory demonstrations --demo_name ThingDoorMultiview01 --environment ThingDoorMultiview

You can also try using the keyboard with:

python manipulator_learning/learning/imitation/collect_demos.py --device keyboard --directory demonstrations --demo_name ThingDoorMultiview01 --environment ThingDoorMultiview

More instructions can be found in the manipulator-learning README.

Real Environments

Although it would be nearly impossible to exactly reproduce our results with our real environments, the code we used for generating our real environments can be found in our thing-gym-ros repository.

Citation

If you use this in your work, please cite:

@inproceedings{2021_Ablett_Seeing,
    address = {Prague, Czech Republic},
    author = {Trevor Ablett and Yifan Zhai and Jonathan Kelly},
    booktitle = {Proceedings of the {IEEE/RSJ} International Conference on Intelligent Robots and Systems {(IROS'21)}},
    date = {2021-09-27/2021-10-01},
    month = {Sep. 27--Oct. 1},
    site = {https://papers.starslab.ca/multiview-manipulation/},
    title = {Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations},
    url = {http://arxiv.org/abs/2104.13907},
    video1 = {https://youtu.be/oh0JMeyoswg},
    year = {2021}
}
Owner
STARS Laboratory
We are the Space and Terrestrial Autonomous Robotic Systems Laboratory at the University of Toronto
STARS Laboratory
An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Deep Permutation Equivariant Structure from Motion Paper | Poster This repository contains an implementation for the ICCV 2021 paper Deep Permutation

72 Dec 27, 2022
Weighted QMIX: Expanding Monotonic Value Function Factorisation

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation"

whirl 82 Dec 29, 2022
Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks - Official Project Page This repository contains the code develope

Amirsina Torfi 1.7k Dec 18, 2022
FairFuzz: AFL extension targeting rare branches

FairFuzz An AFL extension to increase code coverage by targeting rare branches. FairFuzz has a particular advantage on programs with highly nested str

Caroline Lemieux 222 Nov 16, 2022
Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad to your characters in Modo.

Applicator Kit for Modo Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad with a TrueDepth camera to

Andrew Buttigieg 3 Aug 24, 2021
ColossalAI-Benchmark - Performance benchmarking with ColossalAI

Benchmark for Tuning Accuracy and Efficiency Overview The benchmark includes our

HPC-AI Tech 31 Oct 07, 2022
Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

This code implements the paper, Kim et al. (2021). Imputing Qualitative Attributes for Trip Chains Extracted from Smart Card Data Using a Conditional Generative Adversarial Network. Transportation Re

Eui-Jin Kim 2 Feb 03, 2022
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
Face recognize and crop them

Face Recognize Cropping Module Source 아이디어 Face Alignment with OpenCV and Python Requirement 필요 라이브러리 imutil dlib python-opence (cv2) Usage 사용 방법 open

Cho Moon Gi 1 Feb 15, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 02, 2023
Measuring if attention is explanation with ROAR

NLP ROAR Interpretability Official code for: Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Toke

Andreas Madsen 19 Nov 13, 2022
Evolution Strategies in PyTorch

Evolution Strategies This is a PyTorch implementation of Evolution Strategies. Requirements Python 3.5, PyTorch = 0.2.0, numpy, gym, universe, cv2 Wh

Andrew Gambardella 333 Nov 14, 2022
Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

Wenliang Zhao 75 Nov 11, 2022
Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

dimensions Estimating the instrinsic dimensionality of image datasets Code for: The Intrinsic Dimensionaity of Images and Its Impact On Learning - Phi

Phil Pope 41 Dec 10, 2022
LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image

CVSM Group - email: <a href=[email protected]"> 377 Jan 07, 2023
用opencv的dnn模块做yolov5目标检测,包含C++和Python两个版本的程序

yolov5-dnn-cpp-py yolov5s,yolov5l,yolov5m,yolov5x的onnx文件在百度云盘下载, 链接:https://pan.baidu.com/s/1d67LUlOoPFQy0MV39gpJiw 提取码:bayj python版本的主程序是main_yolov5.

365 Jan 04, 2023
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving Code will be available soon. Motivation Architecture

Kai Chen 24 Apr 19, 2022
This is the code used in the paper "Entity Embeddings of Categorical Variables".

This is the code used in the paper "Entity Embeddings of Categorical Variables". If you want to get the original version of the code used for the Kagg

Cheng Guo 845 Nov 29, 2022