A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Last update: Dec 12, 2022

Overview

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

The source codes for ICCV2021 Paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
[paper] [supplemental material] [arXiv]

If you find our work or the codebase inspiring and useful to your research, please cite

@inproceedings{yuan2021DIN,
  title={Spatio-Temporal Dynamic Inference Network for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong and Wang, Mang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={7476--7485},
  year={2021}
}

Dependencies

Software Environment: Linux (CentOS 7)
Hardware Environment: NVIDIA TITAN RTX
Python 3.6
PyTorch 1.2.0, Torchvision 0.4.0
RoIAlign for Pytorch

Prepare Datasets

Download publicly available datasets from following links: Volleyball dataset and Collective Activity dataset.
Unzip the dataset file into data/volleyball or data/collective.
Download the file tracks_normalized.pkl from cvlab-epfl/social-scene-understanding and put it into data/volleyball/videos

Using Docker

Checkout repository and cd PROJECT_PATH
Build the Docker container

docker build -t din_gar https://github.com/JacobYuan7/DIN_GAR.git#main

Run the Docker container

docker run --shm-size=2G -v data/volleyball:/opt/DIN_GAR/data/volleyball -v result:/opt/DIN_GAR/result --rm -it din_gar

--shm-size=2G: To prevent ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)., you have to extend the container's shared memory size. Alternatively: --ipc=host
-v data/volleyball:/opt/DIN_GAR/data/volleyball: Makes the host's folder data/volleyball available inside the container at /opt/DIN_GAR/data/volleyball
-v result:/opt/DIN_GAR/result: Makes the host's folder result available inside the container at /opt/DIN_GAR/result
-it & --rm: Starts the container with an interactive session (PROJECT_PATH is /opt/DIN_GAR) and removes the container after closing the session.
din_gar the name/tag of the image
optional: --gpus='"device=7"' restrict the GPU devices the container can access.

Get Started

Train the Base Model: Fine-tune the base model for the dataset.

# Volleyball dataset
cd PROJECT_PATH 
python scripts/train_volleyball_stage1.py

# Collective Activity dataset
cd PROJECT_PATH 
python scripts/train_collective_stage1.py

Train with the reasoning module: Append the reasoning modules onto the base model to get a reasoning model.
1. Volleyball dataset
  - DIN
```
python scripts/train_volleyball_stage2_dynamic.py
```
  - lite DIN
    We can run DIN in lite version by setting cfg.lite_dim = 128 in scripts/train_volleyball_stage2_dynamic.py.
```
python scripts/train_volleyball_stage2_dynamic.py
```
  - ST-factorized DIN
    We can run ST-factorized DIN by setting cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.hierarchical_inference = True.
    
    Note that if you set cfg.hierarchical_inference = False, cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.num_DIN = 2, then multiple interaction fields run in parallel.
```
python scripts/train_volleyball_stage2_dynamic.py
```
  Other model re-implemented by us according to their papers or publicly available codes:
  - AT
```
python scripts/train_volleyball_stage2_at.py
```
  - PCTDM
```
python scripts/train_volleyball_stage2_pctdm.py
```
  - SACRF
```
python scripts/train_volleyball_stage2_sacrf_biute.py
```
  - ARG
```
python scripts/train_volleyball_stage2_arg.py
```
  - HiGCIN
```
python scripts/train_volleyball_stage2_higcin.py
```
2. Collective Activity dataset
  - DIN
```
python scripts/train_collective_stage2_dynamic.py
```
  - DIN lite
    We can run DIN in lite version by setting 'cfg.lite_dim = 128' in 'scripts/train_collective_stage2_dynamic.py'.
```
python scripts/train_collective_stage2_dynamic.py
```

Another work done by us, solving GAR from the perspective of incorporating visual context, is also available.

@inproceedings{yuan2021visualcontext,
  title={Learning Visual Context for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={3261--3269},
  year={2021}
}

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Related tags

Overview

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

Dependencies

Prepare Datasets

Using Docker

Get Started

Owner

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

LegoDNN: a block-grained scaling tool for mobile vision systems

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

This is the latest version of the PULP SDK

Systemic Evolutionary Chemical Space Exploration for Drug Discovery

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

State-to-Distribution (STD) Model

Controlling Hill Climb Racing with Hand Tacking

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

FTIR-Deep Learning - FTIR Deep Learning With Python

Demonstrational Session git repo for H SAF User Workshop (28/1)