Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Overview

Music Trees

Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

train-test splits and hierarchies.

  • For all experiments, we used the instrument-based split in /music_trees/assets/partitions/mdb-aug.json.
  • To view our Hornbostel-Sachs class hierarchy, see /music_trees/assets/taxonomies/deeper-mdb.yaml. Note that not all of the instruments on this taxonomy are used in our experiments.
  • All random taxonomies are in /music_trees/assets/taxonomies/scrambled-*.yaml

Installation

first, clone the medleydb repo and install using pip install -e:

  • medleydb from marl

Now, download the medleydb and mdb 2.0 datasets from zenodo.

install some utilities for visualizing the embedding space:

git clone https://github.com/hugofloresgarcia/embviz.git
cd embviz
pip install -e .

then, clone this repo and install with

pip install -e .

Usage

1. Generate data

Make sure the MEDLEYDB_PATH environment variable is set (see the medleydb repo for more instructions ). Then, run the generation script:

python -m music_trees.generate \
                --dataset mdb \
                --name mdb-aug \
                --example_length 1.0 \
                --augment true \
                --hop_length 0.5 \
                --sample_rate 16000 \

This will generate both augmented and unaugmented data for MedleyDB. NOTE: There was a bug in the code that disabled data augmentation silently. This bug has been left in the code for the sake of reproducibility. This is why we don't report any data augmentation in the paper, as none was applied at the time of experiments.

2. Partition data

The partition file used for all experiments is available at /music_trees/assets/partitions/mdb-aug.json.

3. Run experiments

The search script will train all models for a particular experiment. It will grab as many GPUs are available (use CUDA_VISIBLE_DEVICES to change the availability of GPUs) and train as many models as it can in parallel.

Each model will be stored under /runs/<NAME>/<VERSION>.

Arbitrary Hierarchies

python music_trees/search.py --name scrambled-tax

Height Search (note that height=0 and height=1 are the baseline and proposed model, respectively)

python music_trees/search.py --name height-v1

Loss Ablation

python music_trees/search.py --name loss-alpha

train the additional BCE baseline:

python music_trees/train.py --model_name hprotonet --height 4 --d_root 128 --loss_alpha 1 --name "flat (BCE)" --dataset mdb-aug --learning_rate 0.03 --loss_weight_fn cross-entropy

4. Evaluate

Perform evaluation on a model. Make sure to pass the path to the run that you wish to evaluate.

To evaluate a model:

python music_trees/eval.py --exp_dir <PATH_TO_RUN>/<VERSION>

Each model will store its evaluation results under /results/<NAME>/<VERSION>

5. Analyze

To compare models and generate analysis figures and tables, place of all the results folders you would like to analyze under a single folder. The resulting folder should look like this:

my_experiment/trial1/version_0
my_experiment/trial2/version_0
my_experiment/trial3/version_0

Then, run analysis using

python music_trees analyze.py my_experiment   <OUTPUT_NAME> 

the figures will be created under /analysis/<OUTPUT_NAME>

To generate paper-ready figures, see scripts/figures.ipynb.

Owner
Hugo Flores García
PhD @interactiveaudiolab
Hugo Flores García
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Panoptic NeRF Project Page | Paper | Dataset Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation Xiao Fu*, Shangzhan zhang*,

Xiao Fu 111 Dec 16, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
TensorFlow implementation of "Variational Inference with Normalizing Flows"

[TensorFlow 2] Variational Inference with Normalizing Flows TensorFlow implementation of "Variational Inference with Normalizing Flows" [1] Concept Co

YeongHyeon Park 7 Jun 08, 2022
Out-of-Town Recommendation with Travel Intention Modeling (AAAI2021)

TrainOR_AAAI21 This is the official implementation of our AAAI'21 paper: Haoran Xin, Xinjiang Lu, Tong Xu, Hao Liu, Jingjing Gu, Dejing Dou, Hui Xiong

Jack Xin 13 Oct 19, 2022
LETR: Line Segment Detection Using Transformers without Edges

LETR: Line Segment Detection Using Transformers without Edges Introduction This repository contains the official code and pretrained models for Line S

mlpc-ucsd 157 Jan 06, 2023
Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

Clockwork VAEs in JAX/Flax Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax, ported

Julius Kunze 26 Oct 05, 2022
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation

PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation Winner method of the ICCV-2021 SemKITTI-DVPS Challenge. [arxiv] [

Yuan Haobo 38 Jan 03, 2023
PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

Mitch Hill 32 Nov 25, 2022
QMagFace: Simple and Accurate Quality-Aware Face Recognition

Quality-Aware Face Recognition 26.11.2021 start readme QMagFace: Simple and Accurate Quality-Aware Face Recognition Research Paper Implementation - To

Philipp Terhörst 59 Jan 04, 2023
AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages

AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages This repository contains the code for the pa

Kelechi 40 Nov 24, 2022
Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

Img-process-manual - Opencv Library basic graphic processing algorithm coding reproduction based on Numpy and Matplotlib library

Jack_Shaw 2 Dec 12, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
Unified tracking framework with a single appearance model

Paper: Do different tracking tasks require different appearance model? [ArXiv] (comming soon) [Project Page] (comming soon) UniTrack is a simple and U

ZhongdaoWang 300 Dec 24, 2022
Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Crowd-Kit: Computational Quality Control for Crowdsourcing Documentation Crowd-Kit is a powerful Python library that implements commonly-used aggregat

Toloka 125 Dec 30, 2022
Fast EMD for Python: a wrapper for Pele and Werman's C++ implementation of the Earth Mover's Distance metric

PyEMD: Fast EMD for Python PyEMD is a Python wrapper for Ofir Pele and Michael Werman's implementation of the Earth Mover's Distance that allows it to

William Mayner 433 Dec 31, 2022
Image Fusion Transformer

Image-Fusion-Transformer Platform Python 3.7 Pytorch =1.0 Training Dataset MS-COCO 2014 (T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ram

Vibashan VS 68 Dec 23, 2022
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 02, 2022