FMA: A Dataset For Music Analysis

Overview

FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson.
International Society for Music Information Retrieval Conference (ISMIR), 2017.

We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.

Data

All metadata and features for all tracks are distributed in fma_metadata.zip (342 MiB). The below tables can be used with pandas or any other data analysis tool. See the paper or the usage.ipynb notebook for a description.

  • tracks.csv: per track metadata such as ID, title, artist, genres, tags and play counts, for all 106,574 tracks.
  • genres.csv: all 163 genres with name and parent (used to infer the genre hierarchy and top-level genres).
  • features.csv: common features extracted with librosa.
  • echonest.csv: audio features provided by Echonest (now Spotify) for a subset of 13,129 tracks.

Then, you got various sizes of MP3-encoded audio data:

  1. fma_small.zip: 8,000 tracks of 30s, 8 balanced genres (GTZAN-like) (7.2 GiB)
  2. fma_medium.zip: 25,000 tracks of 30s, 16 unbalanced genres (22 GiB)
  3. fma_large.zip: 106,574 tracks of 30s, 161 unbalanced genres (93 GiB)
  4. fma_full.zip: 106,574 untrimmed tracks, 161 unbalanced genres (879 GiB)

See the wiki (or #41) for known issues (errata).

Code

The following notebooks, scripts, and modules have been developed for the dataset.

  1. usage.ipynb: shows how to load the datasets and develop, train, and test your own models with it.
  2. analysis.ipynb: exploration of the metadata, data, and features. Creates the figures used in the paper.
  3. baselines.ipynb: baseline models for genre recognition, both from audio and features.
  4. features.py: features extraction from the audio (used to create features.csv).
  5. webapi.ipynb: query the web API of the FMA. Can be used to update the dataset.
  6. creation.ipynb: creation of the dataset (used to create tracks.csv and genres.csv).
  7. creation.py: creation of the dataset (long-running data collection and processing).
  8. utils.py: helper functions and classes.

Usage

Binder   Click the binder badge to play with the code and data from your browser without installing anything.

  1. Clone the repository.

    git clone https://github.com/mdeff/fma.git
    cd fma
  2. Create a Python 3.6 environment.
    # with https://conda.io
    conda create -n fma python=3.6
    conda activate fma
    
    # with https://github.com/pyenv/pyenv
    pyenv install 3.6.0
    pyenv virtualenv 3.6.0 fma
    pyenv activate fma
    
    # with https://pipenv.pypa.io
    pipenv --python 3.6
    pipenv shell
    
    # with https://docs.python.org/3/tutorial/venv.html
    python3.6 -m venv ./env
    source ./env/bin/activate
  3. Install dependencies.

    pip install --upgrade pip setuptools wheel
    pip install numpy==1.12.1  # workaround resampy's bogus setup.py
    pip install -r requirements.txt

    Note: you may need to install ffmpeg or graphviz depending on your usage.
    Note: install CUDA to train neural networks on GPUs (see Tensorflow's instructions).

  4. Download some data, verify its integrity, and uncompress the archives.

    cd data
    
    curl -O https://os.unil.cloud.switch.ch/fma/fma_metadata.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_small.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_medium.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_large.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_full.zip
    
    echo "f0df49ffe5f2a6008d7dc83c6915b31835dfe733  fma_metadata.zip" | sha1sum -c -
    echo "ade154f733639d52e35e32f5593efe5be76c6d70  fma_small.zip"    | sha1sum -c -
    echo "c67b69ea232021025fca9231fc1c7c1a063ab50b  fma_medium.zip"   | sha1sum -c -
    echo "497109f4dd721066b5ce5e5f250ec604dc78939e  fma_large.zip"    | sha1sum -c -
    echo "0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab  fma_full.zip"     | sha1sum -c -
    
    unzip fma_metadata.zip
    unzip fma_small.zip
    unzip fma_medium.zip
    unzip fma_large.zip
    unzip fma_full.zip
    
    cd ..

    Note: try 7zip if decompression errors. It might be an unsupported compression issue.

  5. Fill a .env configuration file (at repository's root) with the following content.

    AUDIO_DIR=./data/fma_small/  # the path to a decompressed fma_*.zip
    FMA_KEY=MYKEY  # only if you want to query the freemusicarchive.org API
    
  6. Open Jupyter or run a notebook.

    jupyter notebook
    make usage.ipynb

Impact, coverage, and resources

100+ research papers

Full list on Google Scholar. Some picks below.

2 derived works
~10 posts
5 events
~10 dataset lists

Contributing

Contribute by opening an issue or a pull request. Let this repository be a hub around the dataset!

History

2017-05-09 pre-publication release

  • paper: arXiv:1612.01840v2
  • code: git tag rc1
  • fma_metadata.zip sha1: f0df49ffe5f2a6008d7dc83c6915b31835dfe733
  • fma_small.zip sha1: ade154f733639d52e35e32f5593efe5be76c6d70
  • fma_medium.zip sha1: c67b69ea232021025fca9231fc1c7c1a063ab50b
  • fma_large.zip sha1: 497109f4dd721066b5ce5e5f250ec604dc78939e
  • fma_full.zip sha1: 0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab
  • known issues: see #41

2016-12-06 beta release

  • paper: arXiv:1612.01840v1
  • code: git tag beta
  • fma_small.zip sha1: e731a5d56a5625f7b7f770923ee32922374e2cbf
  • fma_medium.zip sha1: fe23d6f2a400821ed1271ded6bcd530b7a8ea551

Acknowledgments and Licenses

We are grateful to the Swiss Data Science Center (EPFL and ETHZ) for hosting the dataset.

Please cite our work if you use our code or data.

@inproceedings{fma_dataset,
  title = {{FMA}: A Dataset for Music Analysis},
  author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
  booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
  year = {2017},
  archiveprefix = {arXiv},
  eprint = {1612.01840},
  url = {https://arxiv.org/abs/1612.01840},
}
@inproceedings{fma_challenge,
  title = {Learning to Recognize Musical Genre from Audio},
  subtitle = {Challenge Overview},
  author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel},
  booktitle = {The 2018 Web Conference Companion},
  year = {2018},
  publisher = {ACM Press},
  isbn = {9781450356404},
  doi = {10.1145/3184558.3192310},
  archiveprefix = {arXiv},
  eprint = {1803.05337},
  url = {https://arxiv.org/abs/1803.05337},
}
Owner
Michaël Defferrard
Research on machine learning and graphs. Open science, source, data.
Michaël Defferrard
The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

CSGStumpNet The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing Paper | Project page

Daxuan 39 Dec 26, 2022
Human Pose estimation with TensorFlow framework

Human Pose Estimation with TensorFlow Here you can find the implementation of the Human Body Pose Estimation algorithm, presented in the DeeperCut and

Eldar Insafutdinov 1.1k Dec 29, 2022
Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im

Donggon Jang 12 Sep 26, 2022
Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery (ICCV 2021)

Change is Everywhere Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery by Zhuo Zheng, Ailong Ma, Liangpei Zhang and Yanfei

Zhuo Zheng 125 Dec 13, 2022
Optimising chemical reactions using machine learning

Summit Summit is a set of tools for optimising chemical processes. We’ve started by targeting reactions. What is Summit? Currently, reaction optimisat

Sustainable Reaction Engineering Group 75 Dec 14, 2022
A fast implementation of bss_eval metrics for blind source separation

fast_bss_eval Do you have a zillion BSS audio files to process and it is taking days ? Is your simulation never ending ? Fear no more! fast_bss_eval i

Robin Scheibler 99 Dec 13, 2022
Let's Git - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu dieser Hausaufgabe für unseren MOOC: Let's Git! Wir hoffen, dass Du vi

1 Dec 13, 2021
Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

163 Dec 14, 2022
Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification

S-multi-SNE Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification A repository containing the code to reproduce the findings

Theodoulos Rodosthenous 3 Apr 15, 2022
ESP32 python application to read data from a Tilt™ Hydrometer for homebrewing

TitlESP32 ESP32 MicroPython application to read and log data from a Tilt™ Hydrometer. Requirements A board with an ESP32 chip USB cable - USB A / micr

IoBeer 5 Dec 01, 2022
A novel pipeline framework for multi-hop complex KGQA task. About the paper title: Improving Multi-hop Embedded Knowledge Graph Question Answering by Introducing Relational Chain Reasoning

Rce-KGQA A novel pipeline framework for multi-hop complex KGQA task. This framework mainly contains two modules, answering_filtering_module and relati

金伟强 -上海大学人工智能小渣渣~ 16 Nov 18, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 05, 2023
The missing CMake project initializer

cmake-init - The missing CMake project initializer Opinionated CMake project initializer to generate CMake projects that are FetchContent ready, separ

1k Jan 01, 2023
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 08, 2023
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Indic TTS Samples can be found at https://peter-yh-wu.github.io/cross-

Peter Wu 1 Nov 12, 2022
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

22 Dec 12, 2022
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Anton Jeran Ratnarajah 89 Dec 22, 2022
Quantum-enhanced transformer neural network

Example of a Quantum-enhanced transformer neural network Get the code: git clone https://github.com/rdisipio/qtransformer.git cd qtransformer Create

Riccardo Di Sipio 61 Nov 08, 2022
DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

DPT This repo is the official implementation of DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021). We provide code and model

CASIA-IVA-Lab 111 Dec 21, 2022