Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Overview

ToeplitzLDA

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from label proportions (LLP) example or the example script.

Note we used Ubuntu 20.04 with python 3.8.10 to generate our results.

Getting Started / User Setup

If you only want to use this library, you can use the following setup. Note that this setup is based on a fresh Ubuntu 20.04 installation.

Getting fresh ubuntu ready

apt install python3-pip python3-venv

Python package installation

In this setup, we assume you want to run the examples that actually make use of real EEG data or the actual unsupervised speller replay. If you only want to employ ToeplitzLDA in your own spatiotemporal data / without mne and moabb then you can remove the package extra neuro, i.e. pip install toeplitzlda or pip install toeplitzlda[solver]

  1. (Optional) Install fortran Compiler. On ubuntu: apt install gfortran
  2. Create virtual environment: python3 -m venv toeplitzlda_venv
  3. Activate virtual environment: source toeplitzlda_venv/bin/activate
  4. Install toeplitzlda: pip install toeplitzlda[neuro,solver], if you dont have a fortran compiler: pip install toeplitzlda[neuro]

Check if everything works

Either clone this repo or just download the scripts/example_toeplitz_lda_bci_data.py file and run it: python example_toeplitz_lda_bci_data.py. Note that this will automatically download EEG data with a size of around 650MB.

Alternatively, you can use the scripts/example_toeplitz_lda_generated_data.py where artificial data is generated. Note however, that only stationary background noise is modeled and no interfering artifacts as is the case in, e.g., real EEG data. As a result, the overfitting effect of traditional slda on these artifacts is reduced.

Using ToeplitzLDA in place of traditional shrinkage LDA from sklearn

If you have already your own pipeline, you can simply add toeplitzlda as a dependency in your project and then replace sklearns LDA, i.e., instead of:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
clf = LinearDiscriminantAnalysis(solver="lsqr", shrinkage="auto")  # or eigen solver

use

from toeplitzlda.classification import ToeplitzLDA
clf = ToeplitzLDA(n_channels=your_n_channels)

where your_n_channels is the number of channels of your signal and needs to be provided for this method to work.

If you prefer using sklearn, you can only replace the covariance estimation part, note however, that this in practice (on our data) yields worse performance, as sklearn estimates the class-wise covariance matrices and averages them afterwards, whereas we remove the class-wise means and the estimate one covariance matrix from the pooled data.

So instead of:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
clf = LinearDiscriminantAnalysis(solver="lsqr", shrinkage="auto")  # or eigen solver

you would use

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from toeplitzlda.classification.covariance import ToepTapLW
toep_cov = ToepTapLW(n_channels=your_n_channels)
clf = LinearDiscriminantAnalysis(solver="lsqr", covariance_estimator=toep_cov)  # or eigen solver

Development Setup

We use a fortran compiler to provide speedups for solving block-Toeplitz linear equation systems. If you are on ubuntu you can install gfortran.

We use poetry for dependency management. If you have it installed you can simply use poetry install to set up the virtual environment with all dependencies. All extra features can be installed with poetry install -E solver,neuro.

If setup does not work for you, please open an issue. We cannot guarantee support for many different platforms, but could provide a singularity image.

Learning from label proportions

Use the run_llp.py script to apply ToeplitzLDA in the LLP scenario and create the results file for the different preprocessing parameters. These can then be visualized using visualize_llp.py to create the plots shown in our publication. Note that running LLP takes a while and the two datasets will be downloaded automatically and are approximately 16GB in size. Alternatively, you can use the results provided by us that are stored in scripts/usup_replay/provided_results by moving/copying them to the location that visualize_llp.py looks for.

ERP benchmark

This is not yet available.

Note this benchmark will take quite a long time if you do not have access to a computing cluster. The public datasets (including the LLP datasets) total a size of approximately 120GB.

BLOCKING TODO: How should we handle the private datasets?

FAQ

Why is my classification performance for my stationary spatiotemporal data really bad?

Check if your data is in channel-prime order, i.e., in the flattened feature vector, you first enumerate over all channels (or some other spatially distributed sensors) for the first time point and then for the second time point and so on. If this is not the case, tell the classifier: e.g. ToeplitzLDA(n_channels=16, data_is_channel_prime=False)

Owner
Jan Sosulski
Jan Sosulski
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

neural-combinatorial-rl-pytorch PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic

Patrick E. 454 Jan 06, 2023
[Arxiv preprint] Causality-inspired Single-source Domain Generalization for Medical Image Segmentation (code&data-processing pipeline)

Causality-inspired Single-source Domain Generalization for Medical Image Segmentation Arxiv preprint Repository under construction. Might still be bug

Cheng 31 Dec 27, 2022
Fashion Recommender System With Python

Fashion-Recommender-System Thr growing e-commerce industry presents us with a la

Omkar Gawade 2 Feb 02, 2022
Machine Learning Time-Series Platform

cesium: Open-Source Platform for Time Series Inference Summary cesium is an open source library that allows users to: extract features from raw time s

632 Dec 26, 2022
Awesome Human Pose Estimation

Human Pose Estimation Related Publication

Zhe Wang 1.2k Dec 26, 2022
Source code and dataset of the paper "Contrastive Adaptive Propagation Graph Neural Networks forEfficient Graph Learning"

CAPGNN Source code and dataset of the paper "Contrastive Adaptive Propagation Graph Neural Networks forEfficient Graph Learning" Paper URL: https://ar

1 Mar 12, 2022
Datasets, Transforms and Models specific to Computer Vision

vision Datasets, Transforms and Models specific to Computer Vision Installation First install the nightly version of OneFlow python3 -m pip install on

OneFlow 68 Dec 07, 2022
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

HTSeq DEVS: https://github.com/htseq/htseq DOCS: https://htseq.readthedocs.io A Python library to facilitate programmatic analysis of data from high-t

HTSeq 57 Dec 20, 2022
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

beringresearch 285 Jan 04, 2023
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

NVIDIA Research Projects 74 Dec 22, 2022
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020) Introduction AdaShare is a novel and differentiable approach fo

94 Dec 22, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
A Keras implementation of YOLOv4 (Tensorflow backend)

keras-yolo4 请使用更完善的版本: https://github.com/miemie2013/Keras-YOLOv4 Please visit here for more complete model: https://github.com/miemie2013/Keras-YOLOv

384 Nov 29, 2022
Deep learning with TensorFlow and earth observation data.

Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu

Development Seed 20 Nov 16, 2022
Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation

tf-imle Tensorflow 2 and PyTorch implementation and Jupyter notebooks for Implicit Maximum Likelihood Estimation (I-MLE) proposed in the NeurIPS 2021

NEC Laboratories Europe 69 Dec 13, 2022
The code of paper 'Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection'

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection Pytorch implemetation of paper 'Learning to Aggregate and Personalize

Tencent YouTu Research 136 Dec 29, 2022