MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Last update: Dec 26, 2022

Overview

MINIROCKET

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

arXiv:2012.08791 (preprint)

Until recently, the most accurate methods for time series classification were limited by high computational complexity. ROCKET achieves state-of-the-art accuracy with a fraction of the computational expense of most existing methods by transforming input time series using random convolutional kernels, and using the transformed features to train a linear classifier. We reformulate ROCKET into a new method, MINIROCKET, making it up to 75 times faster on larger datasets, and making it almost deterministic (and optionally, with additional computational expense, fully deterministic), while maintaining essentially the same accuracy. Using this method, it is possible to train and test a classifier on all of 109 datasets from the UCR archive to state-of-the-art accuracy in less than 10 minutes. MINIROCKET is significantly faster than any other method of comparable accuracy (including ROCKET), and significantly more accurate than any other method of even roughly-similar computational expense. As such, we suggest that MINIROCKET should now be considered and used as the default variant of ROCKET.

Please cite as:

@article{dempster_etal_2020,
  author  = {Dempster, Angus and Schmidt, Daniel F and Webb, Geoffrey I},
  title   = {{MINIROCKET}: A Very Fast (Almost) Deterministic Transform for Time Series Classification},
  year    = {2020},
  journal = {arXiv:2012.08791}
}

`sktime`* / Multivariate

MINIROCKET (including a basic multivariate implementation) is also available through sktime. See the examples.

* for larger datasets (10,000+ training examples), the sktime methods should be integrated with SGD or similar as per softmax.py (replace calls to fit(...) and transform(...) from minirocket.py with calls to the relevant sktime methods as appropriate)

Results

UCR Archive (109 Datasets, 30 Resamples)
- Mean Accuracy + Training/Test Times
- Accuracy Per Resample
Scalability / Training Set Size*
- MosquitoSound (139,780 × 3,750)
- InsectSound (25,000 × 600)
- FruitFlies (17,259 × 5,000)
Scalability / Time Series Length
- DucksAndGeese (50 × 236,784)

* num_training_examples does not include the validation set of 2,048 training examples, but the transform time for the validation set is included in time_training_seconds

Requirements*

Python, NumPy, pandas
Numba (0.50+)
scikit-learn or similar
PyTorch or similar (for larger datasets)

* all pre-packaged with or otherwise available through Anaconda

Code

`minirocket.py`

`minirocket_dv.py` (MINIROCKET_DV)

`softmax.py` (PyTorch / 10,000+ Training Examples)

`minirocket_multivariate.py` (equivalent to sktime/MiniRocketMultivariate)

`minirocket_variable.py` (variable-length input; experimental)

Important Notes

Compilation

The functions in minirocket.py and minirocket_dv.py are compiled by Numba on import, which may take some time. By default, the compiled functions are now cached, so this should only happen once (i.e., on the first import).

Input Data Type

Input data should be of type np.float32. Alternatively, you can change the Numba signatures to accept, e.g., np.float64.

Normalisation

Unlike ROCKET, MINIROCKET does not require the input time series to be normalised. (However, whether or not it makes sense to normalise the input time series may depend on your particular application.)

Examples

MINIROCKET

from minirocket import fit, transform
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

parameters = fit(X_training)

X_training_transform = transform(X_training, parameters)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test, parameters)

predictions = classifier.predict(X_test_transform)

MINIROCKET_DV

from minirocket_dv import fit_transform
from minirocket import transform
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

parameters, X_training_transform = fit_transform(X_training)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test, parameters)

predictions = classifier.predict(X_test_transform)

PyTorch / 10,000+ Training Examples

from softmax import train, predict

model_etc = train("InsectSound_TRAIN_shuffled.csv", num_classes = 10, training_size = 22952)
# note: 22,952 = 25,000 - 2,048 (validation)

predictions, accuracy = predict("InsectSound_TEST.csv", *model_etc)

Variable-Length Input (Experimental)

from minirocket_variable import fit, transform, filter_by_length
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

# special instructions for variable-length input:
# * concatenate variable-length input time series into a single 1d numpy array
# * provide another 1d array with the lengths of each of the input time series
# * input data should be np.float32 (as above); lengths should be np.int32

# optionally, use a different reference length when setting dilation (default is
# the length of the longest time series), and use fit(...) with time series of
# at least this length, e.g.:
# >>> reference_length = X_training_lengths.mean()
# >>> X_training_1d_filtered, X_training_lengths_filtered = \
# >>> filter_by_length(X_training_1d, X_training_lengths, reference_length)
# >>> parameters = fit(X_training_1d_filtered, X_training_lengths_filtered, reference_length)

parameters = fit(X_training_1d, X_training_lengths)

X_training_transform = transform(X_training_1d, X_training_lengths, parameters)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test_1d, X_test_lengths, parameters)

predictions = classifier.predict(X_test_transform)

Acknowledgements

We thank Professor Eamonn Keogh and all the people who have contributed to the UCR time series classification archive. Figures in our paper showing mean ranks were produced using code from Ismail Fawaz et al. (2019).

🚀

_🚀_{_🚀}

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Related tags

Overview

MINIROCKET

`sktime`* / Multivariate

Results

Requirements*

Code

`minirocket.py`

`minirocket_dv.py` (MINIROCKET_DV)

`softmax.py` (PyTorch / 10,000+ Training Examples)

`minirocket_multivariate.py` (equivalent to sktime/MiniRocketMultivariate)

`minirocket_variable.py` (variable-length input; experimental)

Important Notes

Compilation

Input Data Type

Normalisation

Examples

Acknowledgements

Owner

Multi-Joint dynamics with Contact. A general purpose physics simulator.

Learning Representations that Support Robust Transfer of Predictors

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Keras-1D-ACGAN-Data-Augmentation

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

Python package for visualizing the loss landscape of parameterized quantum algorithms.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

Realtime YOLO Monster Detection With Non Maximum Supression

Code release for NeRF (Neural Radiance Fields)

Soomvaar is the repo which 🏩 contains different collection of 👨‍💻🚀code in Python and 💫✨Machine 👬🏼 learning algorithms📗📕 that is made during 📃 my practice and learning of ML and Python✨💥

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED)

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Complete the code of prefix-tuning in low data setting

最新版本yolov5+deepsort目标检测和追踪，支持5.0版本可训练自己数据集

This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Related tags

Overview

MINIROCKET

sktime* / Multivariate

Results

Requirements*

Code

minirocket.py

minirocket_dv.py (MINIROCKETDV)

softmax.py (PyTorch / 10,000+ Training Examples)

minirocket_multivariate.py (equivalent to sktime/MiniRocketMultivariate)

minirocket_variable.py (variable-length input; experimental)

Important Notes

Compilation

Input Data Type

Normalisation

Examples

Acknowledgements

Owner

Multi-Joint dynamics with Contact. A general purpose physics simulator.

Learning Representations that Support Robust Transfer of Predictors

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Keras-1D-ACGAN-Data-Augmentation

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

Python package for visualizing the loss landscape of parameterized quantum algorithms.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

Realtime YOLO Monster Detection With Non Maximum Supression

Code release for NeRF (Neural Radiance Fields)

Soomvaar is the repo which 🏩 contains different collection of 👨‍💻🚀code in Python and 💫✨Machine 👬🏼 learning algorithms📗📕 that is made during 📃 my practice and learning of ML and Python✨💥

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED)

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Complete the code of prefix-tuning in low data setting

最新版本yolov5+deepsort目标检测和追踪，支持5.0版本可训练自己数据集

This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

`sktime`* / Multivariate

`minirocket.py`

`minirocket_dv.py` (MINIROCKET_DV)

`softmax.py` (PyTorch / 10,000+ Training Examples)

`minirocket_multivariate.py` (equivalent to sktime/MiniRocketMultivariate)

`minirocket_variable.py` (variable-length input; experimental)