The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Overview

CircleCI Github Actions Codecov Documentation Status Pypi Version Black Python Versions DOI

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory.

Additionally, contributions at the algorithm level are available in the package mlresearch.

Installation

A Python distribution of version 3.8 or 3.9 is required to run this project. Due to the computational limitations of the free tiers in CI/CD platforms, currently we cannot ensure compatibility with earlier Python versions.

ML-Research requires:

  • numpy (>= 1.14.6)
  • pandas (>= 1.3.5)
  • sklearn (>= 1.0.0)
  • imblearn (>= 0.8.0)
  • rich (>= 10.16.1)
  • matplotlib (>= 2.2.3)
  • seaborn (>= 0.9.0)
  • rlearn (>= 0.2.1)
  • pytorch (>= 1.10.1)
  • torchvision (>= 0.11.2)
  • pytorch_lightning (>= 1.5.8)

User Installation

If you already have a working installation of numpy and scipy, the easiest way to install scikit-learn is using pip :

pip install -U ml-research

The documentation includes more detailed installation instructions.

Installing from source

The following commands should allow you to setup the development version of the project with minimal effort:

# Clone the project.
git clone https://github.com/joaopfonseca/ml-research.git
cd ml-research

# Create and activate an environment 
make environment 
conda activate mlresearch # Adapt this line accordingly if you're not running conda

# Install project requirements and the research package
pip install .[tests,docs]

Citing ML-Research

If you use ML-Research in a scientific publication, we would appreciate citations to the following paper:

@article{Fonseca2021,
  doi = {10.3390/RS13132619},
  url = {https://doi.org/10.3390/RS13132619},
  keywords = {SMOTE,active learning,artificial data generation,land use/land cover classification,oversampling},
  year = {2021},
  month = {jul},
  publisher = {Multidisciplinary Digital Publishing Institute},
  volume = {13},
  pages = {2619},
  author = {Fonseca, Joao and Douzas, Georgios and Bacao, Fernando},
  title = {{Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification}},
  journal = {Remote Sensing}
}
You might also like...
A collection of 100 Deep Learning images and visualizations
A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.
ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

ManimML ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Easily pull telemetry data and create beautiful visualizations for analysis.
Easily pull telemetry data and create beautiful visualizations for analysis.

This repository is a work in progress. Anything and everything is subject to change. Porpo Table of Contents Porpo Table of Contents General Informati

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.
Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible
The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximation), to creating novel active learning strategies.

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.
Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Comments
  • Consider modifying default BYOL hyper-parameters for smaller batch sizes

    Consider modifying default BYOL hyper-parameters for smaller batch sizes

    Applicable to both BYOL and SimSiam: Some hyperparameters might need to be added. Some are hard-coded to the default values.

    Taken from the BYOL paper: Screenshot from 2022-03-18 17-54-43

    opened by joaopfonseca 1
  • Remove computer vision models, augmentations and datasets

    Remove computer vision models, augmentations and datasets

    They will be removed in the next release since:

    1. I'm not going to used these methods anytime soon and I don't have the time to test them properly
    2. They are out of scope of the library. It is meant to be used for machine learning techniques, focused on tabular data. In the feature it may be worth considering the development of another library for computer vision, for example.
    3. Setting Pytorch as a dependency for a reduced part of the library isn't particularly efficient.
    wontfix 
    opened by joaopfonseca 0
  • Host all raw data from datasets submodule elsewhere

    Host all raw data from datasets submodule elsewhere

    With Python 3.11, downloading some datasets returns an SSL error (when unsafe legacy renegotiation disabled). It happens when the server doesn't support "RFC 5746 secure renegotiation" and the client is using OpenSSL 3, which enforces that standard by default (source).

    Hosting the raw data elsewhere should fix this issue.

    bug 
    opened by joaopfonseca 0
  • Review and add examples to documentation

    Review and add examples to documentation

    The readthedocs page is getting a bit outdated:

    • [x] Add support for Python 3.10
    • [ ] Add support for Python 3.11
    • [ ] Check for missing, deleted or renamed functions and objects
    • [ ] Review content as a whole
    • [ ] Add examples to documentation
    • [ ] Add dependency groups to documentation
    • [ ] README contains dependencies that will no longer be used
    documentation 
    opened by joaopfonseca 0
Releases(v0.4a2)
  • v0.4a2(Jan 2, 2023)

    NOTE: This pre-release contains implementations of algorithms for Self-supervised learning (BYOL and SimSiam). This release also contains objects to download image data from Pytorch and general definitions for image augmentations. They will be removed in the next release since:

    1. I'm not going to used these methods anytime soon and I don't have the time to test them properly
    2. They are out of scope of the library. It is meant to be used for machine learning techniques, focused on tabular data. In the feature it may be worth considering the development of another library for computer vision, for example.
    3. Setting Pytorch as a dependency for a reduced part of the library isn't particularly efficient.

    Full Changelog: https://github.com/joaopfonseca/ml-research/compare/v0.4a1...v0.4a2

    Source code(tar.gz)
    Source code(zip)
  • v0.4a1(Apr 14, 2022)

  • v0.3.4(Feb 14, 2022)

  • v0.3.3(Feb 14, 2022)

  • v0.3.2(Feb 14, 2022)

  • v0.3.1(Feb 14, 2022)

  • v0.3.0(Feb 14, 2022)

  • v0.2.1(Feb 14, 2022)

  • v0.2.0(Feb 14, 2022)

  • 0.1.0(Feb 14, 2022)

Owner
João Fonseca
PhD student | Researcher | Invited lecturer @ NOVA Information Management School
João Fonseca
Functional deep learning

Pipeline abstractions for deep learning. Full documentation here: https://lf1-io.github.io/padl/ PADL: is a pipeline builder for PyTorch. may be used

LF1 101 Nov 09, 2022
Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

Summary This is the code for the paper Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks by Yanxiang Wang, Xian Zh

zhangxian 54 Jan 03, 2023
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 09, 2023
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts

Yizhi Wang 146 Dec 18, 2022
Visual Question Answering in Pytorch

Visual Question Answering in pytorch /!\ New version of pytorch for VQA available here: https://github.com/Cadene/block.bootstrap.pytorch This repo wa

Remi 672 Jan 01, 2023
RANZCR-CLiP 7th Place Solution

RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi

Hiroshechka Y 21 Oct 22, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
Share a benchmark that can easily apply reinforcement learning in Job-shop-scheduling

Gymjsp Gymjsp is an open source Python library, which uses the OpenAI Gym interface for easily instantiating and interacting with RL environments, and

134 Dec 08, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
GUI for TOAD-GAN, a PCG-ML algorithm for Token-based Super Mario Bros. Levels.

If you are using this code in your own project, please cite our paper: @inproceedings{awiszus2020toadgan, title={TOAD-GAN: Coherent Style Level Gene

Maren A. 13 Dec 14, 2022
BigbrotherBENL - Face recognition on the Big Brother episodes in Belgium and the Netherlands.

BigbrotherBENL - Face recognition on the Big Brother episodes in Belgium and the Netherlands. Keeping statistics of whom are most visible and recognisable in the series and wether or not it has an im

Frederik 2 Jan 04, 2022
Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation (NeurIPS 2021) Code for our NeurIPS 2021 paper 'Exploiting the Intri

Shiqi Yang 53 Dec 25, 2022
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace

8 Dec 04, 2022
Music Classification: Beyond Supervised Learning, Towards Real-world Applications

Music Classification: Beyond Supervised Learning, Towards Real-world Applications

104 Dec 15, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
Ranger deep learning optimizer rewrite to use newest components

Ranger21 - integrating the latest deep learning components into a single optimizer Ranger deep learning optimizer rewrite to use newest components Ran

Less Wright 266 Dec 28, 2022
TrackTech: Real-time tracking of subjects and objects on multiple cameras

TrackTech: Real-time tracking of subjects and objects on multiple cameras This project is part of the 2021 spring bachelor final project of the Bachel

5 Jun 17, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

Jack Parker-Holder 22 Nov 16, 2022
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023