Traingenerator 🧙 A web app to generate template code for machine learning ✨

Overview

Traingenerator

🧙   A web app to generate template code for machine learning ✨

Gitter Heroku Code style: black



🎉 Traingenerator is now live! 🎉

Try it out:
https://traingenerator.jrieke.com


Generate custom template code for PyTorch & sklearn, using a simple web UI built with streamlit. Traingenerator offers multiple options for preprocessing, model setup, training, and visualization (using Tensorboard or comet.ml). It exports to .py, Jupyter Notebook, or Google Colab. The perfect tool to jumpstart your next machine learning project!


For updates, follow me on Twitter, and if you like this project, please consider sponsoring ☺




Adding new templates

You can add your own template in 4 easy steps (see below), without changing any code in the app itself. Your new template will be automatically discovered by Traingenerator and shown in the sidebar. That's it! 🎈

Want to share your magic? 🧙 PRs are welcome! Please have a look at CONTRIBUTING.md and write on Gitter.

Some ideas for new templates: Keras/Tensorflow, Pytorch Lightning, object detection, segmentation, text classification, ...

  1. Create a folder under ./templates. The folder name should be the task that your template solves (e.g. Image classification). Optionally, you can add a framework name (e.g. Image classification_PyTorch). Both names are automatically shown in the first two dropdowns in the sidebar (see image). ✨ Tip: Copy the example template to get started more quickly.
  2. Add a file sidebar.py to the folder (see example). It needs to contain a method show(), which displays all template-specific streamlit components in the sidebar (i.e. everything below Task) and returns a dictionary of user inputs.
  3. Add a file code-template.py.jinja to the folder (see example). This Jinja2 template is used to generate the code. You can write normal Python code in it and modify it (through Jinja) based on the user inputs in the sidebar (e.g. insert a parameter value from the sidebar or show different code parts based on the user's selection).
  4. Optional: Add a file test-inputs.yml to the folder (see example). This simple YAML file should define a few possible user inputs that can be used for testing. If you run pytest (see below), it will automatically pick up this file, render the code template with its values, and check that the generated code runs without errors. This file is optional – but it's required if you want to contribute your template to this repo.

Installation

Note: You only need to install Traingenerator if you want to contribute or run it locally. If you just want to use it, go here.

git clone https://github.com/jrieke/traingenerator.git
cd traingenerator
pip install -r requirements.txt

Optional: For the "Open in Colab" button to work you need to set up a Github repo where the notebook files can be stored (Colab can only open public files if they are on Github). After setting up the repo, create a file .env with content:

GITHUB_TOKEN=<your-github-access-token>
REPO_NAME=<user/notebooks-repo>

If you don't set this up, the app will still work but the "Open in Colab" button will only show an error message.

Running locally

streamlit run app/main.py

Make sure to run always from the traingenerator dir (not from the app dir), otherwise the app will not be able to find the templates.

Deploying to Heroku

First, install heroku and login. To create a new deployment, run inside traingenerator:

heroku create
git push heroku main
heroku open

To update the deployed app, commit your changes and run:

git push heroku main

Optional: If you set up a Github repo to enable the "Open in Colab" button (see above), you also need to run:

heroku config:set GITHUB_TOKEN=
   
    
heroku config:set REPO_NAME=
    

    
   

Testing

First, install pytest and required plugins via:

pip install -r requirements-dev.txt

To run all tests:

pytest ./tests

Note that this only tests the code templates (i.e. it renders them with different input values and makes sure that the code executes without error). The streamlit app itself is not tested at the moment.

You can also test an individual template by passing the name of the template dir to --template, e.g.:

pytest ./tests --template "Image classification_scikit-learn"

The mage image used in Traingenerator is from Twitter's Twemoji library and released under Creative Commons Attribution 4.0 International Public License.

Owner
Johannes Rieke
Product manager dev experience @streamlit
Johannes Rieke
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

AutoViz and Auto_ViML 519 Jan 03, 2023
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Jan 07, 2023
A Pythonic framework for threat modeling

pytm: A Pythonic framework for threat modeling Introduction Traditional threat modeling too often comes late to the party, or sometimes not at all. In

Izar Tarandach 644 Dec 20, 2022
Code Repository for Machine Learning with PyTorch and Scikit-Learn

Code Repository for Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka 1.4k Jan 03, 2023
ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

A library for debugging/inspecting machine learning classifiers and explaining their predictions

154 Dec 17, 2022
A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al.

pyUpSet A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al. Contents Purpose How to install How it work

288 Jan 04, 2023
Combines Bayesian analyses from many datasets.

PosteriorStacker Combines Bayesian analyses from many datasets. Introduction Method Tutorial Output plot and files Introduction Fitting a model to a d

Johannes Buchner 19 Feb 13, 2022
Traingenerator 🧙 A web app to generate template code for machine learning ✨

Traingenerator 🧙 A web app to generate template code for machine learning ✨ 🎉 Traingenerator is now live! 🎉

Johannes Rieke 1.2k Jan 07, 2023
Python bindings for MPI

MPI for Python Overview Welcome to MPI for Python. This package provides Python bindings for the Message Passing Interface (MPI) standard. It is imple

MPI for Python 604 Dec 29, 2022
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Oracle 95 Dec 28, 2022
Model factory is a ML training platform to help engineers to build ML models at scale

Model Factory Machine learning today is powering many businesses today, e.g., search engine, e-commerce, news or feed recommendation. Training high qu

16 Sep 23, 2022
TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

MilaGraph 1.1k Jan 08, 2023
Hierarchical Time Series Forecasting using Prophet

htsprophet Hierarchical Time Series Forecasting using Prophet Credit to Rob J. Hyndman and research partners as much of the code was developed with th

Collin Rooney 131 Dec 02, 2022
ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

ArviZ 1.3k Jan 05, 2023
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

Renato Votto 31 Nov 17, 2022
MLR - Machine Learning Research

Machine Learning Research 1. Project Topic 1.1. Exsiting research Benmark: https://paperswithcode.com/sota ACL anthology for NLP papers: http://www.ac

Charles 69 Oct 20, 2022
Data Efficient Decision Making

Data Efficient Decision Making

Microsoft 197 Jan 06, 2023
The Emergence of Individuality

The Emergence of Individuality

16 Jul 20, 2022
pandas, scikit-learn, xgboost and seaborn integration

pandas, scikit-learn and xgboost integration.

299 Dec 30, 2022