Accelerating model creation and evaluation.

Last update: Dec 06, 2021

Overview

EmeraldML

A machine learning library for streamlining the process of
(1) cleaning and splitting data,
(2) training, optimizing, and testing various models based on the task, and
(3) scoring and ranking them
during the exploratory phase for an elementary analysis of which models perform better for a specific dataset.

Installation

Dependencies

Python (>= 3.7)
NumPy (>= 1.21.2)
pandas (>= 1.3.3)
scikit-learn (>= 0.24.2)
statsmodels (>= 0.12.2)

User installation

pip install emeraldml

Development

Source code

You can check the latest sources with the command:

git clone https://github.com/yu3ufff/emeraldml.git

Demo

Getting the data:

import pandas as pd
audi = pd.read_csv('audi.csv')
audi.head()

|    | model   |   year |   price | transmission   |   mileage | fuelType   |   tax |   mpg |   engineSize |
|---:|:--------|-------:|--------:|:---------------|----------:|:-----------|------:|------:|-------------:|
|  0 | A1      |   2017 |   12500 | Manual         |     15735 | Petrol     |   150 |  55.4 |          1.4 |
|  1 | A6      |   2016 |   16500 | Automatic      |     36203 | Diesel     |    20 |  64.2 |          2   |
|  2 | A1      |   2016 |   11000 | Manual         |     29946 | Petrol     |    30 |  55.4 |          1.4 |
|  3 | A4      |   2017 |   16800 | Automatic      |     25952 | Diesel     |   145 |  67.3 |          2   |
|  4 | A3      |   2019 |   17300 | Manual         |      1998 | Petrol     |   145 |  49.6 |          1   |

Using EmeraldML:

import emerald
from emerald.boa import RegressionBoa

rboa = RegressionBoa(random_state=3)
rboa.hunt(data=audi, target='price')
rboa.ladder

[(OptimalRFRegressor, 0.9624889664024406),
 (OptimalDTRegressor, 0.9514992411732952),
 (OptimalKNRegressor, 0.9511411883559433),
 (OptimalLinearRegression, 0.8876961846248467),
 (OptimalABRegressor, 0.8491539140007975)]

for i in range(len(rboa)):
    print(rboa.model(i))

RandomForestRegressor(min_samples_split=5, n_estimators=500, random_state=3)
DecisionTreeRegressor(max_depth=15, min_samples_split=10, random_state=3)
KNeighborsRegressor(n_neighbors=3, p=1)
LinearRegression()
AdaBoostRegressor(learning_rate=0.1, n_estimators=100, random_state=3)

Accelerating model creation and evaluation.

Related tags

Overview

EmeraldML

Installation

Dependencies

User installation

Development

Source code

Demo

Owner

Yusuf

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

vortex particles for simulating smoke in 2d

Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters

PySurvival is an open source python package for Survival Analysis modeling

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

Accelerating model creation and evaluation.

A collection of video resources for machine learning

K-means clustering is a method used for clustering analysis, especially in data mining and statistics.

Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Transform ML models into a native code with zero dependencies

cuML - RAPIDS Machine Learning Library

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Library for machine learning stacking generalization.

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Drug prediction

Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber

A Time Series Library for Apache Spark

Laporan Proyek Machine Learning - Azhar Rizki Zulma

Napari sklearn decomposition

Fit interpretable models. Explain blackbox machine learning.