A workshop with several modules to help learn Feast, an open-source feature store

Last update: Jan 05, 2023

Related tags

Text Data & NLP feast-workshop

Overview

Workshop: Learning Feast

This workshop aims to teach users about Feast, an open-source feature store.

We explain concepts & best practices by example, and also showcase how to address common use cases.

What is Feast?

Feast is an operational system for managing and serving machine learning features to models in production. It can serve features from a low-latency online store (for real-time prediction) or from an offline store (for batch scoring).

Why Feast?

Feast solves several common challenges teams face:

Lack of feature reuse across teams
Complex point-in-time-correct data joins for generating training data
Difficulty operationalizing features for online inference while minimizing training / serving skew

Pre-requisites

This workshop assumes you have the following installed:

A local development environment that supports running Jupyter notebooks (e.g. VSCode with Jupyter plugin)
Python 3.7+
Java 11 (for Spark, e.g. brew install java11)
pip
Docker & Docker Compose (e.g. brew install docker docker-compose)
Terraform (docs)
AWS CLI
An AWS account setup with credentials via aws configure (e.g see AWS credentials quickstart)

Since we'll be learning how to leverage Feast in CI/CD, you'll also need to fork this workshop repository.

Caveats

M1 Macbook development is untested with this flow. See also How to run / develop for Feast on M1 Macs.
Windows development has only been tested with WSL. You will need to follow this guide to have Docker play nicely.

Modules

These are meant mostly to be done in order, with examples building on previous concepts.

Time (min)	Description	Module
30-45	Setting up Feast projects & CI/CD + powering batch predictions	Module 0
15-20	Streaming ingestion & online feature retrieval with Kafka, Spark, Redis	Module 1
10-15	Real-time feature engineering with on demand transformations	Module 2
TBD	Feature server deployment (embed, as a service, AWS Lambda)	TBD
TBD	Versioning features / models in Feast	TBD
TBD	Data quality monitoring in Feast	TBD
TBD	Batch transformations	TBD
TBD	Stream transformations	TBD

A workshop with several modules to help learn Feast, an open-source feature store

Related tags

Overview

Workshop: Learning Feast

What is Feast?

Why Feast?

Pre-requisites

Modules

Owner

Feast

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

SimCSE: Simple Contrastive Learning of Sentence Embeddings

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Problem: Given a nepali news find the category of the news

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

A desktop GUI providing an audio interface for GPT3.

Client library to download and publish models and other files on the huggingface.co hub

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

A list of NLP(Natural Language Processing) tutorials

CoSENT、STS、SentenceBERT

Knowledge Oriented Programming Language

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

Residual2Vec: Debiasing graph embedding using random graphs

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System