Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Last update: Sep 23, 2021

Overview

play-with-torch

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Tools

Tested Hardware

RasberryPi 4 Model B here, RAM: 4 GB and Processor 4-core @ 1.5 GHz
microSD Card 64 GB
5M USB Retractable Clip 120 Degrees WebCam Web Wide-angle Camera Laptop U7 Mini or Raspi Camera

Tested Software

Ubuntu Desktop 20.10 aarch64 64 bit, install on RasberriPi 4
PyTorch: torch 1.6.0 aarch64 and torchvision 0.7.0 aarch64
Python min. ver. 3.6 (3.8 recommended)

Install the prerequisites

Install packages

$ sudo apt install build-essential make cmake git python3-pip libatlas-base-dev
$ sudo apt install libssl-dev
$ sudo apt install libopenblas-dev libblas-dev m4 python3-yaml
$ sudo apt install libomp-dev

make swap space to 2048 MB

$ free -h
$ sudo swapoff -a
$ sudo dd if=/dev/zero of=/swapfile bs=1M count=2048
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ free -h

Install torch 1.6.0

$ pip3 install torch-1.6.0a0+b31f58d-cp38-cp38-linux_aarch64.whl

Folder Structure

play-with-torch/
├── config/
│    ├── config.json - holds configuration for training
│    └── parse_config.py - class to handle config file and cli options
│
├── docker/
│   ├── Dockerfile
│   └── requirements.txt
│
├── data/ - default directory for storing input data
│
├── docs/ - for documentation
│   └── play-with-torch.tex
│
├── models/ - models, losses, and metrics
│   ├── model.py
│   ├── metric.py
│   └── loss.py
│
├── samples/
│
├── saved/
│   ├── checkpoints/
│   ├── traced_model/
│   ├── models/ - trained models are saved here
│   └── logs/ - default logdir for tensorboard and logging output
│
├── site
├── templates/ - for serving model on Flask
│   └── index.html
├── tests/
├── utils/ - small utility functions
│   ├── data/
│   └── ...
│
├── inference.py - main script to inference model
├── README.md
├── trace_model.py - main script to convert model
└── train.py - main script to start training

Usage

Run inference

$ git clone https://github.com/mheriyanto/play-with-torch.git
$ cd play-with-torch/
$ python3 inference.py video --config config/nanodet-m.yml --model saved/models/nanodet_m.ckpt --path video.mp4

Convert model

$ python3 trace_model.py --cfg_path config/nanodet-m.yml --model_path saved/models/nanodet_m.ckpt --input_shape 320,320

Training

$ python3 train.py config/nanodet_custom_xml_dataset.yml

TO DO

Implement Unit-Test: Test-Driven Development (TDD)

Credit to

Share PyTorch binaries built for Raspberry Pi

Reference

NanoDet: Super fast and lightweight anchor-free object detection model. here
Yunjey Choi - PyTorch Tutorial for Deep Learning Researchers here
Victor Huang - PyTorch Template Project (here)

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Related tags

Overview

play-with-torch

Tools

Tested Hardware

Tested Software

Install the prerequisites

Folder Structure

Usage

TO DO

Credit to

Reference

Owner

eMHa

(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

Text layer for bio-image annotation.

Recognizing the text contents from a scanned visiting card

PAGE XML format collection for document image page content and more

Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Simple app for visual editing of Page XML files

Use Youdao OCR API to covert your clipboard image to text.

Read Japanese manga inside browser with selectable text.

Perspective recovery of text using transformed ellipses

Image Smoothing and Blurring Using OpenCV

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Generate text images for training deep learning ocr model

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

OCR software for recognition of handwritten text

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

Um RPG de texto orientado a objetos.

👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike