Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

Overview

License


camloop

Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff

Table of Contents

Usage

This is a simple project developed to reduce complexity and time writing boilerplate code when prototyping computer vision applications. Stop worrying about opening/closing video caps, handling key presses, etc, and just focus on doing the cool stuff!

The project was developed in Python 3.8 and tested with physical local webcams. If you end up using it in any other context, please consider letting me know if it worked or not for whatever use case you had :)

Install

The project is distributed by pypi, so just:

$ pip install pycamloop

As usual, conda or venv are recommended to manage your local environments.

Quickstart

To run a webcam loop and process each frame, just define a function that takes as argument the frame as obtained from cv2.VideoCapture's cap() method (i.e: a np.array) and wrap it with the @camloop decorator. You just need to make sure your function takes the frame as an argument, and returns it so the loop can show it:

from camloop import camloop

@camloop()
def grayscale_example(frame):
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    return frame

# calling the function will start the loop and show the results with the cv2.imshow method
grayscale_example()

The window can be exited at any time by pressing "q" on the keyboard. You can also take screenshots at any time by pressing the "s" key. By default they will be saved in the current directory (see configuring the loop for information on how to customize this and other options).

More advanced use cases

Now, let's say that instead of just converting the frame to grayscale and visualizing it, you want to pass some other arguments, perform more complex operations, and/or persist information every loop. All of this can be done inside the function wrapped by the camloop decorator, and external dependencies can be passed as arguments to your function. For example, let's say we want to run a face detector and save the results to a file called "face-detection-results.txt":

from camloop import camloop

# for simplicity, we use cv2's own haar face detector
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")

@camloop()
def face_detection_example(frame, face_cascade, results_fp=None):
    grayscale_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(grayscale_frame, 1.2, 5)
    for bbox in faces:
        x1, y1 = bbox[:2]
        x2 = x1 + bbox[2]
        y2 = y1 + bbox[3]
        cv2.rectangle(frame, (x1, y1), (x2, y2), (180, 0, 180), 5)

    if results_fp is not None:
	    with open(results_fp, 'a+') as f:
	        f.write(f"{datetime.datetime.now().isoformat()} - {len(faces)} face(s) found: {faces}\n")
    return frame

face_detection_example(face_cascade, results_fp="face-detection-results.txt")

Camloop can handle any arguments and keyword arguments you define in your function, as long as the frame is the first one. In calling the wrapped function, pass the extra arguments with the exception of the frame which is handled implicitly.

Configuring the loop

Since most of the boilerplate is now hidden, camloop exposes a configuration object that allows the user to modify several aspects of it's behavior. The options are:

parameter type default description
source int 0 Index of the camera to use as source for the loop (passed to cv2.VideoCapture())
mirror bool False Whether to flip the frames horizontally
resolution tuple[int, int] None Desired resolution (H,W) of the frames. Passed to the cv2.VideoCapture.set method. Default values and acceptance of custom ones depend on the webcam.
output string '.' Directory where to save artifacts by default (ex: captured screenshots)
sequence_format string None Format for rendering sequence of frames. Acceptable formats are "gif" or "mp4". If specified a video/gif will be saved to the output folder
fps float None FPS value used for the rendering of the sequence of frames. If unspecified, the program will try to estimate if from the length of the recording and number of frames
exit_key string 'q' Keyboard key used to exit the loop
screenshot_key string 's' Keyboard key used to capture a screenshot

If you want to use something other than the defaults, define a dictionary object with the desired configuration and pass it to the camloop decorator.

For example, here we want to mirror the frames horizontally, and save an MP4 video of the recording at 23.7 FPS to the test directory:

from camloop import camloop

config = {
    'mirror': True,
    'output': "test/",
    'fps': 23.7,
    'sequence_format': "mp4",
}

@camloop(config)
def grayscale_example(frame):
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    return frame

grayscale_example()

Demo

Included in the repo is a demonstration script that can be run out-of-the-box to verify camloop and see it's main functionalities. There are a few different samples you can check out, including the grayscale and face detection examples seen in this README).

To run the demo, install camloop and clone the repo:

$ pip install pycamloop
$ git clone https://github.com/glefundes/pycamloop.git
$ cd pycamloop/

Then run it by specifying which demo you want and passing any of the optional arguments (python3 demo.py -h for more info on them). In this case, we're mirroring the frames from the "face detection" demo and saving the a video of the recording in the "demo-videos" directory:

$ mkdir demo-videos
$ python3 demo.py face-detection --mirror --save-sequence mp4 -o demo-videos/

About The Project

I work as a computer vision engineer and often find myself having to prototype or debug projects locally using my own webcam as a source. This, of course, means I have to frequently code the same boilerplate OpenCV camera loop in multiple places. Eventually I got tired of copy-pasting the same 20 lines from file to file and decided to write a 100-ish lines package to make my work a little more efficient, less boring and code overall less bloated. That's pretty much it. Also, it was a nice chance to practice playing with decorators.

TODO

  • Verify functionality with other types of video sources (video files, streams, etc)

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Gabriel Lefundes Vieira - [email protected]

Owner
Gabriel Lefundes
Data Scientist, Computer Vision Engineer @ Amigo Edu.
Gabriel Lefundes
a deep learning model for page layout analysis / segmentation.

OCR Segmentation a deep learning model for page layout analysis / segmentation. dependencies tensorflow1.8 python3 dataset: uw3-framed-lines-degraded-

99 Dec 12, 2022
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

Saurabh Daware 4.2k Jan 07, 2023
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

EnergyExpenditure Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper. Additional data for replicating this s

Patrick S 42 Oct 26, 2022
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

61 Dec 22, 2022
Crop regions in napari manually

napari-crop Crop regions in napari manually Usage Create a new shapes layer to annotate the region you would like to crop: Use the rectangle tool to a

Robert Haase 4 Sep 29, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
The open source extract transaction infomation by using OCR.

Transaction OCR Mã nguồn trích xuất thông tin transaction từ file scaned pdf, ở đây tôi lựa chọn tài liệu sao kê công khai của Thuy Tien. Mã nguồn có

Nguyen Xuan Hung 18 Jun 02, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 03, 2023
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
Document Layout Analysis

Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P

QURATOR-SPK 198 Dec 29, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
One Metrics Library to Rule Them All!

onemetric Installation Install onemetric from PyPI (recommended): pip install onemetric Install onemetric from the GitHub source: git clone https://gi

Piotr Skalski 49 Jan 03, 2023
Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Bailando Code for CVPR 2022 (oral) paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory" [Paper] | [Project Page] | [Vi

Li Siyao 237 Dec 29, 2022
Open Source Computer Vision Library

OpenCV: Open Source Computer Vision Library Resources Homepage: https://opencv.org Courses: https://opencv.org/courses Docs: https://docs.opencv.org/m

OpenCV 65.7k Jan 03, 2023