Form Segmentation

Let's explore how we can extract text from any forms / scanned pages.

Objectives

The goal is to find an algorithm that can extract the maximum information from a given page (jpg format). So, we can feed it to another system. (Business logic, neural network, classifier, etc.) The overall process may not be perfect. But it would be great if it can find enough information to identify the type of document and the involve identities.

Parse any form / scanned page and extract any text data (printed text and handwriting text). So, no prior knowledge of the layout / structure of the document.
Automatic extraction process (no human interaction. So, it can scale out)
Somehow fast (or the ability to speed up the task with more machines or CPU)

Challenges

There are many challenges to overcome. But the main problem is to identify which part of the form contains text.

Some other challenges:

Black Border Removal
ICR (Intelligent Character Recognition): recognize and convert hand-drawn characters into text
Scanned page (Detect edges and apply a perspective transform to obtain the top-down view of the document)
Remove noise (blur, OTSU, adaptivethreshold with opencv)
Shape detection and extraction
OCR (Not a real issue since we can use : Tesseract 4 great for printed text)
Handwriting recognition
Minimize errors

Let's explore how we can extract text from forms

Related tags

Overview

Form Segmentation

Objectives

Challenges

Owner

Philip Doxakis

Driver Drowsiness Detection with OpenCV & Dlib

POT : Python Optimal Transport

An easy to use an (hopefully useful) captcha solution for pyTelegramBotAPI

かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

A novel region proposal network for more general object detection ( including scene text detection ).

A python program to block out your face

Ackermann Line Follower Robot Simulation.

PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

Text page dewarping using a "cubic sheet" model

Learning Camera Localization via Dense Scene Matching, CVPR2021

Natural language detection

Autonomous Driving project for Euro Truck Simulator 2

Implementation of EAST scene text detector in Keras

A curated list of papers, code and resources pertaining to image composition

Perspective recovery of text using transformed ellipses

A simple python program to record security cam footage by detecting a face and body of a person in the frame.