OCR system for Arabic language that converts images of typed text to machine-encoded text.

Last update: Jan 05, 2023

Overview

Arabic OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.
The system currently supports only letters (29 letters) ا-ى , لا.
The system aims to solve a simpler problem of OCR with images that contain only Arabic characters (check the dataset link below to see a sample of the images).

Setup

Install python then run this command:

pip install -r requirements.txt

Run

Put the images in src/test directory
Go to src directory and run the following command
```
python OCR.py
```
Output folder will be created with:
- text folder which has text files corresponding to the images.
- running_time file which has the time taken to process each image.

Pipeline

Dataset

Link to dataset of images and the corresponding text: here.
We used 1000 images to generate character dataset that we used for training.

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

Average accuracy: 95%.
Average time per image: 16 seconds.

NOTE

We achieved these results when we used only the flatten image as feature.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Related tags

Overview

Arabic OCR

Setup

Run

Pipeline

Dataset

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

References

Owner

Hussein Youssef

POT : Python Optimal Transport

Document Layout Analysis Projects

Automatically fishes for you while you are afk :)

Application that instantly translates sign-language to letters.

Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

A program that takes in the hand gesture displayed by the user and translates ASL.

Optical character recognition for Japanese text, with the main focus being Japanese manga

An OCR evaluation tool

Color Picker and Color Detection tool for METR4202

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

Write-ups for the SwissHackingChallenge2021 CTF.

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.

OCR engine for all the languages

An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Implementation of EAST scene text detector in Keras

Document Layout Analysis