Fine tuning keras-ocr python package with custom synthetic dataset from scratch

Overview

OCR-Pipeline-with-Keras

The keras-ocr package generally consists of two parts: a Detector and a Recognizer:

  • Detector is responsible for creating bounding boxes for the words of the text.
  • Recognizer is responsible for processing batch of cropped parts of the initial image.

Keras-ocr connects this two parts into seamless pipeline. "Out of the box", it can handle a wide range of images with texts. But in a specific task, when the field of possible images with texts is greatly narrowed, it shows itself badly in the Recognizer part of the task.

In this regard, the task of fine-tuning Recognizer on a custom dataset was set.


Virtual environment and packages

$ python3 -m venv keras_ocr
$ pip install keras-ocr

And TRDG library for synthetic text generation.

$ pip install trdg

Synthetic data generation

We will use the TRDG library to generate synthetic text. All necessary code presented in the data_generation.py. Things you need to know:

  • You choose template for generating text, e.g. if template is "({}{}/{})", then all brackets will be randomly filled with symbols from alphabet. You need to specify your own instance of StringTemplate classs.

  • You choose the alphabet. In our example case it contains only digits. P.S. Some of the repeated in data_generation.py, hence emperical distribution probability for each symbol defined as fraction of n_repeats to alphabet_size.

  • You can choose your own fonts. To do this, follow instruction:

    1. Download needed fonts as .ttf files
    2. Go to trdg fonts directory ./keras_ocr/lib/python3.8/site-packages/trdg/fonts/
    3. Create directory $ mkdir cs (cs means custom fonts), you can chooce the disered name
    4. Place fonts files in this dir
    5. (For Mac users only) Don't forget to remove .DS_Store from this folder
  • You can chooce image background for text. When creating instance of GeneratorFromStrings in function generate_data_units(...), provide folder with images with arg image_dir

High-level API in the data_generation.py
data_generator = DataGenerator(string_templates=[StringTemplate('{}{}{}{}{}{}{}', 7)])

data_generator.generate(n_patches=20000, n_total_samples=550, path='DigitsBracketsDataset/train')
  • n_patches -- number of different strings from provided template
  • n_total_samples -- number of total samples from patches
  • path -- dir to save samples

Fine tuning Recognizer

Follow instruction in fine_tuning.ipynb. Don't forget to add function get_custom_dataset(...) to datasets.py in keras-ocr package directory (./keras_ocr/lib/python3.8/site-packages/keras_ocr/datasets.py):

def get_custom_dataset(path: str, split: str):
    """
    param: path: path to dataset root dir (include train/test dirs)
    Returns:
        A recognition dataset as a list of (filepath, box, word) tuples
    """
    data = []
    if split == 'train':
        train_dir = os.path.join(path, 'train')
        data.extend(
            _read_born_digital_labels_file(
                labels_filepath=os.path.join(train_dir, "gt.txt"),
                image_folder=train_dir,
            )
        )
    elif split == 'test':
        test_dir = os.path.join(path, 'test')
        data.extend(
            _read_born_digital_labels_file(
                labels_filepath=os.path.join(test_dir, 'gt.txt'), 
                image_folder=test_dir
            )
        )
    return data 
Owner
Eugene
Eugene
Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

kornia 7.6k Jan 06, 2023
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2.2k Jan 02, 2023
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network Introduction This is a tensorflow re-implementation of PSENet: Shape Robu

Michael liu 498 Dec 30, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

428 Nov 22, 2022
pyntcloud is a Python library for working with 3D point clouds.

pyntcloud is a Python library for working with 3D point clouds.

David de la Iglesia Castro 1.2k Jan 07, 2023
Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

CodeSquad PS1 Solution for Problem Statement 1 for AIDL 2020 conducted by @unifynd technologies. Problem Given images of bills/invoices, the task was

Burhanuddin Udaipurwala 111 Nov 27, 2022
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.

PassportScanner Works with 2 and 3 line identity documents. What is this With PassportScanner you can use your camera to scan the MRZ code of a passpo

Edwin Vermeer 441 Dec 24, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 541 Dec 28, 2022
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Quick and Dirty OCR of Facebook Papers Gizmodo has been working through the Facebook Papers and releasing the docs that they process and review. As lu

Bill Fitzgerald 2 Oct 28, 2021
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
OpenGait is a flexible and extensible gait recognition project

A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.

Shiqi Yu 335 Dec 22, 2022
Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Bailando Code for CVPR 2022 (oral) paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory" [Paper] | [Project Page] | [Vi

Li Siyao 237 Dec 29, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
OCR, Object Detection, Number Plate, Real Time

README.md PrePareded anaconda env requirements.txt clova AI → deep text recognition → trained weights (ex, .pth) wpod-net weights (ex, .h5 , .json) ht

Kaven Lee 7 Dec 06, 2022
Histogram specification using openCV in python .

histogram specification using openCV in python . Have to input miu and sigma to draw gausssian distribution which will be used to map the input image . Example input can be miu = 128 sigma = 30

Tamzid hasan 6 Nov 17, 2021