The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Last update: Dec 26, 2022

Related tags

Text Data & NLP DocTr

Overview

Good news! Our new work exhibits state-of-the-art performances on DocUNet benchmark dataset: DocScanner: Robust Document Image Rectification with Progressive Learning

DocTr

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
ACM MM 2021 Oral

Any questions or discussions are welcomed!

Training

For geometric unwarping, we train the GeoTr network using the Doc3d dataset.
For illumination correction, we train the IllTr network based on the DRIC dataset.

Inference

Download the pretrained models here and put them to $ROOT/model_pretrained/.
Geometric unwarping:
```
python inference.py
```
Geometric unwarping and illumination rectification:
```
python inference.py --ill_rec True
```

Evaluation

We use the same evaluation code as DocUNet benchmark dataset based on Matlab 2019a.
Please compare the scores according to your Matlab version.
Use the images available here for reproducing the quantitative performance reported in the paper and further comparison.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{feng2021doctr,
  title={DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction},
  author={Feng, Hao and Wang, Yuechen and Zhou, Wengang and Deng, Jiajun and Li, Houqiang},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={273--281},
  year={2021}
}

@article{feng2021docscanner,
  title={DocScanner: Robust Document Image Rectification with Progressive Learning},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Tian, Qi and Li, Houqiang},
  journal={arXiv preprint arXiv:2110.14968},
  year={2021}
}

The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Related tags

Overview

DocTr

Training

Inference

Evaluation

Citation

Owner

Hao Feng

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

TTS is a library for advanced Text-to-Speech generation.

A calibre plugin that generates Word Wise and X-Ray files then sends them to Kindle. Supports KFX, AZW3 and MOBI eBooks. X-Ray supports 18 languages.

Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"

2021 2학기 데이터크롤링 기말프로젝트

SDL: Synthetic Document Layout dataset

Phrase-Based & Neural Unsupervised Machine Translation

Semi-automated vocabulary generation from semantic vector models

💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

초성 해석기 based on ko-BART

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Club chatbot

Facilitating the design, comparison and sharing of deep text matching models.

BookNLP, a natural language processing pipeline for books

Python powered crossword generator with database with 20k+ polish words

Rhyme with AI

Pre-Training with Whole Word Masking for Chinese BERT

Natural language computational chemistry command line interface.