Code for Text Prior Guided Scene Text Image Super-Resolution

Last update: Dec 26, 2022

Related tags

Text Data & NLP TPGSR

Overview

Text Prior Guided Scene Text Image Super-Resolution

https://arxiv.org/abs/2106.15368

Jianqi Ma, Shi Guo, Lei Zhang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China

Recovering TextZoom samples

Environment:

Other possible python packages like pyyaml, cv2, Pillow and imgaug

Main idea

Single stage with loss

Multi-stage version

Configure your training

Download the pretrained recognizer from:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Unzip the codes and walk into the '$TPGSR_ROOT$/', place the pretrained weights from recognizer in '$TPGSR_ROOT$/'.

Download the TextZoom dataset:

https://github.com/JasonBoy1/TextZoom

Train the corresponding model (e.g. TPGSR-TSRN):

chmod a+x train_TPGSR-TSRN.sh
./train_TPGSR-TSRN.sh
or
python3 main.py --arch="tsrn_tl_cascade" \       # The architecture
                --batch_size=48 \                # The batch size
                --STN \                          # Using STN net for alignment
		--mask \                         # Using the contour mask
		--use_distill \                  # Using the TP loss
		--gradient \                     # Using the Gradient Prior Loss
		--sr_share \                     # Sharing weights for SR Module
		--stu_iter=1 \                   # The number of interations in multi-stage version
		--vis_dir='vis_TPGSR-TSRN' \     # The checkpoint directory

Run the test-prefixed shell to test the corresponding model.

Adding '--go_test' in the shell file

Cite this paper:

@article{ma2021text,
title={Text Prior Guided Scene Text Image Super-resolution},
author={Ma, Jianqi and Guo, Shi and Zhang, Lei},
journal={arXiv preprint arXiv:2106.15368},
year={2021}
}

Code for Text Prior Guided Scene Text Image Super-Resolution

Related tags

Overview

Text Prior Guided Scene Text Image Super-Resolution

Recovering TextZoom samples

Environment:

Main idea

Single stage with loss

Multi-stage version

Configure your training

Download the pretrained recognizer from:

Download the TextZoom dataset:

Train the corresponding model (e.g. TPGSR-TSRN):

Run the test-prefixed shell to test the corresponding model.

Cite this paper:

Owner

Pretrain CPM - 大规模预训练语言模型的预训练代码

Repository of the Code to Chatbots, developed in Python

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

A website which allows you to play with the GPT-2 transformer

Modified GPT using average pooling to reduce the softmax attention memory constraints.

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

MPNet: Masked and Permuted Pre-training for Language Understanding

kochat

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

【原神】自动演奏风物之诗琴的程序

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

In this project, we aim to achieve the task of predicting emojis from tweets. We aim to investigate the relationship between words and emojis.

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Prompt tuning toolkit for GPT-2 and GPT-Neo