A Japanese Medical Information Extraction Toolkit

Last update: Dec 12, 2022

Related tags

Deep Learning JaMIE

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

The Train/Test phrases require all train, dev, test file converted to CONLL-style. Please check data_converter.py

Installation (python3.8)

git clone https://github.com/racerandom/JaMIE.git
cd JaMIE \

Required python package

pip install -r requirements.txt

Mophological analyzer required:\

jumanpp
mecab (juman-dict)

Pretrained BERT required:\

NICT-BERT (NICT_BERT-base_JapaneseWikipedia_32K_BPE)

Train：

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--pretrained_model $PRETRAINED_BERT \
--train_file $TRAIN_FILE \
--dev_file $DEV_FILE \
--dev_output $DEV_OUT \
--saved_model $MODEL_DIR_TO_SAVE \
--enc_lr 2e-5 \
--batch_size 4 \
--warmup_epoch 2 \
--num_epoch 20 \
--do_train
--fp16 (apex required)

The models trained on radiography interpretation reports of Lung Cancer (LC) and general medical reports of Idiopathic Pulmonary Fibrosis (IPF) are to be availabel: link1, link2.

Test:

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--saved_model $SAVED_MODEL \
--test_file $TEST_FILE \
--test_output $TEST_OUT \
--batch_size 4

Bath Converter from XML (or raw text) to CONLL for Train/Test

Convert XML files to CONLL files for Train/Test. You can also convert raw text to CONLL-style for Test.

python data_converter.py \
--mode xml2conll \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR \
--cv_num 5 \ # 5-fold cross-validation, 0 presents to generate single conll file
--doc_level \ # generate document-level ([SEP] denotes sentence boundaries) or sentence-level conll files
--segmenter mecab \ # please use mecab and NICT bert currently
--bert_dir $PRETRAINED_BERT

Batch Converter from predicted CONLL to XML

python data_converter.py \
--mode conll2xml \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR

Citation

If you use our code in your research, please cite our work:

@inproceedings{cheng2021jamie,
   title={JaMIE: A Pipeline Japanese Medical Information Extraction System,
   author={Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi},
   booktitle={arXiv},
   year={2021}
}

A Japanese Medical Information Extraction Toolkit

Related tags

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

Installation (python3.8)

Required python package

Mophological analyzer required:\

Pretrained BERT required:\

Train：

Test:

Bath Converter from XML (or raw text) to CONLL for Train/Test

Batch Converter from predicted CONLL to XML

Citation

Owner

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

Code for classifying international patents based on the text of their titles/abstracts

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"

Automatically creates genre collections for your Plex media

PyTorch implementation of "Simple and Deep Graph Convolutional Networks"

(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

StyleGAN of All Trades: Image Manipulation withOnly Pretrained StyleGAN

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

A curated list of awesome deep long-tailed learning resources.

Pytorch implementation of FlowNet by Dosovitskiy et al.

MIMO-UNet - Official Pytorch Implementation

Masked regression code - Masked Regression

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

OpenMMLab Detection Toolbox and Benchmark