From f9a918192bad28fa53a20443b3f9fc568764e233 Mon Sep 17 00:00:00 2001
From: RangeKing <rangekinghz@gmail.com>
Date: Tue, 28 Dec 2021 09:58:11 +0800
Subject: [PATCH] Create kie_en.md

---
 ppstructure/docs/kie_en.md | 77 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 ppstructure/docs/kie_en.md
diff --git a/ppstructure/docs/kie_en.md b/ppstructure/docs/kie_en.md
new file mode 100644
index 00000000..571903ce
--- /dev/null
+++ b/ppstructure/docs/kie_en.md
@@ -0,0 +1,77 @@
+
+
+# Key Information Extraction(KIE)
+
+This section provides a tutorial example on how to quickly use, train, and evaluate a key information extraction(KIE) model, [SDMGR](https://arxiv.org/abs/2103.14470), in PaddleOCR.
+
+[SDMGR(Spatial Dual-Modality Graph Reasoning)](https://arxiv.org/abs/2103.14470) is a KIE algorithm that classifies each detected text region into predefined categories, such as order ID, invoice number, amount, and etc.
+
+
+* [1. Quick Use](#1-----)
+* [2. Model Training](#2-----)
+* [3. Model Evaluation](#3-----)
+
+<a name="1-----"></a>
+
+## 1. Quick Use
+
+[Wildreceipt dataset](https://paperswithcode.com/dataset/wildreceipt) is used for this tutorial. It contains 1765 photos, with 25 classes, and 50000 text boxes, which can be downloaded by wget:
+
+```
+wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/wildreceipt.tar && tar xf wildreceipt.tar
+```
+
+Download the pretrained model and predict the result:
+
+```
+cd PaddleOCR/
+wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/kie_vgg16.tar && tar xf kie_vgg16.tar
+python3.7 tools/infer_kie.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=kie_vgg16/best_accuracy  Global.infer_img=../wildreceipt/1.txt
+```
+
+The prediction result is saved as the folder`./output/sdmgr_kie/predicts_kie.txt`, and the visualization result is saved as the folder`/output/sdmgr_kie/kie_results/`.
+
+The visualization result is shown in the figure below:
+
+<div align="center">
+    <img src="./imgs/0.png" width="800">
+</div>
+
+<a name="2-----"></a>
+## 2. Model Training
+
+Create a softlink to the folder, `PaddleOCR/train_data`:
+```
+cd PaddleOCR/ && mkdir train_data && cd train_data
+
+ln -s ../../wildreceipt ./
+```
+
+The configuration file used for training is `configs/kie/kie_unet_sdmgr.yml`. The default training data path in the configuration file is `train_data/wildreceipt`. After preparing the data, you can execute the model training with the following command:
+```
+python3.7 tools/train.py -c configs/kie/kie_unet_sdmgr.yml -o Global.save_model_dir=./output/kie/
+```
+<a name="3-----"></a>
+
+## 3. Model Evaluation
+
+After training, you can execute the model evaluation with the following command:
+
+```
+python3.7 tools/eval.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=./output/kie/best_accuracy
+```
+
+**Reference:**
+
+<!-- [ALGORITHM] -->
+
+```bibtex
+@misc{sun2021spatial,
+      title={Spatial Dual-Modality Graph Reasoning for Key Information Extraction},
+      author={Hongbin Sun and Zhanghui Kuang and Xiaoyu Yue and Chenhao Lin and Wayne Zhang},
+      year={2021},
+      eprint={2103.14470},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```
-- 
GitLab