From 2be35217d3723e0c5987f543a8734c597ce89dfb Mon Sep 17 00:00:00 2001 From: WenmuZhou <572459439@qq.com> Date: Wed, 17 Aug 2022 02:53:16 +0000 Subject: [PATCH] update doc --- doc/doc_ch/table_recognition.md | 41 ++++++++++++++++++++++++-- doc/doc_en/table_recognition_en.md | 47 +++++++++++++++++++++++++++--- ppstructure/table/README.md | 9 +++--- ppstructure/table/README_ch.md | 12 ++------ 4 files changed, 89 insertions(+), 20 deletions(-) diff --git a/doc/doc_ch/table_recognition.md b/doc/doc_ch/table_recognition.md index 558adf7b..aafdaf74 100644 --- a/doc/doc_ch/table_recognition.md +++ b/doc/doc_ch/table_recognition.md @@ -3,7 +3,7 @@ 本文提供了PaddleOCR表格识别模型的全流程指南,包括数据准备、模型训练、调优、评估、预测,各个阶段的详细说明: - [1. 数据准备](#1-数据准备) - - [1.1. 准备数据集](#11-数据集格式) + - [1.1. 数据集格式](#11-数据集格式) - [1.2. 数据下载](#12-数据下载) - [1.3. 数据集生成](#13-数据集生成) - [2. 开始训练](#2-开始训练) @@ -19,6 +19,8 @@ - [3.1. 指标评估](#31-指标评估) - [3.2. 测试表格结构识别效果](#32-测试表格结构识别效果) - [4. 模型导出与预测](#4-模型导出与预测) + - [4.1 模型导出](#41-模型导出) + - [4.2 模型预测](#42-模型预测) - [5. FAQ](#5-faq) # 1. 数据准备 @@ -33,7 +35,7 @@ img_label ``` 每一行的json格式为: -```json +```txt { 'filename': PMC5755158_010_01.png, # 图像名 'split': ’train‘, # 图像属于训练集还是验证集 @@ -236,6 +238,12 @@ DCU设备上运行需要设置环境变量 `export HIP_VISIBLE_DEVICES=0,1,2,3` python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/table/SLANet.yml -o Global.checkpoints={path/to/weights}/best_accuracy ``` +运行完成后,会输出模型的acc指标,如对英文表格识别模型进行评估,会见到如下输出。 +```bash +[2022/08/16 07:59:55] ppocr INFO: acc:0.7622245132160782 +[2022/08/16 07:59:55] ppocr INFO: fps:30.991640622573044 +``` + ## 3.2. 测试表格结构识别效果 使用 PaddleOCR 训练好的模型,可以通过以下脚本进行快速预测。 @@ -278,6 +286,8 @@ python3 tools/infer_table.py -c configs/table/SLANet.yml -o Global.pretrained_mo # 4. 模型导出与预测 +## 4.1 模型导出 + inference 模型(`paddle.jit.save`保存的模型) 一般是模型训练,把模型结构和模型参数保存在文件中的固化模型,多用于预测部署场景。 训练过程中保存的模型是checkpoints模型,保存的只有模型的参数,多用于恢复训练等。 @@ -303,6 +313,33 @@ inference/SLANet/ └── inference.pdmodel # inference模型的program文件 ``` +## 4.2 模型预测 + +模型导出后,使用如下命令即可完成inference模型的预测 + +```python +python3.7 table/predict_structure.py \ + --table_model_dir={path/to/inference model} \ + --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \ + --image_dir=docs/table/table.jpg \ + --output=../output/table +``` + +预测图片: + +![](../../ppstructure/docs/table/table.jpg) + +得到输入图像的预测结果: + +``` +['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '
', '', ''],[[320.0562438964844, 197.83375549316406, 350.0928955078125, 214.4309539794922], ... , [318.959228515625, 271.0166931152344, 353.7394104003906, 286.4538269042969]] +``` + +单元格坐标可视化结果为 + +![](../../ppstructure/docs/imgs/slanet_result.jpg) + + # 5. FAQ Q1: 训练模型转inference 模型之后预测效果不一致? diff --git a/doc/doc_en/table_recognition_en.md b/doc/doc_en/table_recognition_en.md index f4c49cc5..5e03ce9b 100644 --- a/doc/doc_en/table_recognition_en.md +++ b/doc/doc_en/table_recognition_en.md @@ -3,9 +3,9 @@ This article provides a full-process guide for the PaddleOCR table recognition model, including data preparation, model training, tuning, evaluation, prediction, and detailed descriptions of each stage: - [1. Data Preparation](#1-data-preparation) - - [1.1. DataSet Preparation](#11-dataset-preparation) + - [1.1. DataSet Format](#11-dataset-format) - [1.2. Data Download](#12-data-download) - - [1.3. Dataset Generation](#13-dataset-format) + - [1.3. Dataset Generation](#13-dataset-generation) - [2. Training](#2-training) - [2.1. Start Training](#21-start-training) - [2.2. Resume Training](#22-resume-training) @@ -19,7 +19,9 @@ This article provides a full-process guide for the PaddleOCR table recognition m - [3.1. Evaluation](#31-evaluation) - [3.2. Test table structure recognition effect](#32-test-table-structure-recognition-effect) - [4. Model export and prediction](#4-model-export-and-prediction) - - [5. FAQ](#5-faq) + - [4.1 Model export](#41-model-export) + - [4.2 Prediction](#42-prediction) +- [5. FAQ](#5-faq) # 1. Data Preparation @@ -243,6 +245,13 @@ The model parameters during training are saved in the `Global.save_model_dir` di python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/table/SLANet.yml -o Global.checkpoints={path/to/weights}/best_accuracy ``` +After the operation is completed, the acc indicator of the model will be output. If you evaluate the English table recognition model, you will see the following output. + +```bash +[2022/08/16 07:59:55] ppocr INFO: acc:0.7622245132160782 +[2022/08/16 07:59:55] ppocr INFO: fps:30.991640622573044 +``` + ## 3.2. Test table structure recognition effect Using the model trained by PaddleOCR, you can quickly get prediction through the following script. @@ -287,6 +296,8 @@ The cell coordinates are visualized as # 4. Model export and prediction +## 4.1 Model export + inference model (model saved by `paddle.jit.save`) Generally, it is model training, a solidified model that saves the model structure and model parameters in a file, and is mostly used to predict deployment scenarios. The model saved during the training process is the checkpoints model, and only the parameters of the model are saved, which are mostly used to resume training. @@ -313,7 +324,35 @@ inference/SLANet/ └── inference.pdmodel # The program file of model ``` -## 5. FAQ +## 4.2 Prediction + +After the model is exported, use the following command to complete the prediction of the inference model + +```python +python3.7 table/predict_structure.py \ + --table_model_dir={path/to/inference model} \ + --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \ + --image_dir=docs/table/table.jpg \ + --output=../output/table +``` + +Input image: + +![](../../ppstructure/docs/table/table.jpg) + +Get the prediction result of the input image: + +``` +['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '
', '', ''],[[320.0562438964844, 197.83375549316406, 350.0928955078125, 214.4309539794922], ... , [318.959228515625, 271.0166931152344, 353.7394104003906, 286.4538269042969]] +``` + +The cell coordinates are visualized as + +![](../../ppstructure/docs/imgs/slanet_result.jpg) + + + +# 5. FAQ Q1: After the training model is transferred to the inference model, the prediction effect is inconsistent? diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index 67326848..05ea69e8 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -36,13 +36,12 @@ We evaluated the algorithm on the PubTabNet[1] eval dataset, and the | EDD[2] |x| 88.3 | | TableRec-RARE(ours) |73.8%| 93.32 | | SLANet(ours) | 76.2%| 94.98 |SLANet | + ## 3. Result -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-e50a465becdbde9bffb84a84d41d196ac1acf1b6) -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-17ea53b181408a35d977c6c26b1ea308b4c27a79) -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-b905f57beca7115d54b907deac70c10056274858) -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-894694c9558fe7deb8cc896f9411fdfd252bca72) -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-03a0a67378b41a353257bd2fe8a1e9a864c89cb5) +![](../docs/imgs/table_ch_result1.jpg) +![](../docs/imgs/table_ch_result2.jpg) +![](../docs/imgs/table_ch_result3.jpg) ## 4. How to use diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index f82aa786..4f475185 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -44,15 +44,9 @@ ## 3. 效果演示 -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-e50a465becdbde9bffb84a84d41d196ac1acf1b6) - -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-17ea53b181408a35d977c6c26b1ea308b4c27a79) - -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-b905f57beca7115d54b907deac70c10056274858) - -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-894694c9558fe7deb8cc896f9411fdfd252bca72) - -![图片](http://agroup.baidu-int.com/file/stream/bj/bj-03a0a67378b41a353257bd2fe8a1e9a864c89cb5) +![](../docs/imgs/table_ch_result1.jpg) +![](../docs/imgs/table_ch_result2.jpg) +![](../docs/imgs/table_ch_result3.jpg) ## 4. 使用 -- GitLab