diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index fb62a53d141b03367e2a753c495741b44a7f1214..a8d10b79e507ab59ef2481982a33902e4a95e73e 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -15,9 +15,18 @@ The table recognition flow chart is as follows 3. The recognition result of the cell is combined by the coordinates, recognition result of the single line and the coordinates of the cell. 4. The cell recognition result and the table structure together construct the html string of the table. -## 2. How to use +## 2. Performance +We evaluated the algorithm on the PubTabNet[1] eval dataset, and the performance is as follows: -### 2.1 quick start + +|Method|[TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src)| +| --- | --- | +| EDD[2] | 88.3 | +| Ours | 93.32 | + +## 3. How to use + +### 3.1 quick start ```python cd PaddleOCR/ppstructure @@ -38,7 +47,7 @@ Note: The above model is trained on the PubLayNet dataset and only supports Engl After running, the excel sheet of each picture will be saved in the directory specified by the output field -### 2.2 Train +### 3.2 Train In this chapter, we only introduce the training of the table structure model, For model training of [text detection](../../doc/doc_en/detection_en.md) and [text recognition](../../doc/doc_en/recognition_en.md), please refer to the corresponding documents @@ -68,9 +77,9 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo **Note**: The priority of `Global.checkpoints` is higher than that of `Global.pretrain_weights`, that is, when two parameters are specified at the same time, the model specified by `Global.checkpoints` will be loaded first. If the model path specified by `Global.checkpoints` is wrong, the one specified by `Global.pretrain_weights` will be loaded. -### 2.3 Eval +### 3.3 Eval -The table uses [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src)) as the evaluation metric of the model. Before the model evaluation, the three models in the pipeline need to be exported as inference models (we have provided them), and the gt for evaluation needs to be prepared. Examples of gt are as follows: +The table uses [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) as the evaluation metric of the model. Before the model evaluation, the three models in the pipeline need to be exported as inference models (we have provided them), and the gt for evaluation needs to be prepared. Examples of gt are as follows: ```json {"PMC4289340_004_00.png": [ ["", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "
", "", "", "
", "", "", "
", "", ""], @@ -91,13 +100,17 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di If the PubLatNet eval dataset is used, it will be output ```bash -teds: 94.85 +teds: 93.32 ``` -### 2.4 Inference +### 3.4 Inference ```python cd PaddleOCR/ppstructure python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table ``` After running, the excel sheet of each picture will be saved in the directory specified by the output field + +Reference +1. https://github.com/ibm-aur-nlp/PubTabNet +2. https://arxiv.org/pdf/1911.10683 \ No newline at end of file diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index 232b34efa0725b4b89e66ea6259d2e96b04d701f..2ded403c371984a447f94268d23ca1c6240cf432 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -17,9 +17,18 @@ 3. 由单行文字的坐标、识别结果和单元格的坐标一起组合出单元格的识别结果。 4. 单元格的识别结果和表格结构一起构造表格的html字符串。 -## 2. 使用 +## 2. 性能 +我们在 PubTabNet[1] 评估数据集上对算法进行了评估,性能如下 -### 2.1 快速开始 + +|算法|[TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src)| +| --- | --- | +| EDD[2] | 88.3 | +| Ours | 93.32 | + +## 3. 使用 + +### 3.1 快速开始 ```python cd PaddleOCR/ppstructure @@ -40,7 +49,7 @@ python3 table/predict_table.py --det_model_dir=inference/en_ppocr_mobile_v2.0_ta note: 上述模型是在 PubLayNet 数据集上训练的表格识别模型,仅支持英文扫描场景,如需识别其他场景需要自己训练模型后替换 `det_model_dir`,`rec_model_dir`,`table_model_dir`三个字段即可。 -### 2.2 训练 +### 3.2 训练 在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)和[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。 #### 数据准备 @@ -67,9 +76,9 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo **注意**:`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。 -### 2.3 评估 +### 3.3 评估 -表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src)) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下: +表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下: ```json {"PMC4289340_004_00.png": [ ["", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "
", "", "", "
", "", "", "
", "", ""], @@ -89,13 +98,16 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di ``` 如使用PubLatNet评估数据集,将会输出 ```bash -teds: 94.85 +teds: 93.32 ``` -### 2.4 预测 +### 3.4 预测 ```python cd PaddleOCR/ppstructure python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table ``` +Reference +1. https://github.com/ibm-aur-nlp/PubTabNet +2. https://arxiv.org/pdf/1911.10683 \ No newline at end of file