From ee7dfddb9d42781146364eae0d07802e81ee0396 Mon Sep 17 00:00:00 2001 From: WenmuZhou <572459439@qq.com> Date: Tue, 16 Aug 2022 14:21:32 +0000 Subject: [PATCH] add table result --- ppstructure/table/README.md | 28 ++++++++++++++++++---------- ppstructure/table/README_ch.md | 29 +++++++++++++++++++---------- 2 files changed, 37 insertions(+), 20 deletions(-) diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index 4204f1f2..67326848 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -4,11 +4,12 @@ English | [简体中文](README_ch.md) - [1. pipeline](#1-pipeline) - [2. Performance](#2-performance) -- [3. How to use](#3-how-to-use) - - [3.1 quick start](#31-quick-start) - - [3.2 Train](#32-train) - - [3.3 Calculate TEDS](#33-calculate-teds) -- [4. Reference](#4-reference) +- [3. Result](#3-result) +- [4. How to use](#4-how-to-use) + - [4.1 Quick start](#41-quick-start) + - [4.2 Train](#42-train) + - [4.3 Calculate TEDS](#43-calculate-teds) +- [5. Reference](#5-reference) ## 1. pipeline @@ -35,10 +36,17 @@ We evaluated the algorithm on the PubTabNet[1] eval dataset, and the | EDD[2] |x| 88.3 | | TableRec-RARE(ours) |73.8%| 93.32 | | SLANet(ours) | 76.2%| 94.98 |SLANet | +## 3. Result -## 3. How to use +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-e50a465becdbde9bffb84a84d41d196ac1acf1b6) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-17ea53b181408a35d977c6c26b1ea308b4c27a79) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-b905f57beca7115d54b907deac70c10056274858) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-894694c9558fe7deb8cc896f9411fdfd252bca72) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-03a0a67378b41a353257bd2fe8a1e9a864c89cb5) -### 3.1 quick start +## 4. How to use + +### 4.1 Quick start Use the following commands to quickly complete the identification of a table. @@ -68,7 +76,7 @@ python3.7 table/predict_table.py \ After the operation is completed, the excel table of each image will be saved to the directory specified by the output field, and an html file will be produced in the directory to visually view the cell coordinates and the recognized table. -### 3.2 Train +### 4.2 Train The training, evaluation and inference process of the text detection model can be referred to [detection](../../doc/doc_en/detection_en.md) @@ -76,7 +84,7 @@ The training, evaluation and inference process of the text recognition model can The training, evaluation and inference process of the table recognition model can be referred to [table_recognition](../../doc/doc_en/table_recognition_en.md) -### 3.3 Calculate TEDS +### 4.3 Calculate TEDS The table uses [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) as the evaluation metric of the model. Before the model evaluation, the three models in the pipeline need to be exported as inference models (we have provided them), and the gt for evaluation needs to be prepared. Examples of gt are as follows: ```txt @@ -108,6 +116,6 @@ If the PubLatNet eval dataset is used, it will be output teds: 94.98 ``` -## 4. Reference +## 5. Reference 1. https://github.com/ibm-aur-nlp/PubTabNet 2. https://arxiv.org/pdf/1911.10683 diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index d7e82658..a21b3d1c 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -4,11 +4,12 @@ - [1. 表格识别 pipeline](#1-表格识别-pipeline) - [2. 性能](#2-性能) -- [3. 使用](#3-使用) - - [3.1 快速开始](#31-快速开始) - - [3.2 训练](#32-训练) - - [3.3 计算TEDS](#33-计算teds) -- [4. Reference](#4-reference) +- [3. 效果演示](#3-效果演示) +- [4. 使用](#4-使用) + - [4.1 快速开始](#41-快速开始) + - [4.2 训练](#42-训练) + - [4.3 计算TEDS](#43-计算teds) +- [5. Reference](#5-reference) ## 1. 表格识别 pipeline @@ -41,9 +42,17 @@ | TableRec-RARE(ours) |73.8%| 93.32 | | SLANet(ours) | 76.2%| 94.98 | -## 3. 使用 +## 3. 效果演示 -### 3.1 快速开始 +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-e50a465becdbde9bffb84a84d41d196ac1acf1b6) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-17ea53b181408a35d977c6c26b1ea308b4c27a79) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-b905f57beca7115d54b907deac70c10056274858) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-894694c9558fe7deb8cc896f9411fdfd252bca72) +![图片](http://agroup.baidu-int.com/file/stream/bj/bj-03a0a67378b41a353257bd2fe8a1e9a864c89cb5) + +## 4. 使用 + +### 4.1 快速开始 使用如下命令即可快速完成一张表格的识别。 ```python @@ -70,7 +79,7 @@ python table/predict_table.py \ ``` 运行完成后,每张图片的excel表格会保存到output字段指定的目录下,同时在该目录下回生产一个html文件,用于可视化查看单元格坐标和识别的表格。 -### 3.2 训练 +### 4.2 训练 文本检测模型的训练、评估和推理流程可参考 [detection](../../doc/doc_ch/detection.md) @@ -78,7 +87,7 @@ python table/predict_table.py \ 表格识别模型的训练、评估和推理流程可参考 [table_recognition](../../doc/doc_ch/table_recognition.md) -### 3.3 计算TEDS +### 4.3 计算TEDS 表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下: ```txt @@ -110,6 +119,6 @@ python3 table/eval_table.py \ teds: 94.98 ``` -## 4. Reference +## 5. Reference 1. https://github.com/ibm-aur-nlp/PubTabNet 2. https://arxiv.org/pdf/1911.10683 -- GitLab