@@ -258,7 +258,7 @@ PP-Structurev2中,我们参考TableMaster中的token处理方法,将`
`
## 5. 关键信息抽取
-关键信息抽取指的是针对文档图像的文字内容,提取出用户关注的关键信息,如身份证中的姓名、住址等字段。PP-Structure中支持了基于多模态LayoutLM系列模型的语义实体识别 (Semantic Entity Recognition, SER) 以及关系抽取 (Relation Extraction, RE) 任务。PP-Structurev2中,我们对模型结构以及下游任务训练方法进行升级,提出了VI-LayoutXLM(Visual-feature Independent LayoutXLM),具体流程图如下所示。
+关键信息抽取指的是针对文档图像的文字内容,提取出用户关注的关键信息,如身份证中的姓名、住址等字段。PP-Structure中支持了基于多模态LayoutLM系列模型的语义实体识别 (Semantic Entity Recognition, SER) 以及关系抽取 (Relation Extraction, RE) 任务。PP-StructureV2中,我们对模型结构以及下游任务训练方法进行升级,提出了VI-LayoutXLM(Visual-feature Independent LayoutXLM),具体流程图如下所示。
@@ -394,7 +394,7 @@ RE任务的可视化结果如下所示。
| 实验序号 | 策略 | F1-score |
|:------:|:------:|:------:|
| 1 | LayoutXLM | 82.28% |
-| 2 | PP-Structurev2 SER | **87.79%** |
+| 2 | PP-StructureV2 SER | **87.79%** |
**RE任务结果**
@@ -402,7 +402,7 @@ RE任务的可视化结果如下所示。
| 实验序号 | 策略 | F1-score |
|:------:|:------:|:------:|
| 1 | LayoutXLM | 53.13% |
-| 2 | PP-Structurev2 SER | **74.87%** |
+| 2 | PP-StructureV2 SER | **74.87%** |
## 6. Reference
diff --git a/ppstructure/docs/inference.md b/ppstructure/docs/inference.md
index 516db82784ce98abba6db14c795fe7323be508e0..d9b06707306feb03576b5e91dfb6f44530d06a09 100644
--- a/ppstructure/docs/inference.md
+++ b/ppstructure/docs/inference.md
@@ -16,13 +16,13 @@ cd ppstructure
下载模型
```bash
mkdir inference && cd inference
-# 下载PP-Structurev2版面分析模型并解压
+# 下载PP-StructureV2版面分析模型并解压
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
# 下载PP-OCRv3文本检测模型并解压
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
# 下载PP-OCRv3文本识别模型并解压
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
-# 下载PP-Structurev2表格识别模型并解压
+# 下载PP-StructureV2表格识别模型并解压
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
cd ..
```
diff --git a/ppstructure/docs/inference_en.md b/ppstructure/docs/inference_en.md
index 71019ec70f80e44bc16d2b0d07b0bb93b475b7e7..4068ae7485e9f3d09d2e6cb27494d8af444c111b 100644
--- a/ppstructure/docs/inference_en.md
+++ b/ppstructure/docs/inference_en.md
@@ -18,13 +18,13 @@ download model
```bash
mkdir inference && cd inference
-# Download the PP-Structurev2 layout analysis model and unzip it
+# Download the PP-StructureV2 layout analysis model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
# Download the PP-OCRv3 text detection model and unzip it
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
# Download the PP-OCRv3 text recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
-# Download the PP-Structurev2 form recognition model and unzip it
+# Download the PP-StructureV2 form recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
cd ..
```
diff --git a/ppstructure/docs/ppstructurev2_pipeline.png b/ppstructure/docs/ppstructurev2_pipeline.png
deleted file mode 100644
index b53a290a6dbc396449374cc694dd01c304325739..0000000000000000000000000000000000000000
Binary files a/ppstructure/docs/ppstructurev2_pipeline.png and /dev/null differ
diff --git a/ppstructure/kie/how_to_do_kie.md b/ppstructure/kie/how_to_do_kie.md
index e7ac562b1e567ac2da30becb966193ba8e16979b..0c47315d6e484720633a8e9709fee693f055810a 100644
--- a/ppstructure/kie/how_to_do_kie.md
+++ b/ppstructure/kie/how_to_do_kie.md
@@ -42,7 +42,7 @@
## 2. 关键信息抽取任务流程
-PaddleOCR中实现了LayoutXLM等算法(基于Token),同时,在PP-Structurev2中,对LayoutXLM多模态预训练模型的网络结构进行简化,去除了其中的Visual backbone部分,设计了视觉无关的VI-LayoutXLM模型,同时引入符合人类阅读顺序的排序逻辑以及UDML知识蒸馏策略,最终同时提升了关键信息抽取模型的精度与推理速度。
+PaddleOCR中实现了LayoutXLM等算法(基于Token),同时,在PP-StructureV2中,对LayoutXLM多模态预训练模型的网络结构进行简化,去除了其中的Visual backbone部分,设计了视觉无关的VI-LayoutXLM模型,同时引入符合人类阅读顺序的排序逻辑以及UDML知识蒸馏策略,最终同时提升了关键信息抽取模型的精度与推理速度。
下面介绍怎样基于PaddleOCR完成关键信息抽取任务。
@@ -115,7 +115,7 @@ Train:
数据量方面,一般来说,对于比较固定的场景,**50张**左右的训练图片即可达到可以接受的效果,可以使用[PPOCRLabel](../../PPOCRLabel/README_ch.md)完成KIE的标注过程。
-模型方面,推荐使用PP-Structurev2中提出的VI-LayoutXLM模型,它基于LayoutXLM模型进行改进,去除其中的视觉特征提取模块,在精度基本无损的情况下,进一步提升了模型推理速度。更多教程请参考:[VI-LayoutXLM算法介绍](../../doc/doc_ch/algorithm_kie_vi_layoutxlm.md)与[KIE关键信息抽取使用教程](../../doc/doc_ch/kie.md)。
+模型方面,推荐使用PP-StructureV2中提出的VI-LayoutXLM模型,它基于LayoutXLM模型进行改进,去除其中的视觉特征提取模块,在精度基本无损的情况下,进一步提升了模型推理速度。更多教程请参考:[VI-LayoutXLM算法介绍](../../doc/doc_ch/algorithm_kie_vi_layoutxlm.md)与[KIE关键信息抽取使用教程](../../doc/doc_ch/kie.md)。
#### 2.2.2 SER + RE
@@ -145,7 +145,7 @@ Train:
数据量方面,一般来说,对于比较固定的场景,**50张**左右的训练图片即可达到可以接受的效果,可以使用PPOCRLabel完成KIE的标注过程。
-模型方面,推荐使用PP-Structurev2中提出的VI-LayoutXLM模型,它基于LayoutXLM模型进行改进,去除其中的视觉特征提取模块,在精度基本无损的情况下,进一步提升了模型推理速度。更多教程请参考:[VI-LayoutXLM算法介绍](../../doc/doc_ch/algorithm_kie_vi_layoutxlm.md)与[KIE关键信息抽取使用教程](../../doc/doc_ch/kie.md)。
+模型方面,推荐使用PP-StructureV2中提出的VI-LayoutXLM模型,它基于LayoutXLM模型进行改进,去除其中的视觉特征提取模块,在精度基本无损的情况下,进一步提升了模型推理速度。更多教程请参考:[VI-LayoutXLM算法介绍](../../doc/doc_ch/algorithm_kie_vi_layoutxlm.md)与[KIE关键信息抽取使用教程](../../doc/doc_ch/kie.md)。
## 3. 参考文献
diff --git a/ppstructure/kie/how_to_do_kie_en.md b/ppstructure/kie/how_to_do_kie_en.md
index 23b2394f5aa3911a1311d3bc3be8f362861d34af..400bd1c2d29bb24b01818c9060220ecad1e9806c 100644
--- a/ppstructure/kie/how_to_do_kie_en.md
+++ b/ppstructure/kie/how_to_do_kie_en.md
@@ -48,7 +48,7 @@ For more detailed introduction of the algorithms, please refer to Chapter 6 of [
## 2. KIE Pipeline
-Token based methods such as LayoutXLM are implemented in PaddleOCR. What's more, in PP-Structurev2, we simplify the LayoutXLM model and proposed VI-LayoutXLM, in which the visual feature extraction module is removed for speed-up. The textline sorting strategy conforming to the human reading order and UDML knowledge distillation strategy are utilized for higher model accuracy.
+Token based methods such as LayoutXLM are implemented in PaddleOCR. What's more, in PP-StructureV2, we simplify the LayoutXLM model and proposed VI-LayoutXLM, in which the visual feature extraction module is removed for speed-up. The textline sorting strategy conforming to the human reading order and UDML knowledge distillation strategy are utilized for higher model accuracy.
In the non end-to-end KIE method, KIE needs at least ** 2 steps**. Firstly, the OCR model is used to extract the text and its position. Secondly, the KIE model is used to extract the key information according to the image, text position and text content.
@@ -125,7 +125,7 @@ Take the ID card scenario as an example. The key information generally includes
In terms of data, generally speaking, for relatively fixed scenes, **50** training images can achieve acceptable effects. You can refer to [PPOCRLabel](../../PPOCRLabel/README.md) for finish the labeling process.
-In terms of model, it is recommended to use the VI-layoutXLM model proposed in PP-Structurev2. It is improved based on the LayoutXLM model, removing the visual feature extraction module, and further improving the model inference speed without the significant reduction on model accuracy. For more tutorials, please refer to [VI-LayoutXLM introduction](../../doc/doc_en/algorithm_kie_vi_layoutxlm_en.md) and [KIE tutorial](../../doc/doc_en/kie_en.md).
+In terms of model, it is recommended to use the VI-layoutXLM model proposed in PP-StructureV2. It is improved based on the LayoutXLM model, removing the visual feature extraction module, and further improving the model inference speed without the significant reduction on model accuracy. For more tutorials, please refer to [VI-LayoutXLM introduction](../../doc/doc_en/algorithm_kie_vi_layoutxlm_en.md) and [KIE tutorial](../../doc/doc_en/kie_en.md).
#### 2.2.2 SER + RE
@@ -155,7 +155,7 @@ For each textline, you need to add 'ID' and 'linking' field information. The 'ID
In terms of data, generally speaking, for relatively fixed scenes, about **50** training images can achieve acceptable effects.
-In terms of model, it is recommended to use the VI-layoutXLM model proposed in PP-Structurev2. It is improved based on the LayoutXLM model, removing the visual feature extraction module, and further improving the model inference speed without the significant reduction on model accuracy. For more tutorials, please refer to [VI-LayoutXLM introduction](../../doc/doc_en/algorithm_kie_vi_layoutxlm_en.md) and [KIE tutorial](../../doc/doc_en/kie_en.md).
+In terms of model, it is recommended to use the VI-layoutXLM model proposed in PP-StructureV2. It is improved based on the LayoutXLM model, removing the visual feature extraction module, and further improving the model inference speed without the significant reduction on model accuracy. For more tutorials, please refer to [VI-LayoutXLM introduction](../../doc/doc_en/algorithm_kie_vi_layoutxlm_en.md) and [KIE tutorial](../../doc/doc_en/kie_en.md).
diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md
index 1d082f3878c56e42d175d13c75e1fe17916e7781..cebbd1ccafbde0aee7fa9f50398682a86cb1c8dd 100644
--- a/ppstructure/table/README.md
+++ b/ppstructure/table/README.md
@@ -66,7 +66,7 @@ mkdir inference && cd inference
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
# Download the PP-OCRv3 text recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
-# Download the PP-Structurev2 form recognition model and unzip it
+# Download the PP-StructureV2 form recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
cd ..
# run
diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md
index feccb70adfe20fa8c1cd06f33a10ee6fa043e69e..72b7f5cbeb176cd28102c2f4da576f7af3f0c275 100644
--- a/ppstructure/table/README_ch.md
+++ b/ppstructure/table/README_ch.md
@@ -71,7 +71,7 @@ mkdir inference && cd inference
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
# 下载PP-OCRv3文本识别模型并解压
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
-# 下载PP-Structurev2中文表格识别模型并解压
+# 下载PP-StructureV2中文表格识别模型并解压
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
cd ..
# 执行表格识别
|