Merge pull request #5405 from WenmuZhou/vqa

fix link in doc & test=document_fix

Merge pull request #5405 from WenmuZhou/vqa
fix link in doc & test=document_fix
15a645ea · Double_V · GitHub · 26ad6512 · 9d4aed11 · 15a645ea
8 changed file
--- a/PPOCRLabel/README.md
+++ b/PPOCRLabel/README.md
@@ -79,7 +79,7 @@ PPOCRLabel # run

 ```bash
 cd PaddleOCR/PPOCRLabel
-python3 setup.py bdist_wheel 
+python3 setup.py bdist_wheel
 pip3 install dist/PPOCRLabel-1.0.2-py2.py3-none-any.whl
 ```

@@ -171,7 +171,7 @@ python PPOCRLabel.py
 - Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languagesinclude French, German, Korean, and Japanese.
  For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating)

- **Custom Model**: If users want to replace the built-in model with their own inference model, they can follow the [Custom Model Code Usage](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_en/whl_en.md#31-use-by-code) by modifying PPOCRLabel.py for [Instantiation of PaddleOCR class](https://github.com/PaddlePaddle/PaddleOCR/blob/release/ 2.3/PPOCRLabel/PPOCRLabel.py#L116) :
+- **Custom Model**: If users want to replace the built-in model with their own inference model, they can follow the [Custom Model Code Usage](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_en/whl_en.md#31-use-by-code) by modifying PPOCRLabel.py for [Instantiation of PaddleOCR class](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/PPOCRLabel/PPOCRLabel.py#L86) :

  add parameter `det_model_dir`  in `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `

@@ -235,4 +235,4 @@ For some data that are difficult to recognize, the recognition results will not

 ### 4. Related

-1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
\ No newline at end of file
+1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
--- a/README.md
+++ b/README.md
@@ -92,7 +92,7 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr
 | ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
 | Chinese and English ultra-lightweight PP-OCRv2 model（11.6M） |  ch_PP-OCRv2_xx |Mobile & Server|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)| [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar)|
 | Chinese and English ultra-lightweight PP-OCR model (9.4M)       | ch_ppocr_mobile_v2.0_xx      | Mobile & server   |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar)      |
-| Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_traingit.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar)  |
+| Chinese and English general PP-OCR model (143.4M)               | ch_ppocr_server_v2.0_xx      | Server            |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar)    |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar)  |


 For more model downloads (including multiple languages), please refer to [PP-OCR series model downloads](./doc/doc_en/models_list_en.md).

--- a/README_ch.md
+++ b/README_ch.md
@@ -99,7 +99,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 - [PP-Structure信息提取](./ppstructure/README_ch.md)
    - [版面分析](./ppstructure/layout/README_ch.md)
    - [表格识别](./ppstructure/table/README_ch.md)
-    - [DocVQA](./ppstructure/vqa/README_ch.md)
+    - [DocVQA](./ppstructure/vqa/README.md)
    - [关键信息提取](./ppstructure/docs/kie.md)
 - OCR学术圈
    - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)

--- a/deploy/lite/readme.md
+++ b/deploy/lite/readme.md
@@ -42,7 +42,7 @@ git checkout release/v2.9

 注意：编译Paddle-Lite获得预测库时，需要打开`--with_cv=ON --with_extra=ON`两个选项，`--arch`表示`arm`版本，这里指定为armv8，
 更多编译命令
-介绍请参考 [链接](https://paddle-lite.readthedocs.io/zh/latest/source_compile/compile_andriod.html) 。
+介绍请参考 [链接](https://paddle-lite.readthedocs.io/zh/release-v2.10_a/source_compile/linux_x86_compile_android.html) 。

 直接下载预测库并解压后，可以得到`inference_lite_lib.android.armv8/`文件夹，通过编译Paddle-Lite得到的预测库位于
 `Paddle-Lite/build.lite.android.armv8.gcc/inference_lite_lib.android.armv8/`文件夹下。

--- a/deploy/lite/readme_en.md
+++ b/deploy/lite/readme_en.md
@@ -44,7 +44,7 @@ git checkout release/v2.8

 Note: When compiling Paddle-Lite to obtain the Paddle-Lite library, you need to turn on the two options `--with_cv=ON --with_extra=ON`, `--arch` means the `arm` version, here is designated as armv8,

-More compilation commands refer to the introduction [link](https://paddle-lite.readthedocs.io/zh/latest/source_compile/compile_andriod.html) 。
+More compilation commands refer to the introduction [link](https://paddle-lite.readthedocs.io/zh/release-v2.10_a/source_compile/linux_x86_compile_android.html) 。

 After directly downloading the Paddle-Lite library and decompressing it, you can get the `inference_lite_lib.android.armv8/` folder, and the Paddle-Lite library obtained by compiling Paddle-Lite is located
 `Paddle-Lite/build.lite.android.armv8.gcc/inference_lite_lib.android.armv8/` folder.

--- a/doc/doc_en/models_list_en.md
+++ b/doc/doc_en/models_list_en.md
@@ -4,13 +4,14 @@
 > 2. Compared with [models 1.1](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md), which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.
 > 3. All models in this tutorial are all ppocr-series models, for more introduction of algorithms and models based on public dataset, you can refer to [algorithm overview tutorial](./algorithm_overview_en.md).

- [1. Text Detection Model](#Detection)
- [2. Text Recognition Model](#Recognition)
-    - [2.1 Chinese Recognition Model](#Chinese)
-    - [2.2 English Recognition Model](#English)
-    - [2.3 Multilingual Recognition Model](#Multilingual)
- [3. Text Angle Classification Model](#Angle)
- [4. Paddle-Lite Model](#Paddle-Lite)
+- [OCR Model List（V2.1, updated on 2021.9.6）](#ocr-model-listv21-updated-on-202196)
+  - [1. Text Detection Model](#1-text-detection-model)
+  - [2. Text Recognition Model](#2-text-recognition-model)
+    - [2.1 Chinese Recognition Model](#21-chinese-recognition-model)
+    - [2.2 English Recognition Model](#22-english-recognition-model)
+    - [2.3 Multilingual Recognition Model（Updating...）](#23-multilingual-recognition-modelupdating)
+  - [3. Text Angle Classification Model](#3-text-angle-classification-model)
+  - [4. Paddle-Lite Model](#4-paddle-lite-model)

 The downloadable models provided by PaddleOCR include `inference model`, `trained model`, `pre-trained model` and `slim model`. The differences between the models are as follows:

@@ -44,7 +45,7 @@ Relationship of the above models is as follows.
 |model name|description|config|model size|download|
 | --- | --- | --- | --- | --- |
 |ch_PP-OCRv2_rec_slim|[New] Slim qunatization with distillation lightweight model, supporting Chinese, English, multilingual text recognition|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)| 9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_train.tar) |
-|ch_PP-OCRv2_rec|[New] Original lightweight model, supporting Chinese, English, multilingual text recognition|[ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml)|8.5M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
+|ch_PP-OCRv2_rec|[New] Original lightweight model, supporting Chinese, English, multilingual text recognition|[ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml)|8.5M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
 |ch_ppocr_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| 6M | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_slim_train.tar) |
 |ch_ppocr_mobile_v2.0_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|5.2M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
 |ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |

--- a/ppstructure/README.md
+++ b/ppstructure/README.md
 English | [简体中文](README_ch.md)

- [1. Introduction](#1)
- [2. Update log](#2)
- [3. Features](#3)
- [4. Results](#4)
-  * [4.1 Layout analysis and table recognition](#41)
-  * [4.2 DOC-VQA](#42)
- [5. Quick start](#5)
- [6. PP-Structure System](#6)
-  * [6.1 Layout analysis and table recognition](#61)
-  * [6.2 DOC-VQA](#62)
- [7. Model List](#7)
+- [1. Introduction](#1-introduction)
+- [2. Update log](#2-update-log)
+- [3. Features](#3-features)
+- [4. Results](#4-results)
+  - [4.1 Layout analysis and table recognition](#41-layout-analysis-and-table-recognition)
+  - [4.2 DOC-VQA](#42-doc-vqa)
+- [5. Quick start](#5-quick-start)
+- [6. PP-Structure System](#6-pp-structure-system)
+  - [6.1 Layout analysis and table recognition](#61-layout-analysis-and-table-recognition)
+    - [6.1.1 Layout analysis](#611-layout-analysis)
+    - [6.1.2 Table recognition](#612-table-recognition)
+  - [6.2 DOC-VQA](#62-doc-vqa)
+- [7. Model List](#7-model-list)

 <a name="1"></a>

@@ -54,8 +56,8 @@ The figure shows the pipeline of layout analysis + table recognition. The image
 ### 4.2 DOC-VQA

 * SER
-
-![](./vqa/images/result_ser/zh_val_0_ser.jpg) | ![](./vqa/images/result_ser/zh_val_42_ser.jpg)
+*
+![](../doc/vqa/result_ser/zh_val_0_ser.jpg) | ![](../doc/vqa/result_ser/zh_val_42_ser.jpg)
 ---|---

 Different colored boxes in the figure represent different categories. For xfun dataset, there are three categories: query, answer and header:
@@ -69,7 +71,7 @@ The corresponding category and OCR recognition results are also marked at the to

 * RE

-![](./vqa/images/result_re/zh_val_21_re.jpg) | ![](./vqa/images/result_re/zh_val_40_re.jpg)
+![](../doc/vqa/result_re/zh_val_21_re.jpg) | ![](../doc/vqa/result_re/zh_val_40_re.jpg)
 ---|---


@@ -96,7 +98,7 @@ In PP-Structure, the image will be divided into 5 types of areas **text, title,

 #### 6.1.1 Layout analysis

-Layout analysis classifies image by region, including the use of Python scripts of layout analysis tools, extraction of designated category detection boxes, performance indicators, and custom training layout analysis models. For details, please refer to [document](layout/README_en.md).
+Layout analysis classifies image by region, including the use of Python scripts of layout analysis tools, extraction of designated category detection boxes, performance indicators, and custom training layout analysis models. For details, please refer to [document](layout/README.md).

 #### 6.1.2 Table recognition


--- a/ppstructure/README_ch.md
+++ b/ppstructure/README_ch.md
 [English](README.md) | 简体中文

- [1. 简介](#1)
- [2. 近期更新](#2)
- [3. 特性](#3)
- [4. 效果展示](#4)
-  * [4.1 版面分析和表格识别](#41)
-  * [4.2 DOC-VQA](#42)
- [5. 快速体验](#5)
- [6. PP-Structure 介绍](#6)
-  * [6.1 版面分析+表格识别](#61)
-  * [6.2 DOC-VQA](#62)
- [7. 模型库](#7)
+- [1. 简介](#1-简介)
+- [2. 近期更新](#2-近期更新)
+- [3. 特性](#3-特性)
+- [4. 效果展示](#4-效果展示)
+  - [4.1 版面分析和表格识别](#41-版面分析和表格识别)
+  - [4.2 DOC-VQA](#42-doc-vqa)
+- [5. 快速体验](#5-快速体验)
+- [6. PP-Structure 介绍](#6-pp-structure-介绍)
+  - [6.1 版面分析+表格识别](#61-版面分析表格识别)
+    - [6.1.1 版面分析](#611-版面分析)
+    - [6.1.2 表格识别](#612-表格识别)
+  - [6.2 DOC-VQA](#62-doc-vqa)
+- [7. 模型库](#7-模型库)

 <a name="1"></a>

@@ -54,7 +56,7 @@ PP-Structure的主要特性如下：

 * SER

-![](./vqa/images/result_ser/zh_val_0_ser.jpg) | ![](./vqa/images/result_ser/zh_val_42_ser.jpg)
+![](../doc/vqa/result_ser/zh_val_0_ser.jpg) | ![](../doc/vqa/result_ser/zh_val_42_ser.jpg)
 ---|---

 图中不同颜色的框表示不同的类别，对于XFUN数据集，有`QUESTION`, `ANSWER`, `HEADER` 3种类别
@@ -67,7 +69,7 @@ PP-Structure的主要特性如下：

 * RE

-![](./vqa/images/result_re/zh_val_21_re.jpg) | ![](./vqa/images/result_re/zh_val_40_re.jpg)
+![](../doc/vqa/result_re/zh_val_21_re.jpg) | ![](../doc/vqa/result_re/zh_val_40_re.jpg)
 ---|---


@@ -134,4 +136,4 @@ PP-Structure系列模型列表（更新中）
 |PP-Layout_v1.0_re_pretrained|基于LayoutXLM在xfun中文数据集上训练的RE模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/PP-Layout_v1.0_re_pretrained.tar) |


-更多模型下载，可以参考 [PPOCR model_list](../doc/doc_en/models_list.md) and  [PPStructure model_list](./docs/model_list.md)
\ No newline at end of file
+更多模型下载，可以参考 [PPOCR model_list](../doc/doc_en/models_list.md) and  [PPStructure model_list](./docs/model_list.md)