update doc

7c8abcc6 · 文幕地方 · ec9eb720 · 7c8abcc6 · 7c8abcc6
隐藏空白更改
内联并排

Showing with 41 addition and 21 deletion

deploy/hubserving/readme.md deploy/hubserving/readme.md +17 -8

deploy/hubserving/readme_en.md deploy/hubserving/readme_en.md +24 -13

未找到文件。
--- a/deploy/hubserving/readme.md
+++ b/deploy/hubserving/readme.md
@@ -19,13 +19,14 @@ PaddleOCR提供2种服务部署方式：
 # 基于PaddleHub Serving的服务部署
-hubserving服务部署目录下包括检测、识别、2阶段串联三种服务包，请根据需求选择相应的服务包进行安装和启动。目录结构如下：
+hubserving服务部署目录下包括检测、识别、2阶段串联和表格识别四种服务包，请根据需求选择相应的服务包进行安装和启动。目录结构如下：
 ```
 deploy/hubserving/
  └─  ocr_cls     分类模块服务包
  └─  ocr_det     检测模块服务包
  └─  ocr_rec     识别模块服务包
  └─  ocr_system  检测+识别串联服务包
+  └─  structure_table  表格识别服务包
 ```
 每个服务包下包含3个文件。以2阶段串联服务包为例，目录如下：
@@ -43,7 +44,7 @@ deploy/hubserving/ocr_system/
 ```shell
 # 安装paddlehub  
 # paddlehub 需要 python>3.6.2
-pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
+pip3 install paddlehub==2.1.0 --upgrade -i https://mirror.baidu.com/pypi/simple
 ```
 ### 2. 下载推理模型
@@ -52,12 +53,13 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
 检测模型：./inference/ch_PP-OCRv2_det_infer/
 识别模型：./inference/ch_PP-OCRv2_rec_infer/
 方向分类器：./inference/ch_ppocr_mobile_v2.0_cls_infer/
+表格结构识别模型：./inference/en_ppocr_mobile_v2.0_table_structure_infer/
 ```  
-**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的[模型库](../../doc/doc_ch/models_list.md)下载，也可以替换成自己训练转换好的模型。
+**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的模型库[PP-OCR](../../doc/doc_ch/models_list.md)和[PP-Structure](../../ppstructure/docs/models_list.md)下载，也可以替换成自己训练转换好的模型。
 ### 3. 安装服务模块
-PaddleOCR提供3种服务模块，根据需要安装所需模块。
+PaddleOCR提供5种服务模块，根据需要安装所需模块。
 * 在Linux环境下，安装示例如下：
 ```shell
@@ -72,6 +74,9 @@ hub install deploy/hubserving/ocr_rec/
 # 或，安装检测+识别串联服务模块：  
 hub install deploy/hubserving/ocr_system/
+# 或，安装表格识别服务模块：  
+hub install deploy/hubserving/structure_table/
 ```
 * 在Windows环境下(文件夹的分隔符为`\`)，安装示例如下：
@@ -87,6 +92,9 @@ hub install deploy\hubserving\ocr_rec\
 # 或，安装检测+识别串联服务模块：
 hub install deploy\hubserving\ocr_system\
+# 或，安装表格识别服务模块：
+hub install deploy\hubserving\structure_table\
 ```
 ### 4. 启动服务
@@ -102,7 +110,7 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
 **参数：**  
 |参数|用途|  
-|-|-|  
+|---|---|  
 |--modules/-m|PaddleHub Serving预安装模型，以多个Module==Version键值对的形式列出<br>*`当不指定Version时，默认选择最新版本`*|  
 |--port/-p|服务端口，默认为8866|  
 |--use_multiprocess|是否启用并发方式，默认为单进程方式，推荐多核CPU机器使用此方式<br>*`Windows操作系统只支持单进程方式`*|
@@ -157,11 +165,12 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
 需要给脚本传递2个参数：  
 - **server_url**：服务地址，格式为  
 `http://[ip_address]:[port]/predict/[module_name]`  
-例如，如果使用配置文件启动分类，检测、识别，检测+分类+识别3阶段服务，那么发送请求的url将分别是：  
+例如，如果使用配置文件启动分类，检测、识别，检测+分类+识别3阶段，表格识别服务，那么发送请求的url将分别是：  
 `http://127.0.0.1:8865/predict/ocr_det`  
 `http://127.0.0.1:8866/predict/ocr_cls`  
 `http://127.0.0.1:8867/predict/ocr_rec`  
 `http://127.0.0.1:8868/predict/ocr_system`  
+`http://127.0.0.1:8869/predict/structure_table`  
 - **image_dir**：测试图像路径，可以是单张图片路径，也可以是图像集合目录路径  
 - **visualize**：是否可视化结果，默认为False  
@@ -172,7 +181,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
 返回结果为列表（list），列表中的每一项为词典（dict），词典一共可能包含3种字段，信息如下：
 |字段名称|数据类型|意义|
-|----|----|----|
+|---|---|---|
 |angle|str|文本角度|
 |text|str|文本内容|
 |confidence|float| 文本识别置信度或文本角度分类置信度|
@@ -182,7 +191,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
 不同模块返回的字段不同，如，文本识别服务模块返回结果不含`text_region`字段，具体信息如下：
 | 字段名/模块名 | ocr_det | ocr_cls | ocr_rec | ocr_system | structure_table |
-|  ----  |  ----  |  ----  |  ----  |  ----  | ----  |
+|  ---  |  ---  |  ---  |  ---  |  ---  | ---  |
 |angle| | ✔ | | ✔ | |
 |text| | |✔|✔| |
 |confidence| |✔ |✔| | |

--- a/deploy/hubserving/readme_en.md
+++ b/deploy/hubserving/readme_en.md
@@ -19,13 +19,14 @@ PaddleOCR provides 2 service deployment methods:
 # Service deployment based on PaddleHub Serving  
-The hubserving service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:  
+The hubserving service deployment directory includes three service packages: text detection, text recognition, two-stage series connection and table recognition. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:  
 ```
 deploy/hubserving/
-  └─  ocr_det     detection module service package
+  └─  ocr_det     text detection module service package
-  └─  ocr_cls     angle class module service package
+  └─  ocr_cls     text angle class module service package
-  └─  ocr_rec     recognition module service package
+  └─  ocr_rec     text recognition module service package
  └─  ocr_system  two-stage series connection service package
+  └─  structure_table  table recognition service package
 ```
 Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:  
@@ -50,29 +51,33 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
 ### 2. Download inference model
 Before installing the service module, you need to prepare the inference model and put it in the correct path. By default, the PP-OCRv2 models are used, and the default model path is:  
 ```
-detection model: ./inference/ch_PP-OCRv2_det_infer/
+text detection model: ./inference/ch_PP-OCRv2_det_infer/
-recognition model: ./inference/ch_PP-OCRv2_rec_infer/
+text recognition model: ./inference/ch_PP-OCRv2_rec_infer/
-text direction classifier: ./inference/ch_ppocr_mobile_v2.0_cls_infer/
+text angle classifier: ./inference/ch_ppocr_mobile_v2.0_cls_infer/
+tanle recognition: ./inference/en_ppocr_mobile_v2.0_table_structure_infer/
 ```  
 **The model path can be found and modified in `params.py`.** More models provided by PaddleOCR can be obtained from the [model library](../../doc/doc_en/models_list_en.md). You can also use models trained by yourself.
 ### 3. Install Service Module
-PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.
+PaddleOCR provides 5 kinds of service modules, install the required modules according to your needs.
 * On Linux platform, the examples are as follows.
 ```shell
-# Install the detection service module:
+# Install the text detection service module:
 hub install deploy/hubserving/ocr_det/
-# Or, install the angle class service module:
+# Or, install the text angle class service module:
 hub install deploy/hubserving/ocr_cls/
-# Or, install the recognition service module:
+# Or, install the text recognition service module:
 hub install deploy/hubserving/ocr_rec/
 # Or, install the 2-stage series service module:
 hub install deploy/hubserving/ocr_system/
+# Or install table recognition service module
+hub install deploy/hubserving/structure_table/
 ```
 * On Windows platform, the examples are as follows.
@@ -88,6 +93,9 @@ hub install deploy\hubserving\ocr_rec\
 # Or, install the 2-stage series service module:
 hub install deploy\hubserving\ocr_system\
+# Or install table recognition service module
+hub install deploy/hubserving/structure_table/
 ```
 ### 4. Start service
@@ -103,7 +111,7 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
 **parameters：**  
 |parameters|usage|  
-|-|-|  
+|---|---|  
 |--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
 |--port/-p|Service port, default is 8866|  
 |--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
@@ -162,11 +170,13 @@ python tools/test_hubserving.py server_url image_path
 Two parameters need to be passed to the script:
 - **server_url**：service address，format of which is
 `http://[ip_address]:[port]/predict/[module_name]`  
-For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:  
+For example, if using the configuration file to start the text angle classification, text detection, text recognition, detection+classification+recognition 3 stages, table recognition service, then the `server_url` to send the request will be:
 `http://127.0.0.1:8865/predict/ocr_det`  
 `http://127.0.0.1:8866/predict/ocr_cls`  
 `http://127.0.0.1:8867/predict/ocr_rec`  
 `http://127.0.0.1:8868/predict/ocr_system`  
+`http://127.0.0.1:8869/predict/structure_table`
 - **image_dir**：Test image path, can be a single image path or an image directory path
 - **visualize**：Whether to visualize the results, the default value is False
@@ -184,6 +194,7 @@ The returned result is a list. Each item in the list is a dict. The dict may con
 |text|str|text content|
 |confidence|float|text recognition confidence|
 |text_region|list|text location coordinates|
+|html|str|table html str|
 The fields returned by different modules are different. For example, the results returned by the text recognition service module do not contain `text_region`. The details are as follows: