From 7c8abcc6a641d76268edebccb827b7d816e860f8 Mon Sep 17 00:00:00 2001
From: WenmuZhou <572459439@qq.com>
Date: Wed, 30 Mar 2022 11:24:45 +0000
Subject: [PATCH] update doc
---
deploy/hubserving/readme.md | 25 +++++++++++++++--------
deploy/hubserving/readme_en.md | 37 ++++++++++++++++++++++------------
2 files changed, 41 insertions(+), 21 deletions(-)
diff --git a/deploy/hubserving/readme.md b/deploy/hubserving/readme.md
index ff04aa8b..91356155 100755
--- a/deploy/hubserving/readme.md
+++ b/deploy/hubserving/readme.md
@@ -19,13 +19,14 @@ PaddleOCR提供2种服务部署方式:
# 基于PaddleHub Serving的服务部署
-hubserving服务部署目录下包括检测、识别、2阶段串联三种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下:
+hubserving服务部署目录下包括检测、识别、2阶段串联和表格识别四种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下:
```
deploy/hubserving/
└─ ocr_cls 分类模块服务包
└─ ocr_det 检测模块服务包
└─ ocr_rec 识别模块服务包
└─ ocr_system 检测+识别串联服务包
+ └─ structure_table 表格识别服务包
```
每个服务包下包含3个文件。以2阶段串联服务包为例,目录如下:
@@ -43,7 +44,7 @@ deploy/hubserving/ocr_system/
```shell
# 安装paddlehub
# paddlehub 需要 python>3.6.2
-pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
+pip3 install paddlehub==2.1.0 --upgrade -i https://mirror.baidu.com/pypi/simple
```
### 2. 下载推理模型
@@ -52,12 +53,13 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
检测模型:./inference/ch_PP-OCRv2_det_infer/
识别模型:./inference/ch_PP-OCRv2_rec_infer/
方向分类器:./inference/ch_ppocr_mobile_v2.0_cls_infer/
+表格结构识别模型:./inference/en_ppocr_mobile_v2.0_table_structure_infer/
```
-**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的[模型库](../../doc/doc_ch/models_list.md)下载,也可以替换成自己训练转换好的模型。
+**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的模型库[PP-OCR](../../doc/doc_ch/models_list.md)和[PP-Structure](../../ppstructure/docs/models_list.md)下载,也可以替换成自己训练转换好的模型。
### 3. 安装服务模块
-PaddleOCR提供3种服务模块,根据需要安装所需模块。
+PaddleOCR提供5种服务模块,根据需要安装所需模块。
* 在Linux环境下,安装示例如下:
```shell
@@ -72,6 +74,9 @@ hub install deploy/hubserving/ocr_rec/
# 或,安装检测+识别串联服务模块:
hub install deploy/hubserving/ocr_system/
+
+# 或,安装表格识别服务模块:
+hub install deploy/hubserving/structure_table/
```
* 在Windows环境下(文件夹的分隔符为`\`),安装示例如下:
@@ -87,6 +92,9 @@ hub install deploy\hubserving\ocr_rec\
# 或,安装检测+识别串联服务模块:
hub install deploy\hubserving\ocr_system\
+
+# 或,安装表格识别服务模块:
+hub install deploy\hubserving\structure_table\
```
### 4. 启动服务
@@ -102,7 +110,7 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
**参数:**
|参数|用途|
-|-|-|
+|---|---|
|--modules/-m|PaddleHub Serving预安装模型,以多个Module==Version键值对的形式列出
*`当不指定Version时,默认选择最新版本`*|
|--port/-p|服务端口,默认为8866|
|--use_multiprocess|是否启用并发方式,默认为单进程方式,推荐多核CPU机器使用此方式
*`Windows操作系统只支持单进程方式`*|
@@ -157,11 +165,12 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
需要给脚本传递2个参数:
- **server_url**:服务地址,格式为
`http://[ip_address]:[port]/predict/[module_name]`
-例如,如果使用配置文件启动分类,检测、识别,检测+分类+识别3阶段服务,那么发送请求的url将分别是:
+例如,如果使用配置文件启动分类,检测、识别,检测+分类+识别3阶段,表格识别服务,那么发送请求的url将分别是:
`http://127.0.0.1:8865/predict/ocr_det`
`http://127.0.0.1:8866/predict/ocr_cls`
`http://127.0.0.1:8867/predict/ocr_rec`
`http://127.0.0.1:8868/predict/ocr_system`
+`http://127.0.0.1:8869/predict/structure_table`
- **image_dir**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径
- **visualize**:是否可视化结果,默认为False
@@ -172,7 +181,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
返回结果为列表(list),列表中的每一项为词典(dict),词典一共可能包含3种字段,信息如下:
|字段名称|数据类型|意义|
-|----|----|----|
+|---|---|---|
|angle|str|文本角度|
|text|str|文本内容|
|confidence|float| 文本识别置信度或文本角度分类置信度|
@@ -182,7 +191,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
不同模块返回的字段不同,如,文本识别服务模块返回结果不含`text_region`字段,具体信息如下:
| 字段名/模块名 | ocr_det | ocr_cls | ocr_rec | ocr_system | structure_table |
-| ---- | ---- | ---- | ---- | ---- | ---- |
+| --- | --- | --- | --- | --- | --- |
|angle| | ✔ | | ✔ | |
|text| | |✔|✔| |
|confidence| |✔ |✔| | |
diff --git a/deploy/hubserving/readme_en.md b/deploy/hubserving/readme_en.md
index f8fd3fa1..03c1630d 100755
--- a/deploy/hubserving/readme_en.md
+++ b/deploy/hubserving/readme_en.md
@@ -19,13 +19,14 @@ PaddleOCR provides 2 service deployment methods:
# Service deployment based on PaddleHub Serving
-The hubserving service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:
+The hubserving service deployment directory includes three service packages: text detection, text recognition, two-stage series connection and table recognition. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:
```
deploy/hubserving/
- └─ ocr_det detection module service package
- └─ ocr_cls angle class module service package
- └─ ocr_rec recognition module service package
+ └─ ocr_det text detection module service package
+ └─ ocr_cls text angle class module service package
+ └─ ocr_rec text recognition module service package
└─ ocr_system two-stage series connection service package
+ └─ structure_table table recognition service package
```
Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:
@@ -50,29 +51,33 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
### 2. Download inference model
Before installing the service module, you need to prepare the inference model and put it in the correct path. By default, the PP-OCRv2 models are used, and the default model path is:
```
-detection model: ./inference/ch_PP-OCRv2_det_infer/
-recognition model: ./inference/ch_PP-OCRv2_rec_infer/
-text direction classifier: ./inference/ch_ppocr_mobile_v2.0_cls_infer/
+text detection model: ./inference/ch_PP-OCRv2_det_infer/
+text recognition model: ./inference/ch_PP-OCRv2_rec_infer/
+text angle classifier: ./inference/ch_ppocr_mobile_v2.0_cls_infer/
+tanle recognition: ./inference/en_ppocr_mobile_v2.0_table_structure_infer/
```
**The model path can be found and modified in `params.py`.** More models provided by PaddleOCR can be obtained from the [model library](../../doc/doc_en/models_list_en.md). You can also use models trained by yourself.
### 3. Install Service Module
-PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.
+PaddleOCR provides 5 kinds of service modules, install the required modules according to your needs.
* On Linux platform, the examples are as follows.
```shell
-# Install the detection service module:
+# Install the text detection service module:
hub install deploy/hubserving/ocr_det/
-# Or, install the angle class service module:
+# Or, install the text angle class service module:
hub install deploy/hubserving/ocr_cls/
-# Or, install the recognition service module:
+# Or, install the text recognition service module:
hub install deploy/hubserving/ocr_rec/
# Or, install the 2-stage series service module:
hub install deploy/hubserving/ocr_system/
+
+# Or install table recognition service module
+hub install deploy/hubserving/structure_table/
```
* On Windows platform, the examples are as follows.
@@ -88,6 +93,9 @@ hub install deploy\hubserving\ocr_rec\
# Or, install the 2-stage series service module:
hub install deploy\hubserving\ocr_system\
+
+# Or install table recognition service module
+hub install deploy/hubserving/structure_table/
```
### 4. Start service
@@ -103,7 +111,7 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
**parameters:**
|parameters|usage|
-|-|-|
+|---|---|
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs
*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines
*`Windows operating system only supports single-process mode`*|
@@ -162,11 +170,13 @@ python tools/test_hubserving.py server_url image_path
Two parameters need to be passed to the script:
- **server_url**:service address,format of which is
`http://[ip_address]:[port]/predict/[module_name]`
-For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:
+For example, if using the configuration file to start the text angle classification, text detection, text recognition, detection+classification+recognition 3 stages, table recognition service, then the `server_url` to send the request will be:
+
`http://127.0.0.1:8865/predict/ocr_det`
`http://127.0.0.1:8866/predict/ocr_cls`
`http://127.0.0.1:8867/predict/ocr_rec`
`http://127.0.0.1:8868/predict/ocr_system`
+`http://127.0.0.1:8869/predict/structure_table`
- **image_dir**:Test image path, can be a single image path or an image directory path
- **visualize**:Whether to visualize the results, the default value is False
@@ -184,6 +194,7 @@ The returned result is a list. Each item in the list is a dict. The dict may con
|text|str|text content|
|confidence|float|text recognition confidence|
|text_region|list|text location coordinates|
+|html|str|table html str|
The fields returned by different modules are different. For example, the results returned by the text recognition service module do not contain `text_region`. The details are as follows:
--
GitLab