提交 6077d952 编写于 作者: T tink2123

update readme, return bbox

上级 d529f7ae
...@@ -31,8 +31,6 @@ PaddleOCR operating environment and Paddle Serving operating environment are nee ...@@ -31,8 +31,6 @@ PaddleOCR operating environment and Paddle Serving operating environment are nee
1. Please prepare PaddleOCR operating environment reference [link](../../doc/doc_ch/installation.md). 1. Please prepare PaddleOCR operating environment reference [link](../../doc/doc_ch/installation.md).
Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.2.2
2. The steps of PaddleServing operating environment prepare are as follows: 2. The steps of PaddleServing operating environment prepare are as follows:
...@@ -191,6 +189,15 @@ The recognition model is the same. ...@@ -191,6 +189,15 @@ The recognition model is the same.
``` ```
## C++ Serving ## C++ Serving
Service deployment based on python obviously has the advantage of convenient secondary development. However, the real application often needs to pursue better performance. PaddleServing also provides a more performant C++ deployment version.
The C++ service deployment is the same as python in the environment setup and data preparation stages, the difference is when the service is started and the client sends requests.
| Language | Speed ​​| Secondary development | Do you need to compile |
|-----|-----|---------|------------|
| C++ | fast | Slightly difficult | Single model prediction does not need to be compiled, multi-model concatenation needs to be compiled |
| python | general | easy | single-model/multi-model no compilation required |
1. Compile Serving 1. Compile Serving
To improve predictive performance, C++ services also provide multiple model concatenation services. Unlike Python Pipeline services, multiple model concatenation requires the pre - and post-model processing code to be written on the server side, so local recompilation is required to generate serving. Specific may refer to the official document: [how to compile Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_EN.md) To improve predictive performance, C++ services also provide multiple model concatenation services. Unlike Python Pipeline services, multiple model concatenation requires the pre - and post-model processing code to be written on the server side, so local recompilation is required to generate serving. Specific may refer to the official document: [how to compile Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_EN.md)
...@@ -198,12 +205,28 @@ The recognition model is the same. ...@@ -198,12 +205,28 @@ The recognition model is the same.
2. Run the following command to start the service. 2. Run the following command to start the service.
``` ```
# Start the service and save the running log in log.txt # Start the service and save the running log in log.txt
python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralRecOp --port 9293 &>log.txt & python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
``` ```
After the service is successfully started, a log similar to the following will be printed in log.txt After the service is successfully started, a log similar to the following will be printed in log.txt
![](./imgs/start_server.png) ![](./imgs/start_server.png)
3. Send service request 3. Send service request
Due to the need for pre and post-processing in the C++Server part, in order to speed up the input to the C++Server is only the base64 encoded string of the picture, it needs to be manually modified
Change the feed_type field and shape field in ppocrv2_det_client/serving_client_conf.prototxt to the following:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 20
shape: 1
}
```
start the client:
``` ```
python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client
``` ```
......
...@@ -6,6 +6,8 @@ PaddleOCR提供2种服务部署方式: ...@@ -6,6 +6,8 @@ PaddleOCR提供2种服务部署方式:
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md) - 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md)
- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。 - 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。
* AIStudio演示案例可参考 [基于PaddleServing的OCR服务化部署实战](https://aistudio.baidu.com/aistudio/projectdetail/3630726)
# 基于PaddleServing的服务部署 # 基于PaddleServing的服务部署
本文档将介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PP-OCR动态图模型的pipeline在线服务。 本文档将介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PP-OCR动态图模型的pipeline在线服务。
...@@ -32,8 +34,6 @@ PaddleOCR提供2种服务部署方式: ...@@ -32,8 +34,6 @@ PaddleOCR提供2种服务部署方式:
- 准备PaddleOCR的运行环境[链接](../../doc/doc_ch/installation.md) - 准备PaddleOCR的运行环境[链接](../../doc/doc_ch/installation.md)
根据环境下载对应的paddlepaddle whl包,推荐安装2.2.2版本
- 准备PaddleServing的运行环境,步骤如下 - 准备PaddleServing的运行环境,步骤如下
```bash ```bash
...@@ -197,9 +197,24 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \ ...@@ -197,9 +197,24 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别在于启动服务和客户端发送请求时不同。 C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别在于启动服务和客户端发送请求时不同。
| 语言 | 速度 | 二次开发 | 是否需要编译 |
|-----|-----|---------|------------|
| C++ | 很快 | 略有难度 | 单模型预测无需编译,多模型串联需要编译 |
| python | 一般 | 容易 | 单模型/多模型 均无需编译|
1. 准备 Serving 环境 1. 准备 Serving 环境
为了提高预测性能,C++ 服务同样提供了多模型串联服务。与python pipeline服务不同,多模型串联的过程中需要将模型前后处理代码写在服务端,因此需要在本地重新编译生成serving。具体可参考官方文档:[如何编译Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_CN.md) 为了提高预测性能,C++ 服务同样提供了多模型串联服务。与python pipeline服务不同,多模型串联的过程中需要将模型前后处理代码写在服务端,因此需要在本地重新编译生成serving。
首先需要下载Serving代码库, 把OCR文本检测预处理相关代码替换到Serving库中
```
git clone https://github.com/PaddlePaddle/Serving
cp -rf general_detection_op.cpp Serving/core/general-server/op
```
具体可参考官方文档:[如何编译Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_CN.md),注意需要开启 WITH_OPENCV 选项。
完成编译后,注意要安装编译出的三个whl包,并设置SERVING_BIN环境变量。 完成编译后,注意要安装编译出的三个whl包,并设置SERVING_BIN环境变量。
...@@ -209,12 +224,25 @@ C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别 ...@@ -209,12 +224,25 @@ C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别
``` ```
# 启动服务,运行日志保存在log.txt # 启动服务,运行日志保存在log.txt
python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralRecOp --port 9293 &>log.txt & python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
``` ```
成功启动服务后,log.txt中会打印类似如下日志 成功启动服务后,log.txt中会打印类似如下日志
![](./imgs/start_server.png) ![](./imgs/start_server.png)
3. 发送服务请求: 3. 发送服务请求:
由于需要在C++Server部分进行前后处理,为了加速传入C++Server的仅仅是图片的base64编码的字符串,故需要手动修改
ppocrv2_det_client/serving_client_conf.prototxt 中 feed_type 字段 和 shape 字段,修改成如下内容:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 20
shape: 1
}
```
启动客户端
``` ```
python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client
``` ```
......
...@@ -47,9 +47,7 @@ for img_file in os.listdir(test_img_dir): ...@@ -47,9 +47,7 @@ for img_file in os.listdir(test_img_dir):
res_list = [] res_list = []
fetch_map = client.predict( fetch_map = client.predict(
feed={"x": image}, fetch=["save_infer_model/scale_0.tmp_1"], batch=True) feed={"x": image}, fetch=["save_infer_model/scale_0.tmp_1"], batch=True)
print("fetrch map:", fetch_map)
one_batch_res = ocr_reader.postprocess(fetch_map, with_score=True) one_batch_res = ocr_reader.postprocess(fetch_map, with_score=True)
for res in one_batch_res: for res in one_batch_res:
res_list.append(res[0])
res = {"res": str(res_list)} res = {"res": str(res_list)}
print(res) print(res)
...@@ -15,6 +15,7 @@ from paddle_serving_server.web_service import WebService, Op ...@@ -15,6 +15,7 @@ from paddle_serving_server.web_service import WebService, Op
import logging import logging
import numpy as np import numpy as np
import copy
import cv2 import cv2
import base64 import base64
# from paddle_serving_app.reader import OCRReader # from paddle_serving_app.reader import OCRReader
...@@ -29,14 +30,16 @@ _LOGGER = logging.getLogger() ...@@ -29,14 +30,16 @@ _LOGGER = logging.getLogger()
class DetOp(Op): class DetOp(Op):
def init_op(self): def init_op(self):
self.det_preprocess = Sequential([ self.det_preprocess = Sequential([
DetResizeForTest(), Div(255), DetResizeForTest(),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose( Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
Transpose(
(2, 0, 1)) (2, 0, 1))
]) ])
self.filter_func = FilterBoxes(10, 10) self.filter_func = FilterBoxes(10, 10)
self.post_func = DBPostProcess({ self.post_func = DBPostProcess({
"thresh": 0.3, "thresh": 0.3,
"box_thresh": 0.5, "box_thresh": 0.6,
"max_candidates": 1000, "max_candidates": 1000,
"unclip_ratio": 1.5, "unclip_ratio": 1.5,
"min_size": 3 "min_size": 3
...@@ -79,8 +82,10 @@ class RecOp(Op): ...@@ -79,8 +82,10 @@ class RecOp(Op):
raw_im = input_dict["image"] raw_im = input_dict["image"]
data = np.frombuffer(raw_im, np.uint8) data = np.frombuffer(raw_im, np.uint8)
im = cv2.imdecode(data, cv2.IMREAD_COLOR) im = cv2.imdecode(data, cv2.IMREAD_COLOR)
dt_boxes = input_dict["dt_boxes"] self.dt_list = input_dict["dt_boxes"]
dt_boxes = self.sorted_boxes(dt_boxes) self.dt_list = self.sorted_boxes(self.dt_list)
# deepcopy to save origin dt_boxes
dt_boxes = copy.deepcopy(self.dt_list)
feed_list = [] feed_list = []
img_list = [] img_list = []
max_wh_ratio = 0 max_wh_ratio = 0
...@@ -126,25 +131,29 @@ class RecOp(Op): ...@@ -126,25 +131,29 @@ class RecOp(Op):
imgs[id] = norm_img imgs[id] = norm_img
feed = {"x": imgs.copy()} feed = {"x": imgs.copy()}
feed_list.append(feed) feed_list.append(feed)
return feed_list, False, None, "" return feed_list, False, None, ""
def postprocess(self, input_dicts, fetch_data, data_id, log_id): def postprocess(self, input_dicts, fetch_data, data_id, log_id):
res_list = [] rec_list = []
dt_num = len(self.dt_list)
if isinstance(fetch_data, dict): if isinstance(fetch_data, dict):
if len(fetch_data) > 0: if len(fetch_data) > 0:
rec_batch_res = self.ocr_reader.postprocess( rec_batch_res = self.ocr_reader.postprocess(
fetch_data, with_score=True) fetch_data, with_score=True)
for res in rec_batch_res: for res in rec_batch_res:
res_list.append(res[0]) rec_list.append(res[0])
elif isinstance(fetch_data, list): elif isinstance(fetch_data, list):
for one_batch in fetch_data: for one_batch in fetch_data:
one_batch_res = self.ocr_reader.postprocess( one_batch_res = self.ocr_reader.postprocess(
one_batch, with_score=True) one_batch, with_score=True)
for res in one_batch_res: for res in one_batch_res:
res_list.append(res[0]) rec_list.append(res[0])
result_list = []
res = {"res": str(res_list)} for i in range(dt_num):
text = rec_list[i]
dt_box = self.dt_list[i]
result_list.append([text,dt_box.tolist()])
res = {"result": str(result_list)}
return res, None, "" return res, None, ""
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册