diff --git a/deploy/pdserving/README.md b/deploy/pdserving/README.md index 07b019280ae160f9b9e3c98713c7a34e924d8a9e..d3ba7d4cfbabb111831a6ecbce28c1ac352066fe 100644 --- a/deploy/pdserving/README.md +++ b/deploy/pdserving/README.md @@ -36,7 +36,6 @@ PaddleOCR operating environment and Paddle Serving operating environment are nee 1. Please prepare PaddleOCR operating environment reference [link](../../doc/doc_ch/installation.md). Download the corresponding paddlepaddle whl package according to the environment, it is recommended to install version 2.2.2. - 2. The steps of PaddleServing operating environment prepare are as follows: @@ -194,6 +193,52 @@ The recognition model is the same. 2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0] ``` +## C++ Serving + +Service deployment based on python obviously has the advantage of convenient secondary development. However, the real application often needs to pursue better performance. PaddleServing also provides a more performant C++ deployment version. + +The C++ service deployment is the same as python in the environment setup and data preparation stages, the difference is when the service is started and the client sends requests. + +| Language | Speed ​​| Secondary development | Do you need to compile | +|-----|-----|---------|------------| +| C++ | fast | Slightly difficult | Single model prediction does not need to be compiled, multi-model concatenation needs to be compiled | +| python | general | easy | single-model/multi-model no compilation required | + +1. Compile Serving + + To improve predictive performance, C++ services also provide multiple model concatenation services. Unlike Python Pipeline services, multiple model concatenation requires the pre - and post-model processing code to be written on the server side, so local recompilation is required to generate serving. Specific may refer to the official document: [how to compile Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_EN.md) + +2. Run the following command to start the service. + ``` + # Start the service and save the running log in log.txt + python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt & + ``` + After the service is successfully started, a log similar to the following will be printed in log.txt + ![](./imgs/start_server.png) + +3. Send service request + + Due to the need for pre and post-processing in the C++Server part, in order to speed up the input to the C++Server is only the base64 encoded string of the picture, it needs to be manually modified + Change the feed_type field and shape field in ppocrv2_det_client/serving_client_conf.prototxt to the following: + + ``` + feed_var { + name: "x" + alias_name: "x" + is_lod_tensor: false + feed_type: 20 + shape: 1 + } + ``` + + start the client: + + ``` + python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client + ``` + After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is: + ![](./imgs/results.png) + ## WINDOWS Users Windows does not support Pipeline Serving, if we want to lauch paddle serving on Windows, we should use Web Service, for more infomation please refer to [Paddle Serving for Windows Users](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Windows_Tutorial_EN.md) diff --git a/deploy/pdserving/README_CN.md b/deploy/pdserving/README_CN.md index afd355bac098a3c13c36476e2967d8f94e8cd306..28ec45a046bbceb512d124377c2dcb3cf4d3c417 100644 --- a/deploy/pdserving/README_CN.md +++ b/deploy/pdserving/README_CN.md @@ -6,6 +6,8 @@ PaddleOCR提供2种服务部署方式: - 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md); - 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。 +* AIStudio演示案例可参考 [基于PaddleServing的OCR服务化部署实战](https://aistudio.baidu.com/aistudio/projectdetail/3630726) + # 基于PaddleServing的服务部署 本文档将介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PP-OCR动态图模型的pipeline在线服务。 @@ -30,7 +32,6 @@ PaddleOCR提供2种服务部署方式: 需要准备PaddleOCR的运行环境和Paddle Serving的运行环境。 - 准备PaddleOCR的运行环境[链接](../../doc/doc_ch/installation.md) - 根据环境下载对应的paddlepaddle whl包,推荐安装2.2.2版本 - 准备PaddleServing的运行环境,步骤如下 @@ -106,7 +107,7 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \ 1. 下载PaddleOCR代码,若已下载可跳过此步骤 ``` git clone https://github.com/PaddlePaddle/PaddleOCR - + # 进入到工作目录 cd PaddleOCR/deploy/pdserving/ ``` @@ -187,6 +188,73 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \ 2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0] ``` + +## Paddle Serving C++ 部署 + +基于python的服务部署,显然具有二次开发便捷的优势,然而真正落地应用,往往需要追求更优的性能。PaddleServing 也提供了性能更优的C++部署版本。 + +C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别在于启动服务和客户端发送请求时不同。 + +| 语言 | 速度 | 二次开发 | 是否需要编译 | +|-----|-----|---------|------------| +| C++ | 很快 | 略有难度 | 单模型预测无需编译,多模型串联需要编译 | +| python | 一般 | 容易 | 单模型/多模型 均无需编译| + +1. 准备 Serving 环境 + +为了提高预测性能,C++ 服务同样提供了多模型串联服务。与python pipeline服务不同,多模型串联的过程中需要将模型前后处理代码写在服务端,因此需要在本地重新编译生成serving。 + +首先需要下载Serving代码库, 把OCR文本检测预处理相关代码替换到Serving库中 +``` +git clone https://github.com/PaddlePaddle/Serving + +cp -rf general_detection_op.cpp Serving/core/general-server/op + +``` + +具体可参考官方文档:[如何编译Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_CN.md),注意需要开启 WITH_OPENCV 选项。 + +完成编译后,注意要安装编译出的三个whl包,并设置SERVING_BIN环境变量。 + +2. 启动服务可运行如下命令: + +一个服务启动两个模型串联,只需要在--model后依次按顺序传入模型文件夹的相对路径,且需要在--op后依次传入自定义C++OP类名称: + + ``` + # 启动服务,运行日志保存在log.txt + python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt & + ``` + 成功启动服务后,log.txt中会打印类似如下日志 + ![](./imgs/start_server.png) + +3. 发送服务请求: + + 由于需要在C++Server部分进行前后处理,为了加速传入C++Server的仅仅是图片的base64编码的字符串,故需要手动修改 + ppocrv2_det_client/serving_client_conf.prototxt 中 feed_type 字段 和 shape 字段,修改成如下内容: + ``` + feed_var { + name: "x" + alias_name: "x" + is_lod_tensor: false + feed_type: 20 + shape: 1 + } + ``` + 启动客户端 + ``` + python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client + ``` + + 成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为: + ![](./imgs/results.png) + + 在浏览器中输入服务器 ip:端口号,可以看到当前服务的实时QPS。(端口号范围需要是8000-9000) + + 在200张真实图片上测试,把检测长边限制为960。T4 GPU 上 QPS 峰值可达到51左右,约为pipeline的 2.12 倍。 + + ![](./imgs/c++_qps.png) + + ## Windows用户 diff --git a/deploy/pdserving/ocr_cpp_client.py b/deploy/pdserving/ocr_cpp_client.py index 2baa7565ac78b9551c788c7b36457bce38828eb5..cb42943923879d1138e065881a15da893a505083 100755 --- a/deploy/pdserving/ocr_cpp_client.py +++ b/deploy/pdserving/ocr_cpp_client.py @@ -45,10 +45,8 @@ for img_file in os.listdir(test_img_dir): image_data = file.read() image = cv2_to_base64(image_data) res_list = [] - #print(image) fetch_map = client.predict( feed={"x": image}, fetch=["save_infer_model/scale_0.tmp_1"], batch=True) - print("fetrch map:", fetch_map) one_batch_res = ocr_reader.postprocess(fetch_map, with_score=True) for res in one_batch_res: res_list.append(res[0]) diff --git a/deploy/pdserving/web_service.py b/deploy/pdserving/web_service.py index b97c6e1f564a61bb9792542b9e9f1e88d782e80d..e6b13fcc23adf6d71e01b5a65622ec3eb81142e8 100644 --- a/deploy/pdserving/web_service.py +++ b/deploy/pdserving/web_service.py @@ -15,6 +15,7 @@ from paddle_serving_server.web_service import WebService, Op import logging import numpy as np +import copy import cv2 import base64 # from paddle_serving_app.reader import OCRReader @@ -36,7 +37,7 @@ class DetOp(Op): self.filter_func = FilterBoxes(10, 10) self.post_func = DBPostProcess({ "thresh": 0.3, - "box_thresh": 0.5, + "box_thresh": 0.6, "max_candidates": 1000, "unclip_ratio": 1.5, "min_size": 3 @@ -79,8 +80,10 @@ class RecOp(Op): raw_im = input_dict["image"] data = np.frombuffer(raw_im, np.uint8) im = cv2.imdecode(data, cv2.IMREAD_COLOR) - dt_boxes = input_dict["dt_boxes"] - dt_boxes = self.sorted_boxes(dt_boxes) + self.dt_list = input_dict["dt_boxes"] + self.dt_list = self.sorted_boxes(self.dt_list) + # deepcopy to save origin dt_boxes + dt_boxes = copy.deepcopy(self.dt_list) feed_list = [] img_list = [] max_wh_ratio = 0 @@ -126,25 +129,29 @@ class RecOp(Op): imgs[id] = norm_img feed = {"x": imgs.copy()} feed_list.append(feed) - return feed_list, False, None, "" def postprocess(self, input_dicts, fetch_data, data_id, log_id): - res_list = [] + rec_list = [] + dt_num = len(self.dt_list) if isinstance(fetch_data, dict): if len(fetch_data) > 0: rec_batch_res = self.ocr_reader.postprocess( fetch_data, with_score=True) for res in rec_batch_res: - res_list.append(res[0]) + rec_list.append(res[0]) elif isinstance(fetch_data, list): for one_batch in fetch_data: one_batch_res = self.ocr_reader.postprocess( one_batch, with_score=True) for res in one_batch_res: - res_list.append(res[0]) - - res = {"res": str(res_list)} + rec_list.append(res[0]) + result_list = [] + for i in range(dt_num): + text = rec_list[i] + dt_box = self.dt_list[i] + result_list.append([text, dt_box.tolist()]) + res = {"result": str(result_list)} return res, None, ""