diff --git a/deploy/pdserving/README.md b/deploy/pdserving/README.md index 7ee001423084be2ed300135f706ed22f7e63a3ab..7ed52af90df653251e2501a032b26a00d9b96984 100644 --- a/deploy/pdserving/README.md +++ b/deploy/pdserving/README.md @@ -30,29 +30,31 @@ The introduction and tutorial of Paddle Serving service deployment framework ref PaddleOCR operating environment and Paddle Serving operating environment are needed. 1. Please prepare PaddleOCR operating environment reference [link](../../doc/doc_ch/installation.md). - Download the corresponding paddlepaddle whl package according to the environment, it is recommended to install version 2.2.2. + Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.2.2 2. The steps of PaddleServing operating environment prepare are as follows: - ```bash - # Install serving which used to start the service - wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl - pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl - # Install paddle-serving-server for cuda10.1 - # wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl - # pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl - # Install serving which used to start the service - wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl - pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl +```bash +# Install serving which used to start the service +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl +pip3 install paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl +# Install paddle-serving-server for cuda10.1 +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl +# pip3 install paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl - # Install serving-app - wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl - pip3 install paddle_serving_app-0.7.0-py3-none-any.whl - ``` +# Install serving which used to start the service +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp37-none-any.whl +pip3 install paddle_serving_client-0.8.3-cp37-none-any.whl - **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md). +# Install serving-app +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl +pip3 install paddle_serving_app-0.8.3-py3-none-any.whl +``` + + +**note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md). @@ -187,6 +189,26 @@ The recognition model is the same. 2021-05-13 03:42:36,979 chl1(In: ['det'], Out: ['rec']) size[6/0] 2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0] ``` +## C++ Serving + +1. Compile Serving + + To improve predictive performance, C++ services also provide multiple model concatenation services. Unlike Python Pipeline services, multiple model concatenation requires the pre - and post-model processing code to be written on the server side, so local recompilation is required to generate serving. Specific may refer to the official document: [how to compile Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_EN.md) + +2. Run the following command to start the service. + ``` + # Start the service and save the running log in log.txt + python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralRecOp --port 9293 &>log.txt & + ``` + After the service is successfully started, a log similar to the following will be printed in log.txt + ![](./imgs/start_server.png) + +3. Send service request + ``` + python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client + ``` + After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is: + ![](./imgs/results.png) ## WINDOWS Users diff --git a/deploy/pdserving/README_CN.md b/deploy/pdserving/README_CN.md index 0ac0f770c5b41616d66382c87ad9f6a123aebfa1..aad9e14e504481b8f9d113e6e293bfe4609d57b3 100644 --- a/deploy/pdserving/README_CN.md +++ b/deploy/pdserving/README_CN.md @@ -21,6 +21,7 @@ PaddleOCR提供2种服务部署方式: - [环境准备](#环境准备) - [模型转换](#模型转换) - [Paddle Serving pipeline部署](#部署) +- [Paddle Serving C++ 部署](#C++) - [Windows用户](#Windows用户) - [FAQ](#FAQ) @@ -30,28 +31,30 @@ PaddleOCR提供2种服务部署方式: 需要准备PaddleOCR的运行环境和Paddle Serving的运行环境。 - 准备PaddleOCR的运行环境[链接](../../doc/doc_ch/installation.md) + 根据环境下载对应的paddlepaddle whl包,推荐安装2.2.2版本 - 准备PaddleServing的运行环境,步骤如下 ```bash # 安装serving,用于启动服务 -wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl -pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl +pip3 install paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl # 如果是cuda10.1环境,可以使用下面的命令安装paddle-serving-server -# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl -# pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl +# pip3 install paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl # 安装client,用于向服务发送请求 -wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl -pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp37-none-any.whl +pip3 install paddle_serving_client-0.8.3-cp37-none-any.whl + # 安装serving-app -wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl -pip3 install paddle_serving_app-0.7.0-py3-none-any.whl +wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl +pip3 install paddle_serving_app-0.8.3-py3-none-any.whl ``` -**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md)。 +**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Latest_Packages_CN.md)。 ## 模型转换 @@ -106,7 +109,7 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \ 1. 下载PaddleOCR代码,若已下载可跳过此步骤 ``` git clone https://github.com/PaddlePaddle/PaddleOCR - + # 进入到工作目录 cd PaddleOCR/deploy/pdserving/ ``` @@ -187,6 +190,45 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \ 2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0] ``` + +## Paddle Serving C++ 部署 + +基于python的服务部署,显然具有二次开发便捷的优势,然而真正落地应用,往往需要追求更优的性能。PaddleServing 也提供了性能更优的C++部署版本。 + +C++ 服务部署在环境搭建和数据准备阶段与 python 相同,区别在于启动服务和客户端发送请求时不同。 + +1. 准备 Serving 环境 + +为了提高预测性能,C++ 服务同样提供了多模型串联服务。与python pipeline服务不同,多模型串联的过程中需要将模型前后处理代码写在服务端,因此需要在本地重新编译生成serving。具体可参考官方文档:[如何编译Serving](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Compile_CN.md) + +完成编译后,注意要安装编译出的三个whl包,并设置SERVING_BIN环境变量。 + +2. 启动服务可运行如下命令: + +一个服务启动两个模型串联,只需要在--model后依次按顺序传入模型文件夹的相对路径,且需要在--op后依次传入自定义C++OP类名称: + + ``` + # 启动服务,运行日志保存在log.txt + python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralRecOp --port 9293 &>log.txt & + ``` + 成功启动服务后,log.txt中会打印类似如下日志 + ![](./imgs/start_server.png) + +3. 发送服务请求: + ``` + python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client + ``` + + 成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为: + ![](./imgs/results.png) + + 在浏览器中输入服务器 ip:端口号,可以看到当前服务的实时QPS。(端口号范围需要是8000-9000) + + 在200张真实图片上测试,把检测长边限制为960。T4 GPU 上 QPS 峰值可达到51左右,约为pipeline的 2.12 倍。 + + ![](./imgs/c++_qps.png) + + ## Windows用户 diff --git a/deploy/pdserving/imgs/c++_qps.png b/deploy/pdserving/imgs/c++_qps.png new file mode 100644 index 0000000000000000000000000000000000000000..dc406acd624ea3f5fd51a56ae7c6d299c8211b48 Binary files /dev/null and b/deploy/pdserving/imgs/c++_qps.png differ diff --git a/deploy/pdserving/ocr_cpp_client.py b/deploy/pdserving/ocr_cpp_client.py index 2baa7565ac78b9551c788c7b36457bce38828eb5..21c5537fdfdf80363d70d2f493c8fb22386c70ac 100755 --- a/deploy/pdserving/ocr_cpp_client.py +++ b/deploy/pdserving/ocr_cpp_client.py @@ -45,7 +45,6 @@ for img_file in os.listdir(test_img_dir): image_data = file.read() image = cv2_to_base64(image_data) res_list = [] - #print(image) fetch_map = client.predict( feed={"x": image}, fetch=["save_infer_model/scale_0.tmp_1"], batch=True) print("fetrch map:", fetch_map)