diff --git a/deploy/paddleserving/readme.md b/deploy/paddleserving/readme.md index 15f5be36d233ff24673d2185e229e276eb49cd20..e3de61f5e58f3f7c6dade999e26d78a2511350a3 120000 --- a/deploy/paddleserving/readme.md +++ b/deploy/paddleserving/readme.md @@ -1,4 +1,5 @@ 简体中文 | [English](./readme_en.md) + # 分类模型服务化部署 ## 目录 diff --git a/deploy/paddleserving/readme_en.md b/deploy/paddleserving/readme_en.md new file mode 100644 index 0000000000000000000000000000000000000000..55643f7537c7de1a000787df0fa7784980c18abe --- /dev/null +++ b/deploy/paddleserving/readme_en.md @@ -0,0 +1,239 @@ +English | [简体中文](./readme.md) + +# Classification model service deployment + +## Table of contents + +- [1 Introduction](#1-introduction) +- [2. Serving installation](#2-serving-installation) +- [3. Image Classification Service Deployment](#3-image-classification-service-deployment) +- [3.1 Model conversion](#31-model-conversion) +- [3.2 Service deployment and request](#32-service-deployment-and-request) + - [3.2.1 Python Serving](#321-python-serving) + - [3.2.2 C++ Serving](#322-c-serving) + + +## 1 Introduction + +[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages. + +This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported. + + +## 2. Serving installation + +The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker. + +```shell +# start GPU docker +docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel +nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash +nvidia-docker exec -it test bash + +# start CPU docker +docker pull paddlepaddle/serving:0.7.0-devel +docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash +docker exec -it test bash +``` + +After entering docker, you need to install Serving-related python packages. +```shell +python3.7 -m pip install paddle-serving-client==0.7.0 +python3.7 -m pip install paddle-serving-app==0.7.0 +python3.7 -m pip install faiss-cpu==1.7.1post2 + +#If it is a CPU deployment environment: +python3.7 -m pip install paddle-serving-server==0.7.0 #CPU +python3.7 -m pip install paddlepaddle==2.2.0 # CPU + +#If it is a GPU deployment environment +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6 +python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2 + +#Other GPU environments need to confirm the environment and then choose which one to execute +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6 +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8 +``` + +* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process. +* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_EN.md) + + + +## 3. Image Classification Service Deployment + +The following takes the classic ResNet50_vd model as an example to introduce how to deploy the image classification service. + + +### 3.1 Model conversion + +When using PaddleServing for service deployment, you need to convert the saved inference model into a Serving model. +- Go to the working directory: + ```shell + cd deploy/paddleserving + ``` +- Download and unzip the inference model for ResNet50_vd: + ```shell + # Download ResNet50_vd inference model + wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar + # Decompress the ResNet50_vd inference model + tar xf ResNet50_vd_infer.tar + ``` +- Use the paddle_serving_client command to convert the downloaded inference model into a model format for easy server deployment: + ```shell + # Convert ResNet50_vd model + python3.7 -m paddle_serving_client.convert \ + --dirname ./ResNet50_vd_infer/ \ + --model_filename inference.pdmodel \ + --params_filename inference.pdiparams \ + --serving_server ./ResNet50_vd_serving/ \ + --serving_client ./ResNet50_vd_client/ + ``` + The specific meaning of the parameters in the above command is shown in the following table + | parameter | type | default value | description | + | --------- | ---- | ------------- | ----------- |-------------------------------------------------- --- | + | `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory. | + | `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename | + | `params_filename` | str | None | File name where all parameters of the model to be converted are stored. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None | + | `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server | + | `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is serving_client | + + After the ResNet50_vd inference model conversion is completed, there will be additional `ResNet50_vd_serving` and `ResNet50_vd_client` folders in the current folder, with the following structure: + ```shell + ├── ResNet50_vd_serving/ + │ ├── inference.pdiparams + │ ├── inference.pdmodel + │ ├── serving_server_conf.prototxt + │ └── serving_server_conf.stream.prototxt + │ + └── ResNet50_vd_client/ + ├── serving_client_conf.prototxt + └── serving_client_conf.stream.prototxt + ``` + +- Serving provides the function of input and output renaming in order to be compatible with the deployment of different models. When different models are deployed in inference, you only need to modify the `alias_name` of the configuration file, and the inference deployment can be completed without modifying the code. Therefore, after the conversion, you need to modify the alias names in the files `serving_server_conf.prototxt` under `ResNet50_vd_serving` and `ResNet50_vd_client` respectively, and change the `alias_name` in `fetch_var` to `prediction`, the modified serving_server_conf.prototxt is as follows Show: + ```log + feed_var { + name: "inputs" + alias_name: "inputs" + is_lod_tensor: false + feed_type: 1 + shape: 3 + shape: 224 + shape: 224 + } + fetch_var { + name: "save_infer_model/scale_0.tmp_1" + alias_name: "prediction" + is_lod_tensor: false + fetch_type: 1 + shape: 1000 + } + ``` + +### 3.2 Service deployment and request + +The paddleserving directory contains the code for starting the pipeline service, the C++ serving service and sending the prediction request, mainly including: +```shell +__init__.py +classification_web_service.py # Script to start the pipeline server +config.yml # Configuration file to start the pipeline service +pipeline_http_client.py # Script for sending pipeline prediction requests in http mode +pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode +readme.md # Classification model service deployment document +run_cpp_serving.sh # Start the C++ Serving departmentscript +test_cpp_serving_client.py # Script for sending C++ serving prediction requests in rpc mode +``` + +#### 3.2.1 Python Serving + +- Start the service: + ```shell + # Start the service and save the running log in log.txt + python3.7 classification_web_service.py &>log.txt & + ``` + +- send request: + ```shell + # send service request + python3.7 pipeline_http_client.py + ``` + After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows: + ```log + {'err_no': 0, 'err_msg': '', 'key': ['label', 'prob'], 'value': ["['daisy']", '[0.9341402053833008]'], 'tensors ': []} + ``` +- turn off the service +If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program: + ```bash + python3.7 -m paddle_serving_server.serve stop + ``` + After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down. + + +#### 3.2.2 C++ Serving + +Different from Python Serving, the C++ Serving client calls C++ OP to predict, so before starting the service, you need to compile and install the serving server package, and set `SERVING_BIN`. + +- Compile and install the Serving server package + ```shell + # Enter the working directory + cd PaddleClas/deploy/paddleserving + # One-click compile and install Serving server, set SERVING_BIN + bash ./build_server.sh python3.7 + ``` + **Note: The path set by **[build_server.sh](./build_server.sh#L55-L62) may need to be modified according to the actual machine environment such as CUDA, python version, etc., and then compiled. + +- Modify the client file `ResNet50_client/serving_client_conf.prototxt` , change the field after `feed_type:` to 20, change the field after the first `shape:` to 1 and delete the rest of the `shape` fields. + ```log + feed_var { + name: "inputs" + alias_name: "inputs" + is_lod_tensor: false + feed_type: 20 + shape: 1 + } + ``` +- Modify part of the code of [`test_cpp_serving_client`](./test_cpp_serving_client.py) + 1. Modify the [`feed={"inputs": image}`](./test_cpp_serving_client.py#L28) part of the code, and change the path after `load_client_config` to `ResNet50_client/serving_client_conf.prototxt` . + 2. Modify the [`feed={"inputs": image}`](./test_cpp_serving_client.py#L45) part of the code, and change `inputs` to be the same as the `feed_var` field in `ResNet50_client/serving_client_conf.prototxt` name` is the same. Since `name` in some model client files is `x` instead of `inputs` , you need to pay attention to this when using these models for C++ Serving deployment. + +- Start the service: + ```shell + # Start the service, the service runs in the background, and the running log is saved in nohup.txt + # CPU deployment + sh run_cpp_serving.sh + # GPU deployment and specify card 0 + sh run_cpp_serving.sh 0 + ``` + +- send request: + ```shell + # send service request + python3.7 test_cpp_serving_client.py + ``` + After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows: + ```log + prediction: daisy, probability: 0.9341399073600769 + ``` +- turn off the service +If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program: + ```bash + python3.7 -m paddle_serving_server.serve stop + ``` + After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down. + +##4.FAQ + +**Q1**: No result is returned after the request is sent or an output decoding error is prompted + +**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is: +```shell +unset https_proxy +unset http_proxy +``` + +**Q2**: nothing happens after starting the service + +**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct + +For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples) diff --git a/deploy/paddleserving/recognition/readme.md b/deploy/paddleserving/recognition/readme.md index 652faf377d2e6a653fbd9881cb8ed349ac9c9d7c..40cd8b851a96258f25a75270881663be28c464ad 100644 --- a/deploy/paddleserving/recognition/readme.md +++ b/deploy/paddleserving/recognition/readme.md @@ -1,15 +1,17 @@ 简体中文 | [English](./readme_en.md) + # 识别模型服务化部署 ## 目录 -- [1. 简介](#1-简介) -- [2. Serving 安装](#2-serving-安装) -- [3. 图像识别服务部署](#3-图像识别服务部署) - - [3.1 模型转换](#31-模型转换) - - [3.2.1 Python Serving](#321-python-serving) - - [3.2.2 C++ Serving](#322-c-serving) -- [4. FAQ](#4-faq) + - [1. 简介](#1-简介) + - [2. Serving 安装](#2-serving-安装) + - [3. 图像识别服务部署](#3-图像识别服务部署) + - [3.1 模型转换](#31-模型转换) + - [3.2 服务部署和请求](#32-服务部署和请求) + - [3.2.1 Python Serving](#321-python-serving) + - [3.2.2 C++ Serving](#322-c-serving) + - [4. FAQ](#4-faq) ## 1. 简介 @@ -64,6 +66,7 @@ python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUD 使用 PaddleServing 做图像识别服务化部署时,**需要将保存的多个 inference 模型都转换为 Serving 模型**。 下面以 PP-ShiTu 中的超轻量图像识别模型为例,介绍图像识别服务的部署。 + ### 3.1 模型转换 - 进入工作目录: @@ -95,16 +98,16 @@ python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUD 上述命令的参数含义与[#3.1 模型转换](#3.1)相同 通用识别 inference 模型转换完成后,会在当前文件夹多出 `general_PPLCNet_x2_5_lite_v1.0_serving/` 和 `general_PPLCNet_x2_5_lite_v1.0_client/` 的文件夹,具备如下结构: ```shell - ├── general_PPLCNet_x2_5_lite_v1.0_serving/ - │ ├── inference.pdiparams - │ ├── inference.pdmodel - │ ├── serving_server_conf.prototxt - │ └── serving_server_conf.stream.prototxt - │ - └── general_PPLCNet_x2_5_lite_v1.0_client/ - ├── serving_client_conf.prototxt - └── serving_client_conf.stream.prototxt - ``` + ├── general_PPLCNet_x2_5_lite_v1.0_serving/ + │ ├── inference.pdiparams + │ ├── inference.pdmodel + │ ├── serving_server_conf.prototxt + │ └── serving_server_conf.stream.prototxt + │ + └── general_PPLCNet_x2_5_lite_v1.0_client/ + ├── serving_client_conf.prototxt + └── serving_client_conf.stream.prototxt + ``` - 转换通用检测 inference 模型为 Serving 模型: ```shell # 转换通用检测模型 @@ -118,16 +121,16 @@ python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUD 通用检测 inference 模型转换完成后,会在当前文件夹多出 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` 和 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` 的文件夹,具备如下结构: ```shell - ├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ - │ ├── inference.pdiparams - │ ├── inference.pdmodel - │ ├── serving_server_conf.prototxt - │ └── serving_server_conf.stream.prototxt - │ - └── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/ - ├── serving_client_conf.prototxt - └── serving_client_conf.stream.prototxt - ``` + ├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ + │ ├── inference.pdiparams + │ ├── inference.pdmodel + │ ├── serving_server_conf.prototxt + │ └── serving_server_conf.stream.prototxt + │ + └── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/ + ├── serving_client_conf.prototxt + └── serving_client_conf.stream.prototxt + ``` 上述命令中参数具体含义如下表所示 | 参数 | 类型 | 默认值 | 描述 | | ----------------- | ---- | ------------------ | ------------------------------------------------------------ | diff --git a/deploy/paddleserving/recognition/readme_en.md b/deploy/paddleserving/recognition/readme_en.md new file mode 100644 index 0000000000000000000000000000000000000000..e2719e64b4dc7f85959b64a6713d230e5dbffa93 --- /dev/null +++ b/deploy/paddleserving/recognition/readme_en.md @@ -0,0 +1,260 @@ +English | [简体中文](./readme.md) + +# Identify model service deployment + +## Table of contents + +- [1 Introduction](#1-introduction) +- [2. Serving installation](#2-serving-installation) +- [3. Image recognition service deployment](#3-image-recognition-service-deployment) +- [3.1 Model conversion](#31-model-conversion) +- [3.2 Service deployment and request](#32-service-deployment-and-request) + - [3.2.1 Python Serving](#321-python-serving) + - [3.2.2 C++ Serving](#322-c-serving) +- [4. FAQ](#4-faq) + + +## 1 Introduction + +[Paddle Serving](https://github.com/PaddlePaddle/Serving) aims to help deep learning developers easily deploy online prediction services, support one-click deployment of industrial-grade service capabilities, high concurrency between client and server Efficient communication and support for developing clients in multiple programming languages. + +This section takes the HTTP prediction service deployment as an example to introduce how to use PaddleServing to deploy the model service in PaddleClas. Currently, only Linux platform deployment is supported, and Windows platform is not currently supported. + + +## 2. Serving installation + +The Serving official website recommends using docker to install and deploy the Serving environment. First, you need to pull the docker environment and create a Serving-based docker. + +```shell +# start GPU docker +docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel +nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash +nvidia-docker exec -it test bash + +# start CPU docker +docker pull paddlepaddle/serving:0.7.0-devel +docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash +docker exec -it test bash +``` + +After entering docker, you need to install Serving-related python packages. +```shell +python3.7 -m pip install paddle-serving-client==0.7.0 +python3.7 -m pip install paddle-serving-app==0.7.0 +python3.7 -m pip install faiss-cpu==1.7.1post2 + +#If it is a CPU deployment environment: +python3.7 -m pip install paddle-serving-server==0.7.0 #CPU +python3.7 -m pip install paddlepaddle==2.2.0 # CPU + +#If it is a GPU deployment environment +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post102 # GPU with CUDA10.2 + TensorRT6 +python3.7 -m pip install paddlepaddle-gpu==2.2.0 # GPU with CUDA10.2 + +#Other GPU environments need to confirm the environment and then choose which one to execute +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6 +python3.7 -m pip install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8 +``` + +* If the installation speed is too slow, you can change the source through `-i https://pypi.tuna.tsinghua.edu.cn/simple` to speed up the installation process. +* For other environment configuration installation, please refer to: [Install Paddle Serving with Docker](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md) + + + + +## 3. Image recognition service deployment + +When using PaddleServing for image recognition service deployment, **need to convert multiple saved inference models to Serving models**. The following takes the ultra-lightweight image recognition model in PP-ShiTu as an example to introduce the deployment of image recognition services. + +### 3.1 Model conversion + +- Go to the working directory: + ```shell + cd deploy/ + ``` +- Download generic detection inference model and generic recognition inference model + ```shell + # Create and enter the models folder + mkdir models + cd models + # Download and unzip the generic recognition model + wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar + tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar + # Download and unzip the generic detection model + wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar + tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar + ``` +- Convert the generic recognition inference model to the Serving model: + ```shell + # Convert the generic recognition model + python3.7 -m paddle_serving_client.convert \ + --dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \ + --model_filename inference.pdmodel \ + --params_filename inference.pdiparams \ + --serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \ + --serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/ + ``` + The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1) + After the conversion of the general recognition inference model is completed, there will be additional `general_PPLCNet_x2_5_lite_v1.0_serving/` and `general_PPLCNet_x2_5_lite_v1.0_client/` folders in the current folder, with the following structure: + ```shell + ├── general_PPLCNet_x2_5_lite_v1.0_serving/ + │ ├── inference.pdiparams + │ ├── inference.pdmodel + │ ├── serving_server_conf.prototxt + │ └── serving_server_conf.stream.prototxt + │ + └── general_PPLCNet_x2_5_lite_v1.0_client/ + ├── serving_client_conf.prototxt + └── serving_client_conf.stream.prototxt + ``` +- Convert general detection inference model to Serving model: + ```shell + # Convert generic detection model + python3.7 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \ + --model_filename inference.pdmodel \ + --params_filename inference.pdiparams \ + --serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \ + --serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/ + ``` + The meaning of the parameters of the above command is the same as [#3.1 Model conversion](#3.1) + + After the conversion of the general detection inference model is completed, there will be additional folders `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` and `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` in the current folder, with the following structure: + ```shell + ├── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ + │ ├── inference.pdiparams + │ ├── inference.pdmodel + │ ├── serving_server_conf.prototxt + │ └── serving_server_conf.stream.prototxt + │ + └── picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/ + ├── serving_client_conf.prototxt + └── serving_client_conf.stream.prototxt + ``` + The specific meaning of the parameters in the above command is shown in the following table + | parameter | type | default value | description | + | --------- | ---- | ------------- | ----------- | + | `dirname` | str | - | The storage path of the model file to be converted. The program structure file and parameter file are saved in this directory. | + | `model_filename` | str | None | The name of the file storing the model Inference Program structure that needs to be converted. If set to None, use `__model__` as the default filename | + | `params_filename` | str | None | The name of the file that stores all parameters of the model that need to be transformed. It needs to be specified if and only if all model parameters are stored in a single binary file. If the model parameters are stored in separate files, set it to None | + | `serving_server` | str | `"serving_server"` | The storage path of the converted model files and configuration files. Default is serving_server | + | `serving_client` | str | `"serving_client"` | The converted client configuration file storage path. Default is serving_client | + +- Download and unzip the index of the retrieval library that has been built + ```shell + # Go back to the deploy directory + cd ../ + # Download the built retrieval library index + wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar + # Decompress the built retrieval library index + tar -xf drink_dataset_v1.0.tar + ``` + +### 3.2 Service deployment and request + +**Note:** The identification service involves multiple models, and the PipeLine deployment method is used for performance reasons. The Pipeline deployment method currently does not support the windows platform. +- go to the working directory + ```shell + cd ./deploy/paddleserving/recognition + ``` + The paddleserving directory contains code to start the Python Pipeline service, the C++ Serving service, and send prediction requests, including: + ```shell + __init__.py + config.yml # The configuration file to start the python pipeline service + pipeline_http_client.py # Script for sending pipeline prediction requests in http mode + pipeline_rpc_client.py # Script for sending pipeline prediction requests in rpc mode + recognition_web_service.py # Script to start the pipeline server + readme.md # Identify model service deployment documents + run_cpp_serving.sh # Script to start C++ Pipeline Serving deployment + test_cpp_serving_client.py # Script for sending C++ Pipeline serving prediction requests by rpc + ``` + + +#### 3.2.1 Python Serving + +- Start the service: + ```shell + # Start the service and save the running log in log.txt + python3.7 recognition_web_service.py &>log.txt & + ``` + +- send request: + ```shell + python3.7 pipeline_http_client.py + ``` + After a successful run, the results of the model prediction will be printed in the cmd window, and the results are as follows: + ```log + {'err_no': 0, 'err_msg': '', 'key': ['result'], 'value': ["[{'bbox': [345, 95, 524, 576], 'rec_docs': 'Red Bull-Enhanced', 'rec_scores': 0.79903316}]"], 'tensors': []} + ``` + + +#### 3.2.2 C++ Serving + +Different from Python Serving, the C++ Serving client calls C++ OP to predict, so before starting the service, you need to compile and install the serving server package, and set `SERVING_BIN`. +- Compile and install the Serving server package + ```shell + # Enter the working directory + cd PaddleClas/deploy/paddleserving + # One-click compile and install Serving server, set SERVING_BIN + bash ./build_server.sh python3.7 + ``` + **Note: The path set by **[build_server.sh](../build_server.sh#L55-L62) may need to be modified according to the actual machine environment such as CUDA, python version, etc., and then compiled. + +- The input and output format used by C++ Serving is different from that of Python, so you need to execute the following command to overwrite the files below [3.1] (#31-model conversion) by copying the 4 files to get the corresponding 4 prototxt files in the folder. + ```shell + # Enter PaddleClas/deploy directory + cd PaddleClas/deploy/ + + # Overwrite prototxt file + \cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_serving/ + \cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/*.prototxt ./models/general_PPLCNet_x2_5_lite_v1.0_client/ + \cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/ + \cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/*.prototxt ./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ + ``` + +- Start the service: + ```shell + # Enter the working directory + cd PaddleClas/deploy/paddleserving/recognition + + # The default port number is 9400; the running log is saved in log_PPShiTu.txt by default + # CPU deployment + sh run_cpp_serving.sh + # GPU deployment, and specify card 0 + sh run_cpp_serving.sh 0 + ``` + +- send request: + ```shell + # send service request + python3.7 test_cpp_serving_client.py + ``` + After a successful run, the results of the model predictions are printed in the client's terminal window as follows: + ```log + WARNING: Logging before InitGoogleLogging() is written to STDERR + I0614 03:01:36.273097 6084 naming_service_thread.cpp:202] brpc::policy::ListNamingService("127.0.0.1:9400"): added 1 + I0614 03:01:37.393564 6084 general_model.cpp:490] [client]logid=0,client_cost=1107.82ms,server_cost=1101.75ms. + [{'bbox': [345, 95, 524, 585], 'rec_docs': 'Red Bull-Enhanced', 'rec_scores': 0.8073724}] + ``` + +- turn off the service +If the service program is running in the foreground, you can press `Ctrl+C` to terminate the server program; if it is running in the background, you can use the kill command to close related processes, or you can execute the following command in the path where the service program is started to terminate the server program: + ```bash + python3.7 -m paddle_serving_server.serve stop + ``` + After the execution is completed, the `Process stopped` message appears, indicating that the service was successfully shut down. + + +## 4. FAQ + +**Q1**: No result is returned after the request is sent or an output decoding error is prompted + +**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and sending the request. The command to close the proxy is: +```shell +unset https_proxy +unset http_proxy +``` +**Q2**: nothing happens after starting the service + +**A2**: You can check whether the path corresponding to `model_config` in `config.yml` exists, and whether the folder name is correct + +For more service deployment types, such as `RPC prediction service`, you can refer to Serving's [github official website](https://github.com/PaddlePaddle/Serving/tree/v0.9.0/examples)