update serving deployment to v0.7

f3627b66 · stephon · d618efe7 · d618efe7 · d618efe7 · f3627b66
9 changed file
--- a/deploy/paddleserving/README.md
+++ b/deploy/paddleserving/README.md
-# PaddleClas Pipeline WebService
-
-(English|[简体中文](./README_CN.md))
-
-PaddleClas provides two service deployment methods:
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md)
- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`".  if you prefer retrieval_based image reocognition service, please refer to [tutorial](./recognition/README.md)，if you'd like image classification service, Please follow this tutorial.
-
-# Image Classification Service deployment based on PaddleServing  
-
-This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service.
-
-Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
-
-The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
-
-
-## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
-
-<a name="environmental-preparation"></a>
-## Environmental preparation
-
-PaddleClas operating environment and PaddleServing operating environment are needed.
-
-1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
-   Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
-
-2. The steps of PaddleServing operating environment prepare are as follows:
-
-    Install serving which used to start the service
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # Other GPU environments need to confirm the environment and then choose to execute the following commands
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-
-3. Install the client to send requests to the service
-    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
-    The python3.7 version is recommended here:
-
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-
-4. Install serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-
-   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
-
-
-<a name="model-conversion"></a>
-## Model conversion
-When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
-
-Firstly, download the inference model of ResNet50_vd
-```
-# Download and unzip the ResNet50_vd model
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar  && tar xf ResNet50_vd_infer.tar
-```
-
-Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
-```
-#  ResNet50_vd model conversion
-python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./ResNet50_vd_serving/ \
-                                         --serving_client ./ResNet50_vd_client/
-```
-
-After the ResNet50_vd inference model is converted, there will be additional folders of `ResNet50_vd_serving` and `ResNet50_vd_client` in the current folder, with the following format:
-```
-|- ResNet50_vd_client/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-
-|- ResNet50_vd_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-
-Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`: Change `alias_name` in `feed_var` to `image`, change `alias_name` in `fetch_var` to `prediction`,
-The modified serving_server_conf.prototxt file is as follows:
-```
-feed_var {
-  name: "inputs"
-  alias_name: "image"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "prediction"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-
-<a name="paddle-serving-pipeline-deployment"></a>
-## Paddle Serving pipeline deployment
-
-1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-
-    # Enter the working directory  
-    cd PaddleClas/deploy/paddleserving/
-    ```
-
-    The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
-    ```
-    __init__.py
-    config.yml                # configuration file of starting the service
-    pipeline_http_client.py   # script to send pipeline prediction request by http
-    pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
-    classification_web_service.py   # start the script of the pipeline server
-    ```
-
-2. Run the following command to start the service.
-    ```
-    # Start the service and save the running log in log.txt
-    python3 classification_web_service.py &>log.txt &
-    ```
-    After the service is successfully started, a log similar to the following will be printed in log.txt
-    ![](./imgs/start_server.png)
-
-3. Send service request
-    ```
-    python3 pipeline_http_client.py
-    ```
-    After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
-    ![](./imgs/results.png)
-
-    Adjust the number of concurrency in config.yml to get the largest QPS. 
-
-    ```
-    op:
-        concurrency: 8
-        ...
-    ```
-
-    Multiple service requests can be sent at the same time if necessary.
-
-    The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
-
-<a name="faq"></a>
-## FAQ
-**Q1**: No result return after sending the request.
-
-**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
-```
-unset https_proxy
-unset http_proxy
-```  
--- a/deploy/paddleserving/README_CN.md
+++ b/deploy/paddleserving/README_CN.md
-# PaddleClas 服务化部署
-
-([English](./README.md)|简体中文)
-
-PaddleClas提供2种服务部署方式：
- 基于PaddleHub Serving的部署：代码路径为"`./deploy/hubserving`"，使用方法参考[文档](../../deploy/hubserving/readme.md)；
- 基于PaddleServing的部署：代码路径为"`./deploy/paddleserving`"， 基于检索方式的图像识别服务参考[文档](./recognition/README_CN.md)， 图像分类服务按照本教程使用。
-
-# 基于PaddleServing的图像分类服务部署
-
-本文档以经典的ResNet50_vd模型为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas
-动态图模型的pipeline在线服务。
-
-相比较于hubserving部署，PaddleServing具备以下优点：
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理，在线加载，在线A/B测试等
- 支持 多种编程语言 开发客户端，例如C++, Python和Java
-
-更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。
-
-## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
-
-<a name="环境准备"></a>
-## 环境准备
-
-需要准备PaddleClas的运行环境和PaddleServing的运行环境。
-
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包，推荐安装2.1.0版本
-
- 准备PaddleServing的运行环境，步骤如下
-
-1. 安装serving，用于启动服务
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # 其他GPU环境需要确认环境再选择执行如下命令
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-
-2. 安装client，用于向服务发送请求
-    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
-
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-
-3. 安装serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
-
-<a name="模型转换"></a>
-## 模型转换
-
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。
-
-首先，下载ResNet50_vd的inference模型
-```
-# 下载并解压ResNet50_vd模型
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
-```
-
-接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
-
-```
-# 转换ResNet50_vd模型
-python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./ResNet50_vd_serving/ \
-                                         --serving_client ./ResNet50_vd_client/
-```
-ResNet50_vd推理模型转换完成后，会在当前文件夹多出`ResNet50_vd_serving` 和`ResNet50_vd_client`的文件夹，具备如下格式：
-```
-|- ResNet50_vd_client/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-
-|- ResNet50_vd_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-
-```
-得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`feed_var`中的`alias_name`改为`image`, 将`fetch_var`中的`alias_name`改为`prediction`, 
-修改后的serving_server_conf.prototxt内容如下：
-```
-feed_var {
-  name: "inputs"
-  alias_name: "image"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "prediction"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-
-<a name="部署"></a>
-## Paddle Serving pipeline部署
-
-1. 下载PaddleClas代码，若已下载可跳过此步骤
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-
-    # 进入到工作目录
-    cd PaddleClas/deploy/paddleserving/
-    ```
-    paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
-    ```
-    __init__.py
-    config.yml                 # 启动服务的配置文件
-    pipeline_http_client.py    # http方式发送pipeline预测请求的脚本
-    pipeline_rpc_client.py     # rpc方式发送pipeline预测请求的脚本
-    classification_web_service.py    # 启动pipeline服务端的脚本
-    ```
-
-2. 启动服务可运行如下命令：
-    ```
-    # 启动服务，运行日志保存在log.txt
-    python3 classification_web_service.py &>log.txt &
-    ```
-    成功启动服务后，log.txt中会打印类似如下日志
-    ![](./imgs/start_server.png)
-
-3. 发送服务请求：
-    ```
-    python3 pipeline_http_client.py
-    ```
-    成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
-    ![](./imgs/results.png)
-
-    调整 config.yml 中的并发个数可以获得最大的QPS
-    ```
-    op:
-        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
-        concurrency: 8
-        ...
-    ```
-    有需要的话可以同时发送多个服务请求
-
-    预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
-
-<a name="FAQ"></a>
-## FAQ
-**Q1**： 发送请求后没有结果返回或者提示输出解码报错
-
-**A1**： 启动服务和发送请求时不要设置代理，可以在启动服务前和发送请求前关闭代理，关闭代理的命令是：
-```
-unset https_proxy
-unset http_proxy
-```
--- a/deploy/paddleserving/classification_web_service.py
+++ b/deploy/paddleserving/classification_web_service.py
@@ -21,6 +21,7 @@ import logging
 import numpy as np
 import base64, cv2

+
 class ImagenetOp(Op):
    def init_op(self):
        self.seq = Sequential([
@@ -46,7 +47,7 @@ class ImagenetOp(Op):
            img = self.seq(im)
            imgs.append(img[np.newaxis, :].copy())
        input_imgs = np.concatenate(imgs, axis=0)
-        return {"image": input_imgs}, False, None, ""
+        return {"inputs": input_imgs}, False, None, ""

    def postprocess(self, input_dicts, fetch_dict, log_id):
        score_list = fetch_dict["prediction"]

--- a/deploy/paddleserving/imgs/results_shitu.png
+++ b/deploy/paddleserving/imgs/results_shitu.png
--- a/deploy/paddleserving/imgs/start_server_shitu.png
+++ b/deploy/paddleserving/imgs/start_server_shitu.png
--- a/deploy/paddleserving/pipeline_http_client.py
+++ b/deploy/paddleserving/pipeline_http_client.py
@@ -3,15 +3,17 @@ import json
 import base64
 import os

+
 def cv2_to_base64(image):
    return base64.b64encode(image).decode('utf8')

+
 if __name__ == "__main__":
    url = "http://127.0.0.1:18080/imagenet/prediction"
    with open(os.path.join(".", "daisy.jpg"), 'rb') as file:
        image_data1 = file.read()
    image = cv2_to_base64(image_data1)
    data = {"key": ["image"], "value": [image]}
-    for i in range(100):
+    for i in range(1):
        r = requests.post(url=url, data=json.dumps(data))
        print(r.json())
--- a/deploy/paddleserving/recognition/README.md
+++ b/deploy/paddleserving/recognition/README.md
-# Product Recognition Service deployment based on PaddleServing  
-
-(English|[简体中文](./README_CN.md))
-
-This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the product recognition model based on retrieval method as a pipeline online service.
-
-Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
-
-The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
-
-## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
-
-<a name="environmental-preparation"></a>
-## Environmental preparation
-
-PaddleClas operating environment and PaddleServing operating environment are needed.
-
-1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
-   Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
-
-2. The steps of PaddleServing operating environment prepare are as follows:
-
-    Install serving which used to start the service
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # Other GPU environments need to confirm the environment and then choose to execute the following commands
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-
-3. Install the client to send requests to the service
-    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
-    The python3.7 version is recommended here:
-
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-
-4. Install serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-
-   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
-
-
-<a name="model-conversion"></a>
-## Model conversion
-When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
-The following assumes that the current working directory is the PaddleClas root directory
-
-Firstly, download the inference model of ResNet50_vd
-```
-cd deploy
-# Download and unzip the ResNet50_vd model
-wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
-cd models
-tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
-```
-
-Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
-```
-#  Product recognition model conversion
-python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
-                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
-```
-
-After the ResNet50_vd inference model is converted, there will be additional folders of `product_ResNet50_vd_aliproduct_v1.0_serving` and `product_ResNet50_vd_aliproduct_v1.0_client` in the current folder, with the following format:
-```
-|- product_ResNet50_vd_aliproduct_v1.0_serving/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-
-|- product_ResNet50_vd_aliproduct_v1.0_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-
-Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`:  change `alias_name` in `fetch_var` to `features`,
-The modified serving_server_conf.prototxt file is as follows:
-```
-feed_var {
-  name: "x"
-  alias_name: "x"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "features"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-
-Next，download and unpack the built index of product gallery
-```
-cd ../
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
-```
-
-
-<a name="paddle-serving-pipeline-deployment"></a>
-## Paddle Serving pipeline deployment
-
-**Attention:** pipeline deployment mode does not support Windows platform
-
-1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-
-    # Enter the working directory  
-    cd PaddleClas/deploy/paddleserving/recognition
-    ```
-
-    The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
-    ```
-    __init__.py
-    config.yml                # configuration file of starting the service
-    pipeline_http_client.py   # script to send pipeline prediction request by http
-    pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
-    recognition_web_service.py   # start the script of the pipeline server
-    ```
-
-2. Run the following command to start the service.
-    ```
-    # Start the service and save the running log in log.txt
-    python3 recognition_web_service.py &>log.txt &
-    ```
-    After the service is successfully started, a log similar to the following will be printed in log.txt
-    ![](../imgs/start_server_recog.png)
-
-3. Send service request
-    ```
-    python3 pipeline_http_client.py
-    ```
-    After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
-    ![](../imgs/results_recog.png)  
-
-    Adjust the number of concurrency in config.yml to get the largest QPS. 
-
-    ```
-    op:
-        concurrency: 8
-        ...
-    ```
-
-    Multiple service requests can be sent at the same time if necessary.
-
-    The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
-
-<a name="faq"></a>
-## FAQ
-**Q1**: No result return after sending the request.
-
-**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
-```
-unset https_proxy
-unset http_proxy
-```  
--- a/deploy/paddleserving/recognition/README_CN.md
+++ b/deploy/paddleserving/recognition/README_CN.md
-# 基于PaddleServing的商品识别服务部署
-
-([English](./README.md)|简体中文)
-
-本文以商品识别为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas动态图模型的pipeline在线服务。
-
-相比较于hubserving部署，PaddleServing具备以下优点：
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理，在线加载，在线A/B测试等
- 支持 多种编程语言 开发客户端，例如C++, Python和Java
-
-更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。
-
-## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
-
-<a name="环境准备"></a>
-## 环境准备
-
-需要准备PaddleClas的运行环境和PaddleServing的运行环境。
-
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包，推荐安装2.1.0版本
-
- 准备PaddleServing的运行环境，步骤如下
-
-1. 安装serving，用于启动服务
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # 其他GPU环境需要确认环境再选择执行如下命令
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-
-2. 安装client，用于向服务发送请求
-    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
-
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-
-3. 安装serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
-
-<a name="模型转换"></a>
-## 模型转换
-
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。 
-以下内容假定当前工作目录为PaddleClas根目录。
-
-首先，下载商品识别的inference模型
-```
-cd deploy
-
-# 下载并解压商品识别模型
-wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
-cd models
-tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
-```
-
-接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
-
-```
-# 转换商品识别模型
-python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
-                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
-```
-商品识别推理模型转换完成后，会在当前文件夹多出`product_ResNet50_vd_aliproduct_v1.0_serving` 和`product_ResNet50_vd_aliproduct_v1.0_client`的文件夹，具备如下格式：
-```
-|- product_ResNet50_vd_aliproduct_v1.0_serving/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-
-|- product_ResNet50_vd_aliproduct_v1.0_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-
-```
-得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`fetch_var`中的`alias_name`改为`features`, 
-修改后的serving_server_conf.prototxt内容如下：
-```
-feed_var {
-  name: "x"
-  alias_name: "x"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "features"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-
-接下来，下载并解压已经构建后的商品库index
-```
-cd ../
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
-```
-
-
-<a name="部署"></a>
-## Paddle Serving pipeline部署
-
-**注意:**  pipeline部署方式不支持windows平台
-
-1. 下载PaddleClas代码，若已下载可跳过此步骤
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-
-    # 进入到工作目录
-    cd PaddleClas/deploy/paddleserving/recognition
-    ```
-    paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
-    ```
-    __init__.py
-    config.yml                    # 启动服务的配置文件
-    pipeline_http_client.py       # http方式发送pipeline预测请求的脚本
-    pipeline_rpc_client.py        # rpc方式发送pipeline预测请求的脚本
-    recognition_web_service.py    # 启动pipeline服务端的脚本
-    ```
-
-2. 启动服务可运行如下命令：
-    ```
-    # 启动服务，运行日志保存在log.txt
-    python3 recognition_web_service.py &>log.txt &
-    ```
-    成功启动服务后，log.txt中会打印类似如下日志
-    ![](../imgs/start_server_recog.png)
-
-3. 发送服务请求：
-    ```
-    python3 pipeline_http_client.py
-    ```
-    成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
-    ![](../imgs/results_recog.png)
-
-    调整 config.yml 中的并发个数可以获得最大的QPS
-    ```
-    op:
-        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
-        concurrency: 8
-        ...
-    ```
-    有需要的话可以同时发送多个服务请求
-
-    预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
-
-<a name="FAQ"></a>
-## FAQ
-**Q1**： 发送请求后没有结果返回或者提示输出解码报错
-
-**A1**： 启动服务和发送请求时不要设置代理，可以在启动服务前和发送请求前关闭代理，关闭代理的命令是：
-```
-unset https_proxy
-unset http_proxy
-```
--- a/docs/zh_CN/inference_deployment/paddle_serving_deploy.md
+++ b/docs/zh_CN/inference_deployment/paddle_serving_deploy.md
 # 模型服务化部署
- [简介](#简介)
- [Serving安装](#Serving安装)
- [图像分类服务部署](#图像分类服务部署)
- [图像识别服务部署](#图像识别服务部署)
- [FAQ](#FAQ)
-
-<a name="简介"></a>
+- [1. 简介](#1)
+- [2. Serving 安装](#2)
+- [3. 图像分类服务部署](#3)
+    - [3.1 模型转换](#3.1)
+    - [3.2 服务部署和请求](#3.2)
+- [4. 图像识别服务部署](#4)
+  - [4.1 模型转换](#4.1)
+  - [4.2 服务部署和请求](#4.2)
+- [5. FAQ](#5)
+
+<a name="1"></a>
 ## 1. 简介
 [Paddle Serving](https://github.com/PaddlePaddle/Serving) 旨在帮助深度学习开发者轻松部署在线预测服务，支持一键部署工业级的服务能力、客户端和服务端之间高并发和高效通信、并支持多种编程语言开发客户端。

-该部分以 HTTP 预测服务部署为例，介绍怎样在 PaddleClas 中使用 PaddleServing 部署模型服务。
+该部分以 HTTP 预测服务部署为例，介绍怎样在 PaddleClas 中使用 PaddleServing 部署模型服务。目前只支持 Linux 平台部署，暂不支持 Windows 平台。

-<a name="Serving安装"></a>
-## 2. Serving安装
+<a name="2"></a>
+## 2. Serving 安装

 Serving 官网推荐使用 docker 安装并部署 Serving 环境。首先需要拉取 docker 环境并创建基于 Serving 的 docker。

 ```shell
-nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
-nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
+docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
 nvidia-docker exec -it test bash
 ```

 进入 docker 后，需要安装 Serving 相关的 python 包。

 ```shell
-pip install paddlepaddle-gpu
-pip install paddle-serving-client
-pip install paddle-serving-server-gpu
-pip install paddle-serving-app
+pip3 install paddle-serving-client==0.7.0
+pip3 install paddle-serving-server==0.7.0 # CPU
+pip3 install paddle-serving-app==0.7.0
+pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
+# 其他GPU环境需要确认环境再选择执行哪一条
+pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
+pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
 ```

 * 如果安装速度太慢，可以通过 `-i https://pypi.tuna.tsinghua.edu.cn/simple` 更换源，加速安装过程。
+* 其他环境配置安装请参考: [使用Docker安装Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)

 * 如果希望部署 CPU 服务，可以安装 serving-server 的 cpu 版本，安装命令如下。

 ```shell
 pip install paddle-serving-server
 ```
-<a name="图像分类服务部署"></a>
+<a name="3"></a>
+
 ## 3. 图像分类服务部署
+<a name="3.1"></a>
 ### 3.1 模型转换
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为Serving模型。下面以经典的ResNet50_vd模型为例，介绍如何部署图像分类服务。
+使用 PaddleServing 做服务化部署时，需要将保存的 inference 模型转换为 Serving 模型。下面以经典的 ResNet50_vd 模型为例，介绍如何部署图像分类服务。
 - 进入工作目录：
 ```shell
 cd deploy/paddleserving
 ```
- 下载ResNet50_vd的inference模型：
+- 下载 ResNet50_vd 的 inference 模型：
 ```shell
-# 下载并解压ResNet50_vd模型
+# 下载并解压 ResNet50_vd 模型
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
 ```
- 用paddle_serving_client把下载的inference模型转换成易于Server部署的模型格式：
+- 用 paddle_serving_client 把下载的 inference 模型转换成易于 Server 部署的模型格式：
 ```
-# 转换ResNet50_vd模型
+# 转换 ResNet50_vd 模型
 python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
                                         --model_filename inference.pdmodel  \
                                         --params_filename inference.pdiparams \
                                         --serving_server ./ResNet50_vd_serving/ \
                                         --serving_client ./ResNet50_vd_client/
 ```
-ResNet50_vd推理模型转换完成后，会在当前文件夹多出`ResNet50_vd_serving` 和`ResNet50_vd_client`的文件夹，具备如下格式：
+ResNet50_vd 推理模型转换完成后，会在当前文件夹多出 `ResNet50_vd_serving` 和 `ResNet50_vd_client` 的文件夹，具备如下格式：
 ```
-|- ResNet50_vd_client/
+|- ResNet50_vd_server/
  |- __model__  
  |- __params__
  |- serving_server_conf.prototxt  
@@ -71,14 +81,14 @@ ResNet50_vd推理模型转换完成后，会在当前文件夹多出`ResNet50_vd
  |- serving_client_conf.prototxt  
  |- serving_client_conf.stream.prototxt
 ```
-得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`feed_var`中的`alias_name`改为`image`, 将`fetch_var`中的`alias_name`改为`prediction`
+得到模型文件之后，需要修改 `ResNet50_vd_server` 下文件 `serving_server_conf.prototxt` 中的 alias 名字：将 `fetch_var` 中的 `alias_name` 改为 `prediction`

-**备注**:  Serving为了兼容不同模型的部署，提供了输入输出重命名的功能。这样，不同的模型在推理部署时，只需要修改配置文件的alias_name即可，无需修改代码即可完成推理部署。
-修改后的serving_server_conf.prototxt如下所示:
+**备注**:  Serving 为了兼容不同模型的部署，提供了输入输出重命名的功能。这样，不同的模型在推理部署时，只需要修改配置文件的 alias_name 即可，无需修改代码即可完成推理部署。
+修改后的 serving_server_conf.prototxt 如下所示:
 ```
 feed_var {
  name: "inputs"
-  alias_name: "image"
+  alias_name: "inputs"
  is_lod_tensor: false
  feed_type: 1
  shape: 3
@@ -93,8 +103,9 @@ fetch_var {
  shape: -1
 }
 ```
+<a name="3.2"></a>
 ### 3.2 服务部署和请求
-paddleserving目录包含了启动pipeline服务和发送预测请求的代码，包括：
+paddleserving 目录包含了启动 pipeline 服务和发送预测请求的代码，包括：
 ```shell
 __init__.py
 config.yml                 # 启动服务的配置文件
@@ -105,10 +116,10 @@ classification_web_service.py    # 启动pipeline服务端的脚本

 - 启动服务：
 ```shell
-# 启动服务，运行日志保存在log.txt
+# 启动服务，运行日志保存在 log.txt
 python3 classification_web_service.py &>log.txt &
 ```
-成功启动服务后，log.txt中会打印类似如下日志
+成功启动服务后，log.txt 中会打印类似如下日志
 ![](../../../deploy/paddleserving/imgs/start_server.png)

 - 发送请求：
@@ -116,14 +127,15 @@ python3 classification_web_service.py &>log.txt &
 # 发送服务请求
 python3 pipeline_http_client.py
 ```
-成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
+成功运行后，模型预测的结果会打印在 cmd 窗口中，结果示例为：
 ![](../../../deploy/paddleserving/imgs/results.png)

-<a name="图像识别服务部署"></a>
+<a name="4"></a>
 ## 4.图像识别服务部署
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为Serving模型。 下面以PP-ShiTu中的超轻量图像识别模型为例，介绍图像识别服务的部署。
+使用 PaddleServing 做服务化部署时，需要将保存的 inference 模型转换为 Serving 模型。 下面以 PP-ShiTu 中的超轻量图像识别模型为例，介绍图像识别服务的部署。
+<a name="4.1"></a>
 ## 4.1 模型转换
- 下载通用检测inference模型和通用识别inference模型
+- 下载通用检测 inference 模型和通用识别 inference 模型
 ```
 cd deploy
 # 下载并解压通用识别模型
@@ -134,7 +146,7 @@ tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
 tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar
 ```
- 转换识别inference模型为Serving模型：
+- 转换识别 inference 模型为 Serving 模型：
 ```
 # 转换识别模型
 python3 -m paddle_serving_client.convert --dirname ./general_PPLCNet_x2_5_lite_v1.0_infer/ \
@@ -143,8 +155,8 @@ python3 -m paddle_serving_client.convert --dirname ./general_PPLCNet_x2_5_lite_v
                                         --serving_server ./general_PPLCNet_x2_5_lite_v1.0_serving/ \
                                         --serving_client ./general_PPLCNet_x2_5_lite_v1.0_client/
 ```
-识别推理模型转换完成后，会在当前文件夹多出`general_PPLCNet_x2_5_lite_v1.0_serving/` 和`general_PPLCNet_x2_5_lite_v1.0_serving/`的文件夹。修改`general_PPLCNet_x2_5_lite_v1.0_serving/`目录下的serving_server_conf.prototxt中的alias名字： 将`fetch_var`中的`alias_name`改为`features`。
-修改后的serving_server_conf.prototxt内容如下：
+识别推理模型转换完成后，会在当前文件夹多出 `general_PPLCNet_x2_5_lite_v1.0_serving/` 和 `general_PPLCNet_x2_5_lite_v1.0_serving/` 的文件夹。修改 `general_PPLCNet_x2_5_lite_v1.0_serving/` 目录下的 serving_server_conf.prototxt 中的 alias 名字： 将 `fetch_var` 中的 `alias_name` 改为 `features`。
+修改后的 serving_server_conf.prototxt 内容如下：
 ```
 feed_var {
  name: "x"
@@ -163,7 +175,7 @@ fetch_var {
  shape: -1
 }
 ```
- 转换通用检测inference模型为Serving模型：
+- 转换通用检测 inference 模型为 Serving 模型：
 ```
 # 转换通用检测模型
 python3 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/ \
@@ -172,22 +184,23 @@ python3 -m paddle_serving_client.convert --dirname ./picodet_PPLCNet_x2_5_mainbo
                                         --serving_server ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/ \
                                         --serving_client ./picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/
 ```
-检测inference模型转换完成后，会在当前文件夹多出`picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` 和`picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/`的文件夹。
+检测 inference 模型转换完成后，会在当前文件夹多出 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` 和 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/` 的文件夹。

-**注意:** 此处不需要修改`picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/`目录下的serving_server_conf.prototxt中的alias名字。
+**注意:** 此处不需要修改 `picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/` 目录下的 serving_server_conf.prototxt 中的 alias 名字。

- 下载并解压已经构建后的检索库index
+- 下载并解压已经构建后的检索库 index
 ```
 cd ../
 wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
 ```
+<a name="4.2"></a>
 ## 4.2 服务部署和请求
-**注意:**  识别服务涉及到多个模型，出于性能考虑采用PipeLine部署方式。Pipeline部署方式当前不支持windows平台。
+**注意:** 识别服务涉及到多个模型，出于性能考虑采用 PipeLine 部署方式。Pipeline 部署方式当前不支持 windows 平台。
 - 进入到工作目录
 ```shell
 cd ./deploy/paddleserving/recognition
 ```
-paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
+paddleserving 目录包含启动 pipeline 服务和发送预测请求的代码，包括：
 ```
 __init__.py
 config.yml                    # 启动服务的配置文件
@@ -197,20 +210,20 @@ recognition_web_service.py    # 启动pipeline服务端的脚本
 ```
 - 启动服务：
 ```
-# 启动服务，运行日志保存在log.txt
+# 启动服务，运行日志保存在 log.txt
 python3 recognition_web_service.py &>log.txt &
 ```
-成功启动服务后，log.txt中会打印类似如下日志
+成功启动服务后，log.txt 中会打印类似如下日志
 ![](../../../deploy/paddleserving/imgs/start_server_shitu.png)

 - 发送请求：
 ```
 python3 pipeline_http_client.py
 ```
-成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
+成功运行后，模型预测的结果会打印在 cmd 窗口中，结果示例为：
 ![](../../../deploy/paddleserving/imgs/results_shitu.png)

-<a name="FAQ"></a>
+<a name="5"></a>
 ## 5.FAQ
 **Q1**： 发送请求后没有结果返回或者提示输出解码报错

@@ -220,4 +233,4 @@ unset https_proxy
 unset http_proxy
 ```

-更多的服务部署类型，如 `RPC预测服务` 等，可以参考 Serving 的[github 官网](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet)
+更多的服务部署类型，如 `RPC 预测服务` 等，可以参考 Serving 的[github 官网](https://github.com/PaddlePaddle/Serving/tree/v0.7.0/examples)