update serving deployment

cd9b8cad · stephon · 0f3e3141 · 0f3e3141 · 0f3e3141 · cd9b8cad
7 changed file
--- a/deploy/paddleserving/README.md
+++ b/deploy/paddleserving/README.md
-# PaddleClas Pipeline WebService
-(English|[简体中文](./README_CN.md))
-PaddleClas provides two service deployment methods:
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md)
- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`".  if you prefer retrieval_based image reocognition service, please refer to [tutorial](./recognition/README.md)，if you'd like image classification service, Please follow this tutorial.
-# Image Classification Service deployment based on PaddleServing  
-This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service.
-Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
-The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
-## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
-<a name="environmental-preparation"></a>
-## Environmental preparation
-PaddleClas operating environment and PaddleServing operating environment are needed.
-1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
-   Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
-2. The steps of PaddleServing operating environment prepare are as follows:
-    Install serving which used to start the service
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # Other GPU environments need to confirm the environment and then choose to execute the following commands
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-3. Install the client to send requests to the service
-    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
-    The python3.7 version is recommended here:
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-4. Install serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
-<a name="model-conversion"></a>
-## Model conversion
-When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
-Firstly, download the inference model of ResNet50_vd
-```
-# Download and unzip the ResNet50_vd model
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar  && tar xf ResNet50_vd_infer.tar
-```
-Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
-```
-#  ResNet50_vd model conversion
-python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./ResNet50_vd_serving/ \
-                                         --serving_client ./ResNet50_vd_client/
-```
-After the ResNet50_vd inference model is converted, there will be additional folders of `ResNet50_vd_serving` and `ResNet50_vd_client` in the current folder, with the following format:
-```
-|- ResNet50_vd_client/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-|- ResNet50_vd_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`: Change `alias_name` in `feed_var` to `image`, change `alias_name` in `fetch_var` to `prediction`,
-The modified serving_server_conf.prototxt file is as follows:
-```
-feed_var {
-  name: "inputs"
-  alias_name: "image"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "prediction"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-<a name="paddle-serving-pipeline-deployment"></a>
-## Paddle Serving pipeline deployment
-1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-    # Enter the working directory  
-    cd PaddleClas/deploy/paddleserving/
-    ```
-    The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
-    ```
-    __init__.py
-    config.yml                # configuration file of starting the service
-    pipeline_http_client.py   # script to send pipeline prediction request by http
-    pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
-    classification_web_service.py   # start the script of the pipeline server
-    ```
-2. Run the following command to start the service.
-    ```
-    # Start the service and save the running log in log.txt
-    python3 classification_web_service.py &>log.txt &
-    ```
-    After the service is successfully started, a log similar to the following will be printed in log.txt
-    ![](./imgs/start_server.png)
-3. Send service request
-    ```
-    python3 pipeline_http_client.py
-    ```
-    After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
-    ![](./imgs/results.png)
-    Adjust the number of concurrency in config.yml to get the largest QPS. 
-    ```
-    op:
-        concurrency: 8
-        ...
-    ```
-    Multiple service requests can be sent at the same time if necessary.
-    The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
-<a name="faq"></a>
-## FAQ
-**Q1**: No result return after sending the request.
-**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
-```
-unset https_proxy
-unset http_proxy
-```  
--- a/deploy/paddleserving/README_CN.md
+++ b/deploy/paddleserving/README_CN.md
-# PaddleClas 服务化部署
-([English](./README.md)|简体中文)
-PaddleClas提供2种服务部署方式：
- 基于PaddleHub Serving的部署：代码路径为"`./deploy/hubserving`"，使用方法参考[文档](../../deploy/hubserving/readme.md)；
- 基于PaddleServing的部署：代码路径为"`./deploy/paddleserving`"， 基于检索方式的图像识别服务参考[文档](./recognition/README_CN.md)， 图像分类服务按照本教程使用。
-# 基于PaddleServing的图像分类服务部署
-本文档以经典的ResNet50_vd模型为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas
-动态图模型的pipeline在线服务。
-相比较于hubserving部署，PaddleServing具备以下优点：
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理，在线加载，在线A/B测试等
- 支持 多种编程语言 开发客户端，例如C++, Python和Java
-更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。
-## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
-<a name="环境准备"></a>
-## 环境准备
-需要准备PaddleClas的运行环境和PaddleServing的运行环境。
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包，推荐安装2.1.0版本
- 准备PaddleServing的运行环境，步骤如下
-1. 安装serving，用于启动服务
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # 其他GPU环境需要确认环境再选择执行如下命令
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-2. 安装client，用于向服务发送请求
-    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-3. 安装serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
-<a name="模型转换"></a>
-## 模型转换
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。
-首先，下载ResNet50_vd的inference模型
-```
-# 下载并解压ResNet50_vd模型
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
-```
-接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
-```
-# 转换ResNet50_vd模型
-python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./ResNet50_vd_serving/ \
-                                         --serving_client ./ResNet50_vd_client/
-```
-ResNet50_vd推理模型转换完成后，会在当前文件夹多出`ResNet50_vd_serving` 和`ResNet50_vd_client`的文件夹，具备如下格式：
-```
-|- ResNet50_vd_client/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-|- ResNet50_vd_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`feed_var`中的`alias_name`改为`image`, 将`fetch_var`中的`alias_name`改为`prediction`, 
-修改后的serving_server_conf.prototxt内容如下：
-```
-feed_var {
-  name: "inputs"
-  alias_name: "image"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "prediction"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-<a name="部署"></a>
-## Paddle Serving pipeline部署
-1. 下载PaddleClas代码，若已下载可跳过此步骤
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-    # 进入到工作目录
-    cd PaddleClas/deploy/paddleserving/
-    ```
-    paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
-    ```
-    __init__.py
-    config.yml                 # 启动服务的配置文件
-    pipeline_http_client.py    # http方式发送pipeline预测请求的脚本
-    pipeline_rpc_client.py     # rpc方式发送pipeline预测请求的脚本
-    classification_web_service.py    # 启动pipeline服务端的脚本
-    ```
-2. 启动服务可运行如下命令：
-    ```
-    # 启动服务，运行日志保存在log.txt
-    python3 classification_web_service.py &>log.txt &
-    ```
-    成功启动服务后，log.txt中会打印类似如下日志
-    ![](./imgs/start_server.png)
-3. 发送服务请求：
-    ```
-    python3 pipeline_http_client.py
-    ```
-    成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
-    ![](./imgs/results.png)
-    调整 config.yml 中的并发个数可以获得最大的QPS
-    ```
-    op:
-        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
-        concurrency: 8
-        ...
-    ```
-    有需要的话可以同时发送多个服务请求
-    预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
-<a name="FAQ"></a>
-## FAQ
-**Q1**： 发送请求后没有结果返回或者提示输出解码报错
-**A1**： 启动服务和发送请求时不要设置代理，可以在启动服务前和发送请求前关闭代理，关闭代理的命令是：
-```
-unset https_proxy
-unset http_proxy
-```
--- a/deploy/paddleserving/classification_web_service.py
+++ b/deploy/paddleserving/classification_web_service.py
@@ -21,6 +21,7 @@ import logging
 import numpy as np
 import base64, cv2
 class ImagenetOp(Op):
    def init_op(self):
        self.seq = Sequential([
@@ -36,6 +37,7 @@ class ImagenetOp(Op):
                label_idx += 1
    def preprocess(self, input_dicts, data_id, log_id):
+        print("111111")
        (_, input_dict), = input_dicts.items()
        batch_size = len(input_dict.keys())
        imgs = []
@@ -46,9 +48,11 @@ class ImagenetOp(Op):
            img = self.seq(im)
            imgs.append(img[np.newaxis, :].copy())
        input_imgs = np.concatenate(imgs, axis=0)
-        return {"image": input_imgs}, False, None, ""
+        print("2222222")
+        return {"inputs": input_imgs}, False, None, ""
-    def postprocess(self, input_dicts, fetch_dict, log_id):
+    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
+        print("3333333")
        score_list = fetch_dict["prediction"]
        result = {"label": [], "prob": []}
        for score in score_list:
@@ -59,6 +63,7 @@ class ImagenetOp(Op):
            result["prob"].append(max_score)
        result["label"] = str(result["label"])
        result["prob"] = str(result["prob"])
+        print("444444444")
        return result, None, ""

--- a/deploy/paddleserving/recognition/README.md
+++ b/deploy/paddleserving/recognition/README.md
-# Product Recognition Service deployment based on PaddleServing  
-(English|[简体中文](./README_CN.md))
-This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the product recognition model based on retrieval method as a pipeline online service.
-Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
-The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
-## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
-<a name="environmental-preparation"></a>
-## Environmental preparation
-PaddleClas operating environment and PaddleServing operating environment are needed.
-1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
-   Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
-2. The steps of PaddleServing operating environment prepare are as follows:
-    Install serving which used to start the service
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # Other GPU environments need to confirm the environment and then choose to execute the following commands
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-3. Install the client to send requests to the service
-    In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
-    The python3.7 version is recommended here:
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-4. Install serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-   **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
-<a name="model-conversion"></a>
-## Model conversion
-When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
-The following assumes that the current working directory is the PaddleClas root directory
-Firstly, download the inference model of ResNet50_vd
-```
-cd deploy
-# Download and unzip the ResNet50_vd model
-wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
-cd models
-tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
-```
-Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
-```
-#  Product recognition model conversion
-python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
-                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
-```
-After the ResNet50_vd inference model is converted, there will be additional folders of `product_ResNet50_vd_aliproduct_v1.0_serving` and `product_ResNet50_vd_aliproduct_v1.0_client` in the current folder, with the following format:
-```
-|- product_ResNet50_vd_aliproduct_v1.0_serving/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-|- product_ResNet50_vd_aliproduct_v1.0_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`:  change `alias_name` in `fetch_var` to `features`,
-The modified serving_server_conf.prototxt file is as follows:
-```
-feed_var {
-  name: "x"
-  alias_name: "x"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "features"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-Next，download and unpack the built index of product gallery
-```
-cd ../
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
-```
-<a name="paddle-serving-pipeline-deployment"></a>
-## Paddle Serving pipeline deployment
-**Attention:** pipeline deployment mode does not support Windows platform
-1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-    # Enter the working directory  
-    cd PaddleClas/deploy/paddleserving/recognition
-    ```
-    The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
-    ```
-    __init__.py
-    config.yml                # configuration file of starting the service
-    pipeline_http_client.py   # script to send pipeline prediction request by http
-    pipeline_rpc_client.py    # script to send pipeline prediction request by rpc
-    recognition_web_service.py   # start the script of the pipeline server
-    ```
-2. Run the following command to start the service.
-    ```
-    # Start the service and save the running log in log.txt
-    python3 recognition_web_service.py &>log.txt &
-    ```
-    After the service is successfully started, a log similar to the following will be printed in log.txt
-    ![](../imgs/start_server_recog.png)
-3. Send service request
-    ```
-    python3 pipeline_http_client.py
-    ```
-    After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
-    ![](../imgs/results_recog.png)  
-    Adjust the number of concurrency in config.yml to get the largest QPS. 
-    ```
-    op:
-        concurrency: 8
-        ...
-    ```
-    Multiple service requests can be sent at the same time if necessary.
-    The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
-<a name="faq"></a>
-## FAQ
-**Q1**: No result return after sending the request.
-**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
-```
-unset https_proxy
-unset http_proxy
-```  
--- a/deploy/paddleserving/recognition/README_CN.md
+++ b/deploy/paddleserving/recognition/README_CN.md
-# 基于PaddleServing的商品识别服务部署
-([English](./README.md)|简体中文)
-本文以商品识别为例，介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas动态图模型的pipeline在线服务。
-相比较于hubserving部署，PaddleServing具备以下优点：
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理，在线加载，在线A/B测试等
- 支持 多种编程语言 开发客户端，例如C++, Python和Java
-更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。
-## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
-<a name="环境准备"></a>
-## 环境准备
-需要准备PaddleClas的运行环境和PaddleServing的运行环境。
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包，推荐安装2.1.0版本
- 准备PaddleServing的运行环境，步骤如下
-1. 安装serving，用于启动服务
-    ```
-    pip3 install paddle-serving-server==0.6.1 # for CPU
-    pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
-    # 其他GPU环境需要确认环境再选择执行如下命令
-    pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
-    pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
-    ```
-2. 安装client，用于向服务发送请求
-    在[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包，这里推荐python3.7版本：
-    ```
-    wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
-    pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
-    ```
-3. 安装serving-app
-    ```
-    pip3 install paddle-serving-app==0.6.1
-    ```
-    **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。
-<a name="模型转换"></a>
-## 模型转换
-使用PaddleServing做服务化部署时，需要将保存的inference模型转换为serving易于部署的模型。 
-以下内容假定当前工作目录为PaddleClas根目录。
-首先，下载商品识别的inference模型
-```
-cd deploy
-# 下载并解压商品识别模型
-wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
-cd models
-tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
-```
-接下来，用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
-```
-# 转换商品识别模型
-python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
-                                         --model_filename inference.pdmodel  \
-                                         --params_filename inference.pdiparams \
-                                         --serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
-                                         --serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
-```
-商品识别推理模型转换完成后，会在当前文件夹多出`product_ResNet50_vd_aliproduct_v1.0_serving` 和`product_ResNet50_vd_aliproduct_v1.0_client`的文件夹，具备如下格式：
-```
-|- product_ResNet50_vd_aliproduct_v1.0_serving/
-  |- __model__  
-  |- __params__
-  |- serving_server_conf.prototxt  
-  |- serving_server_conf.stream.prototxt
-|- product_ResNet50_vd_aliproduct_v1.0_client
-  |- serving_client_conf.prototxt  
-  |- serving_client_conf.stream.prototxt
-```
-得到模型文件之后，需要修改serving_server_conf.prototxt中的alias名字： 将`fetch_var`中的`alias_name`改为`features`, 
-修改后的serving_server_conf.prototxt内容如下：
-```
-feed_var {
-  name: "x"
-  alias_name: "x"
-  is_lod_tensor: false
-  feed_type: 1
-  shape: 3
-  shape: 224
-  shape: 224
-}
-fetch_var {
-  name: "save_infer_model/scale_0.tmp_1"
-  alias_name: "features"
-  is_lod_tensor: true
-  fetch_type: 1
-  shape: -1
-}
-```
-接下来，下载并解压已经构建后的商品库index
-```
-cd ../
-wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
-```
-<a name="部署"></a>
-## Paddle Serving pipeline部署
-**注意:**  pipeline部署方式不支持windows平台
-1. 下载PaddleClas代码，若已下载可跳过此步骤
-    ```
-    git clone https://github.com/PaddlePaddle/PaddleClas
-    # 进入到工作目录
-    cd PaddleClas/deploy/paddleserving/recognition
-    ```
-    paddleserving目录包含启动pipeline服务和发送预测请求的代码，包括：
-    ```
-    __init__.py
-    config.yml                    # 启动服务的配置文件
-    pipeline_http_client.py       # http方式发送pipeline预测请求的脚本
-    pipeline_rpc_client.py        # rpc方式发送pipeline预测请求的脚本
-    recognition_web_service.py    # 启动pipeline服务端的脚本
-    ```
-2. 启动服务可运行如下命令：
-    ```
-    # 启动服务，运行日志保存在log.txt
-    python3 recognition_web_service.py &>log.txt &
-    ```
-    成功启动服务后，log.txt中会打印类似如下日志
-    ![](../imgs/start_server_recog.png)
-3. 发送服务请求：
-    ```
-    python3 pipeline_http_client.py
-    ```
-    成功运行后，模型预测的结果会打印在cmd窗口中，结果示例为：
-    ![](../imgs/results_recog.png)
-    调整 config.yml 中的并发个数可以获得最大的QPS
-    ```
-    op:
-        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
-        concurrency: 8
-        ...
-    ```
-    有需要的话可以同时发送多个服务请求
-    预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
-<a name="FAQ"></a>
-## FAQ
-**Q1**： 发送请求后没有结果返回或者提示输出解码报错
-**A1**： 启动服务和发送请求时不要设置代理，可以在启动服务前和发送请求前关闭代理，关闭代理的命令是：
-```
-unset https_proxy
-unset http_proxy
-```
--- a/deploy/paddleserving/recognition/recognition_web_service.py
+++ b/deploy/paddleserving/recognition/recognition_web_service.py
@@ -83,7 +83,7 @@ class DetOp(Op):
        }
        return feed_dict, False, None, ""
-    def postprocess(self, input_dicts, fetch_dict, log_id):
+    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
        boxes = self.img_postprocess(fetch_dict, visualize=False)
        boxes.sort(key=lambda x: x["score"], reverse=True)
        boxes = filter(lambda x: x["score"] >= self.threshold,
@@ -173,7 +173,7 @@ class RecOp(Op):
            filtered_results.append(results[i])
        return filtered_results
-    def postprocess(self, input_dicts, fetch_dict, log_id):
+    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
        batch_features = fetch_dict["features"]
        if self.feature_normalize:

--- a/docs/zh_CN/inference_deployment/paddle_serving_deploy.md
+++ b/docs/zh_CN/inference_deployment/paddle_serving_deploy.md
@@ -17,27 +17,26 @@
 Serving 官网推荐使用 docker 安装并部署 Serving 环境。首先需要拉取 docker 环境并创建基于 Serving 的 docker。
 ```shell
-nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
+docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
-nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
 nvidia-docker exec -it test bash
 ```
 进入 docker 后，需要安装 Serving 相关的 python 包。
 ```shell
-pip install paddlepaddle-gpu
+pip3 install paddle-serving-client==0.7.0
-pip install paddle-serving-client
+pip3 install paddle-serving-server==0.7.0 # CPU
-pip install paddle-serving-server-gpu
+pip3 install paddle-serving-app==0.7.0
-pip install paddle-serving-app
+pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
+# 其他GPU环境需要确认环境再选择执行哪一条
+pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
+pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
 ```
 * 如果安装速度太慢，可以通过 `-i https://pypi.tuna.tsinghua.edu.cn/simple` 更换源，加速安装过程。
+* 其他环境配置安装请参考: [使用Docker安装Paddle Serving](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Install_CN.md)
-* 如果希望部署 CPU 服务，可以安装 serving-server 的 cpu 版本，安装命令如下。
-```shell
-pip install paddle-serving-server
-```
 <a name="图像分类服务部署"></a>
 ## 3. 图像分类服务部署
 ### 3.1 模型转换