Merge branch 'develop' into dev_2

71cc5408 · huangjianhui · GitHub · 59285ba9 · 8d2daeff · 71cc5408
21 changed file
--- a/doc/C++_Serving/ABTest_CN.md
+++ b/doc/C++_Serving/ABTest_CN.md
@@ -30,7 +30,7 @@ pip install Shapely

 ### 启动Server端

-这里采用[Docker方式](../Run_In_Docker_CN.md)启动Server端服务。
+这里采用[Docker方式](../Install_CN.md)启动Server端服务。

 首先启动BOW Server，该服务启用`8000`端口：


--- a/doc/C++_Serving/ABTest_EN.md
+++ b/doc/C++_Serving/ABTest_EN.md
@@ -31,7 +31,7 @@ The Python code in the file will process the data `test_data/part-0` and write t

 ### Start Server

-Here, we [use docker](../Run_In_Docker_EN.md) to start the server-side service. 
+Here, we [use docker](../Docker_Images_EN.md) to start the server-side service. 

 First, start the BOW server, which enables the `8000` port:


--- a/doc/C++_Serving/Introduction_CN.md
+++ b/doc/C++_Serving/Introduction_CN.md
@@ -76,7 +76,7 @@ C++ Serving采用对称加密算法对模型进行加密，在服务加载模型
 <p>

 ### 4.2 多语言多协议Client
-BRPC网络框架支持[多种底层通信协议](#1.网络框架(BRPC))，即使用目前的C++ Serving框架的Server端，各种语言的Client端，甚至使用curl的方式，只要按照上述协议（具体支持的协议见[brpc官网](https://github.com/apache/incubator-brpc)）封装数据并发送，Server端就能够接收、处理和返回结果。
+BRPC网络框架支持[多种底层通信协议](#1网络框架BRPC)，即使用目前的C++ Serving框架的Server端，各种语言的Client端，甚至使用curl的方式，只要按照上述协议（具体支持的协议见[brpc官网](https://github.com/apache/incubator-brpc)）封装数据并发送，Server端就能够接收、处理和返回结果。

 对于支持的各种协议我们提供了部分的Client SDK示例供用户参考和使用，用户也可以根据自己的需求去开发新的Client SDK，也欢迎用户添加其他语言/协议（例如GRPC-Go、GRPC-C++ HTTP2-Go、HTTP2-Java等）Client SDK到我们的仓库供其他开发者借鉴和参考。


--- a/doc/Compile_CN.md
+++ b/doc/Compile_CN.md
@@ -2,7 +2,22 @@

 (简体中文|[English](./Compile_EN.md))

-## 编译环境设置
+## 总体概述
+
+编译Paddle Serving一共分以下几步
+
+- 编译环境准备：根据模型和运行环境的需要，选择最合适的镜像
+- 下载代码库：下载Serving代码库，按需要执行初始化操作
+- 环境变量准备：根据运行环境的需要，确定Python各个环境变量，如GPU环境还需要确定Cuda，Cudnn，TensorRT等环境变量。
+- 正式编译： 编译`paddle-serving-server`, `paddle-serving-client`, `paddle-serving-app`相关whl包
+- 安装相关whl包：安装编译出的三个whl包，并设置SERVING_BIN环境变量
+
+此外，针对某些C++二次开发场景，我们也提供了OPENCV的联编方案。
+
+
+
+
+## 编译环境准备

 |             组件             |             版本要求              |
 | :--------------------------: | :-------------------------------: |
@@ -11,7 +26,7 @@
 |           gcc-c++            |          5.4.0(Cuda 10.1) and 8.2.0         |
 |            cmake             |          3.2.0 and later          |
 |            Python            |          3.6.0 and later          |
-|              Go              |          1.9.2 and later          |
+|              Go              |          1.17.2 and later          |
 |             git              |         2.17.1 and later          |
 |         glibc-static         |               2.17                |
 |        openssl-devel         |              1.0.2k               |
@@ -25,107 +40,149 @@

 推荐使用Docker编译，我们已经为您准备好了Paddle Serving编译环境并配置好了上述编译依赖，详见[该文档](Docker_Images_CN.md)。

-## 获取代码
+我们提供了五个环境的开发镜像，分别是CPU， Cuda10.1+Cudnn7， Cuda10.2+Cudnn7，Cuda10.2+Cudnn8， Cuda11.2+Cudnn8。我们提供了Serving开发镜像涵盖以上环境。与此同时，我们也支持Paddle开发镜像。

-``` python
-git clone https://github.com/PaddlePaddle/Serving
-cd Serving && git submodule update --init --recursive
-```
+其中Serving镜像名是 **paddlepaddle/serving:${Serving开发镜像Tag}**(如果网络不佳可以访问**registry.baidubce.com/paddlepaddle/serving:${Serving开发镜像Tag}**)， Paddle开发镜像名是 **paddlepaddle/paddle:${Paddle开发镜像Tag}**。为了防止用户对两套镜像出现混淆，我们分别解释一下两套镜像的由来。

-## PYTHONROOT设置
+Serving开发镜像是Serving套件为了支持各个预测环境提供的用于编译、调试预测服务的镜像，Paddle开发镜像是Paddle在官网发布的用于编译、开发、训练模型使用镜像。为了让Paddle开发者能够在同一个容器内直接使用Serving。对于上个版本就已经使用Serving用户的开发者来说，Serving开发镜像应该不会感到陌生。但对于熟悉Paddle训练框架生态的开发者，目前应该更熟悉已有的Paddle开发镜像。为了适应所有用户的不同习惯，我们对这两套镜像都做了充分的支持。

-```shell
-# 例如python的路径为/usr/bin/python，可以设置PYTHONROOT
-export PYTHONROOT=/usr
+
+|  环境                         |   Serving开发镜像Tag               |    操作系统      | Paddle开发镜像Tag       |  操作系统            |
+| :--------------------------: | :-------------------------------: | :-------------: | :-------------------: | :----------------: |
+|  CPU                         | 0.7.0-devel                       |  Ubuntu 16.04   | 2.2.0                 | Ubuntu 18.04.       |
+|  Cuda10.1+Cudnn7             | 0.7.0-cuda10.1-cudnn7-devel       |  Ubuntu 16.04   | 无                     | 无                 |
+|  Cuda10.2+Cudnn7             | 0.7.0-cuda10.2-cudnn7-devel       |  Ubuntu 16.04   | 2.2.0-cuda10.2-cudnn7 | Ubuntu 16.04        |
+|  Cuda10.2+Cudnn8             | 0.7.0-cuda10.2-cudnn8-devel       |  Ubuntu 16.04   | 无                    |  无                 |
+|  Cuda11.2+Cudnn8             | 0.7.0-cuda11.2-cudnn8-devel       |  Ubuntu 16.04   | 2.2.0-cuda11.2-cudnn8 | Ubuntu 18.04        | 
+
+我们首先要针对自己所需的环境拉取相关镜像。上表**环境**一列下，除了CPU，其余（Cuda**+Cudnn**）都属于GPU环境。
+您可以使用Serving开发镜像。
 ```
+docker pull paddlepaddle/serving:${Serving开发镜像Tag}
+
+# 如果是GPU镜像
+nvidia-docker run --rm -it  paddlepaddle/serving:${Serving开发镜像Tag} bash

-如果您使用的是Docker开发镜像，请按照如下，确定好需要编译的Python版本，设置对应的环境变量
+# 如果是CPU镜像
+docker run --rm -it  paddlepaddle/serving:${Serving开发镜像Tag} bash
 ```
-#Python3.6
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.6m
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.6m.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.6
-
-#Python3.7
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.7m
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.7m.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.7
-
-#Python3.8
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.8
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.8.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.8

+也可以使用Paddle开发镜像。
 ```
+docker pull paddlepaddle/paddle:${Paddle开发镜像Tag}

-## 安装Python依赖
+# 如果是GPU镜像，需要使用nvidia-docker
+nvidia-docker run --rm -it paddlepaddle/paddle:${Paddle开发镜像Tag} bash

-```shell
-pip install -r python/requirements.txt -i https://mirror.baidu.com/pypi/simple
+# 如果是CPU镜像
+docker run --rm -it paddlepaddle/paddle:${Paddle开发镜像Tag} bash
 ```

-如果使用其他Python版本，请使用对应版本的`pip`。

-## GOPATH 设置
+## 下载代码库
+**注明： 如果您正在使用Paddle开发镜像，需要在下载代码库后手动运行`bash env_install.sh`(如代码框的第三行所示）**
+```
+git clone https://github.com/PaddlePaddle/Serving
+cd Serving && git submodule update --init --recursive

-默认 GOPATH 设置为 `$HOME/go`，您也可以设置为其他值。** 如果是Serving提供的Docker环境，可以不需要设置。**
-```shell
-export GOPATH=$HOME/go
-export PATH=$PATH:$GOPATH/bin
+# Paddle开发镜像需要运行如下命令，Serving开发镜像不需要运行
+bash tools/paddle_env_install.sh
 ```

-## 获取 Go packages
+## 环境变量准备

-```shell
-go env -w GO111MODULE=on
-go env -w GOPROXY=https://goproxy.cn,direct
-go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
-go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
-go get -u github.com/golang/protobuf/protoc-gen-go@v1.4.3
-go get -u google.golang.org/grpc@v1.33.0
-go env -w GO111MODULE=auto
+**设置PYTHON环境变量**
+
+如果您使用的是Serving开发镜像，请按照如下，确定好需要编译的Python版本，设置对应的环境变量，一共需要设置三个环境变量，分别是`PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES`, `PYTHON_EXECUTABLE`。以下我们以python 3.7为例，介绍如何设置这三个环境变量。
+
+1) 设置`PYTHON_INCLUDE_DIR`
+
+搜索Python.h 所在的目录
 ```
+find / -name Python.h
+```
+通常会有类似于`**/include/python3.7/Python.h`出现，我们只需要取它的文件夹目录就好，比如找到`/usr/include/python3.7/Python.h`，那么我们只需要`export PYTHON_INCLUDE_DIR=/usr/include/python3.7/`就好。
+如果没有找到。说明 1）没有安装开发版本的Python，需重新安装 2）权限不足无法查看相关系统目录。

+2) 设置`PYTHON_LIBRARIES`

-## 编译Server部分
+搜索 libpython3.7.so
+```
+find / -name libpython3.7.so
+```
+通常会有类似于`**/lib/libpython3.7.so`或者`**/lib/x86_64-linux-gnu/libpython3.7.so`出现，我们只需要取它的文件夹目录就好，比如找到`/usr/local/lib/libpython3.7.so`，那么我们只需要`export PYTHON_LIBRARIES=/usr/local/lib`就好。
+如果没有找到，说明 1）静态编译Python，需要重新安装动态编译的Python 2）全县不足无法查看相关系统目录。

-### 集成CPU版本Paddle Inference Library
+3) 设置`PYTHON_EXECUTABLE`

-``` shell
-mkdir server-build-cpu && cd server-build-cpu
-cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-    -DSERVER=ON ..
-make -j10
+直接查看python3.7路径
 ```
+which python3.7
+```
+假如结果是`/usr/local/bin/python3.7`，那么直接设置`export PYTHON_EXECUTABLE=/usr/local/bin/python3.7`。

-可以执行`make install`把目标产出放在`./output`目录下，cmake阶段需添加`-DCMAKE_INSTALL_PREFIX=./output`选项来指定存放路径。
+设置好这三个环境变量至关重要，设置完成后，我们便可以执行下列操作（以下是Paddle Cuda 11.2的开发镜像的PYTHON环境，如果是其他镜像，请更改相应的`PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES`, `PYTHON_EXECUTABLE`）。

-### 集成GPU版本Paddle Inference Library
+```
+# 以下三个环境变量是Paddle开发镜像Cuda11.2的环境，如其他镜像可能需要修改
+export PYTHON_INCLUDE_DIR=/usr/include/python3.7m/
+export PYTHON_LIBRARIES=/usr/lib/x86_64-linux-gnu/libpython3.7m.so
+export PYTHON_EXECUTABLE=/usr/bin/python3.7

-相比CPU环境，GPU环境需要参考以下表格,
-**需要说明的是，以下表格对非Docker编译环境作为参考，Docker编译环境已经配置好相关参数，无需在cmake过程指定。**
+export GOPATH=$HOME/go
+export PATH=$PATH:$GOPATH/bin

-| cmake环境变量         | 含义                                | GPU环境注意事项               | Docker环境是否需要 |
-|-----------------------|-------------------------------------|-------------------------------|--------------------|
-| CUDA_TOOLKIT_ROOT_DIR | cuda安装路径，通常为/usr/local/cuda | 全部环境都需要                | 否(/usr/local/cuda)                 |
-| CUDNN_LIBRARY         | libcudnn.so.*所在目录，通常为/usr/local/cuda/lib64/  | 全部环境都需要                | 否(/usr/local/cuda/lib64/)                 |
-| CUDA_CUDART_LIBRARY   | libcudart.so.*所在目录，通常为/usr/local/cuda/lib64/ | 全部环境都需要                | 否(/usr/local/cuda/lib64/)                 |
-| TENSORRT_ROOT         | libnvinfer.so.*所在目录的上一级目录，取决于TensorRT安装目录 | Cuda 9.0/10.0不需要，其他需要 | 否(/usr)                 |
+python -m install -r python/requirements.txt
 
-非Docker环境下，用户可以参考如下执行方式，具体的路径以当时环境为准，代码仅作为参考。TENSORRT_LIBRARY_PATH和TensorRT版本有关，要根据实际情况设置。例如在cuda10.1环境下TensorRT版本是6.0(/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/)，在cuda10.2和cuda11.0环境下TensorRT版本是7.1（/usr/local/TensorRT-7.1.3.4/targets/x86_64-linux-gnu/）。
+go env -w GO111MODULE=on
+go env -w GOPROXY=https://goproxy.cn,direct
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
+go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
+go install google.golang.org/grpc@v1.33.0
+go env -w GO111MODULE=auto
+```

-``` shell
+如果您是GPU用户需要额外设置`CUDA_PATH`, `CUDNN_LIBRARY`, `CUDA_CUDART_LIBRARY`和`TENSORRT_LIBRARY_PATH`。
+```
 export CUDA_PATH='/usr/local/cuda'
 export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
 export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
-export TENSORRT_LIBRARY_PATH="/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/"
+export TENSORRT_LIBRARY_PATH="/usr/"
+```
+环境变量的含义如下表所示。
+
+| cmake环境变量         | 含义                                | GPU环境注意事项               | Docker环境是否需要 |
+|-----------------------|-------------------------------------|-------------------------------|--------------------|
+| CUDA_TOOLKIT_ROOT_DIR | cuda安装路径，通常为/usr/local/cuda | 全部GPU环境都需要                | 否(/usr/local/cuda)                 |
+| CUDNN_LIBRARY         | libcudnn.so.*所在目录，通常为/usr/local/cuda/lib64/  | 全部GPU环境都需要                | 否(/usr/local/cuda/lib64/)                 |
+| CUDA_CUDART_LIBRARY   | libcudart.so.*所在目录，通常为/usr/local/cuda/lib64/ | 全部GPU环境都需要                | 否(/usr/local/cuda/lib64/)                 |
+| TENSORRT_ROOT         | libnvinfer.so.*所在目录的上一级目录，取决于TensorRT安装目录 | 全部GPU环境都需要 | 否(/usr)                 |
+
+
+
+## 正式编译

-mkdir server-build-gpu && cd server-build-gpu
+我们一共需要编译三个目标，分别是`paddle-serving-server`, `paddle-serving-client`, `paddle-serving-app`，其中`paddle-serving-server`需要区分CPU或者GPU版本。如果是CPU版本请运行，
+
+### 编译paddle-serving-server
+
+```
+mkdir build_server
+cd build_server
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
+    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+    -DSERVER=ON \
+    -DWITH_GPU=OFF ..
+make -j20
+cd ..
+```
+
+如果是GPU版本，请运行，
+```
+mkdir build_server
+cd build_server
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
@@ -135,83 +192,76 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
    -DSERVER=ON \
    -DWITH_GPU=ON ..
-make -j10
+make -j20
+cd ..
 ``` 

-执行`make install`可以把目标产出放在`./output`目录下。
-
-### 开启WITH_OPENCV选项编译C++ Server
-**注意：** 只有当您需要对Paddle Serving C++部分进行二次开发，且新增的代码依赖于OpenCV库时，您才需要这样做。
-
-编译Serving C++ Server部分，开启WITH_OPENCV选项时，需要已安装的OpenCV库，若尚未安装，可参考本文档后面的说明编译安装OpenCV库。
+### 编译paddle-serving-client 和 paddle-serving-app

-以开启WITH_OPENCV选项，编译CPU版本Paddle Inference Library为例，在上述编译命令基础上，加入`DOPENCV_DIR=${OPENCV_DIR}` 和 `DWITH_OPENCV=ON`选项。
-``` shell
-OPENCV_DIR=your_opencv_dir #`your_opencv_dir`为opencv库的安装路径。
-mkdir server-build-cpu && cd server-build-cpu
-cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-    -DOPENCV_DIR=${OPENCV_DIR} \
-    -DWITH_OPENCV=ON \
-    -DSERVER=ON ..
-make -j10
+接下来，我们继续编译client和app就可以了，这两个包的编译命令在所有平台通用，不区分CPU和GPU的版本。
 ```
-
-**注意：** 编译成功后，需要设置`SERVING_BIN`路径，详见后面的[注意事项](#注意事项)。
-
-
-## 编译Client部分
-
-``` shell
-mkdir client-build && cd client-build
+# 编译paddle-serving-client
+mkdir build_client
+cd build_client
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
    -DCLIENT=ON ..
 make -j10
-```
-
-执行`make install`可以把目标产出放在`./output`目录下。
-
-
-
-## 编译App部分
+cd ..

-```bash
-mkdir app-build && cd app-build
+# 编译paddle-serving-app
+mkdir build_app
+cd build_app
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
    -DAPP=ON ..
-make
+make -j10
+cd ..
 ```

+## 安装相关whl包
+```
+pip3.7 install -r build_server/python/dist/*.whl
+pip3.7 install -r build_client/python/dist/*.whl
+pip3.7 install -r build_app/python/dist/*.whl
+export SERVING_BIN=${PWD}/build_server/core/general-server/serving
+```

+## 注意事项

-## 安装wheel包
-
-无论是Client端，Server端还是App部分，编译完成后，安装编译过程临时目录（`server-build-cpu`、`server-build-gpu`、`client-build`、`app-build`）下的`python/dist/` 中的whl包即可。
-例如：cd server-build-cpu/python/dist && pip install -U xxxxx.whl
-
+注意到上一小节的最后一行`export SERVING_BIN`，运行python端Server时，会检查`SERVING_BIN`环境变量，如果想使用自己编译的二进制文件，请将设置该环境变量为对应二进制文件的路径，通常是`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`。
+其中BUILD_DIR为`build_server`的绝对路径。
+可以cd build_server路径下，执行`export SERVING_BIN=${PWD}/core/general-server/serving`


+## 开启WITH_OPENCV选项编译C++ Server

-## 注意事项
+**注意：** 只有当您需要对Paddle Serving C++部分进行二次开发，且新增的代码依赖于OpenCV库时，您才需要这样做。

-运行python端Server时，会检查`SERVING_BIN`环境变量，如果想使用自己编译的二进制文件，请将设置该环境变量为对应二进制文件的路径，通常是`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`。
-其中BUILD_DIR为server-build-cpu或server-build-gpu的绝对路径。
-可以cd server-build-cpu路径下，执行`export SERVING_BIN=${PWD}/core/general-server/serving`
+编译Serving C++ Server部分，开启WITH_OPENCV选项时，需要已安装的OpenCV库，若尚未安装，可参考本文档后面的说明编译安装OpenCV库。

+以开启WITH_OPENCV选项，编译CPU版本Paddle Inference Library为例，在上述编译命令基础上，加入`DOPENCV_DIR=${OPENCV_DIR}` 和 `DWITH_OPENCV=ON`选项。
+``` shell
+OPENCV_DIR=your_opencv_dir #`your_opencv_dir`为opencv库的安装路径。
+mkdir build_server && cd build_server
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
+    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+    -DOPENCV_DIR=${OPENCV_DIR} \
+    -DWITH_OPENCV=ON \
+    -DSERVER=ON ..
+make -j10
+```

+**注意：** 编译成功后，需要设置`SERVING_BIN`路径，详见后面的[注意事项](https://github.com/PaddlePaddle/Serving/blob/develop/doc/COMPILE_CN.md#注意事项)。

-## 如何验证

-请使用 `python/examples` 下的例子进行验证。



-## CMake选项说明
+## 附：CMake选项说明

 |     编译选项     |                    说明                    | 默认 |
 | :--------------: | :----------------------------------------: | :--: |
@@ -252,11 +302,11 @@ Paddle Serving通过PaddlePaddle预测库支持在GPU上做预测。WITH_GPU选
 | post102  |  10.2   | CuDNN 8.0.5  | 7.1.3    |
 | post11   |  11.0   | CuDNN 8.0.4  | 7.1.3    |

-### 如何让Paddle Serving编译系统探测到CuDNN库
+### 附：如何让Paddle Serving编译系统探测到CuDNN库

 从NVIDIA developer官网下载对应版本CuDNN并在本地解压后，在cmake编译命令中增加`-DCUDNN_LIBRARY`参数，指定CuDNN库所在路径。

-## 编译安装OpenCV库
+## 附：编译安装OpenCV库
 **注意：** 只有当您需要在C++代码中引入OpenCV库时，您才需要这样做。

 * 首先需要从OpenCV官网上下载在Linux环境下源码编译的包，以OpenCV3.4.7为例，下载命令如下。

--- a/doc/Compile_EN.md
+++ b/doc/Compile_EN.md
@@ -2,7 +2,19 @@

 ([简体中文](./Compile_CN.md)|English)

-## Compilation environment requirements
+## Overview
+
+Compiling Paddle Serving is divided into the following steps
+
+- Compilation Environment Preparation: According to the needs of the model and operating environment, select the most suitable image
+- Download the Serving Code Repo: Download the Serving code library, and perform initialization operations as needed
+- Environment Variable Preparation: According to the needs of the running environment, determine the various environment variables of Python. For example, the GPU environment also needs to determine the environment variables such as Cuda, Cudnn, TensorRT and so on.
+- Compilation: Compile `paddle-serving-server`, `paddle-serving-client`, `paddle-serving-app` related whl packages
+- Install Related Whl Packages: install the three compiled whl packages, and set the SERVING_BIN environment variable
+
+In addition, for some C++ secondary development scenarios, we also provide OPENCV binding solutions.
+
+## Compilation Environment Requirements

 |            module            |              version              |
 | :--------------------------: | :-------------------------------: |
@@ -11,7 +23,7 @@
 |           gcc-c++            |          5.4.0(Cuda 10.1) and 8.2.0         |
 |            cmake             |          3.2.0 and later          |
 |            Python            |          3.6.0 and later          |
-|              Go              |          1.9.2 and later          |
+|              Go              |          1.17.2 and later          |
 |             git              |         2.17.1 and later          |
 |         glibc-static         |               2.17                |
 |        openssl-devel         |              1.0.2k               |
@@ -23,111 +35,141 @@
 |            libSM             |               1.2.2               |
 |          libXrender          |              0.9.10               |

-It is recommended to use Docker for compilation. We have prepared the Paddle Serving compilation environment for you, see [this document](Docker_Images_EN.md).
+Docker compilation is recommended. We have prepared the Paddle Serving compilation environment for you and configured the above compilation dependencies. For details, please refer to [this document](DOCKER_IMAGES_CN.md).

-## Get Code
+We provide five environment development images, namely CPU, Cuda10.1+Cudnn7, Cuda10.2+Cudnn7, Cuda10.2+Cudnn8, Cuda11.2+Cudnn8. We provide a Serving development image to cover the above environment. At the same time, we also support Paddle development mirroring.

-``` python
-git clone https://github.com/PaddlePaddle/Serving
-cd Serving && git submodule update --init --recursive
+The Serving image name is **paddlepaddle/serving:${Serving development image Tag}** (If the network is not good, you can visit **registry.baidubce.com/paddlepaddle/serving:${Serving development image Tag}**), The name of the Paddle development image is **paddlepaddle/paddle:${Paddle Development Image Tag}**. In order to prevent users from confusing the two sets of mirroring, we explain the origin of the two sets of mirroring separately.
+
+Serving development mirror is the mirror used to compile and debug prediction services provided by Serving suite in order to support various prediction environments. Paddle development mirror is the mirror used for compilation, development, and training models released by Paddle on the official website. In order to allow Paddle developers to use Serving directly in the same container. For developers who have already used Serving users in the previous version, Serving development image should not be unfamiliar. But for developers who are familiar with the Paddle training framework ecology, they should be more familiar with the existing Paddle development mirrors. In order to adapt to the different habits of all users, we have fully supported both sets of mirrors.
+
+|  Environment           |   Serving Dev Image Tag               |    OS      | Paddle Dev Image Tag       |  OS            |
+| :--------------------------: | :-------------------------------: | :-------------: | :-------------------: | :----------------: |
+|  CPU                         | 0.7.0-devel                       |  Ubuntu 16.04   | 2.2.0                 | Ubuntu 18.04.       |
+|  Cuda10.1+Cudnn7             | 0.7.0-cuda10.1-cudnn7-devel       |  Ubuntu 16.04   | Nan                     | Nan                 |
+|  Cuda10.2+Cudnn7             | 0.7.0-cuda10.2-cudnn7-devel       |  Ubuntu 16.04   | 2.2.0-cuda10.2-cudnn7 | Ubuntu 16.04        |
+|  Cuda10.2+Cudnn8             | 0.7.0-cuda10.2-cudnn8-devel       |  Ubuntu 16.04   | Nan                    |  Nan                 |
+|  Cuda11.2+Cudnn8             | 0.7.0-cuda11.2-cudnn8-devel       |  Ubuntu 16.04   | 2.2.0-cuda11.2-cudnn8 | Ubuntu 18.04        | 
+
+We first need to pull related images for the environment we need. Under the **Environment** column in the above table, except for the CPU, the rest (Cuda**+Cudnn**) belong to the GPU environment.
+
+You can use Serving Dev Images.
 ```
+docker pull paddlepaddle/serving:${Serving Dev Image Tag}

-## PYTHONROOT settings
+# For GPU Image
+nvidia-docker run --rm -it  paddlepaddle/serving:${Serving Dev Image Tag} bash

-```shell
-# For example, the path of python is /usr/bin/python, you can set PYTHONROOT
-export PYTHONROOT=/usr
+# For CPU Image
+docker run --rm -it  paddlepaddle/serving:${Serving Dev Image Tag} bash
 ```

-If you are using a Docker development image, please follow the following to determine the Python version to be compiled, and set the corresponding environment variables
+You can also use Paddle Dev Images.

+## Download the Serving Code Repo
+**Note: If you are using Paddle to develop the image, you need to manually run `bash env_install.sh` after downloading the code base (as shown in the third line of the code box)**
 ```
-#Python3.6
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.6m
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.6m.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.6
-
-#Python3.7
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.7m
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.7m.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.7
-
-#Python3.8
-export PYTHONROOT=/usr/local/
-export PYTHON_INCLUDE_DIR=$PYTHONROOT/include/python3.8
-export PYTHON_LIBRARIES=$PYTHONROOT/lib/libpython3.8.so
-export PYTHON_EXECUTABLE=$PYTHONROOT/bin/python3.8
+git clone https://github.com/PaddlePaddle/Serving
+cd Serving && git submodule update --init --recursive

+# Paddle development image needs to run the following commands, Serving development image does not need to run
+bash tools/paddle_env_install.sh
 ```

-## Install Python dependencies
+## Environment Variables Preparation

-```shell
-pip install -r python/requirements.txt -i https://mirror.baidu.com/pypi/simple
-```
+**Set PYTHON environment variable**

-If you use other Python version, please use the right `pip` accordingly.
+If you are using a Serving development image, please follow the steps below to determine the Python version that needs to be compiled and set the corresponding environment variables. A total of three environment variables need to be set, namely `PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES`, `PYTHON_EXECUTABLE`. Below we take python 3.7 as an example to introduce how to set these three environment variables.

-## GOPATH Setting
-The default GOPATH is set to `$HOME/go`, you can also set it to other values. **If it is the Docker environment provided by Serving, you do not need to set up.**
+1) Set `PYTHON_INCLUDE_DIR`

-```shell
-export GOPATH=$HOME/go
-export PATH=$PATH:$GOPATH/bin
+Search the directory where Python.h is located
 ```
+find / -name Python.h
+```
+Usually there will be something like `**/include/python3.7/Python.h`, we only need to take its folder directory, for example, find `/usr/include/python3.7/Python.h`, Then we only need `export PYTHON_INCLUDE_DIR=/usr/include/python3.7/`.
+If not found. Explanation 1) The development version of Python is not installed and needs to be re-installed. 2) Insufficient permissions cannot view the relevant system directories.

-## Get go packages
+2) Set `PYTHON_LIBRARIES`

-```shell
-go env -w GO111MODULE=on
-go env -w GOPROXY=https://goproxy.cn,direct
-go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
-go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
-go get -u github.com/golang/protobuf/protoc-gen-go@v1.4.3
-go get -u google.golang.org/grpc@v1.33.0
-go env -w GO111MODULE=auto
+Search for libpython3.7.so
+```
+find / -name libpython3.7.so
 ```
+Usually there will be something similar to `**/lib/libpython3.7.so` or `**/lib/x86_64-linux-gnu/libpython3.7.so`, we only need to take its folder directory, For example, find `/usr/local/lib/libpython3.7.so`, then we only need `export PYTHON_LIBRARIES=/usr/local/lib`.
+If it is not found, it means 1) Statically compiling Python, you need to reinstall the dynamically compiled Python 2) The county is not enough to view the relevant system catalogs.

+3) Set `PYTHON_EXECUTABLE`

-## Compile Server
+View the python3.7 path directly
+```
+which python3.7
+```
+If the result is `/usr/local/bin/python3.7`, then directly set `export PYTHON_EXECUTABLE=/usr/local/bin/python3.7`.

-### Integrated CPU version paddle inference library
+It is very important to set these three environment variables. After the settings are completed, we can perform the following operations (the following is the PYTHON environment of the development image of Paddle Cuda 11.2, if it is another image, please change the corresponding `PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES` , `PYTHON_EXECUTABLE`).

-``` shell
-mkdir server-build-cpu && cd server-build-cpu
-cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-    -DSERVER=ON ..
-make -j10
 ```
+# The following three environment variables are the environment of Paddle development mirror Cuda11.2, such as other mirrors may need to be modified
+export PYTHON_INCLUDE_DIR=/usr/include/python3.7m/
+export PYTHON_LIBRARIES=/usr/lib/x86_64-linux-gnu/libpython3.7m.so
+export PYTHON_EXECUTABLE=/usr/bin/python3.7

-you can execute `make install` to put targets under directory `./output`, you need to add`-DCMAKE_INSTALL_PREFIX=./output`to specify output path to cmake command shown above.
+export GOPATH=$HOME/go
+export PATH=$PATH:$GOPATH/bin

-### Integrated GPU version paddle inference library
+python -m install -r python/requirements.txt
 
-Compared with CPU environment, GPU environment needs to refer to the following table,
-**It should be noted that the following table is used as a reference for non-Docker compilation environment. The Docker compilation environment has been configured with relevant parameters and does not need to be specified in cmake process. **
+go env -w GO111MODULE=on
+go env -w GOPROXY=https://goproxy.cn,direct
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
+go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
+go install google.golang.org/grpc@v1.33.0
+go env -w GO111MODULE=auto
+```
+
+If you are a GPU user, you need to set additional `CUDA_PATH`, `CUDNN_LIBRARY`, `CUDA_CUDART_LIBRARY` and `TENSORRT_LIBRARY_PATH`.
+```
+export CUDA_PATH='/usr/local/cuda'
+export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
+export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"
+export TENSORRT_LIBRARY_PATH="/usr/"
+```
+The meaning of environment variables is shown in the table below.

 | cmake environment variable | meaning | GPU environment considerations | whether Docker environment is needed |
 |-----------------------|-------------------------------------|-------------------------------|--------------------|
 | CUDA_TOOLKIT_ROOT_DIR | cuda installation path, usually /usr/local/cuda | Required for all environments | No (/usr/local/cuda) |
 | CUDNN_LIBRARY | The directory where libcudnn.so.* is located, usually /usr/local/cuda/lib64/ | Required for all environments | No (/usr/local/cuda/lib64/) |
 | CUDA_CUDART_LIBRARY | The directory where libcudart.so.* is located, usually /usr/local/cuda/lib64/ | Required for all environments | No (/usr/local/cuda/lib64/) |
-| TENSORRT_ROOT | The upper level directory of the directory where libnvinfer.so.* is located, depends on the TensorRT installation directory | Cuda 9.0/10.0 does not need, other needs | No (/usr) |
+| TENSORRT_ROOT | The upper level directory of the directory where libnvinfer.so.* is located, depends on the TensorRT installation directory | Required for all environments | No (/usr) |

-If not in Docker environment, users can refer to the following execution methods. The specific path is subject to the current environment, and the code is only for reference.TENSORRT_LIBRARY_PATH is related to the TensorRT version and should be set according to the actual situation。For example, in the cuda10.1 environment, the TensorRT version is 6.0 (/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/)，In the cuda10.2 and cuda11.0 environment, the TensorRT version is 7.1 (/usr/local/TensorRT-7.1.3.4/targets/x86_64-linux-gnu/).

-``` shell
-export CUDA_PATH='/usr/local/cuda'
-export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
-export CUDA_CUDART_LIBRARY="/usr/local/cuda/lib64/"

-export TENSORRT_LIBRARY_PATH="/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/"
+## Compilation
+
+We need to compile three targets in total, namely `paddle-serving-server`, `paddle-serving-client`, and `paddle-serving-app`, among which `paddle-serving-server` needs to distinguish between CPU or GPU version. If it is a CPU version, please run,
+
+### Compile paddle-serving-server
+
+```
+mkdir build_server
+cd build_server
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
+     -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+     -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+     -DSERVER=ON \
+     -DWITH_GPU=OFF ..
+make -j20
+cd ..
+```

-mkdir server-build-gpu && cd server-build-gpu
+If it is the GPU version, please run,
+```
+mkdir build_server
+cd build_server
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
     -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
     -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
@@ -137,85 +179,84 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
     -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
     -DSERVER=ON \
     -DWITH_GPU=ON ..
-make -j10
+make -j20
+cd ..
 ```

-Execute `make install` to put the target output in the `./output` directory.
-
-### Compile C++ Server under the condition of WITH_OPENCV=ON
-**Note:** Only when you need to redevelop the paddle serving C + + part, and the new code depends on the OpenCV library, you need to do so.
-
-First of all , OpenCV library should be installed, if not, please refer to the `Compile and install OpenCV` section later in this article.
-
-In the compile command, add `DOPENCV_DIR=${OPENCV_DIR}` and `DWITH_OPENCV=ON`，for example：
-``` shell
-OPENCV_DIR=your_opencv_dir #`your_opencv_dir` is the installation path of OpenCV library。
-mkdir server-build-cpu && cd server-build-cpu
-cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-    -DOPENCV_DIR=${OPENCV_DIR} \
-    -DWITH_OPENCV=ON \
-    -DSERVER=ON ..
-make -j10
-```
 **Note:** After the compilation is successful, you need to set the `SERVING_BIN` path, see the following [Notes](Compile_EN.md#Note).

-## Compile Client
+### Compile paddle-serving-client and paddle-serving-app

-``` shell
-mkdir client-build && cd client-build
+Next, we can continue to compile the client and app. The compilation commands for these two packages are common on all platforms, and do not distinguish between CPU and GPU versions.
+```
+# Compile paddle-serving-client
+mkdir build_client
+cd build_client
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
    -DCLIENT=ON ..
 make -j10
-```
-
-execute `make install` to put targets under directory `./output`
-
-
+cd ..

-## Compile the App
-
-```bash
-mkdir app-build && cd app-build
+# Compile paddle-serving-app
+mkdir build_app
+cd build_app
 cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
    -DAPP=ON ..
-make
+make -j10
+cd ..
 ```

+## Install Related Whl Packages
+```
+pip3.7 install -r build_server/python/dist/*.whl
+pip3.7 install -r build_client/python/dist/*.whl
+pip3.7 install -r build_app/python/dist/*.whl
+export SERVING_BIN=${PWD}/build_server/core/general-server/serving
+```

+## Precautions

-## Install wheel package
-
-Regardless of the client, server or App part, after compiling, install the whl package in `python/dist/` in the temporary directory(`server-build-cpu`, `server-build-gpu`, `client-build`,`app-build`) of the compilation process.
-for example：cd server-build-cpu/python/dist && pip install -U xxxxx.whl
+Note the last line `export SERVING_BIN` in the previous section. When running the python server, the `SERVING_BIN` environment variable will be checked. If you want to use the binary file compiled by yourself, please set the environment variable to the path of the corresponding binary file. It is `export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`.
+Where BUILD_DIR is the absolute path of `build_server`.
+You can cd build_server path and execute `export SERVING_BIN=${PWD}/core/general-server/serving`

+## Enable WITH_OPENCV option to compile C++ Server

-## Note
+**Note:** You only need to do this when you need to do secondary development on the Paddle Serving C++ part and the newly added code depends on the OpenCV library.

-When running the python server, it will check the `SERVING_BIN` environment variable. If you want to use your own compiled binary file, set the environment variable to the path of the corresponding binary file, usually`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`.
-BUILD_DIR is the absolute path of server build CPU or server build GPU。
-for example: cd server-build-cpu && export SERVING_BIN=${PWD}/core/general-server/serving
+To compile the Serving C++ Server part, when the WITH_OPENCV option is turned on, the installed OpenCV library is required. If it has not been installed, you can refer to the instructions at the back of this document to compile and install the OpenCV library.

+Take the WITH_OPENCV option and compile the CPU version Paddle Inference Library as an example. On the basis of the above compilation command, add the `DOPENCV_DIR=${OPENCV_DIR}` and `DWITH_OPENCV=ON` options.
+``` shell
+OPENCV_DIR=your_opencv_dir #`your_opencv_dir` is the installation path of the opencv library.
+mkdir build_server && cd build_server
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
+    -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+    -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+    -DOPENCV_DIR=${OPENCV_DIR} \
+    -DWITH_OPENCV=ON \
+    -DSERVER=ON ..
+make -j10
+```

+**Note:** After the compilation is successful, you need to set the `SERVING_BIN` path, see the following [Notes](https://github.com/PaddlePaddle/Serving/blob/develop/doc/COMPILE_CN.md#Notes) ).

-## Verify

-Please use the example under `python/examples` to verify.



-## CMake Option Description
+## Attached: CMake option description

-| Compile Options  |                    Description             | Default |
-| :--------------: | :----------------------------------------: | :--: |
+| Compilation Options | Description | Default |
+| :--------------: | :------------------------------- ---------: | :--: |
 | WITH_AVX | Compile Paddle Serving with AVX intrinsics | OFF |
 | WITH_MKL | Compile Paddle Serving with MKL support | OFF |
 | WITH_GPU | Compile Paddle Serving with NVIDIA GPU | OFF |
+| WITH_TRT | Compile Paddle Serving with TensorRT | OFF |
 | WITH_OPENCV | Compile Paddle Serving with OPENCV | OFF |
 | CUDNN_LIBRARY | Define CuDNN library and header path | |
 | CUDA_TOOLKIT_ROOT_DIR | Define CUDA PATH | |
@@ -225,24 +266,23 @@ Please use the example under `python/examples` to verify.
 | APP | Compile Paddle Serving App package | OFF |
 | PACK | Compile for whl | OFF |

-### WITH_GPU Option
+### WITH_GPU option

-Paddle Serving supports prediction on the GPU through the PaddlePaddle inference library. The WITH_GPU option is used to detect basic libraries such as CUDA/CUDNN on the system. If an appropriate version is detected, the GPU Kernel will be compiled when PaddlePaddle is compiled.
+Paddle Serving supports prediction on the GPU through the PaddlePaddle prediction library. The WITH_GPU option is used to detect basic libraries such as CUDA/CUDNN on the system. If a suitable version is detected, the GPU version of the OP Kernel will be compiled when the PaddlePaddle is compiled.

 To compile the Paddle Serving GPU version on bare metal, you need to install these basic libraries:

- CUDA
- CuDNN
+-CUDA
+-CuDNN

 To compile the TensorRT version, you need to install the TensorRT library.

-Note here:
-
-1. The basic library versions such as CUDA/CUDNN installed on the system where Serving is compiled, needs to be compatible with the actual GPU device. For example, the Tesla V100 card requires at least CUDA 9.0. If the version of the basic library such as CUDA used during compilation is too low, the generated GPU code is not compatible with the actual hardware device, which will cause the Serving process to fail to start or serious problems such as coredump.
-2. Install the CUDA driver compatible with the actual GPU device on the system running Paddle Serving, and install the basic library compatible with the CUDA/CuDNN version used during compilation. If the version of CUDA/CuDNN installed on the system running Paddle Serving is lower than the version used at compile time, it may cause some cuda function call failures and other problems.
+The things to note here are:

+1. Compile the basic library versions such as CUDA/CUDNN installed on the system where Serving is located, and need to be compatible with the actual GPU device. For example, Tesla V100 card requires at least CUDA 9.0. If the version of basic libraries such as CUDA used during compilation is too low, the Serving process cannot be started due to the incompatibility between the generated GPU code and the actual hardware device, or serious problems such as coredump may occur.
+2. Install the CUDA driver compatible with the actual GPU device on the system running Paddle Serving, and install the basic library compatible with the CUDA/CuDNN version used during compilation. If the version of CUDA/CuDNN installed on the system running Paddle Serving is lower than the version used during compilation, it may cause strange cuda function call failures and other problems.

-The following is the base library version matching relationship used by the PaddlePaddle release version for reference:
+The following is the matching relationship between PaddleServing mirrored Cuda, Cudnn, and TensorRT for reference:

 | | CUDA | CuDNN | TensorRT |
 | :----: | :-----: | :----------: | :----: |
@@ -250,24 +290,23 @@ The following is the base library version matching relationship used by the Padd
 | post102 | 10.2 | CuDNN 8.0.5 | 7.1.3 |
 | post11 | 11.0 | CuDNN 8.0.4 | 7.1.3 |

-### How to make the compiler detect the CuDNN library
+### Attachment: How to make the Paddle Serving compilation system detect the CuDNN library

-Download the corresponding CUDNN version from NVIDIA developer official website and decompressing it, add `-DCUDNN_ROOT` to cmake command, to specify the path of CUDNN.
+After downloading the corresponding version of CuDNN from the official website of NVIDIA developer and decompressing it locally, add the `-DCUDNN_LIBRARY` parameter to the cmake compilation command and specify the path of the CuDNN library.

-## Compile and install OpenCV
-**Note:** You need to do this only if you need to import the opencv library into your C + + code.
+## Attachment: Compile and install OpenCV library
+**Note:** You only need to do this when you need to include the OpenCV library in your C++ code.

-* First of all, you need to download the source code compiled package in the Linux environment from the OpenCV official website. Taking OpenCV3.4.7 as an example, the download command is as follows.
+* First, you need to download the package compiled from the source code in the Linux environment from the OpenCV official website. Take OpenCV 3.4.7 as an example. The download command is as follows.

 ```
 wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
 tar -xf 3.4.7.tar.gz
 ```

-Finally, you can see the folder of `opencv-3.4.7/` in the current directory.
-
-* Compile OpenCV, the OpenCV source path (`root_path`) and installation path (`install_path`) should be set by yourself. Enter the OpenCV source code path and compile it in the following way.
+Finally, you can see the folder `opencv-3.4.7/` in the current directory.

+* Compile OpenCV, set the OpenCV source path (`root_path`) and installation path (`install_path`). Enter the OpenCV source code path and compile in the following way.

 ```shell
 root_path=your_opencv_root_path
@@ -299,11 +338,10 @@ make -j
 make install
 ```

-Among them, `root_path` is the downloaded OpenCV source code path, and `install_path` is the installation path of OpenCV. After `make install` is completed, the OpenCV header file and library file will be generated in this folder for later source code compilation.
-

+Among them, `root_path` is the downloaded OpenCV source path, `install_path` is the installation path of OpenCV, after the completion of `make install`, OpenCV header files and library files will be generated in this folder, which are used to compile the code that references the OpenCV library .

-The final file structure under the OpenCV installation path is as follows.
+The final file structure under the installation path is as follows.

 ```
 opencv3/

--- a/doc/Docker_Images_CN.md
+++ b/doc/Docker_Images_CN.md
@@ -8,10 +8,10 @@

 您可以通过两种方式获取镜像。

-1. 通过 TAG 直接从 `registry.baidubce.com ` 或 拉取镜像，具体TAG请参见下文的**镜像说明**章节的表格。
+1. 通过 TAG 直接从 dockerhub 或 `registry.baidubce.com` 拉取镜像，具体TAG请参见下文的**镜像说明**章节的表格。

   ```shell
-   docker pull registry.baidubce.com/paddlepaddle/serving:<TAG> # registry.baidubce.com
+   docker pull paddlepaddle/serving:<TAG> # 如果连接dockerhub网速不佳可以尝试registry.baidubce.com/paddlepaddle/serving:<TAG>
   ```

 2. 基于 Dockerfile 构建镜像
@@ -19,27 +19,25 @@
   建立新目录，复制对应 Dockerfile 内容到该目录下 Dockerfile 文件。执行

   ```shell
-   cd tools
-   docker build -f ${DOCKERFILE} -t <image-name>:<images-tag> .
+   docker build -f tools/${DOCKERFILE} -t <image-name>:<images-tag> .
   ```
   


 ## 镜像说明

-运行时镜像不能用于开发编译。
 若需要基于源代码二次开发编译，请使用后缀为-devel的版本。
-**在TAG列，latest也可以替换成对应的版本号，例如0.5.0/0.4.1等，但需要注意的是，部分开发环境随着某个版本迭代才增加，因此并非所有环境都有对应的版本号可以使用。**
+**在TAG列，0.7.0也可以替换成对应的版本号，例如0.5.0/0.4.1等，但需要注意的是，部分开发环境随着某个版本迭代才增加，因此并非所有环境都有对应的版本号可以使用。**

-**cuda10.1-cudnn7-gcc54环境尚未同步到镜像仓库，如果您需要相关镜像请运行相关dockerfile**

 |                         镜像选择                         |   操作系统    |             TAG              |                          Dockerfile                          |
 | :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: |
-|                       CPU development                        | Ubuntu16 |         latest-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
-|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | latest-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
-|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | latest-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
-|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | latest-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
-|              GPU (cuda11.2-cudnn8-tensorRT7) development               | Ubuntu18 | latest-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
+|                       CPU development                        | Ubuntu16 |         0.7.0-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
+|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
+|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn7-devel | [Dockerfile.cuda10.2-cudnn7.devel](../tools/Dockerfile.cuda10.2-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
+|              GPU (cuda11.2-cudnn8-tensorRT8) development               | Ubuntu16 | 0.7.0-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |

 **Java镜像：**
 ```
@@ -63,38 +61,24 @@ registry.baidubce.com/paddlepaddle/serving:xpu-x86 # for x86 xpu user

 # （附录）所有镜像列表

-编译镜像：

 开发镜像:

 | Env      | Version | Docker images tag            | OS        | Gcc Version |
 |----------|---------|------------------------------|-----------|-------------|
-|    CPU   | >=0.5.0 | 0.6.2-devel                 | Ubuntu 16 |  8.2.0       |
+|    CPU   | >=0.5.0 | 0.7.0-devel                 | Ubuntu 16 |  8.2.0       |
 |          | <=0.4.0 | 0.4.0-devel                  | CentOS 7  | 4.8.5       |
-| Cuda10.1 | >=0.5.0 | 0.6.2-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
-|          | <=0.4.0 | 0.6.2-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
-| Cuda10.2 | >=0.5.0 | 0.6.2-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+| Cuda10.1 | >=0.5.0 | 0.7.0-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | 0.4.0-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
+| Cuda10.2+Cudnn7 | >=0.5.0 | 0.7.0-cuda10.2-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
-| Cuda11.0 | >=0.5.0 | 0.6.2-cuda11.0-cudnn8-devel | Ubuntu 18 |    8.2.0       |
+| Cuda10.2+Cudnn8 | >=0.5.0 | 0.7.0-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | Nan                          | Nan       | Nan         |
+| Cuda11.2 | >=0.5.0 | 0.7.0-cuda11.2-cudnn8-devel | Ubuntu 16 |    8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |

 运行镜像:

 运行镜像比开发镜像更加轻量化, 运行镜像提供了serving的whl和bin，但为了运行期更小的镜像体积，没有提供诸如cmake这样但开发工具。 如果您想了解有关信息，请检查文档[在Kubernetes上使用Paddle Serving](./Run_On_Kubernetes_CN.md)。

-| ENV                                      | Python Version | Tag                         |
-|------------------------------------------|----------------|-----------------------------|
-| cpu                                      | 3.6            | 0.6.2-py36-runtime          |
-| cpu                                      | 3.7            | 0.6.2-py37-runtime          |
-| cpu                                      | 3.8            | 0.6.2-py38-runtime          |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.6            | 0.6.2-cuda10.1-py36-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.7            | 0.6.2-cuda10.1-py37-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.8            | 0.6.2-cuda10.1-py38-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.6            | 0.6.2-cuda10.2-py36-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.7            | 0.6.2-cuda10.2-py37-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.8            | 0.6.2-cuda10.2-py38-runtime |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.6            | 0.6.2-cuda11-py36-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.7            | 0.6.2-cuda11-py37-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.8            | 0.6.2-cuda11-py38-runtime   |
-
-**注意事项：** 如果您在0.5.0及以上版本需要在一个容器当中同时运行CPU server和GPU server，需要选择Cuda10.1/10.2/11的镜像，因为他们和CPU环境有着相同版本的gcc。
+
--- a/doc/Docker_Images_EN.md
+++ b/doc/Docker_Images_EN.md
@@ -8,10 +8,10 @@ This document maintains a list of docker images provided by Paddle Serving.

 You can get images in two ways:

-1. Pull image directly from `registry.baidubce.com ` through TAG:
+1. Pull image directly from dockerhub or `registry.baidubce.com ` through TAG:

   ```shell
-   docker pull registry.baidubce.com/paddlepaddle/serving:<TAG> # registry.baidubce.com
+   docker pull docker pull paddlepaddle/serving:<TAG>  # if it is slow connection to dockerhub, please try registry.baidubce.com
   ```

 2. Building image based on dockerfile
@@ -19,25 +19,28 @@ You can get images in two ways:
   Create a new folder and copy Dockerfile to this folder, and run the following command:

   ```shell
-   docker build -f ${DOCKERFILE} -t <image-name>:<images-tag> .
+   docker build -f tools/${DOCKERFILE} -t <image-name>:<images-tag> .
   ```



 ## Image description

-Runtime images cannot be used for compilation.
 If you want to customize your Serving based on source code, use the version with the suffix - devel.

 **cuda10.1-cudnn7-gcc54 image is not ready, you should run from dockerfile if you need it.**

+If you need to develop and compile based on the source code, please use the version with the suffix -devel.
+**In the TAG column, 0.7.0 can also be replaced with the corresponding version number, such as 0.5.0/0.4.1, etc., but it should be noted that some development environments only increase with a certain version iteration, so not all environments All have the corresponding version number can be used.**
+
 |                         Description                         |   OS    |             TAG              |                          Dockerfile                          |
 | :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: |
-|                       CPU development                        | Ubuntu16 |         latest-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
-|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | latest-cuda10.1-cudnn7-gcc54-devel(not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
-|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | latest-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
-|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | latest-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
-|              GPU (cuda11.2-cudnn8-tensorRT7) development               | Ubuntu18 | latest-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
+|                       CPU development                        | Ubuntu16 |         0.7.0-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
+|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
+|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn7-devel | [Dockerfile.cuda10.2-cudnn7.devel](../tools/Dockerfile.cuda10.2-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
+|              GPU (cuda11.2-cudnn8-tensorRT8) development               | Ubuntu16 | 0.7.0-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |

 **Java Client:**
 ```
@@ -64,34 +67,20 @@ Develop Images:

 | Env      | Version | Docker images tag            | OS        | Gcc Version |
 |----------|---------|------------------------------|-----------|-------------|
-|    CPU   | >=0.5.0 | 0.6.2-devel                 | Ubuntu 16 |  8.2.0       |
+|    CPU   | >=0.5.0 | 0.7.0-devel                 | Ubuntu 16 |  8.2.0       |
 |          | <=0.4.0 | 0.4.0-devel                  | CentOS 7  | 4.8.5       |
-| Cuda10.1 | >=0.5.0 | 0.6.2-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
-|          | 0.6.2   | 0.6.2-cuda10.1-cudnn7-gcc54-devel(not ready)  | Ubuntu 16 |  5.4.0 |
-|          | <=0.4.0 | 0.6.2-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
-| Cuda10.2 | >=0.5.0 | 0.6.2-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+| Cuda10.1 | >=0.5.0 | 0.7.0-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | 0.4.0-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
+| Cuda10.2+Cudnn7 | >=0.5.0 | 0.7.0-cuda10.2-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | Nan                          | Nan       | Nan         |
+| Cuda10.2+Cudnn8 | >=0.5.0 | 0.7.0-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
-| Cuda11.0 | >=0.5.0 | 0.6.2-cuda11.0-cudnn8-devel | Ubuntu 18 |    8.2.0       |
+| Cuda11.2 | >=0.5.0 | 0.7.0-cuda11.2-cudnn8-devel | Ubuntu 16 |    8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |

+
 Running Images:

 Running Images is lighter than Develop Images, and Running Images are made up with serving whl and bin, but without develop tools like cmake because of lower image size. If you want to know about it, plese check the document [Paddle Serving on Kubernetes.](./Run_On_Kubernetes_CN.md).


-| ENV                                      | Python Version | Tag                         |
-|------------------------------------------|----------------|-----------------------------|
-| cpu                                      | 3.6            | 0.6.2-py36-runtime          |
-| cpu                                      | 3.7            | 0.6.2-py37-runtime          |
-| cpu                                      | 3.8            | 0.6.2-py38-runtime          |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.6            | 0.6.2-cuda10.1-py36-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.7            | 0.6.2-cuda10.1-py37-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.8            | 0.6.2-cuda10.1-py38-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.6            | 0.6.2-cuda10.2-py36-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.7            | 0.6.2-cuda10.2-py37-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.8            | 0.6.2-cuda10.2-py38-runtime |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.6            | 0.6.2-cuda11-py36-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.7            | 0.6.2-cuda11-py37-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.8            | 0.6.2-cuda11-py38-runtime   |
-
-**Tips:**  If you want to use CPU server and GPU server (version>=0.5.0) at the same time, you should check the gcc version,  only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
--- a/doc/FAQ_CN.md
+++ b/doc/FAQ_CN.md
@@ -142,7 +142,7 @@ make: *** [all] Error 2

 #### Q：使用过程中出现CXXABI错误。

-这个问题出现的原因是Python使用的gcc版本和Serving所需的gcc版本对不上。对于Docker用户，推荐使用[Docker容器](./Run_In_Docker_CN.md)，由于Docker容器内的Python版本与Serving在发布前都做过适配，这样就不会出现类似的错误。如果是其他开发环境，首先需要确保开发环境中具备GCC 8.2，如果没有gcc 8.2，参考安装方式
+这个问题出现的原因是Python使用的gcc版本和Serving所需的gcc版本对不上。对于Docker用户，推荐使用[Docker容器](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Docker_Images_CN.md)，由于Docker容器内的Python版本与Serving在发布前都做过适配，这样就不会出现类似的错误。如果是其他开发环境，首先需要确保开发环境中具备GCC 8.2，如果没有gcc 8.2，参考安装方式

 ```bash
 wget -q https://paddle-ci.gz.bcebos.com/gcc-8.2.0.tar.xz 
@@ -236,7 +236,7 @@ InvalidArgumentError: Device id must be less than GPU count, but received id is:

 #### Q: python编译的GCC版本与serving的版本不匹配

-**A:**:1)使用[GPU docker](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Run_In_Docker_CN.md#gpunvidia-docker)解决环境问题；2)修改anaconda的虚拟环境下安装的python的gcc版本[改变python的GCC编译环境](https://www.jianshu.com/p/c498b3d86f77) 
+**A:**:1)使用GPU Dockers, [这里是Docker镜像列表](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Docker_Images_CN.md)解决环境问题；2)修改anaconda的虚拟环境下安装的python的gcc版本[改变python的GCC编译环境](https://www.jianshu.com/p/c498b3d86f77) 

 #### Q: paddle-serving是否支持本地离线安装 


--- a/doc/Install_CN.md
+++ b/doc/Install_CN.md
@@ -2,26 +2,56 @@

 (简体中文|[English](./Install_EN.md))

-**强烈建议**您在**Docker内构建**Paddle Serving，请查看[如何在Docker中运行PaddleServing](Run_In_Docker_CN.md)。更多镜像请查看[Docker镜像列表](Docker_Images_CN.md)。
+**强烈建议**您在**Docker内构建**Paddle Serving，更多镜像请查看[Docker镜像列表](Docker_Images_CN.md)。

-**提示**：目前paddlepaddle 2.1版本的默认GPU环境是Cuda 10.2，因此GPU Docker的示例代码以Cuda 10.2为准。镜像和pip安装包也提供了其余GPU环境，用户如果使用其他环境，需要仔细甄别并选择合适的版本。
+**提示-1**：本项目仅支持<mark>**Python3.6/3.7/3.8**</mark>，接下来所有的与Python/Pip相关的操作都需要选择正确的Python版本。

-**提示**：本项目仅支持Python3.6/3.7/3.8，接下来所有的与Python/Pip相关的操作都需要选择正确的Python版本。
+**提示-2**：以下示例中GPU环境均为cuda10.2-cudnn7，如果您使用Python Pipeline来部署，并需要Nvidia TensorRT来优化预测性能，请参考[支持的镜像环境和说明](#4支持的镜像环境和说明)来选择其他版本。

+
+## 1.启动开发镜像
+<mark>**同时支持使用Serving镜像和Paddle镜像，1.1和1.2章节中的操作2选1即可。**</mark>
+### 1.1 Serving开发镜像（CPU/GPU 2选1）
+**CPU：**
 ```
 # 启动 CPU Docker
-docker pull registry.baidubce.com/paddlepaddle/serving:0.6.2-devel
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.6.2-devel bash
+docker pull paddlepaddle/serving:0.7.0-devel
+docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
 docker exec -it test bash
 git clone https://github.com/PaddlePaddle/Serving
 ```
+**GPU：**
 ```
 # 启动 GPU Docker
-nvidia-docker pull registry.baidubce.com/paddlepaddle/serving:0.6.2-cuda10.2-cudnn8-devel
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.6.2-cuda10.2-cudnn8-devel bash
+docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
 nvidia-docker exec -it test bash
 git clone https://github.com/PaddlePaddle/Serving
 ```
+### 1.2 Paddle开发镜像（CPU/GPU 2选1）
+**CPU：**
+```
+# 启动 CPU Docker
+docker pull paddlepaddle/paddle:2.2.0
+docker run -p 9292:9292 --name test -dit paddlepaddle/paddle:2.2.0 bash
+docker exec -it test bash
+git clone https://github.com/PaddlePaddle/Serving
+
+# Paddle开发镜像需要执行以下脚本增加Serving所需依赖项
+bash Serving/tools/paddle_env_install.sh
+```
+**GPU：**
+```
+# 启动 GPU Docker
+docker pull paddlepaddle/paddle:2.2.0-cuda10.2-cudnn7
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/paddle:2.2.0-cuda10.2-cudnn7 bash
+nvidia-docker exec -it test bash
+git clone https://github.com/PaddlePaddle/Serving
+
+# Paddle开发镜像需要执行以下脚本增加Serving所需依赖项
+bash Serving/tools/paddle_env_install.sh
+```
+## 2.安装Paddle Serving相关Python库

 安装所需的pip依赖
 ```
@@ -30,13 +60,13 @@ pip3 install -r python/requirements.txt
 ```

 ```shell
-pip3 install paddle-serving-client==0.6.2
-pip3 install paddle-serving-server==0.6.2 # CPU
-pip3 install paddle-serving-app==0.6.2
-pip3 install paddle-serving-server-gpu==0.6.2.post102 #GPU with CUDA10.2 + TensorRT7
+pip3 install paddle-serving-client==0.7.0
+pip3 install paddle-serving-server==0.7.0 # CPU
+pip3 install paddle-serving-app==0.7.0
+pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
 # 其他GPU环境需要确认环境再选择执行哪一条
-pip3 install paddle-serving-server-gpu==0.6.2.post101 # GPU with CUDA10.1 + TensorRT6
-pip3 install paddle-serving-server-gpu==0.6.2.post11 # GPU with CUDA10.1 + TensorRT7
+pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
+pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
 ```

 您可能需要使用国内镜像源（例如清华源, 在pip命令中添加`-i https://pypi.tuna.tsinghua.edu.cn/simple`）来加速下载。
@@ -47,26 +77,28 @@ paddle-serving-server和paddle-serving-server-gpu安装包支持Centos 6/7, Ubun

 paddle-serving-client和paddle-serving-app安装包支持Linux和Windows，其中paddle-serving-client仅支持python3.6/3.7/3.8。

-**最新的0.6.2的版本，已经不支持Cuda 9.0和Cuda 10.0，Python已不支持2.7和3.5。**
-
-推荐安装2.1.0及以上版本的paddle
-
+## 3.安装Paddle相关Python库
+**当您使用`paddle_serving_client.convert`命令或者`Python Pipeline框架`时才需要安装。**
 ```
 # CPU环境请执行
-pip3 install paddlepaddle==2.1.0
+pip3 install paddlepaddle==2.2.0

 # GPU Cuda10.2环境请执行
-pip3 install paddlepaddle-gpu==2.1.0
+pip3 install paddlepaddle-gpu==2.2.0
 ```
+**注意**： 如果您的Cuda版本不是10.2，请勿直接执行上述命令，需要参考[Paddle-Inference官方文档-下载安装Linux预测库](https://paddleinference.paddlepaddle.org.cn/master/user_guides/download_lib.html#python)选择相应的GPU环境的url链接并进行安装。

-**注意**： 如果您的Cuda版本不是10.2，请勿直接执行上述命令，需要参考[Paddle官方文档-多版本whl包列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-release)
-
-选择相应的GPU环境的url链接并进行安装，例如Cuda 10.1的Python3.6用户，请选择表格当中的`cp36-cp36m`和`cuda10.1-cudnn7-mkl-gcc8.2-avx-trt6.0.1.5`对应的url，复制下来并执行
+例如Cuda 10.1的Python3.6用户，请选择表格当中的`cp36-cp36m`和`linux-cuda10.1-cudnn7.6-trt6-gcc8.2`对应的url，复制下来并执行
 ```
-pip3 install https://paddle-wheel.bj.bcebos.com/with-trt/2.1.0-gpu-cuda10.1-cudnn7-mkl-gcc8.2/paddlepaddle_gpu-2.1.0.post101-cp36-cp36m-linux_x86_64.whl
+pip3 install https://paddle-inference-lib.bj.bcebos.com/2.2.0/python/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.1_cudnn7.6.5_trt6.0.1.5/paddlepaddle_gpu-2.2.0.post101-cp36-cp36m-linux_x86_64.whl
 ```
-由于默认的`paddlepaddle-gpu==2.1.0`是Cuda 10.2，并没有联编TensorRT，因此如果需要和在`paddlepaddle-gpu`上使用TensorRT，需要在上述多版本whl包列表当中，找到`cuda10.2-cudnn8.0-trt7.1.3`，下载对应的Python版本
-
-如果是其他环境和Python版本，请在表格中找到对应的链接并用pip安装。
+## 4.支持的镜像环境和说明
+|  环境                         |   Serving开发镜像Tag               |    操作系统      | Paddle开发镜像Tag       |  操作系统            |
+| :--------------------------: | :-------------------------------: | :-------------: | :-------------------: | :----------------: |
+|  CPU                         | 0.7.0-devel                       |  Ubuntu 16.04   | 2.2.0                 | Ubuntu 18.04.       |
+|  Cuda10.1+Cudnn7             | 0.7.0-cuda10.1-cudnn7-devel       |  Ubuntu 16.04   | 无                     | 无                 |
+|  Cuda10.2+Cudnn7             | 0.7.0-cuda10.2-cudnn7-devel       |  Ubuntu 16.04   | 2.2.0-cuda10.2-cudnn7 | Ubuntu 16.04        |
+|  Cuda10.2+Cudnn8             | 0.7.0-cuda10.2-cudnn8-devel       |  Ubuntu 16.04   | 无                    |  无                 |
+|  Cuda11.2+Cudnn8             | 0.7.0-cuda11.2-cudnn8-devel       |  Ubuntu 16.04   | 2.2.0-cuda11.2-cudnn8 | Ubuntu 18.04        | 

 对于**Windows 10 用户**，请参考文档[Windows平台使用Paddle Serving指导](Windows_Tutorial_CN.md)。
--- a/doc/Install_EN.md
+++ b/doc/Install_EN.md
@@ -2,78 +2,106 @@

 ([简体中文](./Install_CN.md)|English)

-We **highly recommend** you to **run Paddle Serving in Docker**, please visit [Run in Docker](Run_In_Docker_EN.md). See the [document](./Docker_Images_EN.md) for more docker images.
+**Strongly recommend** you build **Paddle Serving** in Docker. For more images, please refer to [Docker Image List](Docker_Images_CN.md).

-**Attention:**: Currently, the default GPU environment of paddlepaddle 2.1 is Cuda 10.2, so the sample code of GPU Docker is based on Cuda 10.2. We also provides docker images and whl packages for other GPU environments. If users use other environments, they need to carefully check and select the appropriate version.
+**Tip-1**: This project only supports <mark>**Python3.6/3.7/3.8**</mark>, all subsequent operations related to Python/Pip need to select the correct Python version.

-**Attention:** the following so-called 'python' or 'pip' stands for one of Python 3.6/3.7/3.8.
+**Tip-2**: The GPU environments in the following examples are all cuda10.2-cudnn7. If you use Python Pipeline to deploy and need Nvidia TensorRT to optimize prediction performance, please refer to [Supported Mirroring Environment and Instructions](#4.-Supported-Docker-Images-and-Instruction) to choose other versions.

+## 1. Start the Docker Container
+<mark>**Both Serving Dev Image and Paddle Dev Image are supported at the same time. You can choose 1 from the operation 2 in chapters 1.1 and 1.2.**</mark>
+
+### 1.1 Serving Dev Images (CPU/GPU 2 choose 1)
+**CPU:**
 ```
-# Run CPU Docker
-docker pull registry.baidubce.com/paddlepaddle/serving:0.6.0-devel
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.6.0-devel bash
+# Start CPU Docker Container
+docker pull paddlepaddle/serving:0.7.0-devel
+docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-devel bash
 docker exec -it test bash
 git clone https://github.com/PaddlePaddle/Serving
 ```
+**GPU:**
 ```
-# Run GPU Docker
-nvidia-docker pull registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-cudnn8-devel
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-cudnn8-devel bash
+# Start GPU Docker Container
+docker pull paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/serving:0.7.0-cuda10.2-cudnn7-devel bash
 nvidia-docker exec -it test bash
 git clone https://github.com/PaddlePaddle/Serving
 ```
-install python dependencies
-```
-cd Serving
-pip install -r python/requirements.txt
+### 1.2 Paddle Dev Images (choose any codeblock of CPU/GPU)
+**CPU:**
 ```
+# Start CPU Docker Container
+docker pull paddlepaddle/paddle:2.2.0
+docker run -p 9292:9292 --name test -dit paddlepaddle/paddle:2.2.0 bash
+docker exec -it test bash
+git clone https://github.com/PaddlePaddle/Serving

-```shell
-pip install paddle-serving-client==0.6.0
-pip install paddle-serving-server==0.6.0 # CPU
-pip install paddle-serving-app==0.6.0
-pip install paddle-serving-server-gpu==0.6.0.post102 #GPU with CUDA10.2 + TensorRT7
-# DO NOT RUN ALL COMMANDS! check your GPU env and select the right one
-pip install paddle-serving-server-gpu==0.6.0.post101 # GPU with CUDA10.1 + TensorRT6
-pip install paddle-serving-server-gpu==0.6.0.post11 # GPU with CUDA10.1 + TensorRT7
+# Paddle dev image needs to run the following script to increase the dependencies required by Serving
+bash Serving/tools/paddle_env_install.sh
+```
+**GPU:**
+```
+# Start GPU Docker
+docker pull paddlepaddle/paddle:2.2.0-cuda10.2-cudnn7
+nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/paddle:2.2.0-cuda10.2-cudnn7 bash
+nvidia-docker exec -it test bash
+git clone https://github.com/PaddlePaddle/Serving

+# Paddle development image needs to execute the following script to increase the dependencies required by Serving
+bash Serving/tools/paddle_env_install.sh
+```

-You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
+## 2. Install Paddle Serving related whl Packages

-If you need install modules compiled with develop branch, please download packages from [latest packages list](./Latest_Packages_CN.md) and install with `pip install` command. If you want to compile by yourself, please refer to [How to compile Paddle Serving?](Compile_EN.md)
+Install the required pip dependencies
+```
+cd Serving
+pip3 install -r python/requirements.txt
+```

-Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.
+```shell
+pip3 install paddle-serving-client==0.7.0
+pip3 install paddle-serving-server==0.7.0 # CPU
+pip3 install paddle-serving-app==0.7.0
+pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
+# Other GPU environments need to confirm the environment before choosing which one to execute
+pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
+pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
+```

-Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python3.6/3.7/3.8.
+If you are in China, You may need to use a chinese mirror source (such as Tsinghua source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to the pip command) to speed up the download.

-**For latest version, Cuda 9.0 or Cuda 10.0 are no longer supported, Python2.7/3.5 is no longer supported.**
+If you need to use the installation package compiled by the develop branch, please download the download address from [Latest installation package list](./Latest_Packages_CN.md), and use the `pip install` command to install. If you want to compile by yourself, please refer to [Paddle Serving Compilation Document](./Compile_CN.md).

-Recommended to install paddle >= 2.1.0
+The paddle-serving-server and paddle-serving-server-gpu installation packages support Centos 6/7, Ubuntu 16/18 and Windows 10.

+The paddle-serving-client and paddle-serving-app installation packages support Linux and Windows, and paddle-serving-client only supports python3.6/3.7/3.8.

+## 3. Install Paddle related Python libraries
+**You only need to install it when you use the `paddle_serving_client.convert` command or the `Python Pipeline framework`. **
 ```
-# CPU users, please run
-pip install paddlepaddle==2.1.0
+# CPU environment please execute
+pip3 install paddlepaddle==2.2.0

-# GPU Cuda10.2 please run
-pip install paddlepaddle-gpu==2.1.0 
+# GPU Cuda10.2 environment please execute
+pip3 install paddlepaddle-gpu==2.2.0
 ```
+**Note**: If your Cuda version is not 10.2, please do not execute the above commands directly, you need to refer to [Paddle-Inference official document-download and install the Linux prediction library](https://paddleinference.paddlepaddle.org.cn/master/user_guides/download_lib.html#python) Select the URL link of the corresponding GPU environment and install it.

-**Note**: If your Cuda version is not 10.2, please do not execute the above commands directly, you need to refer to [Paddle official documentation-multi-version whl package list
-](https://www.paddlepaddle.org.cn/documentation/docs/en/install/Tables_en.html#multi-version-whl-package-list-release)
-
-Select the url link of the corresponding GPU environment and install it. For example, for Python3.6 users of Cuda 10.1, please select `cp36-cp36m` and
-The url corresponding to `cuda10.1-cudnn7-mkl-gcc8.2-avx-trt6.0.1.5`, copy it and run
+For example, for Python3.6 users of Cuda 10.1, please select the URL corresponding to `cp36-cp36m` and `linux-cuda10.1-cudnn7.6-trt6-gcc8.2` in the table, copy it and execute
 ```
-pip install https://paddle-wheel.bj.bcebos.com/with-trt/2.1.0-gpu-cuda10.1-cudnn7-mkl-gcc8.2/paddlepaddle_gpu-2.1.0.post101-cp36-cp36m-linux_x86_64.whl
+pip3 install https://paddle-inference-lib.bj.bcebos.com/2.2.0/python/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.1_cudnn7.6.5_trt6.0.1.5/paddlepaddle_gpu-2.2.0.post101 -cp36-cp36m-linux_x86_64.whl
 ```
+## 4. Supported Docker Images and Instruction

-the default `paddlepaddle-gpu==2.1.0` is Cuda 10.2 with no TensorRT. If you want to install PaddlePaddle with TensorRT. please also check the documentation-multi-version whl package list and find key word `cuda10.2-cudnn8.0-trt7.1.3`. 
-
-If it is other environment and Python version, please find the corresponding link in the table and install it with pip.
-
-For **Windows Users**, please read the document [Paddle Serving for Windows Users](Windows_Tutorial_EN.md)

-<h2 align="center">Quick Start Example</h2>
+| Environment | Serving Development Image Tag | Operating System | Paddle Development Image Tag | Operating System |
+| :--------------------------: | :-------------------------------: | :-------------: | :-------------------: | :----------------: |
+|  CPU                         | 0.7.0-devel                       |  Ubuntu 16.04   | 2.2.0                 | Ubuntu 18.04.       |
+|  Cuda10.1+Cudnn7             | 0.7.0-cuda10.1-cudnn7-devel       |  Ubuntu 16.04   | 无                     | 无                 |
+|  Cuda10.2+Cudnn7             | 0.7.0-cuda10.2-cudnn7-devel       |  Ubuntu 16.04   | 2.2.0-cuda10.2-cudnn7 | Ubuntu 16.04        |
+|  Cuda10.2+Cudnn8             | 0.7.0-cuda10.2-cudnn8-devel       |  Ubuntu 16.04   | 无                    |  无                 |
+|  Cuda11.2+Cudnn8             | 0.7.0-cuda11.2-cudnn8-devel       |  Ubuntu 16.04   | 2.2.0-cuda11.2-cudnn8 | Ubuntu 18.04        | 

-This quick start example is mainly for those users who already have a model to deploy, and we also provide a model that can be used for deployment. in case if you want to know how to complete the process from offline training to online service, please refer to the AiStudio tutorial above.
+For **Windows 10 users**, please refer to the document [Paddle Serving Guide for Windows Platform](Windows_Tutorial_CN.md).
--- a/doc/Latest_Packages_CN.md
+++ b/doc/Latest_Packages_CN.md
@@ -10,14 +10,15 @@ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server-0.0.0-py
 ## GPU server
 ### Python 3
 ```
-#cuda10.1 with TensorRT 6, Compile by gcc8.2
+#cuda10.1 Cudnn 7 with TensorRT 6, Compile by gcc8.2
 https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post101-py3-none-any.whl
-#cuda10.2 with TensorRT 7, Compile by gcc8.2
+#cuda10.2 Cudnn 7 with TensorRT 6, Compile by gcc5.4
 https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post102-py3-none-any.whl
-#cuda11.0 with TensorRT 7 (beta), Compile by gcc8.2
-https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post11-py3-none-any.whl
+#cuda10.2 Cudnn 8 with TensorRT 7, Compile by gcc8.2
+https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post1028-py3-none-any.whl
+#cuda11.2 Cudnn 8 with TensorRT 8 (beta), Compile by gcc8.2
+https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post112-py3-none-any.whl
 ```
-**Tips:**  If you want to use CPU server and GPU server at the same time, you should check the gcc version,  only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).

 ## Client

@@ -48,16 +49,16 @@ for kunlun user who uses arm-xpu or x86-xpu can download the wheel packages as f

 for arm kunlun user
 ```
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_server_xpu-0.6.0.post2-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_client-0.6.0-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_app-0.6.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_aarch64.whl
 ```
 
 for x86 kunlun user
 ``` 
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_server_xpu-0.6.0.post2-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_client-0.6.0-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_app-0.6.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_x86_64.whl
 ```


@@ -74,10 +75,12 @@ https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-cpu-avx-openblas-0.0.0
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-cpu-noavx-openblas-0.0.0.tar.gz
 # Cuda 10.1
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-101-0.0.0.tar.gz
-# Cuda 10.2
+# Cuda 10.2 + Cudnn 7
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-102-0.0.0.tar.gz
-# Cuda 11
-https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-cuda11-0.0.0.tar.gz
+# Cuda 10.2 + Cudnn 8
+https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-1028-0.0.0.tar.gz
+# Cuda 11.2
+https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-cuda112-0.0.0.tar.gz
 ```

 #### How to setup SERVING_BIN offline?

--- a/doc/Python_Pipeline/Pipeline_Design_CN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_CN.md
@@ -20,7 +20,7 @@ Paddle Serving提供了用户友好的多模型组合服务编程框架，Pipeli
 Server端基于<b>RPC服务层</b>和<b>图执行引擎</b>构建，两者的关系如下图所示。

 <div align=center>
-<img src='images/pipeline_serving-image1.png' height = "250" align="middle"/>
+<img src='../images/pipeline_serving-image1.png' height = "250" align="middle"/>
 </div>

 </n>
@@ -65,7 +65,7 @@ Response中`err_no`和`err_msg`表达处理结果的正确性和错误信息，`
 - 对于 OP 之间需要传输过大数据的情况，可以考虑 RAM DB 外存进行全局存储，通过在 Channel 中传递索引的 Key 来进行数据传输

 <div align=center>
-<img src='images/pipeline_serving-image2.png' height = "300" align="middle"/>
+<img src='../images/pipeline_serving-image2.png' height = "300" align="middle"/>
 </div>


@@ -84,7 +84,7 @@ Response中`err_no`和`err_msg`表达处理结果的正确性和错误信息，`
 - 下图为图执行引擎中 Channel 的设计，采用 input buffer 和 output buffer 进行多 OP 输入或多 OP 输出的数据对齐，中间采用一个 Queue 进行缓冲

 <div align=center>
-<img src='images/pipeline_serving-image3.png' height = "500" align="middle"/>
+<img src='../images/pipeline_serving-image3.png' height = "500" align="middle"/>
 </div>

 #### <b>1.2.3 预测类型的设计</b>
@@ -319,7 +319,7 @@ class ResponseOp(Op):
 所有Pipeline示例在[examples/Pipeline/](../../examples/Pipeline) 目录下，目前有7种类型模型示例：
 - [PaddleClas](../../examples/Pipeline/PaddleClas) 
 - [Detection](../../examples/Pipeline/PaddleDetection)  
- [bert](../../examples/Pipeline/bert)
+- [bert](../../examples/Pipeline/PaddleNLP/bert)
 - [imagenet](../../examples/Pipeline/PaddleClas/imagenet)
 - [imdb_model_ensemble](../../examples/Pipeline/imdb_model_ensemble)
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
@@ -328,7 +328,7 @@ class ResponseOp(Op):
 以 imdb_model_ensemble 为例来展示如何使用 Pipeline Serving，相关代码在 `Serving/examples/Pipeline/imdb_model_ensemble` 文件夹下可以找到，例子中的 Server 端结构如下图所示：

 <div align=center>
-<img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
+<img src='../images/pipeline_serving-image4.png' height = "200" align="middle"/>
 </div>

 ### 3.1 Pipeline部署需要的文件

--- a/doc/Python_Pipeline/Pipeline_Design_EN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_EN.md
@@ -18,7 +18,7 @@ Paddle Serving provides a user-friendly programming framework for multi-model co
 The Server side is built based on <b>RPC Service</b> and <b>graph execution engine</b>. The relationship between them is shown in the following figure.

 <div align=center>
-<img src='images/pipeline_serving-image1.png' height = "250" align="middle"/>
+<img src='../images/pipeline_serving-image1.png' height = "250" align="middle"/>
 </div>

 ### 1.1 RPC Service
@@ -61,7 +61,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
 - For cases where large data needs to be transferred between OPs, consider RAM DB external memory for global storage and data transfer by passing index keys in Channel.

 <div align=center>
-<img src='images/pipeline_serving-image2.png' height = "300" align="middle"/>
+<img src='../images/pipeline_serving-image2.png' height = "300" align="middle"/>
 </div>


@@ -80,7 +80,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
 - The following illustration shows the design of Channel in the graph execution engine, using input buffer and output buffer to align data between multiple OP inputs and multiple OP outputs, with a queue in the middle to buffer.

 <div align=center>
-<img src='images/pipeline_serving-image3.png' height = "500" align="middle"/>
+<img src='../images/pipeline_serving-image3.png' height = "500" align="middle"/>
 </div>


@@ -314,7 +314,7 @@ The default implementation of **pack_response_package** is to convert the dictio
 All examples of pipelines are in [examples/pipeline/](../../examples/Pipeline) directory, There are 7 types of model examples currently:
 - [PaddleClas](../../examples/Pipeline/PaddleClas) 
 - [Detection](../../examples/Pipeline/PaddleDetection)  
- [bert](../../examples/Pipeline/bert)
+- [bert](../../examples/Pipeline/PaddleNLP/bert)
 - [imagenet](../../examples/Pipeline/PaddleClas/imagenet)
 - [imdb_model_ensemble](../../examples/Pipeline/imdb_model_ensemble)
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
@@ -323,7 +323,7 @@ All examples of pipelines are in [examples/pipeline/](../../examples/Pipeline) d
 Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `Serving/examples/Pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:

 <div align=center>
-<img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
+<img src='../images/pipeline_serving-image4.png' height = "200" align="middle"/>
 </div>

 ### 3.1 Files required for pipeline deployment

--- a/doc/Quick_Start_CN.md
+++ b/doc/Quick_Start_CN.md
@@ -8,7 +8,7 @@

 进入到Serving的git目录下，进入到`fit_a_line`例子
 ``` shell
-cd Serving/python/examples/fit_a_line
+cd Serving/examples/C++/fit_a_line
 sh get_data.sh
 ```


--- a/doc/Run_In_Docker_CN.md
+++ b/doc/Run_In_Docker_CN.md
-# 如何在Docker中运行PaddleServing
-
-(简体中文|[English](Run_In_Docker_EN.md))
-
-Docker最大的好处之一就是可移植性，可在多种操作系统和主流的云计算平台部署。使用Paddle Serving Docker镜像可在Linux、Mac和Windows平台部署。
-
-## 环境要求
-
-Docker（GPU版本需要在GPU机器上安装nvidia-docker）
-
-该文档以Python2为例展示如何在Docker中运行Paddle Serving，您也可以通过将`python`更换成`python3`来用Python3运行相关命令。
-
-## CPU版本
-
-### 获取镜像
-
-参考[该文档](Docker_Images_CN.md)获取镜像：
-
-以CPU编译镜像为例
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-devel
-```
-
-### 创建容器并进入
-
-```bash
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-devel
-docker exec -it test bash
-```
-
-`-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
-
-### 安装PaddleServing
-
-镜像里自带对应镜像tag版本的`paddle_serving_server`，`paddle_serving_client`，`paddle_serving_app`，如果用户不需要更改版本，可以直接使用，适用于没有外网服务的环境。
-
-如果需要更换版本，请参照首页的指导，下载对应版本的pip包。
-
-## GPU 版本
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-```
-
-### 创建容器并进入
-
-```bash
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-nvidia-docker exec -it test bash
-```
-或者
-```bash
-docker run --gpus all -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-docker exec -it test bash
-```
-
-`-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
-
-### 安装PaddleServing
-
-请参照首页的指导，下载对应版本的pip包。[最新安装包合集](Latest_Packages_CN.md)
-
-## 注意事项
-
- 运行时镜像不能用于开发编译。如果想要从源码编译，请查看[如何编译PaddleServing](Compile_CN.md)。
- 由于Cuda10和Cuda9的环境受限于GCC版本，无法同时运行CPU版本的`paddle_serving_server`，因此如果想要在GPU环境中同时使用CPU版本的`paddle_serving_server`，请选择Cuda10.1，Cuda10.2和Cuda11版本的镜像。
--- a/doc/Run_In_Docker_EN.md
+++ b/doc/Run_In_Docker_EN.md
-# How to run PaddleServing in Docker
-
-([简体中文](Run_In_Docker_CN.md)|English)
-
-One of the biggest benefits of Docker is portability, which can be deployed on multiple operating systems and mainstream cloud computing platforms. The Paddle Serving Docker image can be deployed on Linux, Mac and Windows platforms.
-
-## Requirements
-
-Docker (GPU version requires nvidia-docker to be installed on the GPU machine)
-
-This document takes Python2 as an example to show how to run Paddle Serving in docker. You can also use Python3 to run related commands by replacing `python` with `python3`.
-
-## CPU
-
-### Get docker image
-
-Refer to [this document](Docker_Images_EN.md) for a docker image:
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-devel
-```
-
-
-### Create container
-
-```bash
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-devel
-docker exec -it test bash
-```
-
-The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
-
-### Install PaddleServing
-
-Please refer to the instructions on the homepage to download the pip package of the corresponding version.
-  
-
-## GPU
-
-The GPU version is basically the same as the CPU version, with only some differences in interface naming (GPU version requires nvidia-docker to be installed on the GPU machine).
-
-### Get docker image
-
-Refer to [this document](Docker_Images_EN.md) for a docker image, the following is an example of an `cuda9.0-cudnn7` image:
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-```
-
-### Create container
-
-```bash
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-nvidia-docker exec -it test bash
-```
-
-or
-
-```bash
-docker run --gpus all -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-docker exec -it test bash
-```
-
-The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
-
-### Install PaddleServing
-
-The mirror comes with `paddle_serving_server_gpu`, `paddle_serving_client`, and `paddle_serving_app` corresponding to the mirror tag version. If users don’t need to change the version, they can use it directly, which is suitable for environments without extranet services.
-
-If you need to change the version, please refer to the instructions on the homepage to download the pip package of the corresponding version. [LATEST_PACKAGES](./Latest_Packages_CN.md)
-
-## Precautious
-
- Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](Compile_EN.md).
- If you use Cuda9 and Cuda10 docker images, you cannot use `paddle_serving_server` CPU version at the same time, due to the limitation of gcc version. If you want to use both in one docker image, please choose images of Cuda10.1, Cuda10.2 and Cuda11.
--- a/doc/Serving_Design_CN.md
+++ b/doc/Serving_Design_CN.md
@@ -55,7 +55,7 @@ Paddle Serving从做顶层设计时考虑到不同团队在工业级场景中会
 > 跨平台运行

 跨平台是不依赖于操作系统，也不依赖硬件环境。一个操作系统下开发的应用，放到另一个操作系统下依然可以运行。因此，设计上既要考虑开发语言、组件是跨平台的，同时也要考虑不同系统上编译器的解释差异。
-Docker 是一个开源的应用容器引擎，让开发者可以打包他们的应用以及依赖包到一个可移植的容器中，然后发布到任何流行的Linux机器或Windows机器上。我们将Paddle Serving框架打包了多种Docker镜像，镜像列表参考《[Docker镜像](./Docker_Images_CN.md)》，根据用户的使用场景选择镜像。为方便用户使用Docker，我们提供了帮助文档《[如何在Docker中运行PaddleServing](./Run_In_Dokcer_CN.md)》。目前，Python webservice模式可在原生系统Linux和Windows双系统上部署运行。《[Windows平台使用Paddle Serving指导](./Windows_Tutorial_CN.md)》
+Docker 是一个开源的应用容器引擎，让开发者可以打包他们的应用以及依赖包到一个可移植的容器中，然后发布到任何流行的Linux机器或Windows机器上。我们将Paddle Serving框架打包了多种Docker镜像，镜像列表参考《[Docker镜像](./Docker_Images_CN.md)》，根据用户的使用场景选择镜像。为方便用户使用Docker，我们提供了帮助文档《[如何在Docker中运行PaddleServing](./Install_CN.md)》。目前，Python webservice模式可在原生系统Linux和Windows双系统上部署运行。《[Windows平台使用Paddle Serving指导](./Windows_Tutorial_CN.md)》

 > 支持多种开发语言SDK

@@ -132,7 +132,7 @@ Paddle Serving采用对称加密算法对模型进行加密，在服务加载模

 ### 3.5 A/B Test

-在对模型进行充分的离线评估后，通常需要进行在线A/B测试，来决定是否大规模上线服务。下图为使用Paddle Serving做A/B测试的基本结构，Client端做好相应的配置后，自动将流量分发给不同的Server，从而完成A/B测试。具体例子请参考《[如何使用Paddle Serving做ABTEST](./C++_Serving/ABTEST_CN.md)》。
+在对模型进行充分的离线评估后，通常需要进行在线A/B测试，来决定是否大规模上线服务。下图为使用Paddle Serving做A/B测试的基本结构，Client端做好相应的配置后，自动将流量分发给不同的Server，从而完成A/B测试。具体例子请参考《[如何使用Paddle Serving做ABTEST](./C++_Serving/ABTest_CN.md)》。

 <p align="center">
    <br>

--- a/doc/Serving_Design_EN.md
+++ b/doc/Serving_Design_EN.md
@@ -53,7 +53,7 @@ Paddle Serving takes into account a series of issues such as different operating

 Cross-platform is not dependent on the operating system, nor on the hardware environment. Applications developed under one operating system can still run under another operating system. Therefore, the design should consider not only the development language and the cross-platform components, but also the interpretation differences of the compilers on different systems.

-Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable container, and then publish it to any popular Linux machine or Windows machine. We have packaged a variety of Docker images for the Paddle Serving framework. Refer to the image list《[Docker Images](Docker_Images_EN.md)》, Select mirrors according to user's usage. We provide Docker usage documentation《[How to run PaddleServing in Docker](Run_In_Docker_EN.md)》.Currently, the Python webservice mode can be deployed and run on the native Linux and Windows dual systems.《[Paddle Serving for Windows Users](Windows_Tutorial_EN.md)》
+Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable container, and then publish it to any popular Linux machine or Windows machine. We have packaged a variety of Docker images for the Paddle Serving framework. Refer to the image list《[Docker Images](Docker_Images_EN.md)》, Select mirrors according to user's usage. We provide Docker usage documentation《[How to run PaddleServing in Docker](Install_EN.md)》.Currently, the Python webservice mode can be deployed and run on the native Linux and Windows dual systems.《[Paddle Serving for Windows Users](Windows_Tutorial_EN.md)》

 > Support multiple development languages client SDKs


--- a/python/CMakeLists.txt
+++ b/python/CMakeLists.txt
@@ -73,7 +73,7 @@ if (SERVER)
      set(VERSION_SUFFIX 101)
    elseif(CUDA_VERSION EQUAL 10.2)
      if(CUDNN_MAJOR_VERSION EQUAL 7)
-        set(VERSION_SUFFIX 1027)
+        set(VERSION_SUFFIX 102)
      elseif(CUDNN_MAJOR_VERSION EQUAL 8)
        set(VERSION_SUFFIX 1028)
      endif()

--- a/python/paddle_serving_server/server.py
+++ b/python/paddle_serving_server/server.py
@@ -429,7 +429,7 @@ class Server(object):
        if device_type == "0":
            device_version = self.get_device_version()
        elif device_type == "1":
-            if version_suffix == "101" or version_suffix == "1027" or version_suffix == "1028" or version_suffix == "112":
+            if version_suffix == "101" or version_suffix == "102" or version_suffix == "1028" or version_suffix == "112":
                device_version = "gpu-" + version_suffix
            else:
                device_version = "gpu-cuda" + version_suffix

--- a/tools/paddle_env_install.sh
+++ b/tools/paddle_env_install.sh
+unset GREP_OPTIONS
+
+function install_trt(){
+  CUDA_VERSION=$(nvcc --version | egrep -o "V[0-9]+.[0-9]+" | cut -c2-)
+  if [ $CUDA_VERSION == "10.2" ]; then
+    wget https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.2-cudnn7.tar.gz --no-check-certificate
+    tar -zxf TensorRT6-cuda10.2-cudnn7.tar.gz -C /usr/local
+    cp -rf /usr/local/TensorRT-6.0.1.8/include/*  /usr/include/ && cp -rf /usr/local/TensorRT-6.0.1.8/lib/* /usr/lib/
+    rm -rf TensorRT6-cuda10.2-cudnn7.tar.gz
+  elif [ $CUDA_VERSION == "11.2" ]; then
+    wget https://paddle-ci.gz.bcebos.com/TRT/TensorRT-8.0.3.4.Linux.x86_64-gnu.cuda-11.3.cudnn8.2.tar.gz --no-check-certificate
+    tar -zxf TensorRT-8.0.3.4.Linux.x86_64-gnu.cuda-11.3.cudnn8.2.tar.gz -C /usr/local
+    cp -rf /usr/local/TensorRT-8.0.3.4/include/* /usr/include/ && cp -rf /usr/local/TensorRT-8.0.3.4/lib/* /usr/lib/
+    rm -rf TensorRT-8.0.3.4.Linux.x86_64-gnu.cuda-11.3.cudnn8.2.tar.gz
+  else
+    echo "No Cuda Found, no need to install TensorRT"
+  fi
+}
+
+function env_install()
+{
+    apt install -y libcurl4-openssl-dev libbz2-dev
+    wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && tar xf centos_ssl.tar && rm -rf centos_ssl.tar && mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k && mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k && ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10 && ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10 && ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so && ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
+    rm -rf /usr/local/go && wget -qO- https://paddle-ci.gz.bcebos.com/go1.15.12.linux-amd64.tar.gz | \
+    tar -xz -C /usr/local && \
+    mkdir /root/go && \
+    mkdir /root/go/bin && \
+    mkdir /root/go/src && \
+    echo "GOROOT=/usr/local/go" >> /root/.bashrc && \
+    echo "GOPATH=/root/go" >> /root/.bashrc && \
+    echo "PATH=/usr/local/go/bin:/root/go/bin:$PATH" >> /root/.bashrc
+    install_trt
+}
+
+env_install