diff --git a/core/general-client/README_CN.md b/core/general-client/README_CN.md
index d391ed8612b5296843b7b0dfadf951a699c9dfa5..60e1d0846a91c0b8def2a3908b990426c3516a9a 100755
--- a/core/general-client/README_CN.md
+++ b/core/general-client/README_CN.md
@@ -9,7 +9,7 @@
 以fit_a_line模型为例，服务端启动与常规BRPC-Server端启动命令一样。
 
 ```
-cd ../../python/examples/fit_a_line
+cd ../../examples/C++/fit_a_line
 sh get_data.sh
 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
 ```
diff --git a/doc/C++_Serving/ABTest_CN.md b/doc/C++_Serving/ABTest_CN.md
index d3aa73103ba79f1d55d7b6383f4e7a0233e26146..34054d449b020c695e11e4d9e7212c76bd24fe70 100755
--- a/doc/C++_Serving/ABTest_CN.md
+++ b/doc/C++_Serving/ABTest_CN.md
@@ -30,7 +30,7 @@ pip install Shapely
 
 ### 启动Server端
 
-这里采用[Docker方式](../Run_In_Docker_CN.md)启动Server端服务。
+这里采用[Docker方式](../Install_CN.md)启动Server端服务。
 
 首先启动BOW Server，该服务启用`8000`端口：
 
diff --git a/doc/Docker_Images_CN.md b/doc/Docker_Images_CN.md
index 183e046e067962842e868a997a7e12433e81ed8e..1d8015ec8e45c388c6161e417cc332b7bb9f4086 100644
--- a/doc/Docker_Images_CN.md
+++ b/doc/Docker_Images_CN.md
@@ -8,10 +8,10 @@
 
 您可以通过两种方式获取镜像。
 
-1. 通过 TAG 直接从 `registry.baidubce.com ` 或 拉取镜像，具体TAG请参见下文的**镜像说明**章节的表格。
+1. 通过 TAG 直接从 dockerhub 或 `registry.baidubce.com` 拉取镜像，具体TAG请参见下文的**镜像说明**章节的表格。
 
    ```shell
-   docker pull registry.baidubce.com/paddlepaddle/serving:<TAG> # registry.baidubce.com
+   docker pull paddlepaddle/serving:<TAG> # 如果连接dockerhub网速不佳可以尝试registry.baidubce.com/paddlepaddle/serving:<TAG>
    ```
 
 2. 基于 Dockerfile 构建镜像
@@ -19,27 +19,25 @@
    建立新目录，复制对应 Dockerfile 内容到该目录下 Dockerfile 文件。执行
 
    ```shell
-   cd tools
-   docker build -f ${DOCKERFILE} -t <image-name>:<images-tag> .
+   docker build -f tools/${DOCKERFILE} -t <image-name>:<images-tag> .
    ```
    
 
 
 ## 镜像说明
 
-运行时镜像不能用于开发编译。
 若需要基于源代码二次开发编译，请使用后缀为-devel的版本。
-**在TAG列，latest也可以替换成对应的版本号，例如0.5.0/0.4.1等，但需要注意的是，部分开发环境随着某个版本迭代才增加，因此并非所有环境都有对应的版本号可以使用。**
+**在TAG列，0.7.0也可以替换成对应的版本号，例如0.5.0/0.4.1等，但需要注意的是，部分开发环境随着某个版本迭代才增加，因此并非所有环境都有对应的版本号可以使用。**
 
-**cuda10.1-cudnn7-gcc54环境尚未同步到镜像仓库，如果您需要相关镜像请运行相关dockerfile**
 
 |                         镜像选择                         |   操作系统    |             TAG              |                          Dockerfile                          |
 | :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: |
-|                       CPU development                        | Ubuntu16 |         latest-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
-|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | latest-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
-|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | latest-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
-|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | latest-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
-|              GPU (cuda11.2-cudnn8-tensorRT7) development               | Ubuntu18 | latest-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
+|                       CPU development                        | Ubuntu16 |         0.7.0-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
+|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
+|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn7-devel | [Dockerfile.cuda10.2-cudnn7.devel](../tools/Dockerfile.cuda10.2-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
+|              GPU (cuda11.2-cudnn8-tensorRT8) development               | Ubuntu16 | 0.7.0-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
 
 **Java镜像：**
 ```
@@ -63,38 +61,24 @@ registry.baidubce.com/paddlepaddle/serving:xpu-x86 # for x86 xpu user
 
 # （附录）所有镜像列表
 
-编译镜像：
 
 开发镜像:
 
 | Env      | Version | Docker images tag            | OS        | Gcc Version |
 |----------|---------|------------------------------|-----------|-------------|
-|    CPU   | >=0.5.0 | 0.6.2-devel                 | Ubuntu 16 |  8.2.0       |
+|    CPU   | >=0.5.0 | 0.7.0-devel                 | Ubuntu 16 |  8.2.0       |
 |          | <=0.4.0 | 0.4.0-devel                  | CentOS 7  | 4.8.5       |
-| Cuda10.1 | >=0.5.0 | 0.6.2-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
-|          | <=0.4.0 | 0.6.2-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
-| Cuda10.2 | >=0.5.0 | 0.6.2-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+| Cuda10.1 | >=0.5.0 | 0.7.0-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | 0.4.0-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
+| Cuda10.2+Cudnn7 | >=0.5.0 | 0.7.0-cuda10.2-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
-| Cuda11.0 | >=0.5.0 | 0.6.2-cuda11.0-cudnn8-devel | Ubuntu 18 |    8.2.0       |
+| Cuda10.2+Cudnn8 | >=0.5.0 | 0.7.0-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | Nan                          | Nan       | Nan         |
+| Cuda11.2 | >=0.5.0 | 0.7.0-cuda11.2-cudnn8-devel | Ubuntu 16 |    8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
 
 运行镜像:
 
 运行镜像比开发镜像更加轻量化, 运行镜像提供了serving的whl和bin，但为了运行期更小的镜像体积，没有提供诸如cmake这样但开发工具。 如果您想了解有关信息，请检查文档[在Kubernetes上使用Paddle Serving](./Run_On_Kubernetes_CN.md)。
 
-| ENV                                      | Python Version | Tag                         |
-|------------------------------------------|----------------|-----------------------------|
-| cpu                                      | 3.6            | 0.6.2-py36-runtime          |
-| cpu                                      | 3.7            | 0.6.2-py37-runtime          |
-| cpu                                      | 3.8            | 0.6.2-py38-runtime          |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.6            | 0.6.2-cuda10.1-py36-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.7            | 0.6.2-cuda10.1-py37-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.8            | 0.6.2-cuda10.1-py38-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.6            | 0.6.2-cuda10.2-py36-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.7            | 0.6.2-cuda10.2-py37-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.8            | 0.6.2-cuda10.2-py38-runtime |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.6            | 0.6.2-cuda11-py36-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.7            | 0.6.2-cuda11-py37-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.8            | 0.6.2-cuda11-py38-runtime   |
-
-**注意事项：** 如果您在0.5.0及以上版本需要在一个容器当中同时运行CPU server和GPU server，需要选择Cuda10.1/10.2/11的镜像，因为他们和CPU环境有着相同版本的gcc。
+
diff --git a/doc/Docker_Images_EN.md b/doc/Docker_Images_EN.md
index a495856afae6ead575390f5ea83345ea6a21bb48..5b5ee3b2ff172bf9716af0be8a9125500276d18c 100644
--- a/doc/Docker_Images_EN.md
+++ b/doc/Docker_Images_EN.md
@@ -8,10 +8,10 @@ This document maintains a list of docker images provided by Paddle Serving.
 
 You can get images in two ways:
 
-1. Pull image directly from `registry.baidubce.com ` through TAG:
+1. Pull image directly from dockerhub or `registry.baidubce.com ` through TAG:
 
    ```shell
-   docker pull registry.baidubce.com/paddlepaddle/serving:<TAG> # registry.baidubce.com
+   docker pull docker pull paddlepaddle/serving:<TAG>  # if it is slow connection to dockerhub, please try registry.baidubce.com
    ```
 
 2. Building image based on dockerfile
@@ -19,25 +19,28 @@ You can get images in two ways:
    Create a new folder and copy Dockerfile to this folder, and run the following command:
 
    ```shell
-   docker build -f ${DOCKERFILE} -t <image-name>:<images-tag> .
+   docker build -f tools/${DOCKERFILE} -t <image-name>:<images-tag> .
    ```
 
 
 
 ## Image description
 
-Runtime images cannot be used for compilation.
 If you want to customize your Serving based on source code, use the version with the suffix - devel.
 
 **cuda10.1-cudnn7-gcc54 image is not ready, you should run from dockerfile if you need it.**
 
-|                         Description                          |   OS    |             TAG              |                          Dockerfile                          |
+If you need to develop and compile based on the source code, please use the version with the suffix -devel.
+**In the TAG column, 0.7.0 can also be replaced with the corresponding version number, such as 0.5.0/0.4.1, etc., but it should be noted that some development environments only increase with a certain version iteration, so not all environments All have the corresponding version number can be used.**
+
+|                         Description                         |   OS    |             TAG              |                          Dockerfile                          |
 | :----------------------------------------------------------: | :-----: | :--------------------------: | :----------------------------------------------------------: |
-|                       CPU development                        | Ubuntu16 |         latest-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
-|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | latest-cuda10.1-cudnn7-gcc54-devel(not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
-|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | latest-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
-|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | latest-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
-|              GPU (cuda11.2-cudnn8-tensorRT7) development               | Ubuntu18 | latest-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
+|                       CPU development                        | Ubuntu16 |         0.7.0-devel         |        [Dockerfile.devel](../tools/Dockerfile.devel)         |
+|              GPU (cuda10.1-cudnn7-tensorRT6-gcc54) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-gcc54-devel (not ready) | [Dockerfile.cuda10.1-cudnn7-gcc54.devel](../tools/Dockerfile.cuda10.1-cudnn7-gcc54.devel) |
+|              GPU (cuda10.1-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.1-cudnn7-devel | [Dockerfile.cuda10.1-cudnn7.devel](../tools/Dockerfile.cuda10.1-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn7-tensorRT6) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn7-devel | [Dockerfile.cuda10.2-cudnn7.devel](../tools/Dockerfile.cuda10.2-cudnn7.devel) |
+|              GPU (cuda10.2-cudnn8-tensorRT7) development               | Ubuntu16 | 0.7.0-cuda10.2-cudnn8-devel | [Dockerfile.cuda10.2-cudnn8.devel](../tools/Dockerfile.cuda10.2-cudnn8.devel) |
+|              GPU (cuda11.2-cudnn8-tensorRT8) development               | Ubuntu16 | 0.7.0-cuda11.2-cudnn8-devel | [Dockerfile.cuda11.2-cudnn8.devel](../tools/Dockerfile.cuda11.2-cudnn8.devel) |
 
 **Java Client:**
 ```
@@ -64,34 +67,20 @@ Develop Images:
 
 | Env      | Version | Docker images tag            | OS        | Gcc Version |
 |----------|---------|------------------------------|-----------|-------------|
-|    CPU   | >=0.5.0 | 0.6.2-devel                 | Ubuntu 16 |  8.2.0       |
+|    CPU   | >=0.5.0 | 0.7.0-devel                 | Ubuntu 16 |  8.2.0       |
 |          | <=0.4.0 | 0.4.0-devel                  | CentOS 7  | 4.8.5       |
-| Cuda10.1 | >=0.5.0 | 0.6.2-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
-|          | 0.6.2   | 0.6.2-cuda10.1-cudnn7-gcc54-devel(not ready)  | Ubuntu 16 |  5.4.0 |
-|          | <=0.4.0 | 0.6.2-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
-| Cuda10.2 | >=0.5.0 | 0.6.2-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
+| Cuda10.1 | >=0.5.0 | 0.7.0-cuda10.1-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | 0.4.0-cuda10.1-cudnn7-devel    | CentOS 7  | 4.8.5     |
+| Cuda10.2+Cudnn7 | >=0.5.0 | 0.7.0-cuda10.2-cudnn7-devel  | Ubuntu 16 |   8.2.0       |
+|          | <=0.4.0 | Nan                          | Nan       | Nan         |
+| Cuda10.2+Cudnn8 | >=0.5.0 | 0.7.0-cuda10.2-cudnn8-devel  | Ubuntu 16 |   8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
-| Cuda11.0 | >=0.5.0 | 0.6.2-cuda11.0-cudnn8-devel | Ubuntu 18 |    8.2.0       |
+| Cuda11.2 | >=0.5.0 | 0.7.0-cuda11.2-cudnn8-devel | Ubuntu 16 |    8.2.0       |
 |          | <=0.4.0 | Nan                          | Nan       | Nan         |
 
+
 Running Images:
 
 Running Images is lighter than Develop Images, and Running Images are made up with serving whl and bin, but without develop tools like cmake because of lower image size. If you want to know about it, plese check the document [Paddle Serving on Kubernetes.](./Run_On_Kubernetes_CN.md).
 
 
-| ENV                                      | Python Version | Tag                         |
-|------------------------------------------|----------------|-----------------------------|
-| cpu                                      | 3.6            | 0.6.2-py36-runtime          |
-| cpu                                      | 3.7            | 0.6.2-py37-runtime          |
-| cpu                                      | 3.8            | 0.6.2-py38-runtime          |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.6            | 0.6.2-cuda10.1-py36-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.7            | 0.6.2-cuda10.1-py37-runtime |
-| cuda-10.1 + cudnn-7.6.5 + tensorrt-6.0.1 | 3.8            | 0.6.2-cuda10.1-py38-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.6            | 0.6.2-cuda10.2-py36-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.7            | 0.6.2-cuda10.2-py37-runtime |
-| cuda-10.2 + cudnn-8.2.0 + tensorrt-7.1.3 | 3.8            | 0.6.2-cuda10.2-py38-runtime |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.6            | 0.6.2-cuda11-py36-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.7            | 0.6.2-cuda11-py37-runtime   |
-| cuda-11 + cudnn-8.0.5 + tensorrt-7.1.3   | 3.8            | 0.6.2-cuda11-py38-runtime   |
-
-**Tips:**  If you want to use CPU server and GPU server (version>=0.5.0) at the same time, you should check the gcc version,  only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
diff --git a/doc/FAQ_CN.md b/doc/FAQ_CN.md
index 18712c1cde16ffdef31c0598ff5a53b2a339e027..c1f2359abf5aca4fdf40e9f7fb3abccb6adce62a 100644
--- a/doc/FAQ_CN.md
+++ b/doc/FAQ_CN.md
@@ -142,7 +142,7 @@ make: *** [all] Error 2
 
 #### Q：使用过程中出现CXXABI错误。
 
-这个问题出现的原因是Python使用的gcc版本和Serving所需的gcc版本对不上。对于Docker用户，推荐使用[Docker容器](./Run_In_Docker_CN.md)，由于Docker容器内的Python版本与Serving在发布前都做过适配，这样就不会出现类似的错误。如果是其他开发环境，首先需要确保开发环境中具备GCC 8.2，如果没有gcc 8.2，参考安装方式
+这个问题出现的原因是Python使用的gcc版本和Serving所需的gcc版本对不上。对于Docker用户，推荐使用[Docker容器](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Docker_Images_CN.md)，由于Docker容器内的Python版本与Serving在发布前都做过适配，这样就不会出现类似的错误。如果是其他开发环境，首先需要确保开发环境中具备GCC 8.2，如果没有gcc 8.2，参考安装方式
 
 ```bash
 wget -q https://paddle-ci.gz.bcebos.com/gcc-8.2.0.tar.xz 
@@ -236,7 +236,7 @@ InvalidArgumentError: Device id must be less than GPU count, but received id is:
 
 #### Q: python编译的GCC版本与serving的版本不匹配
 
-**A:**:1)使用[GPU docker](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Run_In_Docker_CN.md#gpunvidia-docker)解决环境问题；2)修改anaconda的虚拟环境下安装的python的gcc版本[改变python的GCC编译环境](https://www.jianshu.com/p/c498b3d86f77) 
+**A:**:1)使用GPU Dockers, [这里是Docker镜像列表](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Docker_Images_CN.md)解决环境问题；2)修改anaconda的虚拟环境下安装的python的gcc版本[改变python的GCC编译环境](https://www.jianshu.com/p/c498b3d86f77) 
 
 #### Q: paddle-serving是否支持本地离线安装 
 
diff --git a/doc/Install_CN.md b/doc/Install_CN.md
index 9dada97ee0e0b9a70f3d46115821e8dd57f1060f..c48384162fdb944deaae277e67ca0ecfabcc4694 100644
--- a/doc/Install_CN.md
+++ b/doc/Install_CN.md
@@ -2,7 +2,7 @@
 
 (简体中文|[English](./Install_EN.md))
 
-**强烈建议**您在**Docker内构建**Paddle Serving，请查看[如何在Docker中运行PaddleServing](Run_In_Docker_CN.md)。更多镜像请查看[Docker镜像列表](Docker_Images_CN.md)。
+**强烈建议**您在**Docker内构建**Paddle Serving，更多镜像请查看[Docker镜像列表](Docker_Images_CN.md)。
 
 **提示-1**：本项目仅支持<mark>**Python3.6/3.7/3.8**</mark>，接下来所有的与Python/Pip相关的操作都需要选择正确的Python版本。
 
diff --git a/doc/Install_EN.md b/doc/Install_EN.md
index b9044d163c1772200d101f14fe439389255c51a1..6c75cc52698a23594568766c9a265aae3a2beba0 100644
--- a/doc/Install_EN.md
+++ b/doc/Install_EN.md
@@ -2,7 +2,7 @@
 
 ([简体中文](./Install_CN.md)|English)
 
-**Strongly recommend** you build **Paddle Serving** in Docker, please check [How to run PaddleServing in Docker](Run_In_Docker_CN.md). For more images, please refer to [Docker Image List](Docker_Images_CN.md).
+**Strongly recommend** you build **Paddle Serving** in Docker. For more images, please refer to [Docker Image List](Docker_Images_CN.md).
 
 **Tip-1**: This project only supports <mark>**Python3.6/3.7/3.8**</mark>, all subsequent operations related to Python/Pip need to select the correct Python version.
 
diff --git a/doc/Latest_Packages_CN.md b/doc/Latest_Packages_CN.md
index 022ae75ab824ed8462f876f8d57b9097720cc18d..924844013b1e445665b94249f84ee8b89335db35 100644
--- a/doc/Latest_Packages_CN.md
+++ b/doc/Latest_Packages_CN.md
@@ -10,14 +10,15 @@ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server-0.0.0-py
 ## GPU server
 ### Python 3
 ```
-#cuda10.1 with TensorRT 6, Compile by gcc8.2
+#cuda10.1 Cudnn 7 with TensorRT 6, Compile by gcc8.2
 https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post101-py3-none-any.whl
-#cuda10.2 with TensorRT 7, Compile by gcc8.2
+#cuda10.2 Cudnn 7 with TensorRT 6, Compile by gcc5.4
 https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post102-py3-none-any.whl
-#cuda11.0 with TensorRT 7 (beta), Compile by gcc8.2
-https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post11-py3-none-any.whl
+#cuda10.2 Cudnn 8 with TensorRT 7, Compile by gcc8.2
+https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post1028-py3-none-any.whl
+#cuda11.2 Cudnn 8 with TensorRT 8 (beta), Compile by gcc8.2
+https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.0.0.post112-py3-none-any.whl
 ```
-**Tips:**  If you want to use CPU server and GPU server at the same time, you should check the gcc version,  only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
 
 ## Client
 
@@ -48,16 +49,16 @@ for kunlun user who uses arm-xpu or x86-xpu can download the wheel packages as f
 
 for arm kunlun user
 ```
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_server_xpu-0.6.0.post2-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_client-0.6.0-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_app-0.6.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_aarch64.whl
 ```
  
 for x86 kunlun user
 ``` 
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_server_xpu-0.6.0.post2-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_client-0.6.0-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.6.0/paddle_serving_app-0.6.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_x86_64.whl
 ```
 
 
@@ -74,10 +75,12 @@ https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-cpu-avx-openblas-0.0.0
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-cpu-noavx-openblas-0.0.0.tar.gz
 # Cuda 10.1
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-101-0.0.0.tar.gz
-# Cuda 10.2
+# Cuda 10.2 + Cudnn 7
 https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-102-0.0.0.tar.gz
-# Cuda 11
-https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-cuda11-0.0.0.tar.gz
+# Cuda 10.2 + Cudnn 8
+https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-1028-0.0.0.tar.gz
+# Cuda 11.2
+https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-cuda112-0.0.0.tar.gz
 ```
 
 #### How to setup SERVING_BIN offline?
diff --git a/doc/Process_data_CN.md b/doc/Process_data_CN.md
index f8bc5b25dc83adbb915cd5ec3fd8a1a8ac3b5943..105956f51fc232bc4d88864dac6a9e922d411d50 100644
--- a/doc/Process_data_CN.md
+++ b/doc/Process_data_CN.md
@@ -10,7 +10,7 @@ pipeline客户端只做很简单的处理，他们把自然输入转化成可以
 
 #### 1）字符串/数字
 
-字符串和数字在这个阶段都以字符串的形式存在。我们以[房价预测](../python/examples/pipeline/simple_web_service)作为例子。房价预测的输入是13个维度的浮点数去描述一个住房的特征。在客户端阶段就可以直接如下所示
+字符串和数字在这个阶段都以字符串的形式存在。我们以[房价预测](../examples/Pipeline/simple_web_service)作为例子。房价预测的输入是13个维度的浮点数去描述一个住房的特征。在客户端阶段就可以直接如下所示
 
 ```
 curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}'
@@ -24,11 +24,11 @@ curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value"
 curl -X POST -k http://localhost:18082/bert/prediction -d '{"key": ["x"], "value": ["hello world"]}'
 ```
 
-当然，复杂的处理也可以把这个curl转换成python语言，详情参见[Bert Pipeline示例](../python/examples/pipeline/bert). 
+当然，复杂的处理也可以把这个curl转换成python语言，详情参见[Bert Pipeline示例](../examples/Pipeline/PaddleNLP/bert). 
 
 #### 2）图片
 
-图片在Paddle的输入通常需要转换成numpy array，但是在客户端阶段，不需要转换成numpy array，因为那样比较耗费空间，在这个阶段我们用base64 string来传输就可以了，到了服务端的前处理再去解读base64转换成numpy array。详情参见[图像分类pipeline示例](../python/examples/pipeline/PaddleClas/DarkNet53/pipeline_http_client.py)，我们也贴出部分代码
+图片在Paddle的输入通常需要转换成numpy array，但是在客户端阶段，不需要转换成numpy array，因为那样比较耗费空间，在这个阶段我们用base64 string来传输就可以了，到了服务端的前处理再去解读base64转换成numpy array。详情参见[图像分类pipeline示例](../examples/Pipeline/PaddleClas/DarkNet53/pipeline_http_client.py)，我们也贴出部分代码
 
 ```python
 def cv2_to_base64(image):
@@ -52,7 +52,7 @@ if __name__ == "__main__":
 
 #### 1）字符串/数字
 
-刚才提到的房价预测示例，[服务端程序](../python/examples/pipeline/simple_web_service/web_service.py)在这里。
+刚才提到的房价预测示例，[服务端程序](../examples/Pipeline/simple_web_service/web_service.py)在这里。
 
 ```python
     def init_op(self):
@@ -115,7 +115,7 @@ if __name__ == "__main__":
 
 #### 2）图片处理
 
-图像的前处理阶段，前面提到的图像处理程序，[服务端程序](../python/examples/pipeline/PaddleClas/DarkNet53/resnet50_web_service.py)如下。
+图像的前处理阶段，前面提到的图像处理程序，[服务端程序](../examples/Pipeline/PaddleClas/DarkNet53/resnet50_web_service.py)如下。
 
 ```python
     def init_op(self):
diff --git a/doc/Python_Pipeline/Pipeline_Design_CN.md b/doc/Python_Pipeline/Pipeline_Design_CN.md
index 942e3b1567cb31bc9b727a10b03c5795f22223a8..5bb083b2079dca9391fdc61d62ede9145fd7f46a 100644
--- a/doc/Python_Pipeline/Pipeline_Design_CN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_CN.md
@@ -20,7 +20,7 @@ Paddle Serving提供了用户友好的多模型组合服务编程框架，Pipeli
 Server端基于<b>RPC服务层</b>和<b>图执行引擎</b>构建，两者的关系如下图所示。
 
 <div align=center>
-<img src='images/pipeline_serving-image1.png' height = "250" align="middle"/>
+<img src='../images/pipeline_serving-image1.png' height = "250" align="middle"/>
 </div>
 
 </n>
@@ -65,7 +65,7 @@ Response中`err_no`和`err_msg`表达处理结果的正确性和错误信息，`
 - 对于 OP 之间需要传输过大数据的情况，可以考虑 RAM DB 外存进行全局存储，通过在 Channel 中传递索引的 Key 来进行数据传输
 
 <div align=center>
-<img src='images/pipeline_serving-image2.png' height = "300" align="middle"/>
+<img src='../images/pipeline_serving-image2.png' height = "300" align="middle"/>
 </div>
 
 
@@ -84,7 +84,7 @@ Response中`err_no`和`err_msg`表达处理结果的正确性和错误信息，`
 - 下图为图执行引擎中 Channel 的设计，采用 input buffer 和 output buffer 进行多 OP 输入或多 OP 输出的数据对齐，中间采用一个 Queue 进行缓冲
 
 <div align=center>
-<img src='images/pipeline_serving-image3.png' height = "500" align="middle"/>
+<img src='../images/pipeline_serving-image3.png' height = "500" align="middle"/>
 </div>
 
 #### <b>1.2.3 预测类型的设计</b>
@@ -319,16 +319,16 @@ class ResponseOp(Op):
 所有Pipeline示例在[examples/Pipeline/](../../examples/Pipeline) 目录下，目前有7种类型模型示例：
 - [PaddleClas](../../examples/Pipeline/PaddleClas) 
 - [Detection](../../examples/Pipeline/PaddleDetection)  
-- [bert](../../examples/Pipeline/bert)
+- [bert](../../examples/Pipeline/PaddleNLP/bert)
 - [imagenet](../../examples/Pipeline/PaddleClas/imagenet)
 - [imdb_model_ensemble](../../examples/Pipeline/imdb_model_ensemble)
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
 - [simple_web_service](../../examples/Pipeline/simple_web_service)
 
-以 imdb_model_ensemble 为例来展示如何使用 Pipeline Serving，相关代码在 `python/examples/pipeline/imdb_model_ensemble` 文件夹下可以找到，例子中的 Server 端结构如下图所示：
+以 imdb_model_ensemble 为例来展示如何使用 Pipeline Serving，相关代码在 `Serving/examples/Pipeline/imdb_model_ensemble` 文件夹下可以找到，例子中的 Server 端结构如下图所示：
 
 <div align=center>
-<img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
+<img src='../images/pipeline_serving-image4.png' height = "200" align="middle"/>
 </div>
 
 ### 3.1 Pipeline部署需要的文件
@@ -352,13 +352,13 @@ class ResponseOp(Op):
 ### 3.2 获取模型文件
 
 ```shell
-cd python/examples/pipeline/imdb_model_ensemble
+cd Serving/examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
 ```
 
-PipelineServing 也支持本地自动启动 PaddleServingService，请参考 `python/examples/pipeline/ocr` 下的例子。
+PipelineServing 也支持本地自动启动 PaddleServingService，请参考 `Serving/examples/Pipeline/PaddleOCR/ocr` 下的例子。
 
 ### 3.3 创建config.yaml
 本示例采用了brpc的client连接类型，还可以选择grpc或local_predictor。
@@ -700,7 +700,7 @@ Pipeline Serving支持低精度推理，CPU、GPU和TensoRT支持的精度类型
   - fp16
   - int8 
 
-参考[simple_web_service](../python/examples/pipeline/simple_web_service)示例
+参考[simple_web_service](../../examples/Pipeline/simple_web_service)示例
 ***
 
 ## 5.日志追踪
diff --git a/doc/Python_Pipeline/Pipeline_Design_EN.md b/doc/Python_Pipeline/Pipeline_Design_EN.md
index 069d16974976f7d30f3328b4c6500af2219f0aba..e30d09dc7c1eae64a873c3462137c772a1d006be 100644
--- a/doc/Python_Pipeline/Pipeline_Design_EN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_EN.md
@@ -18,7 +18,7 @@ Paddle Serving provides a user-friendly programming framework for multi-model co
 The Server side is built based on <b>RPC Service</b> and <b>graph execution engine</b>. The relationship between them is shown in the following figure.
 
 <div align=center>
-<img src='images/pipeline_serving-image1.png' height = "250" align="middle"/>
+<img src='../images/pipeline_serving-image1.png' height = "250" align="middle"/>
 </div>
 
 ### 1.1 RPC Service
@@ -61,7 +61,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
 - For cases where large data needs to be transferred between OPs, consider RAM DB external memory for global storage and data transfer by passing index keys in Channel.
 
 <div align=center>
-<img src='images/pipeline_serving-image2.png' height = "300" align="middle"/>
+<img src='../images/pipeline_serving-image2.png' height = "300" align="middle"/>
 </div>
 
 
@@ -80,7 +80,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
 - The following illustration shows the design of Channel in the graph execution engine, using input buffer and output buffer to align data between multiple OP inputs and multiple OP outputs, with a queue in the middle to buffer.
 
 <div align=center>
-<img src='images/pipeline_serving-image3.png' height = "500" align="middle"/>
+<img src='../images/pipeline_serving-image3.png' height = "500" align="middle"/>
 </div>
 
 
@@ -314,16 +314,16 @@ The default implementation of **pack_response_package** is to convert the dictio
 All examples of pipelines are in [examples/pipeline/](../../examples/Pipeline) directory, There are 7 types of model examples currently:
 - [PaddleClas](../../examples/Pipeline/PaddleClas) 
 - [Detection](../../examples/Pipeline/PaddleDetection)  
-- [bert](../../examples/Pipeline/bert)
+- [bert](../../examples/Pipeline/PaddleNLP/bert)
 - [imagenet](../../examples/Pipeline/PaddleClas/imagenet)
 - [imdb_model_ensemble](../../examples/Pipeline/imdb_model_ensemble)
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
 - [simple_web_service](../../examples/Pipeline/simple_web_service)
 
-Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `python/examples/pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:
+Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `Serving/examples/Pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:
 
 <div align=center>
-<img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
+<img src='../images/pipeline_serving-image4.png' height = "200" align="middle"/>
 </div>
 
 ### 3.1 Files required for pipeline deployment
@@ -348,13 +348,13 @@ Five types of files are needed, of which model files, configuration files, and s
 ### 3.2 Get model files
 
 ```shell
-cd python/examples/pipeline/imdb_model_ensemble
+cd Serving/examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
 ```
 
-PipelineServing also supports local automatic startup of PaddleServingService. Please refer to the example `python/examples/pipeline/ocr`.
+PipelineServing also supports local automatic startup of PaddleServingService. Please refer to the example `Serving/examples/Pipeline/PaddleOCR/ocr`.
 
 
 ### 3.3 Create config.yaml
@@ -705,7 +705,7 @@ Pipeline Serving supports low-precision inference. The precision types supported
   - fp16
   - int8 
 
-Reference the example [simple_web_service](../python/examples/pipeline/simple_web_service).
+Reference the example [simple_web_service](../../examples/Pipeline/simple_web_service).
 
 ***
  
diff --git a/doc/Run_In_Docker_CN.md b/doc/Run_In_Docker_CN.md
deleted file mode 100644
index 7e4e4ad3dc9652772028a3e28d9453a032b25297..0000000000000000000000000000000000000000
--- a/doc/Run_In_Docker_CN.md
+++ /dev/null
@@ -1,67 +0,0 @@
-# 如何在Docker中运行PaddleServing
-
-(简体中文|[English](Run_In_Docker_EN.md))
-
-Docker最大的好处之一就是可移植性，可在多种操作系统和主流的云计算平台部署。使用Paddle Serving Docker镜像可在Linux、Mac和Windows平台部署。
-
-## 环境要求
-
-Docker（GPU版本需要在GPU机器上安装nvidia-docker）
-
-该文档以Python2为例展示如何在Docker中运行Paddle Serving，您也可以通过将`python`更换成`python3`来用Python3运行相关命令。
-
-## CPU版本
-
-### 获取镜像
-
-参考[该文档](Docker_Images_CN.md)获取镜像：
-
-以CPU编译镜像为例
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-devel
-```
-
-### 创建容器并进入
-
-```bash
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-devel
-docker exec -it test bash
-```
-
-`-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
-
-### 安装PaddleServing
-
-镜像里自带对应镜像tag版本的`paddle_serving_server`，`paddle_serving_client`，`paddle_serving_app`，如果用户不需要更改版本，可以直接使用，适用于没有外网服务的环境。
-
-如果需要更换版本，请参照首页的指导，下载对应版本的pip包。
-
-## GPU 版本
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-```
-
-### 创建容器并进入
-
-```bash
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-nvidia-docker exec -it test bash
-```
-或者
-```bash
-docker run --gpus all -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-docker exec -it test bash
-```
-
-`-p`选项是为了将容器的`9292`端口映射到宿主机的`9292`端口。
-
-### 安装PaddleServing
-
-请参照首页的指导，下载对应版本的pip包。[最新安装包合集](Latest_Packages_CN.md)
-
-## 注意事项
-
-- 运行时镜像不能用于开发编译。如果想要从源码编译，请查看[如何编译PaddleServing](Compile_CN.md)。
-- 由于Cuda10和Cuda9的环境受限于GCC版本，无法同时运行CPU版本的`paddle_serving_server`，因此如果想要在GPU环境中同时使用CPU版本的`paddle_serving_server`，请选择Cuda10.1，Cuda10.2和Cuda11版本的镜像。
diff --git a/doc/Run_In_Docker_EN.md b/doc/Run_In_Docker_EN.md
deleted file mode 100644
index 44a516cb0b611315ade0440b6cea81632d8e62f6..0000000000000000000000000000000000000000
--- a/doc/Run_In_Docker_EN.md
+++ /dev/null
@@ -1,75 +0,0 @@
-# How to run PaddleServing in Docker
-
-([简体中文](Run_In_Docker_CN.md)|English)
-
-One of the biggest benefits of Docker is portability, which can be deployed on multiple operating systems and mainstream cloud computing platforms. The Paddle Serving Docker image can be deployed on Linux, Mac and Windows platforms.
-
-## Requirements
-
-Docker (GPU version requires nvidia-docker to be installed on the GPU machine)
-
-This document takes Python2 as an example to show how to run Paddle Serving in docker. You can also use Python3 to run related commands by replacing `python` with `python3`.
-
-## CPU
-
-### Get docker image
-
-Refer to [this document](Docker_Images_EN.md) for a docker image:
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-devel
-```
-
-
-### Create container
-
-```bash
-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-devel
-docker exec -it test bash
-```
-
-The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
-
-### Install PaddleServing
-
-Please refer to the instructions on the homepage to download the pip package of the corresponding version.
-  
-
-## GPU
-
-The GPU version is basically the same as the CPU version, with only some differences in interface naming (GPU version requires nvidia-docker to be installed on the GPU machine).
-
-### Get docker image
-
-Refer to [this document](Docker_Images_EN.md) for a docker image, the following is an example of an `cuda9.0-cudnn7` image:
-
-```shell
-docker pull registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-```
-
-### Create container
-
-```bash
-nvidia-docker run -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-nvidia-docker exec -it test bash
-```
-
-or
-
-```bash
-docker run --gpus all -p 9292:9292 --name test -dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
-docker exec -it test bash
-```
-
-The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
-
-### Install PaddleServing
-
-The mirror comes with `paddle_serving_server_gpu`, `paddle_serving_client`, and `paddle_serving_app` corresponding to the mirror tag version. If users don’t need to change the version, they can use it directly, which is suitable for environments without extranet services.
-
-If you need to change the version, please refer to the instructions on the homepage to download the pip package of the corresponding version. [LATEST_PACKAGES](./Latest_Packages_CN.md)
-
-## Precautious
-
-- Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](Compile_EN.md).
-- If you use Cuda9 and Cuda10 docker images, you cannot use `paddle_serving_server` CPU version at the same time, due to the limitation of gcc version. If you want to use both in one docker image, please choose images of Cuda10.1, Cuda10.2 and Cuda11.
diff --git a/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md b/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md
index f8b0bfebcadac6ec2bed2e4924c3b23ecc4a79e1..7981b7f3374c43fe4039c52c1da918d88ae53666 100755
--- a/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md
+++ b/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md
@@ -4,7 +4,7 @@
 
 ### Get Sample Dataset
 
-go to directory `python/examples/criteo_ctr_with_cube`
+go to directory `examples/C++/PaddleRec/criteo_ctr_with_cube`
 ```
 sh get_data.sh
 ```
@@ -45,7 +45,7 @@ python3 test_client.py ctr_client_conf/serving_client_conf.prototxt ./raw_data
 
 CPU ：Intel(R) Xeon(R) CPU 6148 @ 2.40GHz 
 
-Model ：[Criteo CTR](https://github.com/PaddlePaddle/Serving/blob/develop/python/examples/criteo_ctr_with_cube/network_conf.py)
+Model ：[Criteo CTR](./network_conf.py)
 
 server core/thread num ： 4/8
 
diff --git a/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md b/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md
index ba59d39505de3bead675163553951646063225d8..f498d52d54eb7fe447465a219491371be856702a 100644
--- a/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md
+++ b/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md
@@ -2,7 +2,7 @@
 (简体中文|[English](./README.md))
 
 ### 获取样例数据
-进入目录 `python/examples/criteo_ctr_with_cube`
+进入目录 `examples/C++/PaddleRec/criteo_ctr_with_cube`
 ```
 sh get_data.sh
 ```
@@ -43,7 +43,7 @@ python3 test_client.py ctr_client_conf/serving_client_conf.prototxt ./raw_data
 
 设备 ：Intel(R) Xeon(R) CPU 6148 @ 2.40GHz 
 
-模型 ：[Criteo CTR](https://github.com/PaddlePaddle/Serving/blob/develop/python/examples/criteo_ctr_with_cube/network_conf.py)
+模型 ：[Criteo CTR](./network_conf.py)
 
 server core/thread num ： 4/8
 
diff --git a/examples/util/README.md b/examples/util/README.md
index 49934d1b9a4e204f8b18a46a3a77776c5b31139d..4bf7af78e42780a24251c1c11d69c3e5108cc7b8 100644
--- a/examples/util/README.md
+++ b/examples/util/README.md
@@ -26,6 +26,6 @@ The script converts the time-dot information in the log into a json format and s
 
 Specific operation: Open the chrome browser, enter `chrome://tracing/` in the address bar, jump to the tracing page, click the `load` button, and open the saved trace file to visualize the time information of each stage of the prediction service.
 
-The data visualization output is shown as follow, it uses [bert as service example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert) GPU inference service. The server starts 4 GPU prediction, the client starts 4 `processes`, and the timeline of each stage when the batch size is 1. Among them, `bert_pre` represents the data preprocessing stage of the client, and `client_infer` represents the stage where the client completes sending and receiving prediction requests. `process` represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
+The data visualization output is shown as follow, it uses [bert as service example](../C++/PaddleNLP/bert) GPU inference service. The server starts 4 GPU prediction, the client starts 4 `processes`, and the timeline of each stage when the batch size is 1. Among them, `bert_pre` represents the data preprocessing stage of the client, and `client_infer` represents the stage where the client completes sending and receiving prediction requests. `process` represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
 
 ![timeline](../../doc/images/timeline-example.png)
diff --git a/examples/util/README_CN.md b/examples/util/README_CN.md
index 1ba45ec7e228d6c0ad94b7c966464932f27e251b..e5a98b1db85fae32eceefd8305a6d6ead803ba01 100644
--- a/examples/util/README_CN.md
+++ b/examples/util/README_CN.md
@@ -26,6 +26,6 @@ python3 timeline_trace.py profile trace
 
 具体操作：打开chrome浏览器，在地址栏输入chrome://tracing/，跳转至tracing页面，点击load按钮，打开保存的trace文件，即可将预测服务的各阶段时间信息可视化。
 
-效果如下图，图中展示了使用[bert示例](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline，其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送和接收结果的阶段，图中的process代表的是client的进程号，每个进进程的第二行展示的是server各个op的timeline。
+效果如下图，图中展示了使用[bert示例](../C++/PaddleNLP/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline，其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送和接收结果的阶段，图中的process代表的是client的进程号，每个进进程的第二行展示的是server各个op的timeline。
 
 ![timeline](../../doc/images/timeline-example.png)
diff --git a/java/README_CN.md b/java/README_CN.md
index 6c2465ecceaea135795cb75ce87afb6e78b8f90e..2f720898f2f25f3e3db7ce2fb752d309584fe160 100755
--- a/java/README_CN.md
+++ b/java/README_CN.md
@@ -34,7 +34,7 @@ mvn install
 以fit_a_line模型为例，服务端启动与常规BRPC-Server端启动命令一样。
 
 ```
-cd ../../python/examples/fit_a_line
+cd ../../examples/C++/fit_a_line
 sh get_data.sh
 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
 ```
@@ -59,7 +59,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Paddle
 对于input data type = string类型，以IMDB model ensemble模型为例，服务端启动
 
 ```
-cd ../../python/examples/pipeline/imdb_model_ensemble
+cd ../examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
@@ -84,7 +84,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Pipeli
 ### 对于input data type = INDArray类型，以Simple Pipeline WebService中的uci_housing_model模型为例，服务端启动
 
 ```
-cd ../../python/examples/pipeline/simple_web_service
+cd ../examples/Pipeline/simple_web_service
 sh get_data.sh
 python web_service_java.py &>log.txt &
 ```
@@ -102,7 +102,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Pipeli
 
 2.目前Serving已推出Pipeline模式（原理详见[Pipeline Serving](../doc/Python_Pipeline/Pipeline_Design_CN.md)），面向Java的Pipeline Serving Client已发布。
 
-3.注意PipelineClientExample.java中的ip和port（位于java/examples/src/main/java/[PipelineClientExample.java](./examples/src/main/java/PipelineClientExample.java)），需要与对应Pipeline server的config.yaml文件中配置的ip和port相对应。（以IMDB model ensemble模型为例，位于python/examples/pipeline/imdb_model_ensemble/[config.yaml](../python/examples/pipeline/imdb_model_ensemble/config.yml)）
+3.注意PipelineClientExample.java中的ip和port（位于java/examples/src/main/java/[PipelineClientExample.java](./examples/src/main/java/PipelineClientExample.java)），需要与对应Pipeline server的config.yaml文件中配置的ip和port相对应。（以IMDB model ensemble模型为例，位于python/examples/pipeline/imdb_model_ensemble/[config.yaml](../examples/Pipeline/imdb_model_ensemble/config.yml)）
 
 ### 开发部署指导
 
diff --git a/python/paddle_serving_app/README.md b/python/paddle_serving_app/README.md
index 0430ec5c0f76c5de84dafcff688cc857531134ee..c3aa3721ee6b53363fdc799cd2b6e360b51cd478 100644
--- a/python/paddle_serving_app/README.md
+++ b/python/paddle_serving_app/README.md
@@ -52,7 +52,7 @@ Preprocessing for Chinese semantic representation task.
 
     - line（st ）：Text input.
 
-  [example](../examples/bert/bert_client.py)
+  [example](../../examples/C++/PaddleNLP/bert/bert_client.py)
 
 - class LACReader 
   
@@ -67,7 +67,7 @@ Preprocessing for Chinese word segmentation task.
     - words（st ）：Original text input.
     - crf_decode（np.array）：CRF code predicted by model.
 
-  [example](../examples/lac/lac_http_client.py)
+  [example](../../examples/C++/PaddleNLP/lac/lac_http_client.py)
 
 - class SentaReader
 
@@ -76,9 +76,9 @@ Preprocessing for Chinese word segmentation task.
   - `process(cols)`
     - cols（st ）：Word segmentation result.
 
-  [example](../examples/senta/senta_web_service.py)
+  [example](../../examples/C++/PaddleNLP/senta/senta_web_service.py)
 
-- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes，[example](../examples/imagenet/resnet50_rpc_client.py)
+- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes，[example](../../examples/C++/PaddleClas/imagenet/resnet50_rpc_client.py)
 
 - class Sequentia
 
@@ -144,7 +144,7 @@ This tool is convenient to analyze the proportion of time occupancy in the predi
 Load the trace file generated in the previous step through the load button, you can
 Visualize the time information of each stage of the forecast service.
 
-As shown in next figure, the figure shows the timeline of GPU prediction service using [bert example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert).
+As shown in next figure, the figure shows the timeline of GPU prediction service using [bert example](../../examples/C++/PaddleNLP/bert).
 The server side starts service with 4 GPU cards, the client side starts 4 processes to request, and the batch size is 1.
 In the figure, bert_pre represents the data pre-processing stage of the client, and client_infer represents the stage where the client completes the sending of the prediction request to the receiving result.
 The process in the figure represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
@@ -157,7 +157,7 @@ The inference op of Paddle Serving is implemented based on Paddle inference lib.
 Before deploying the prediction service, you may need to check the input and output of the prediction service or check the resource consumption.
 Therefore, a local prediction tool is built into the paddle_serving_app, which is used in the same way as sending a request to the server through the client.
 
-Taking [fit_a_line prediction service](../examples/fit_a_line) as an example, the following code can be used to run local prediction.
+Taking [fit_a_line prediction service](../../examples/C++/fit_a_line) as an example, the following code can be used to run local prediction.
 
 ```python
 from paddle_serving_app.local_predict import LocalPredictor
diff --git a/python/paddle_serving_app/README_CN.md b/python/paddle_serving_app/README_CN.md
index fc0c5154031bf53ab6c8b38626c7e1d271555bef..d651e39a4073fc2ba1d3977197b3a1a367437afc 100644
--- a/python/paddle_serving_app/README_CN.md
+++ b/python/paddle_serving_app/README_CN.md
@@ -48,7 +48,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
   - `process(line)`
     - line（str）：输入文本
 
-  [参考示例](../examples/bert/bert_client.py)
+  [参考示例](../../examples/C++/PaddleNLP/bert/bert_client.py)
 
 - class LACReader 中文分词预处理
 
@@ -60,7 +60,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
     - words（str）：原始文本
     - crf_decode（np.array）：模型预测结果中的CRF编码
 
-  [参考示例](../examples/lac/lac_http_client.py)
+  [参考示例](../../examples/C++/PaddleNLP/lac/lac_http_client.py)
 
 - class SentaReader
 
@@ -69,9 +69,9 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
   - `process(cols)`
     - cols（str）：分词后的文本
 
-  [参考示例](../examples/senta/senta_web_service.py)
+  [参考示例](../../examples/C++/PaddleNLP/senta/senta_web_service.py)
 
-- 图像的预处理方法相比于上述的方法更加灵活多变，可以通过以下的多个类进行组合，[参考示例](../examples/imagenet/resnet50_rpc_client.py)
+- 图像的预处理方法相比于上述的方法更加灵活多变，可以通过以下的多个类进行组合，[参考示例](../../examples/C++/PaddleClas/imagenet/resnet50_rpc_client.py)
 
 - class Sequentia
 
@@ -135,7 +135,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
 
 4. 使用chrome浏览器，打开`chrome://tracing/`网址，通过load按钮加载上一步产生的trace文件，即可将预测服务的各阶段时间信息可视化。
 
-   效果如下图，图中展示了使用[bert示例](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline。
+   效果如下图，图中展示了使用[bert示例](../../examples/C++/PaddleNLP/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline。
 其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送到接收结果的阶段，图中的process代表的是client的进程号，每个进程的第二行展示的是server各个op的timeline。
 
    ![timeline](../../doc/images/timeline-example.png)
@@ -144,7 +144,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
 
 Paddle Serving框架的server预测op使用了Paddle 的预测框架，在部署预测服务之前可能需要对预测服务的输入输出进行检验或者查看资源占用等。因此在paddle_serving_app中内置了本地预测工具，使用方式与通过client向服务端发送请求一致。
 
-以[fit_a_line预测服务](../examples/fit_a_line)为例，使用以下代码即可执行本地预测。
+以[fit_a_line预测服务](../../examples/C++/fit_a_line)为例，使用以下代码即可执行本地预测。
 
 ```python
 from paddle_serving_app.local_predict import LocalPredictor