Merge branch 'compile_doc' of https://github.com/bjjwwang/serving into compile_doc

df478f9b · bjjwwang · 23bba8f4 · c0d48e17 · df478f9b · df478f9b
12 changed file
--- a/core/general-client/README_CN.md
+++ b/core/general-client/README_CN.md
@@ -9,7 +9,7 @@
 以fit_a_line模型为例，服务端启动与常规BRPC-Server端启动命令一样。

 ```
-cd ../../python/examples/fit_a_line
+cd ../../examples/C++/fit_a_line
 sh get_data.sh
 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
 ```

--- a/doc/Latest_Packages_CN.md
+++ b/doc/Latest_Packages_CN.md
@@ -49,16 +49,16 @@ for kunlun user who uses arm-xpu or x86-xpu can download the wheel packages as f

 for arm kunlun user
 ```
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_client-0.7.0-cp36-cp36m-linux_aarch64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_app-0.7.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_aarch64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_aarch64.whl
 ```
 
 for x86 kunlun user
 ``` 
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_client-0.7.0-cp36-cp36m-linux_x86_64.whl
-https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0g/paddle_serving_app-0.7.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_server_xpu-0.7.0.post2-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_client-0.7.0-cp36-cp36m-linux_x86_64.whl
+https://paddle-serving.bj.bcebos.com/whl/xpu/0.7.0/paddle_serving_app-0.7.0-cp36-cp36m-linux_x86_64.whl
 ```



--- a/doc/Process_data_CN.md
+++ b/doc/Process_data_CN.md
@@ -10,7 +10,7 @@ pipeline客户端只做很简单的处理，他们把自然输入转化成可以

 #### 1）字符串/数字

-字符串和数字在这个阶段都以字符串的形式存在。我们以[房价预测](../python/examples/pipeline/simple_web_service)作为例子。房价预测的输入是13个维度的浮点数去描述一个住房的特征。在客户端阶段就可以直接如下所示
+字符串和数字在这个阶段都以字符串的形式存在。我们以[房价预测](../examples/Pipeline/simple_web_service)作为例子。房价预测的输入是13个维度的浮点数去描述一个住房的特征。在客户端阶段就可以直接如下所示

 ```
 curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value": ["0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332"]}'
@@ -24,11 +24,11 @@ curl -X POST -k http://localhost:18082/uci/prediction -d '{"key": ["x"], "value"
 curl -X POST -k http://localhost:18082/bert/prediction -d '{"key": ["x"], "value": ["hello world"]}'
 ```

-当然，复杂的处理也可以把这个curl转换成python语言，详情参见[Bert Pipeline示例](../python/examples/pipeline/bert). 
+当然，复杂的处理也可以把这个curl转换成python语言，详情参见[Bert Pipeline示例](../examples/Pipeline/PaddleNLP/bert). 

 #### 2）图片

-图片在Paddle的输入通常需要转换成numpy array，但是在客户端阶段，不需要转换成numpy array，因为那样比较耗费空间，在这个阶段我们用base64 string来传输就可以了，到了服务端的前处理再去解读base64转换成numpy array。详情参见[图像分类pipeline示例](../python/examples/pipeline/PaddleClas/DarkNet53/pipeline_http_client.py)，我们也贴出部分代码
+图片在Paddle的输入通常需要转换成numpy array，但是在客户端阶段，不需要转换成numpy array，因为那样比较耗费空间，在这个阶段我们用base64 string来传输就可以了，到了服务端的前处理再去解读base64转换成numpy array。详情参见[图像分类pipeline示例](../examples/Pipeline/PaddleClas/DarkNet53/pipeline_http_client.py)，我们也贴出部分代码

 ```python
 def cv2_to_base64(image):
@@ -52,7 +52,7 @@ if __name__ == "__main__":

 #### 1）字符串/数字

-刚才提到的房价预测示例，[服务端程序](../python/examples/pipeline/simple_web_service/web_service.py)在这里。
+刚才提到的房价预测示例，[服务端程序](../examples/Pipeline/simple_web_service/web_service.py)在这里。

 ```python
    def init_op(self):
@@ -115,7 +115,7 @@ if __name__ == "__main__":

 #### 2）图片处理

-图像的前处理阶段，前面提到的图像处理程序，[服务端程序](../python/examples/pipeline/PaddleClas/DarkNet53/resnet50_web_service.py)如下。
+图像的前处理阶段，前面提到的图像处理程序，[服务端程序](../examples/Pipeline/PaddleClas/DarkNet53/resnet50_web_service.py)如下。

 ```python
    def init_op(self):

--- a/doc/Python_Pipeline/Pipeline_Design_CN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_CN.md
@@ -325,7 +325,7 @@ class ResponseOp(Op):
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
 - [simple_web_service](../../examples/Pipeline/simple_web_service)

-以 imdb_model_ensemble 为例来展示如何使用 Pipeline Serving，相关代码在 `python/examples/pipeline/imdb_model_ensemble` 文件夹下可以找到，例子中的 Server 端结构如下图所示：
+以 imdb_model_ensemble 为例来展示如何使用 Pipeline Serving，相关代码在 `Serving/examples/Pipeline/imdb_model_ensemble` 文件夹下可以找到，例子中的 Server 端结构如下图所示：

 <div align=center>
 <img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
@@ -352,13 +352,13 @@ class ResponseOp(Op):
 ### 3.2 获取模型文件

 ```shell
-cd python/examples/pipeline/imdb_model_ensemble
+cd Serving/examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
 ```

-PipelineServing 也支持本地自动启动 PaddleServingService，请参考 `python/examples/pipeline/ocr` 下的例子。
+PipelineServing 也支持本地自动启动 PaddleServingService，请参考 `Serving/examples/Pipeline/PaddleOCR/ocr` 下的例子。

 ### 3.3 创建config.yaml
 本示例采用了brpc的client连接类型，还可以选择grpc或local_predictor。
@@ -700,7 +700,7 @@ Pipeline Serving支持低精度推理，CPU、GPU和TensoRT支持的精度类型
  - fp16
  - int8 

-参考[simple_web_service](../python/examples/pipeline/simple_web_service)示例
+参考[simple_web_service](../../examples/Pipeline/simple_web_service)示例
 ***

 ## 5.日志追踪

--- a/doc/Python_Pipeline/Pipeline_Design_EN.md
+++ b/doc/Python_Pipeline/Pipeline_Design_EN.md
@@ -320,7 +320,7 @@ All examples of pipelines are in [examples/pipeline/](../../examples/Pipeline) d
 - [ocr](../../examples/Pipeline/PaddleOCR/ocr)
 - [simple_web_service](../../examples/Pipeline/simple_web_service)

-Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `python/examples/pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:
+Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `Serving/examples/Pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:

 <div align=center>
 <img src='images/pipeline_serving-image4.png' height = "200" align="middle"/>
@@ -348,13 +348,13 @@ Five types of files are needed, of which model files, configuration files, and s
 ### 3.2 Get model files

 ```shell
-cd python/examples/pipeline/imdb_model_ensemble
+cd Serving/examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
 ```

-PipelineServing also supports local automatic startup of PaddleServingService. Please refer to the example `python/examples/pipeline/ocr`.
+PipelineServing also supports local automatic startup of PaddleServingService. Please refer to the example `Serving/examples/Pipeline/PaddleOCR/ocr`.


 ### 3.3 Create config.yaml
@@ -705,7 +705,7 @@ Pipeline Serving supports low-precision inference. The precision types supported
  - fp16
  - int8 

-Reference the example [simple_web_service](../python/examples/pipeline/simple_web_service).
+Reference the example [simple_web_service](../../examples/Pipeline/simple_web_service).

 ***
 

--- a/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md
+++ b/examples/C++/PaddleRec/criteo_ctr_with_cube/README.md
@@ -4,7 +4,7 @@

 ### Get Sample Dataset

-go to directory `python/examples/criteo_ctr_with_cube`
+go to directory `examples/C++/PaddleRec/criteo_ctr_with_cube`
 ```
 sh get_data.sh
 ```
@@ -45,7 +45,7 @@ python3 test_client.py ctr_client_conf/serving_client_conf.prototxt ./raw_data

 CPU ：Intel(R) Xeon(R) CPU 6148 @ 2.40GHz 

-Model ：[Criteo CTR](https://github.com/PaddlePaddle/Serving/blob/develop/python/examples/criteo_ctr_with_cube/network_conf.py)
+Model ：[Criteo CTR](./network_conf.py)

 server core/thread num ： 4/8


--- a/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md
+++ b/examples/C++/PaddleRec/criteo_ctr_with_cube/README_CN.md
@@ -2,7 +2,7 @@
 (简体中文|[English](./README.md))

 ### 获取样例数据
-进入目录 `python/examples/criteo_ctr_with_cube`
+进入目录 `examples/C++/PaddleRec/criteo_ctr_with_cube`
 ```
 sh get_data.sh
 ```
@@ -43,7 +43,7 @@ python3 test_client.py ctr_client_conf/serving_client_conf.prototxt ./raw_data

 设备 ：Intel(R) Xeon(R) CPU 6148 @ 2.40GHz 

-模型 ：[Criteo CTR](https://github.com/PaddlePaddle/Serving/blob/develop/python/examples/criteo_ctr_with_cube/network_conf.py)
+模型 ：[Criteo CTR](./network_conf.py)

 server core/thread num ： 4/8


--- a/examples/util/README.md
+++ b/examples/util/README.md
@@ -26,6 +26,6 @@ The script converts the time-dot information in the log into a json format and s

 Specific operation: Open the chrome browser, enter `chrome://tracing/` in the address bar, jump to the tracing page, click the `load` button, and open the saved trace file to visualize the time information of each stage of the prediction service.

-The data visualization output is shown as follow, it uses [bert as service example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert) GPU inference service. The server starts 4 GPU prediction, the client starts 4 `processes`, and the timeline of each stage when the batch size is 1. Among them, `bert_pre` represents the data preprocessing stage of the client, and `client_infer` represents the stage where the client completes sending and receiving prediction requests. `process` represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
+The data visualization output is shown as follow, it uses [bert as service example](../C++/PaddleNLP/bert) GPU inference service. The server starts 4 GPU prediction, the client starts 4 `processes`, and the timeline of each stage when the batch size is 1. Among them, `bert_pre` represents the data preprocessing stage of the client, and `client_infer` represents the stage where the client completes sending and receiving prediction requests. `process` represents the process number of the client, and the second line of each process shows the timeline of each op of the server.

 ![timeline](../../doc/images/timeline-example.png)
--- a/examples/util/README_CN.md
+++ b/examples/util/README_CN.md
@@ -26,6 +26,6 @@ python3 timeline_trace.py profile trace

 具体操作：打开chrome浏览器，在地址栏输入chrome://tracing/，跳转至tracing页面，点击load按钮，打开保存的trace文件，即可将预测服务的各阶段时间信息可视化。

-效果如下图，图中展示了使用[bert示例](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline，其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送和接收结果的阶段，图中的process代表的是client的进程号，每个进进程的第二行展示的是server各个op的timeline。
+效果如下图，图中展示了使用[bert示例](../C++/PaddleNLP/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline，其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送和接收结果的阶段，图中的process代表的是client的进程号，每个进进程的第二行展示的是server各个op的timeline。

 ![timeline](../../doc/images/timeline-example.png)
--- a/java/README_CN.md
+++ b/java/README_CN.md
@@ -34,7 +34,7 @@ mvn install
 以fit_a_line模型为例，服务端启动与常规BRPC-Server端启动命令一样。

 ```
-cd ../../python/examples/fit_a_line
+cd ../../examples/C++/fit_a_line
 sh get_data.sh
 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
 ```
@@ -59,7 +59,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Paddle
 对于input data type = string类型，以IMDB model ensemble模型为例，服务端启动

 ```
-cd ../../python/examples/pipeline/imdb_model_ensemble
+cd ../examples/Pipeline/imdb_model_ensemble
 sh get_data.sh
 python -m paddle_serving_server.serve --model imdb_cnn_model --port 9292 &> cnn.log &
 python -m paddle_serving_server.serve --model imdb_bow_model --port 9393 &> bow.log &
@@ -84,7 +84,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Pipeli
 ### 对于input data type = INDArray类型，以Simple Pipeline WebService中的uci_housing_model模型为例，服务端启动

 ```
-cd ../../python/examples/pipeline/simple_web_service
+cd ../examples/Pipeline/simple_web_service
 sh get_data.sh
 python web_service_java.py &>log.txt &
 ```
@@ -102,7 +102,7 @@ java -cp paddle-serving-sdk-java-examples-0.0.1-jar-with-dependencies.jar Pipeli

 2.目前Serving已推出Pipeline模式（原理详见[Pipeline Serving](../doc/Python_Pipeline/Pipeline_Design_CN.md)），面向Java的Pipeline Serving Client已发布。

-3.注意PipelineClientExample.java中的ip和port（位于java/examples/src/main/java/[PipelineClientExample.java](./examples/src/main/java/PipelineClientExample.java)），需要与对应Pipeline server的config.yaml文件中配置的ip和port相对应。（以IMDB model ensemble模型为例，位于python/examples/pipeline/imdb_model_ensemble/[config.yaml](../python/examples/pipeline/imdb_model_ensemble/config.yml)）
+3.注意PipelineClientExample.java中的ip和port（位于java/examples/src/main/java/[PipelineClientExample.java](./examples/src/main/java/PipelineClientExample.java)），需要与对应Pipeline server的config.yaml文件中配置的ip和port相对应。（以IMDB model ensemble模型为例，位于python/examples/pipeline/imdb_model_ensemble/[config.yaml](../examples/Pipeline/imdb_model_ensemble/config.yml)）

 ### 开发部署指导


--- a/python/paddle_serving_app/README.md
+++ b/python/paddle_serving_app/README.md
@@ -52,7 +52,7 @@ Preprocessing for Chinese semantic representation task.

    - line（st ）：Text input.

-  [example](../examples/bert/bert_client.py)
+  [example](../../examples/C++/PaddleNLP/bert/bert_client.py)

 - class LACReader 
  
@@ -67,7 +67,7 @@ Preprocessing for Chinese word segmentation task.
    - words（st ）：Original text input.
    - crf_decode（np.array）：CRF code predicted by model.

-  [example](../examples/lac/lac_http_client.py)
+  [example](../../examples/C++/PaddleNLP/lac/lac_http_client.py)

 - class SentaReader

@@ -76,9 +76,9 @@ Preprocessing for Chinese word segmentation task.
  - `process(cols)`
    - cols（st ）：Word segmentation result.

-  [example](../examples/senta/senta_web_service.py)
+  [example](../../examples/C++/PaddleNLP/senta/senta_web_service.py)

- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes，[example](../examples/imagenet/resnet50_rpc_client.py)
+- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes，[example](../../examples/C++/PaddleClas/imagenet/resnet50_rpc_client.py)

 - class Sequentia

@@ -144,7 +144,7 @@ This tool is convenient to analyze the proportion of time occupancy in the predi
 Load the trace file generated in the previous step through the load button, you can
 Visualize the time information of each stage of the forecast service.

-As shown in next figure, the figure shows the timeline of GPU prediction service using [bert example](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert).
+As shown in next figure, the figure shows the timeline of GPU prediction service using [bert example](../../examples/C++/PaddleNLP/bert).
 The server side starts service with 4 GPU cards, the client side starts 4 processes to request, and the batch size is 1.
 In the figure, bert_pre represents the data pre-processing stage of the client, and client_infer represents the stage where the client completes the sending of the prediction request to the receiving result.
 The process in the figure represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
@@ -157,7 +157,7 @@ The inference op of Paddle Serving is implemented based on Paddle inference lib.
 Before deploying the prediction service, you may need to check the input and output of the prediction service or check the resource consumption.
 Therefore, a local prediction tool is built into the paddle_serving_app, which is used in the same way as sending a request to the server through the client.

-Taking [fit_a_line prediction service](../examples/fit_a_line) as an example, the following code can be used to run local prediction.
+Taking [fit_a_line prediction service](../../examples/C++/fit_a_line) as an example, the following code can be used to run local prediction.

 ```python
 from paddle_serving_app.local_predict import LocalPredictor

--- a/python/paddle_serving_app/README_CN.md
+++ b/python/paddle_serving_app/README_CN.md
@@ -48,7 +48,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
  - `process(line)`
    - line（str）：输入文本

-  [参考示例](../examples/bert/bert_client.py)
+  [参考示例](../../examples/C++/PaddleNLP/bert/bert_client.py)

 - class LACReader 中文分词预处理

@@ -60,7 +60,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
    - words（str）：原始文本
    - crf_decode（np.array）：模型预测结果中的CRF编码

-  [参考示例](../examples/lac/lac_http_client.py)
+  [参考示例](../../examples/C++/PaddleNLP/lac/lac_http_client.py)

 - class SentaReader

@@ -69,9 +69,9 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的
  - `process(cols)`
    - cols（str）：分词后的文本

-  [参考示例](../examples/senta/senta_web_service.py)
+  [参考示例](../../examples/C++/PaddleNLP/senta/senta_web_service.py)

- 图像的预处理方法相比于上述的方法更加灵活多变，可以通过以下的多个类进行组合，[参考示例](../examples/imagenet/resnet50_rpc_client.py)
+- 图像的预处理方法相比于上述的方法更加灵活多变，可以通过以下的多个类进行组合，[参考示例](../../examples/C++/PaddleClas/imagenet/resnet50_rpc_client.py)

 - class Sequentia

@@ -135,7 +135,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的

 4. 使用chrome浏览器，打开`chrome://tracing/`网址，通过load按钮加载上一步产生的trace文件，即可将预测服务的各阶段时间信息可视化。

-   效果如下图，图中展示了使用[bert示例](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline。
+   效果如下图，图中展示了使用[bert示例](../../examples/C++/PaddleNLP/bert)的GPU预测服务，server端开启4卡预测，client端启动4进程，batch size为1时的各阶段timeline。
 其中bert_pre代表client端的数据预处理阶段，client_infer代表client完成预测请求的发送到接收结果的阶段，图中的process代表的是client的进程号，每个进程的第二行展示的是server各个op的timeline。

   ![timeline](../../doc/images/timeline-example.png)
@@ -144,7 +144,7 @@ paddle_serving_app针对CV和NLP领域的模型任务，提供了多种常见的

 Paddle Serving框架的server预测op使用了Paddle 的预测框架，在部署预测服务之前可能需要对预测服务的输入输出进行检验或者查看资源占用等。因此在paddle_serving_app中内置了本地预测工具，使用方式与通过client向服务端发送请求一致。

-以[fit_a_line预测服务](../examples/fit_a_line)为例，使用以下代码即可执行本地预测。
+以[fit_a_line预测服务](../../examples/C++/fit_a_line)为例，使用以下代码即可执行本地预测。

 ```python
 from paddle_serving_app.local_predict import LocalPredictor