提交 0a5cea64 编写于 作者: B barriery

Merge branch 'pipeline-auto-batch' of https://github.com/barrierye/Serving into pipeling-log

...@@ -68,7 +68,7 @@ Paddle Serving uses this [Git branching model](http://nvie.com/posts/a-successfu ...@@ -68,7 +68,7 @@ Paddle Serving uses this [Git branching model](http://nvie.com/posts/a-successfu
1. Build and test 1. Build and test
Users can build Paddle Serving natively on Linux, see the [BUILD steps](doc/INSTALL.md). Users can build Paddle Serving natively on Linux, see the [BUILD steps](https://github.com/PaddlePaddle/Serving/blob/develop/doc/COMPILE.md).
1. Keep pulling 1. Keep pulling
......
...@@ -6,7 +6,8 @@ ...@@ -6,7 +6,8 @@
There are two examples on CTR under python / examples, they are criteo_ctr, criteo_ctr_with_cube. The former is to save the entire model during training, including sparse parameters. The latter is to cut out the sparse parameters and save them into two parts, one is the sparse parameter and the other is the dense parameter. Because the scale of sparse parameters is very large in industrial cases, reaching the order of 10 ^ 9. Therefore, it is not practical to start large-scale sparse parameter prediction on one machine. Therefore, we introduced Baidu's industrial-grade product Cube to provide the sparse parameter service for many years to provide distributed sparse parameter services. There are two examples on CTR under python / examples, they are criteo_ctr, criteo_ctr_with_cube. The former is to save the entire model during training, including sparse parameters. The latter is to cut out the sparse parameters and save them into two parts, one is the sparse parameter and the other is the dense parameter. Because the scale of sparse parameters is very large in industrial cases, reaching the order of 10 ^ 9. Therefore, it is not practical to start large-scale sparse parameter prediction on one machine. Therefore, we introduced Baidu's industrial-grade product Cube to provide the sparse parameter service for many years to provide distributed sparse parameter services.
The local mode of Cube is different from distributed Cube, which is designed to be convenient for developers to use in experiments and demos. If there is a demand for distributed sparse parameter service, please continue reading [Distributed Cube User Guide](./Distributed_Cube) after reading this document (still developing). The local mode of Cube is different from distributed Cube, which is designed to be convenient for developers to use in experiments and demos.
<!--If there is a demand for distributed sparse parameter service, please continue reading [Distributed Cube User Guide](./Distributed_Cube) after reading this document (still developing).-->
This document uses the original model without any compression algorithm. If there is a need for a quantitative model to go online, please read the [Quantization Storage on Cube Sparse Parameter Indexing](./CUBE_QUANT.md) This document uses the original model without any compression algorithm. If there is a need for a quantitative model to go online, please read the [Quantization Storage on Cube Sparse Parameter Indexing](./CUBE_QUANT.md)
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
在python/examples下有两个关于CTR的示例,他们分别是criteo_ctr, criteo_ctr_with_cube。前者是在训练时保存整个模型,包括稀疏参数。后者是将稀疏参数裁剪出来,保存成两个部分,一个是稀疏参数,另一个是稠密参数。由于在工业级的场景中,稀疏参数的规模非常大,达到10^9数量级。因此在一台机器上启动大规模稀疏参数预测是不实际的,因此我们引入百度多年来在稀疏参数索引领域的工业级产品Cube,提供分布式的稀疏参数服务。 在python/examples下有两个关于CTR的示例,他们分别是criteo_ctr, criteo_ctr_with_cube。前者是在训练时保存整个模型,包括稀疏参数。后者是将稀疏参数裁剪出来,保存成两个部分,一个是稀疏参数,另一个是稠密参数。由于在工业级的场景中,稀疏参数的规模非常大,达到10^9数量级。因此在一台机器上启动大规模稀疏参数预测是不实际的,因此我们引入百度多年来在稀疏参数索引领域的工业级产品Cube,提供分布式的稀疏参数服务。
单机版Cube是分布式Cube的弱化版本,旨在方便开发者做实验和Demo时使用。如果有分布式稀疏参数服务的需求,请在读完此文档之后,继续阅读 [稀疏参数索引服务Cube使用指南](分布式Cube)(正在建设中)。 <!--单机版Cube是分布式Cube的弱化版本,旨在方便开发者做实验和Demo时使用。如果有分布式稀疏参数服务的需求,请在读完此文档之后,继续阅读 [稀疏参数索引服务Cube使用指南](分布式Cube)(正在建设中)。-->
本文档使用的都是未经过任何压缩算法处理的原始模型,如果有量化模型上线需求,请阅读[Cube稀疏参数索引量化存储使用指南](./CUBE_QUANT_CN.md) 本文档使用的都是未经过任何压缩算法处理的原始模型,如果有量化模型上线需求,请阅读[Cube稀疏参数索引量化存储使用指南](./CUBE_QUANT_CN.md)
......
...@@ -106,7 +106,7 @@ class FluidFamilyCore { ...@@ -106,7 +106,7 @@ class FluidFamilyCore {
![预测服务Service](predict-service.png) ![预测服务Service](predict-service.png)
关于OP之间的依赖关系,以及通过OP组建workflow,可以参考[从零开始写一个预测服务](CREATING.md)的相关章节 关于OP之间的依赖关系,以及通过OP组建workflow,可以参考[从零开始写一个预测服务](https://github.com/PaddlePaddle/Serving/blob/develop/doc/deprecated/CREATING.md)的相关章节
服务端实例透视图 服务端实例透视图
......
...@@ -254,6 +254,8 @@ dag: ...@@ -254,6 +254,8 @@ dag:
client_type: brpc # Use brpc or grpc client. The default is brpc client_type: brpc # Use brpc or grpc client. The default is brpc
retry: 1 # The number of times DAG executor retries after failure. The default value is 1, that is, no retrying retry: 1 # The number of times DAG executor retries after failure. The default value is 1, that is, no retrying
use_profile: false # Whether to print the log on the server side. The default is false use_profile: false # Whether to print the log on the server side. The default is false
tracer:
interval_s: 600 # Monitoring time interval of Tracer (in seconds). Do not start monitoring when the value is less than 1. The default value is 600
``` ```
......
...@@ -254,6 +254,8 @@ dag: ...@@ -254,6 +254,8 @@ dag:
client_type: brpc # 使用 brpc 或 grpc client,默认为 brpc client_type: brpc # 使用 brpc 或 grpc client,默认为 brpc
retry: 1 # DAG Executor 在失败后重试次数,默认为 1,即不重试 retry: 1 # DAG Executor 在失败后重试次数,默认为 1,即不重试
use_profile: false # 是否在 Server 端打印日志,默认为 false use_profile: false # 是否在 Server 端打印日志,默认为 false
tracer:
interval_s: 600 # Tracer 监控的时间间隔,单位为秒。当该值小于 1 时不启动监控,默认为 600
``` ```
......
...@@ -77,7 +77,7 @@ service ImageClassifyService { ...@@ -77,7 +77,7 @@ service ImageClassifyService {
关于Serving端的配置的详细信息,可以参考[Serving端配置](SERVING_CONFIGURE.md) 关于Serving端的配置的详细信息,可以参考[Serving端配置](SERVING_CONFIGURE.md)
以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念,可参考[设计文档](DESIGN.md)) 以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念,可参考[设计文档](../DESIGN.md))
- 配置文件示例: - 配置文件示例:
......
...@@ -26,7 +26,7 @@ ...@@ -26,7 +26,7 @@
第1) - 第5)步裁剪完毕后的模型网络配置如下: 第1) - 第5)步裁剪完毕后的模型网络配置如下:
![Pruned CTR prediction network](pruned-ctr-network.png) ![Pruned CTR prediction network](../pruned-ctr-network.png)
整个裁剪过程具体说明如下: 整个裁剪过程具体说明如下:
......
# Docker compilation environment preparation
([简体中文](./DOCKER_CN.md)|English)
## Environmental requirements
+ Docker is installed on the development machine.
+ Compiling the GPU version requires nvidia-docker.
## Dockerfile
[CPU Version Dockerfile](../tools/Dockerfile)
[GPU Version Dockerfile](../tools/Dockerfile.gpu)
## Instructions
### Building Docker Image
Create a new directory and copy the Dockerfile to this directory.
Run
```bash
docker build -t serving_compile:cpu .
```
Or
```bash
docker build -t serving_compile:cuda9 .
```
## Enter Docker Container
CPU Version please run
```bash
docker run -it serving_compile:cpu bash
```
GPU Version please run
```bash
docker run -it --runtime=nvidia -it serving_compile:cuda9 bash
```
## List of supported environments compiled by Docker
The list of supported environments is as follows::
| System Environment Supported by CPU Docker Compiled Executables |
| -------------------------- |
| Centos6 |
| Centos7 |
| Ubuntu16.04 |
| Ubuntu18.04 |
| System Environment Supported by GPU Docker Compiled Executables |
| ---------------------------------- |
| Centos6_cuda9_cudnn7 |
| Centos7_cuda9_cudnn7 |
| Ubuntu16.04_cuda9_cudnn7 |
| Ubuntu16.04_cuda10_cudnn7 |
**Remarks:**
+ If you cannot find libcrypto.so.10 and libssl.so.10 when you execute the pre-compiled version, you can change /usr/lib64/libssl.so.10 and /usr/lib64/libcrypto.so in the Docker environment. 10 Copy to the directory where the executable is located.
+ CPU pre-compiled version can only be executed on CPU machines, GPU pre-compiled version can only be executed on GPU machines.
# Docker编译环境准备
(简体中文|[English](./DOCKER.md))
## 环境要求
+ 开发机上已安装Docker。
+ 编译GPU版本需要安装nvidia-docker。
## Dockerfile文件
[CPU版本Dockerfile](../tools/Dockerfile)
[GPU版本Dockerfile](../tools/Dockerfile.gpu)
## 使用方法
### 构建Docker镜像
建立新目录,复制Dockerfile内容到该目录下Dockerfile文件。
执行
```bash
docker build -t serving_compile:cpu .
```
或者
```bash
docker build -t serving_compile:cuda9 .
```
## 进入Docker
CPU版本请执行
```bash
docker run -it serving_compile:cpu bash
```
GPU版本请执行
```bash
docker run -it --runtime=nvidia -it serving_compile:cuda9 bash
```
## Docker编译出的可执行文件支持的环境列表
经过验证的环境列表如下:
| CPU Docker编译出的可执行文件支持的系统环境 |
| -------------------------- |
| Centos6 |
| Centos7 |
| Ubuntu16.04 |
| Ubuntu18.04 |
| GPU Docker编译出的可执行文件支持的系统环境 |
| ---------------------------------- |
| Centos6_cuda9_cudnn7 |
| Centos7_cuda9_cudnn7 |
| Ubuntu16.04_cuda9_cudnn7 |
| Ubuntu16.04_cuda10_cudnn7 |
**备注:**
+ 若执行预编译版本出现找不到libcrypto.so.10、libssl.so.10的情况,可以将Docker环境中的/usr/lib64/libssl.so.10与/usr/lib64/libcrypto.so.10复制到可执行文件所在目录。
+ CPU预编译版本仅可在CPU机器上执行,GPU预编译版本仅可在GPU机器上执行。
# Getting Started
请先按照[编译安装说明](INSTALL.md)完成编译
## 运行示例
说明:Imagenet图像分类模型,默认采用CPU模式(GPU模式当前版本暂未提供支持)
Step1:启动Server端:
```shell
cd /path/to/paddle-serving/output/demo/serving/ && ./bin/serving &
```
默认启动后日志写在./log/下,可tail日志查看serving端接收请求的日志:
```shell
tail -f log/serving.INFO
```
Step2:启动Client端:
```shell
cd path/to/paddle-serving/output/demo/client/image_classification && ./bin/ximage &
```
默认启动后日志写在./log/下,可tail日志查看分类结果:
```shell
tail -f log/ximage.INFO
```
...@@ -72,7 +72,7 @@ for i in range(0, len(samples) - BATCH_SIZE, BATCH_SIZE): ...@@ -72,7 +72,7 @@ for i in range(0, len(samples) - BATCH_SIZE, BATCH_SIZE):
print e.reason print e.reason
``` ```
完整示例请参考[text_classification.py](../demo-client/python/text_classification.py) 完整示例请参考[text_classification.py](https://github.com/PaddlePaddle/Serving/blob/develop/tools/cpp_examples/demo-client/python/text_classification.py)
## 3. PHP访问HTTP Serving ## 3. PHP访问HTTP Serving
...@@ -128,4 +128,4 @@ for ($i = 0; $i < count($samples) - BATCH_SIZE; $i += BATCH_SIZE) { ...@@ -128,4 +128,4 @@ for ($i = 0; $i < count($samples) - BATCH_SIZE; $i += BATCH_SIZE) {
curl_close($ch); curl_close($ch);
``` ```
完整代码请参考[text_classification.php](../demo-client/php/text_classification.php) 完整代码请参考[text_classification.php](https://github.com/PaddlePaddle/Serving/blob/develop/tools/cpp_examples/demo-client/php/text_classification.php)
[Design](DESIGN.md)
[Installation](INSTALL.md)
[Getting Started](GETTING_STARTED.md)
[Creating a Prediction Service](CREATING.md)
[Client Configure](CLIENT_CONFIGURE.md)
[Server Side Configuration](SERVING_CONFIGURE.md)
[How to Configure a Clustered Service](CLUSTERING.md)
[Multiple Serving Instances over Single GPU Card](MULTI_SERVING_OVER_SINGLE_GPU_CARD.md)
[Benchmarking](BENCHMARKING.md)
[GPU Benchmarking](GPU_BENCHMARKING.md)
[FAQ](FAQ.md)
## 带稀疏参数索引服务的CTR预测服务
该样例是为了展示gRPC Server 端 `load_model_config` 函数,在这个例子中,bRPC Server 端与 bRPC Client 端的配置文件是不同的(bPRC Client 端的数据先交给 cube,经过 cube 处理后再交给预测库)
### 获取样例数据
```
sh get_data.sh
```
### 下载模型和稀疏参数序列文件
```
wget https://paddle-serving.bj.bcebos.com/unittest/ctr_cube_unittest.tar.gz
tar xf ctr_cube_unittest.tar.gz
mv models/ctr_client_conf ./
mv models/ctr_serving_model_kv ./
mv models/data ./cube/
```
执行脚本后会在当前目录有ctr_server_model_kv和ctr_client_config文件夹。
### 启动稀疏参数索引服务
```
wget https://paddle-serving.bj.bcebos.com/others/cube_app.tar.gz
tar xf cube_app.tar.gz
mv cube_app/cube* ./cube/
sh cube_prepare.sh &
```
此处,模型当中的稀疏参数会被存放在稀疏参数索引服务Cube当中,关于稀疏参数索引服务Cube的介绍,请阅读[稀疏参数索引服务Cube单机版使用指南](../../../doc/CUBE_LOCAL_CN.md)
### 启动RPC预测服务,服务端线程数为4(可在test_server.py配置)
```
python test_server.py ctr_serving_model_kv ctr_client_conf/serving_client_conf.prototxt
```
### 执行预测
```
python test_client.py ./raw_data
```
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册