@@ -54,8 +54,11 @@ You may need to use a domestic mirror source (in China, you can use the Tsinghua
...
@@ -54,8 +54,11 @@ You may need to use a domestic mirror source (in China, you can use the Tsinghua
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command.
If you need install modules compiled with develop branch, please download packages from [latest packages list](./doc/LATEST_PACKAGES.md) and install with `pip install` command.
Packages of Paddle Serving support Centos 6/7 and Ubuntu 16/18, or you can use HTTP service without install client.
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7 and Ubuntu 16/18.
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.6/3.7.
Recommended to install paddle >= 1.8.2.
<h2align="center"> Pre-built services with Paddle Serving</h2>
<h2align="center"> Pre-built services with Paddle Serving</h2>
It is recommended to use Docker for compilation. We have prepared the Paddle Serving compilation environment for you, see [this document](DOCKER_IMAGES.md).
It is recommended to use Docker for compilation. We have prepared the Paddle Serving compilation environment for you, see [this document](DOCKER_IMAGES.md).
There are two examples on CTR under python / examples, they are criteo_ctr, criteo_ctr_with_cube. The former is to save the entire model during training, including sparse parameters. The latter is to cut out the sparse parameters and save them into two parts, one is the sparse parameter and the other is the dense parameter. Because the scale of sparse parameters is very large in industrial cases, reaching the order of 10 ^ 9. Therefore, it is not practical to start large-scale sparse parameter prediction on one machine. Therefore, we introduced Baidu's industrial-grade product Cube to provide the sparse parameter service for many years to provide distributed sparse parameter services.
There are two examples on CTR under python / examples, they are criteo_ctr, criteo_ctr_with_cube. The former is to save the entire model during training, including sparse parameters. The latter is to cut out the sparse parameters and save them into two parts, one is the sparse parameter and the other is the dense parameter. Because the scale of sparse parameters is very large in industrial cases, reaching the order of 10 ^ 9. Therefore, it is not practical to start large-scale sparse parameter prediction on one machine. Therefore, we introduced Baidu's industrial-grade product Cube to provide the sparse parameter service for many years to provide distributed sparse parameter services.
The local mode of Cube is different from distributed Cube, which is designed to be convenient for developers to use in experiments and demos. If there is a demand for distributed sparse parameter service, please continue reading [Distributed Cube User Guide](./Distributed_Cube) after reading this document (still developing).
The local mode of Cube is different from distributed Cube, which is designed to be convenient for developers to use in experiments and demos.
<!--If there is a demand for distributed sparse parameter service, please continue reading [Distributed Cube User Guide](./Distributed_Cube) after reading this document (still developing).-->
This document uses the original model without any compression algorithm. If there is a need for a quantitative model to go online, please read the [Quantization Storage on Cube Sparse Parameter Indexing](./CUBE_QUANT.md)
This document uses the original model without any compression algorithm. If there is a need for a quantitative model to go online, please read the [Quantization Storage on Cube Sparse Parameter Indexing](./CUBE_QUANT.md)
@@ -14,7 +14,35 @@ Under the same conditions, the communication time of the HTTP prediction service
...
@@ -14,7 +14,35 @@ Under the same conditions, the communication time of the HTTP prediction service
Parameters for performance optimization:
Parameters for performance optimization:
The memory/graphic memory optimization option is enabled by default in Paddle Serving, which can reduce the memory/video memory usage and usually does not affect performance. If you need to turn it off, you can use --mem_optim_off in the command line.
r_optim can optimize the calculation graph and increase the inference speed. It is turned off by default and turned on by --ir_optim in the command line.
+ Compiling the GPU version requires nvidia-docker.
## Dockerfile
[CPU Version Dockerfile](../tools/Dockerfile)
[GPU Version Dockerfile](../tools/Dockerfile.gpu)
## Instructions
### Building Docker Image
Create a new directory and copy the Dockerfile to this directory.
Run
```bash
docker build -t serving_compile:cpu .
```
Or
```bash
docker build -t serving_compile:cuda9 .
```
## Enter Docker Container
CPU Version please run
```bash
docker run -it serving_compile:cpu bash
```
GPU Version please run
```bash
docker run -it--runtime=nvidia -it serving_compile:cuda9 bash
```
## List of supported environments compiled by Docker
The list of supported environments is as follows::
| System Environment Supported by CPU Docker Compiled Executables |
| -------------------------- |
| Centos6 |
| Centos7 |
| Ubuntu16.04 |
| Ubuntu18.04 |
| System Environment Supported by GPU Docker Compiled Executables |
| ---------------------------------- |
| Centos6_cuda9_cudnn7 |
| Centos7_cuda9_cudnn7 |
| Ubuntu16.04_cuda9_cudnn7 |
| Ubuntu16.04_cuda10_cudnn7 |
**Remarks:**
+ If you cannot find libcrypto.so.10 and libssl.so.10 when you execute the pre-compiled version, you can change /usr/lib64/libssl.so.10 and /usr/lib64/libcrypto.so in the Docker environment. 10 Copy to the directory where the executable is located.
+ CPU pre-compiled version can only be executed on CPU machines, GPU pre-compiled version can only be executed on GPU machines.