Compiling Paddle Serving is divided into the following steps
- Compilation Environment Preparation: According to the needs of the model and operating environment, select the most suitable image
- Download the Serving Code Repo: Download the Serving code library, and perform initialization operations as needed
- Environment Variable Preparation: According to the needs of the running environment, determine the various environment variables of Python. For example, the GPU environment also needs to determine the environment variables such as Cuda, Cudnn, TensorRT and so on.
- Compilation: Compile `paddle-serving-server`, `paddle-serving-client`, `paddle-serving-app` related whl packages
- Install Related Whl Packages: install the three compiled whl packages, and set the SERVING_BIN environment variable
In addition, for some C++ secondary development scenarios, we also provide OPENCV binding solutions.
It is recommended to use Docker for compilation. We have prepared the Paddle Serving compilation environment for you, see [this document](Docker_Images_EN.md).
Docker compilation is recommended. We have prepared the Paddle Serving compilation environment for you and configured the above compilation dependencies. For details, please refer to [this document](DOCKER_IMAGES_CN.md).
## Get Code
We provide five environment development images, namely CPU, Cuda10.1+Cudnn7, Cuda10.2+Cudnn7, Cuda10.2+Cudnn8, Cuda11.2+Cudnn8. We provide a Serving development image to cover the above environment. At the same time, we also support Paddle development mirroring.
``` python
gitclonehttps://github.com/PaddlePaddle/Serving
cdServing&&gitsubmoduleupdate--init--recursive
The Serving image name is **paddlepaddle/serving:${Serving development image Tag}** (If the network is not good, you can visit **registry.baidubce.com/paddlepaddle/serving:${Serving development image Tag}**), The name of the Paddle development image is **paddlepaddle/paddle:${Paddle Development Image Tag}**. In order to prevent users from confusing the two sets of mirroring, we explain the origin of the two sets of mirroring separately.
Serving development mirror is the mirror used to compile and debug prediction services provided by Serving suite in order to support various prediction environments. Paddle development mirror is the mirror used for compilation, development, and training models released by Paddle on the official website. In order to allow Paddle developers to use Serving directly in the same container. For developers who have already used Serving users in the previous version, Serving development image should not be unfamiliar. But for developers who are familiar with the Paddle training framework ecology, they should be more familiar with the existing Paddle development mirrors. In order to adapt to the different habits of all users, we have fully supported both sets of mirrors.
| Environment | Serving Dev Image Tag | OS | Paddle Dev Image Tag | OS |
We first need to pull related images for the environment we need. Under the **Environment** column in the above table, except for the CPU, the rest (Cuda**+Cudnn**) belong to the GPU environment.
You can use Serving Dev Images.
```
docker pull paddlepaddle/serving:${Serving Dev Image Tag}
## PYTHONROOT settings
# For GPU Image
nvidia-docker run --rm -it paddlepaddle/serving:${Serving Dev Image Tag} bash
```shell
# For example, the path of python is /usr/bin/python, you can set PYTHONROOT
export PYTHONROOT=/usr
# For CPU Image
docker run --rm -it paddlepaddle/serving:${Serving Dev Image Tag} bash
```
If you are using a Docker development image, please follow the following to determine the Python version to be compiled, and set the corresponding environment variables
You can also use Paddle Dev Images.
## Download the Serving Code Repo
**Note: If you are using Paddle to develop the image, you need to manually run `bash env_install.sh` after downloading the code base (as shown in the third line of the code box)**
If you use other Python version, please use the right `pip` accordingly.
If you are using a Serving development image, please follow the steps below to determine the Python version that needs to be compiled and set the corresponding environment variables. A total of three environment variables need to be set, namely `PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES`, `PYTHON_EXECUTABLE`. Below we take python 3.7 as an example to introduce how to set these three environment variables.
## GOPATH Setting
The default GOPATH is set to `$HOME/go`, you can also set it to other values. **If it is the Docker environment provided by Serving, you do not need to set up.**
1) Set `PYTHON_INCLUDE_DIR`
```shell
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
Search the directory where Python.h is located
```
find / -name Python.h
```
Usually there will be something like `**/include/python3.7/Python.h`, we only need to take its folder directory, for example, find `/usr/include/python3.7/Python.h`, Then we only need `export PYTHON_INCLUDE_DIR=/usr/include/python3.7/`.
If not found. Explanation 1) The development version of Python is not installed and needs to be re-installed. 2) Insufficient permissions cannot view the relevant system directories.
## Get go packages
2) Set `PYTHON_LIBRARIES`
```shell
go env-wGO111MODULE=on
go env-wGOPROXY=https://goproxy.cn,direct
go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
go get -u github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
go get -u github.com/golang/protobuf/protoc-gen-go@v1.4.3
go get -u google.golang.org/grpc@v1.33.0
go env-wGO111MODULE=auto
Search for libpython3.7.so
```
find / -name libpython3.7.so
```
Usually there will be something similar to `**/lib/libpython3.7.so` or `**/lib/x86_64-linux-gnu/libpython3.7.so`, we only need to take its folder directory, For example, find `/usr/local/lib/libpython3.7.so`, then we only need `export PYTHON_LIBRARIES=/usr/local/lib`.
If it is not found, it means 1) Statically compiling Python, you need to reinstall the dynamically compiled Python 2) The county is not enough to view the relevant system catalogs.
3) Set `PYTHON_EXECUTABLE`
## Compile Server
View the python3.7 path directly
```
which python3.7
```
If the result is `/usr/local/bin/python3.7`, then directly set `export PYTHON_EXECUTABLE=/usr/local/bin/python3.7`.
### Integrated CPU version paddle inference library
It is very important to set these three environment variables. After the settings are completed, we can perform the following operations (the following is the PYTHON environment of the development image of Paddle Cuda 11.2, if it is another image, please change the corresponding `PYTHON_INCLUDE_DIR`, `PYTHON_LIBRARIES` , `PYTHON_EXECUTABLE`).
``` shell
mkdir server-build-cpu &&cd server-build-cpu
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR\
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES\
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE\
-DSERVER=ON ..
make -j10
```
# The following three environment variables are the environment of Paddle development mirror Cuda11.2, such as other mirrors may need to be modified
you can execute `make install` to put targets under directory `./output`, you need to add`-DCMAKE_INSTALL_PREFIX=./output`to specify output path to cmake command shown above.
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
### Integrated GPU version paddle inference library
python -m install -r python/requirements.txt
go env -w GO111MODULE=on
go env -w GOPROXY=https://goproxy.cn,direct
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go install google.golang.org/grpc@v1.33.0
go env -w GO111MODULE=auto
```
Compared with CPU environment, GPU environment needs to refer to the following table,
**It should be noted that the following table is used as a reference for non-Docker compilation environment. The Docker compilation environment has been configured with relevant parameters and does not need to be specified in cmake process. **
If you are a GPU user, you need to set additional `CUDA_PATH`, `CUDNN_LIBRARY`, `CUDA_CUDART_LIBRARY` and `TENSORRT_LIBRARY_PATH`.
| CUDA_TOOLKIT_ROOT_DIR | cuda installation path, usually /usr/local/cuda | Required for all environments | No (/usr/local/cuda) |
| CUDNN_LIBRARY | The directory where libcudnn.so.* is located, usually /usr/local/cuda/lib64/ | Required for all environments | No (/usr/local/cuda/lib64/) |
| CUDA_CUDART_LIBRARY | The directory where libcudart.so.* is located, usually /usr/local/cuda/lib64/ | Required for all environments | No (/usr/local/cuda/lib64/) |
| TENSORRT_ROOT | The upper level directory of the directory where libnvinfer.so.* is located, depends on the TensorRT installation directory | Cuda 9.0/10.0 does not need, other needs | No (/usr) |
| TENSORRT_ROOT | The upper level directory of the directory where libnvinfer.so.* is located, depends on the TensorRT installation directory | Required for all environments | No (/usr) |
If not in Docker environment, users can refer to the following execution methods. The specific path is subject to the current environment, and the code is only for reference.TENSORRT_LIBRARY_PATH is related to the TensorRT version and should be set according to the actual situation。For example, in the cuda10.1 environment, the TensorRT version is 6.0 (/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/),In the cuda10.2 and cuda11.0 environment, the TensorRT version is 7.1 (/usr/local/TensorRT-7.1.3.4/targets/x86_64-linux-gnu/).
Execute `make install` to put the target output in the `./output` directory.
We need to compile three targets in total, namely `paddle-serving-server`, `paddle-serving-client`, and `paddle-serving-app`, among which `paddle-serving-server` needs to distinguish between CPU or GPU version. If it is a CPU version, please run,
### Compile C++ Server under the condition of WITH_OPENCV=ON
**Note:** Only when you need to redevelop the paddle serving C + + part, and the new code depends on the OpenCV library, you need to do so.
### Compile paddle-serving-server
First of all , OpenCV library should be installed, if not, please refer to the `Compile and install OpenCV` section later in this article.
```
mkdir build_server
cd build_server
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DSERVER=ON \
-DWITH_GPU=OFF ..
make -j20
cd ..
```
In the compile command, add `DOPENCV_DIR=${OPENCV_DIR}` and `DWITH_OPENCV=ON`,for example:
``` shell
OPENCV_DIR=your_opencv_dir #`your_opencv_dir` is the installation path of OpenCV library。
mkdir server-build-cpu &&cd server-build-cpu
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES\
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE\
-DOPENCV_DIR=${OPENCV_DIR}\
-DWITH_OPENCV=ON \
-DSERVER=ON ..
make -j10
If it is the GPU version, please run,
```
mkdir build_server
cd build_server
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
-DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
-DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
-DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
-DSERVER=ON \
-DWITH_GPU=ON ..
make -j20
cd ..
```
**Note:** After the compilation is successful, you need to set the `SERVING_BIN` path, see the following [Notes](Compile_EN.md#Note).
## Compile Client
### Compile paddle-serving-client and paddle-serving-app
``` shell
mkdir client-build &&cd client-build
Next, we can continue to compile the client and app. The compilation commands for these two packages are common on all platforms, and do not distinguish between CPU and GPU versions.
```
# Compile paddle-serving-client
mkdir build_client
cd build_client
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DCLIENT=ON ..
make -j10
```
execute `make install` to put targets under directory `./output`
Regardless of the client, server or App part, after compiling, install the whl package in `python/dist/` in the temporary directory(`server-build-cpu`, `server-build-gpu`, `client-build`,`app-build`) of the compilation process.
for example:cd server-build-cpu/python/dist && pip install -U xxxxx.whl
Note the last line `export SERVING_BIN` in the previous section. When running the python server, the `SERVING_BIN` environment variable will be checked. If you want to use the binary file compiled by yourself, please set the environment variable to the path of the corresponding binary file. It is `export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`.
Where BUILD_DIR is the absolute path of `build_server`.
You can cd build_server path and execute `export SERVING_BIN=${PWD}/core/general-server/serving`
## Enable WITH_OPENCV option to compile C++ Server
## Note
**Note:** You only need to do this when you need to do secondary development on the Paddle Serving C++ part and the newly added code depends on the OpenCV library.
When running the python server, it will check the `SERVING_BIN` environment variable. If you want to use your own compiled binary file, set the environment variable to the path of the corresponding binary file, usually`export SERVING_BIN=${BUILD_DIR}/core/general-server/serving`.
BUILD_DIR is the absolute path of server build CPU or server build GPU。
for example: cd server-build-cpu && export SERVING_BIN=${PWD}/core/general-server/serving
To compile the Serving C++ Server part, when the WITH_OPENCV option is turned on, the installed OpenCV library is required. If it has not been installed, you can refer to the instructions at the back of this document to compile and install the OpenCV library.
Take the WITH_OPENCV option and compile the CPU version Paddle Inference Library as an example. On the basis of the above compilation command, add the `DOPENCV_DIR=${OPENCV_DIR}` and `DWITH_OPENCV=ON` options.
``` shell
OPENCV_DIR=your_opencv_dir #`your_opencv_dir` is the installation path of the opencv library.
mkdir build_server &&cd build_server
cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES\
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE\
-DOPENCV_DIR=${OPENCV_DIR}\
-DWITH_OPENCV=ON \
-DSERVER=ON ..
make -j10
```
**Note:** After the compilation is successful, you need to set the `SERVING_BIN` path, see the following [Notes](https://github.com/PaddlePaddle/Serving/blob/develop/doc/COMPILE_CN.md#Notes) ).
## Verify
Please use the example under `python/examples` to verify.
Paddle Serving supports prediction on the GPU through the PaddlePaddle inference library. The WITH_GPU option is used to detect basic libraries such as CUDA/CUDNN on the system. If an appropriate version is detected, the GPU Kernel will be compiled when PaddlePaddle is compiled.
Paddle Serving supports prediction on the GPU through the PaddlePaddle prediction library. The WITH_GPU option is used to detect basic libraries such as CUDA/CUDNN on the system. If a suitable version is detected, the GPU version of the OP Kernel will be compiled when the PaddlePaddle is compiled.
To compile the Paddle Serving GPU version on bare metal, you need to install these basic libraries:
-CUDA
-CuDNN
-CUDA
-CuDNN
To compile the TensorRT version, you need to install the TensorRT library.
Note here:
1. The basic library versions such as CUDA/CUDNN installed on the system where Serving is compiled, needs to be compatible with the actual GPU device. For example, the Tesla V100 card requires at least CUDA 9.0. If the version of the basic library such as CUDA used during compilation is too low, the generated GPU code is not compatible with the actual hardware device, which will cause the Serving process to fail to start or serious problems such as coredump.
2. Install the CUDA driver compatible with the actual GPU device on the system running Paddle Serving, and install the basic library compatible with the CUDA/CuDNN version used during compilation. If the version of CUDA/CuDNN installed on the system running Paddle Serving is lower than the version used at compile time, it may cause some cuda function call failures and other problems.
The things to note here are:
1. Compile the basic library versions such as CUDA/CUDNN installed on the system where Serving is located, and need to be compatible with the actual GPU device. For example, Tesla V100 card requires at least CUDA 9.0. If the version of basic libraries such as CUDA used during compilation is too low, the Serving process cannot be started due to the incompatibility between the generated GPU code and the actual hardware device, or serious problems such as coredump may occur.
2. Install the CUDA driver compatible with the actual GPU device on the system running Paddle Serving, and install the basic library compatible with the CUDA/CuDNN version used during compilation. If the version of CUDA/CuDNN installed on the system running Paddle Serving is lower than the version used during compilation, it may cause strange cuda function call failures and other problems.
The following is the base library version matching relationship used by the PaddlePaddle release version for reference:
The following is the matching relationship between PaddleServing mirrored Cuda, Cudnn, and TensorRT for reference:
| | CUDA | CuDNN | TensorRT |
| :----: | :-----: | :----------: | :----: |
| post101 | 10.1 | CuDNN 7.6.5 | 6.0.1 |
| post102 | 10.2 | CuDNN 8.0.5 | 7.1.3 |
| post11 | 11.0 | CuDNN 8.0.4 | 7.1.3 |
| | CUDA | CuDNN | TensorRT |
| :----: | :-----: | :----------: | :----: |
| post101 | 10.1 | CuDNN 7.6.5 | 6.0.1 |
| post102 | 10.2 | CuDNN 8.0.5 | 7.1.3 |
| post11 | 11.0 | CuDNN 8.0.4 | 7.1.3 |
### How to make the compiler detect the CuDNN library
### Attachment: How to make the Paddle Serving compilation system detect the CuDNN library
Download the corresponding CUDNN version from NVIDIA developer official website and decompressing it, add `-DCUDNN_ROOT` to cmake command, to specify the path of CUDNN.
After downloading the corresponding version of CuDNN from the official website of NVIDIA developer and decompressing it locally, add the `-DCUDNN_LIBRARY` parameter to the cmake compilation command and specify the path of the CuDNN library.
## Compile and install OpenCV
**Note:** You need to do this only if you need to import the opencv library into your C + + code.
## Attachment: Compile and install OpenCV library
**Note:** You only need to do this when you need to include the OpenCV library in your C++ code.
* First of all, you need to download the source code compiled package in the Linux environment from the OpenCV official website. Taking OpenCV3.4.7 as an example, the download command is as follows.
* First, you need to download the package compiled from the source code in the Linux environment from the OpenCV official website. Take OpenCV 3.4.7 as an example. The download command is as follows.
Finally, you can see the folder of `opencv-3.4.7/` in the current directory.
* Compile OpenCV, the OpenCV source path (`root_path`) and installation path (`install_path`) should be set by yourself. Enter the OpenCV source code path and compile it in the following way.
Finally, you can see the folder `opencv-3.4.7/` in the current directory.
* Compile OpenCV, set the OpenCV source path (`root_path`) and installation path (`install_path`). Enter the OpenCV source code path and compile in the following way.
```shell
root_path=your_opencv_root_path
...
...
@@ -299,11 +338,10 @@ make -j
make install
```
Among them, `root_path` is the downloaded OpenCV source code path, and `install_path` is the installation path of OpenCV. After `make install` is completed, the OpenCV header file and library file will be generated in this folder for later source code compilation.
Among them, `root_path` is the downloaded OpenCV source path, `install_path` is the installation path of OpenCV, after the completion of `make install`, OpenCV header files and library files will be generated in this folder, which are used to compile the code that references the OpenCV library .
The final file structure under the OpenCV installation path is as follows.
The final file structure under the installation path is as follows.
If you want to customize your Serving based on source code, use the version with the suffix - devel.
**cuda10.1-cudnn7-gcc54 image is not ready, you should run from dockerfile if you need it.**
| Description | OS | TAG | Dockerfile |
If you need to develop and compile based on the source code, please use the version with the suffix -devel.
**In the TAG column, 0.7.0 can also be replaced with the corresponding version number, such as 0.5.0/0.4.1, etc., but it should be noted that some development environments only increase with a certain version iteration, so not all environments All have the corresponding version number can be used.**
Running Images is lighter than Develop Images, and Running Images are made up with serving whl and bin, but without develop tools like cmake because of lower image size. If you want to know about it, plese check the document [Paddle Serving on Kubernetes.](./Run_On_Kubernetes_CN.md).
**Tips:** If you want to use CPU server and GPU server (version>=0.5.0) at the same time, you should check the gcc version, only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
We **highly recommend** you to **run Paddle Serving in Docker**, please visit [Run in Docker](Run_In_Docker_EN.md). See the [document](./Docker_Images_EN.md) for more docker images.
**Strongly recommend** you build **Paddle Serving** in Docker. For more images, please refer to [Docker Image List](Docker_Images_CN.md).
**Attention:**: Currently, the default GPU environment of paddlepaddle 2.1 is Cuda 10.2, so the sample code of GPU Docker is based on Cuda 10.2. We also provides docker images and whl packages for other GPU environments. If users use other environments, they need to carefully check and select the appropriate version.
**Tip-1**: This project only supports <mark>**Python3.6/3.7/3.8**</mark>, all subsequent operations related to Python/Pip need to select the correct Python version.
**Attention:** the following so-called 'python' or 'pip' stands for one of Python 3.6/3.7/3.8.
**Tip-2**: The GPU environments in the following examples are all cuda10.2-cudnn7. If you use Python Pipeline to deploy and need Nvidia TensorRT to optimize prediction performance, please refer to [Supported Mirroring Environment and Instructions](#4.-Supported-Docker-Images-and-Instruction) to choose other versions.
## 1. Start the Docker Container
<mark>**Both Serving Dev Image and Paddle Dev Image are supported at the same time. You can choose 1 from the operation 2 in chapters 1.1 and 1.2.**</mark>
nvidia-docker run -p 9292:9292 --name test -dit paddlepaddle/paddle:2.2.0-cuda10.2-cudnn7 bash
nvidia-docker exec -it test bash
git clone https://github.com/PaddlePaddle/Serving
# Paddle development image needs to execute the following script to increase the dependencies required by Serving
bash Serving/tools/paddle_env_install.sh
```
You may need to use a domestic mirror source(in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
## 2. Install Paddle Serving related whl Packages
If you need install modules compiled with develop branch, please download packages from [latest packages list](./Latest_Packages_CN.md) and install with `pip install` command. If you want to compile by yourself, please refer to [How to compile Paddle Serving?](Compile_EN.md)
Install the required pip dependencies
```
cd Serving
pip3 install -r python/requirements.txt
```
Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.
```shell
pip3 install paddle-serving-client==0.7.0
pip3 install paddle-serving-server==0.7.0 # CPU
pip3 install paddle-serving-app==0.7.0
pip3 install paddle-serving-server-gpu==0.7.0.post102 #GPU with CUDA10.2 + TensorRT6
# Other GPU environments need to confirm the environment before choosing which one to execute
pip3 install paddle-serving-server-gpu==0.7.0.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.7.0.post112 # GPU with CUDA11.2 + TensorRT8
```
Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python3.6/3.7/3.8.
If you are in China, You may need to use a chinese mirror source (such as Tsinghua source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to the pip command) to speed up the download.
**For latest version, Cuda 9.0 or Cuda 10.0 are no longer supported, Python2.7/3.5 is no longer supported.**
If you need to use the installation package compiled by the develop branch, please download the download address from [Latest installation package list](./Latest_Packages_CN.md), and use the `pip install` command to install. If you want to compile by yourself, please refer to [Paddle Serving Compilation Document](./Compile_CN.md).
Recommended to install paddle >= 2.1.0
The paddle-serving-server and paddle-serving-server-gpu installation packages support Centos 6/7, Ubuntu 16/18 and Windows 10.
The paddle-serving-client and paddle-serving-app installation packages support Linux and Windows, and paddle-serving-client only supports python3.6/3.7/3.8.
## 3. Install Paddle related Python libraries
**You only need to install it when you use the `paddle_serving_client.convert` command or the `Python Pipeline framework`. **
```
# CPU users, please run
pip install paddlepaddle==2.1.0
# CPU environment please execute
pip3 install paddlepaddle==2.2.0
# GPU Cuda10.2 please run
pip install paddlepaddle-gpu==2.1.0
# GPU Cuda10.2 environment please execute
pip3 install paddlepaddle-gpu==2.2.0
```
**Note**: If your Cuda version is not 10.2, please do not execute the above commands directly, you need to refer to [Paddle-Inference official document-download and install the Linux prediction library](https://paddleinference.paddlepaddle.org.cn/master/user_guides/download_lib.html#python) Select the URL link of the corresponding GPU environment and install it.
**Note**: If your Cuda version is not 10.2, please do not execute the above commands directly, you need to refer to [Paddle official documentation-multi-version whl package list
Select the url link of the corresponding GPU environment and install it. For example, for Python3.6 users of Cuda 10.1, please select `cp36-cp36m` and
The url corresponding to `cuda10.1-cudnn7-mkl-gcc8.2-avx-trt6.0.1.5`, copy it and run
For example, for Python3.6 users of Cuda 10.1, please select the URL corresponding to `cp36-cp36m` and `linux-cuda10.1-cudnn7.6-trt6-gcc8.2` in the table, copy it and execute
the default `paddlepaddle-gpu==2.1.0` is Cuda 10.2 with no TensorRT. If you want to install PaddlePaddle with TensorRT. please also check the documentation-multi-version whl package list and find key word `cuda10.2-cudnn8.0-trt7.1.3`.
If it is other environment and Python version, please find the corresponding link in the table and install it with pip.
For **Windows Users**, please read the document [Paddle Serving for Windows Users](Windows_Tutorial_EN.md)
<h2 align="center">Quick Start Example</h2>
| Environment | Serving Development Image Tag | Operating System | Paddle Development Image Tag | Operating System |
This quick start example is mainly for those users who already have a model to deploy, and we also provide a model that can be used for deployment. in case if you want to know how to complete the process from offline training to online service, please refer to the AiStudio tutorial above.
For **Windows 10 users**, please refer to the document [Paddle Serving Guide for Windows Platform](Windows_Tutorial_CN.md).
**Tips:** If you want to use CPU server and GPU server at the same time, you should check the gcc version, only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
## Client
...
...
@@ -48,16 +49,16 @@ for kunlun user who uses arm-xpu or x86-xpu can download the wheel packages as f
@@ -18,7 +18,7 @@ Paddle Serving provides a user-friendly programming framework for multi-model co
The Server side is built based on <b>RPC Service</b> and <b>graph execution engine</b>. The relationship between them is shown in the following figure.
@@ -61,7 +61,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
- For cases where large data needs to be transferred between OPs, consider RAM DB external memory for global storage and data transfer by passing index keys in Channel.
@@ -80,7 +80,7 @@ The graph execution engine consists of OPs and Channels, and the connected OPs s
- The following illustration shows the design of Channel in the graph execution engine, using input buffer and output buffer to align data between multiple OP inputs and multiple OP outputs, with a queue in the middle to buffer.
@@ -323,7 +323,7 @@ All examples of pipelines are in [examples/pipeline/](../../examples/Pipeline) d
Here, we build a simple imdb model enable example to show how to use Pipeline Serving. The relevant code can be found in the `Serving/examples/Pipeline/imdb_model_ensemble` folder. The Server-side structure in the example is shown in the following figure:
One of the biggest benefits of Docker is portability, which can be deployed on multiple operating systems and mainstream cloud computing platforms. The Paddle Serving Docker image can be deployed on Linux, Mac and Windows platforms.
## Requirements
Docker (GPU version requires nvidia-docker to be installed on the GPU machine)
This document takes Python2 as an example to show how to run Paddle Serving in docker. You can also use Python3 to run related commands by replacing `python` with `python3`.
## CPU
### Get docker image
Refer to [this document](Docker_Images_EN.md) for a docker image:
docker run -p 9292:9292 --nametest-dit registry.baidubce.com/paddlepaddle/serving:latest-devel
docker exec-ittest bash
```
The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
### Install PaddleServing
Please refer to the instructions on the homepage to download the pip package of the corresponding version.
## GPU
The GPU version is basically the same as the CPU version, with only some differences in interface naming (GPU version requires nvidia-docker to be installed on the GPU machine).
### Get docker image
Refer to [this document](Docker_Images_EN.md) for a docker image, the following is an example of an `cuda9.0-cudnn7` image:
nvidia-docker run -p 9292:9292 --nametest-dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
nvidia-docker exec-ittest bash
```
or
```bash
docker run --gpus all -p 9292:9292 --nametest-dit registry.baidubce.com/paddlepaddle/serving:latest-cuda10.2-cudnn8-devel
docker exec-ittest bash
```
The `-p` option is to map the `9292` port of the container to the `9292` port of the host.
### Install PaddleServing
The mirror comes with `paddle_serving_server_gpu`, `paddle_serving_client`, and `paddle_serving_app` corresponding to the mirror tag version. If users don’t need to change the version, they can use it directly, which is suitable for environments without extranet services.
If you need to change the version, please refer to the instructions on the homepage to download the pip package of the corresponding version. [LATEST_PACKAGES](./Latest_Packages_CN.md)
## Precautious
- Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](Compile_EN.md).
- If you use Cuda9 and Cuda10 docker images, you cannot use `paddle_serving_server` CPU version at the same time, due to the limitation of gcc version. If you want to use both in one docker image, please choose images of Cuda10.1, Cuda10.2 and Cuda11.
@@ -53,7 +53,7 @@ Paddle Serving takes into account a series of issues such as different operating
Cross-platform is not dependent on the operating system, nor on the hardware environment. Applications developed under one operating system can still run under another operating system. Therefore, the design should consider not only the development language and the cross-platform components, but also the interpretation differences of the compilers on different systems.
Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable container, and then publish it to any popular Linux machine or Windows machine. We have packaged a variety of Docker images for the Paddle Serving framework. Refer to the image list《[Docker Images](Docker_Images_EN.md)》, Select mirrors according to user's usage. We provide Docker usage documentation《[How to run PaddleServing in Docker](Run_In_Docker_EN.md)》.Currently, the Python webservice mode can be deployed and run on the native Linux and Windows dual systems.《[Paddle Serving for Windows Users](Windows_Tutorial_EN.md)》
Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable container, and then publish it to any popular Linux machine or Windows machine. We have packaged a variety of Docker images for the Paddle Serving framework. Refer to the image list《[Docker Images](Docker_Images_EN.md)》, Select mirrors according to user's usage. We provide Docker usage documentation《[How to run PaddleServing in Docker](Install_EN.md)》.Currently, the Python webservice mode can be deployed and run on the native Linux and Windows dual systems.《[Paddle Serving for Windows Users](Windows_Tutorial_EN.md)》
> Support multiple development languages client SDKs