We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:
- Any model trained by [PaddlePaddle](https://github.com/paddlepaddle/paddle) can be directly used or [Model Conversion Interface](./doc/SAVE_CN.md) for online deployment of Paddle Serving.
<h3align="center">Some Key Features of Paddle Serving</h3>
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
-**Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
-**Highly concurrent and efficient communication** between clients and servers supported.
-**Multiple programming languages** supported on client side, such as C++, python and Java.
***
- Any model trained by [PaddlePaddle](https://github.com/paddlepaddle/paddle) can be directly used or [Model Conversion Interface](./doc/SAVE.md) for online deployment of Paddle Serving.
- Support [Multi-model Pipeline Deployment](./doc/PIPELINE_SERVING.md), and provide the requirements of the REST interface and RPC interface itself, [Pipeline example](./python/examples/pipeline).
- Support the model zoos from the Paddle ecosystem, such as [PaddleDetection](./python/examples/detection), [PaddleOCR](./python/examples/ocr), [PaddleRec](https://github.com/PaddlePaddle/PaddleRec/tree/master/tools/recserving/movie_recommender).
- Provide a variety of pre-processing and post-processing to facilitate users in training, deployment and other stages of related code, bridging the gap between AI developers and application developers, please refer to
...
...
@@ -197,14 +206,6 @@ the response is
{"result":{"price":[[18.901151657104492]]}}
```
<h2align="center">Some Key Features of Paddle Serving</h2>
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
-**Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
-**Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
-**Highly concurrent and efficient communication** between clients and servers supported.
-**Multiple programming languages** supported on client side, such as Golang, C++ and python.
**Tips:** If you want to use CPU server and GPU server (version>=0.5.0) at the same time, you should check the gcc version, only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
**Tips:** If you want to use CPU server and GPU server at the same time, you should check the gcc version, only Cuda10.1/10.2/11 can run with CPU server owing to the same gcc version(8.2).
@@ -73,4 +73,5 @@ If you need to change the version, please refer to the instructions on the homep
## Precautious
Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](COMPILE.md).
- Runtime images cannot be used for compilation. If you want to compile from source, refer to [COMPILE](COMPILE.md).
- If you use Cuda9 and Cuda10 docker images, you cannot use `paddle_serving_server` CPU version at the same time, due to the limitation of gcc version. If you want to use both in one docker image, please choose images of Cuda10.1, Cuda10.2 and Cuda11.