Motivation
Paddle Serving helps deep learning developers deploy an online inference service without much effort. The goal of this project: once you have trained a deep neural nets with Paddle, you already have a model inference service. A demo of serving is as follows:
Key Features
- Integrate with Paddle training pipeline seemlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Distributed Key-Value indexing supported that is especially useful for large scale sparse features as model inputs.
- Highly concurrent and efficient communication between clients and servers.
- Multiple programming languages supported on client side, such as Golang, C++ and python
- Extensible framework design that can support model serving beyond Paddle.
Installation
We highly recommend you to run Paddle Serving in Docker, please visit Run in Docker
pip install paddle-serving-client
pip install paddle-serving-server
Quick Start Example
Boston House Price Prediction model
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
Paddle Serving provides HTTP and RPC based service for users to access
HTTP service
python -m paddle_serving_server.web_serve --model uci_housing_model --thread 10 --port 9292 --name uci
Argument | Type | Default | Description |
---|---|---|---|
thread |
int | 10 |
Concurrency of current service |
port |
int | 9292 |
Exposed port of current service to users |
name |
str | "" |
Service name, can be used to generate HTTP request url |
model |
str | "" |
Path of paddle model directory to be served |
curl -H "Content-Type:application/json" -X POST -d '{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
RPC service
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
# A user can visit rpc service through paddle_serving_client API
from paddle_serving_client import Client
client = Client()
client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
client.connect(["127.0.0.1:9292"])
data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
-0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
Pre-built services with Paddle Serving
-
Description: Chinese word segmentation HTTP service that can be deployed with one line command.
-
Download:
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
-
Host web service:
tar -xzf lac_model_jieba_web.tar.gz
python lac_web_service.py jieba_server_model/ lac_workdir 9292
-
Request sample:
curl -H "Content-Type:application/json" -X POST -d '{"words": "我爱北京天安门", "fetch":["crf_decode"]}' http://127.0.0.1:9292/lac/prediction
-
Request result:
{"word_seg":"我|爱|北京|天安门"}
Document
New to Paddle Serving
- How to save a servable model?
- An end-to-end tutorial from training to serving
- Write Bert-as-Service in 10 minutes
Developers
- How to config Serving native operators on server side?
- How to develop a new Serving operator
- Golang client
- Compile from source code(Chinese)
About Efficiency
FAQ
Design
Community
Slack
Image Classification
Image To Vector
Chinese Sentence To Vector
Chinese Word Segmentation
Description: Chinese word segmentation HTTP service that can be deployed with one line command.
Download:
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
tar -xzf lac_model_jieba_web.tar.gz
python lac_web_service.py jieba_server_model/ lac_workdir 9292
curl -H "Content-Type:application/json" -X POST -d '{"words": "我爱北京天安门", "fetch":["crf_decode"]}' http://127.0.0.1:9292/lac/prediction
{"word_seg":"我|爱|北京|天安门"}
Document
New to Paddle Serving
- How to save a servable model?
- An end-to-end tutorial from training to serving
- Write Bert-as-Service in 10 minutes
Developers
- How to config Serving native operators on server side?
- How to develop a new Serving operator
- Golang client
- Compile from source code(Chinese)
About Efficiency
FAQ
Design
Community
Slack
Image Classification
Image To Vector
Document
New to Paddle Serving
- How to save a servable model?
- An end-to-end tutorial from training to serving
- Write Bert-as-Service in 10 minutes
Developers
- How to config Serving native operators on server side?
- How to develop a new Serving operator
- Golang client
- Compile from source code(Chinese)
About Efficiency
FAQ
Design
Community
Slack
To connect with other users and contributors, welcome to join our Slack channel
Contribution
If you want to contribute code to Paddle Serving, please reference Contribution Guidelines
Feedback
For any feedback or to report a bug, please propose a GitHub Issue.