README.md 10.6 KB
Newer Older
M
MRXLT 已提交
1 2
([简体中文](./README_CN.md)|English)

D
Dong Daxiang 已提交
3 4
<p align="center">
    <br>
T
TeslaZhao 已提交
5
<img src='doc/images/serving_logo.png' width = "600" height = "130">
D
Dong Daxiang 已提交
6 7
    <br>
<p>
8

D
Dong Daxiang 已提交
9 10
<p align="center">
    <br>
B
barrierye 已提交
11
    <a href="https://travis-ci.com/PaddlePaddle/Serving">
T
TeslaZhao 已提交
12 13
        <img alt="Build Status" src="https://img.shields.io/travis/com/PaddlePaddle/Serving/develop?style=flat-square">
        <img alt="Docs" src="https://img.shields.io/badge/docs-中文文档-brightgreen?style=flat-square">
T
TeslaZhao 已提交
14 15
        <img alt="Release" src="https://img.shields.io/badge/release-0.8.0-blue?style=flat-square">
        <img alt="Python" src="https://img.shields.io/badge/python-3.6/3.7/3.8/3.9-blue?style=flat-square">
T
TeslaZhao 已提交
16 17 18 19
        <img alt="License" src="https://img.shields.io/github/license/PaddlePaddle/Serving?color=blue&style=flat-square">
        <img alt="Forks" src="https://img.shields.io/github/forks/PaddlePaddle/Serving?color=yellow&style=flat-square">
        <img alt="Issues" src="https://img.shields.io/github/issues/PaddlePaddle/Serving?color=yellow&style=flat-square">
        <img alt="Contributors" src="https://img.shields.io/github/contributors/PaddlePaddle/Serving?color=orange&style=flat-square">
T
TeslaZhao 已提交
20
        <img alt="Community" src="https://img.shields.io/badge/join-Wechat,QQ-orange?style=flat-square">
B
barrierye 已提交
21
    </a>
D
Dong Daxiang 已提交
22 23
    <br>
<p>
D
Dong Daxiang 已提交
24

T
TeslaZhao 已提交
25 26
***

J
Jiawei Wang 已提交
27
The goal of Paddle Serving is to provide high-performance, flexible and easy-to-use industrial-grade online inference services for machine learning developers and enterprises.Paddle Serving supports multiple protocols such as RESTful, gRPC, bRPC, and provides inference solutions under a variety of hardware and multiple operating system environments, and many famous pre-trained model examples. The core features are as follows:
W
wangjiawei04 已提交
28

D
Dong Daxiang 已提交
29

T
TeslaZhao 已提交
30
- Integrate high-performance server-side inference engine paddle Inference and mobile-side engine paddle Lite. Models of other machine learning platforms (Caffe/TensorFlow/ONNX/PyTorch) can be migrated to paddle through [x2paddle](https://github.com/PaddlePaddle/X2Paddle).
J
Jiawei Wang 已提交
31 32
- There are two frameworks, namely high-performance C++ Serving and high-easy-to-use Python pipeline. The C++ Serving is based on the bRPC network framework to create a high-throughput, low-latency inference service, and its performance indicators are ahead of competing products. The Python pipeline is based on the gRPC/gRPC-Gateway network framework and the Python language to build a highly easy-to-use and high-throughput inference service. How to choose which one please see [Techinical Selection](doc/Serving_Design_EN.md#21-design-selection).
- Support multiple [protocols](doc/C++_Serving/Inference_Protocols_CN.md) such as HTTP, gRPC, bRPC, and provide C++, Python, Java language SDK.
33
- Design and implement a high-performance inference service framework for asynchronous pipelines based on directed acyclic graph (DAG), with features such as multi-model combination, asynchronous scheduling, concurrent inference, dynamic batch, multi-card multi-stream inference, request cache, etc.
T
TeslaZhao 已提交
34 35
- Adapt to a variety of commonly used computing hardwares, such as x86 (Intel) CPU, ARM CPU, Nvidia GPU, Kunlun XPU, HUAWEI Ascend 310/910, HYGON DCU、Nvidia Jetson etc. 
- Integrate acceleration libraries of Intel MKLDNN and  Nvidia TensorRT, and low-precision and quantitative inference.
T
TeslaZhao 已提交
36 37 38 39
- Provide a model security deployment solution, including encryption model deployment, and authentication mechanism, HTTPs security gateway, which is used in practice.
- Support cloud deployment, provide a deployment case of Baidu Cloud Intelligent Cloud kubernetes cluster.
- Provide more than 40 classic pre-model deployment examples, such as PaddleOCR, PaddleClas, PaddleDetection, PaddleSeg, PaddleNLP, PaddleRec and other suites, and more models continue to expand.
- Supports distributed deployment of large-scale sparse parameter index models, with features such as multiple tables, multiple shards, multiple copies, local high-frequency cache, etc., and can be deployed on a single machine or clouds.
S
ShiningZhang 已提交
40
- Support service monitoring, provide prometheus-based performance statistics and port access
W
wangjiawei04 已提交
41

W
wangjiawei04 已提交
42

T
TeslaZhao 已提交
43
<h2 align="center">Tutorial and Papers</h2>
W
wangjiawei04 已提交
44

J
Jiawei Wang 已提交
45

T
TeslaZhao 已提交
46
- AIStudio tutorial(Chinese) : [Paddle Serving服务化部署框架](https://www.paddlepaddle.org.cn/tutorials/projectdetail/3946013)
T
TeslaZhao 已提交
47
- AIStudio OCR practice(Chinese) : [基于PaddleServing的OCR服务化部署实战](https://aistudio.baidu.com/aistudio/projectdetail/3630726)
T
TeslaZhao 已提交
48
- Video tutorial(Chinese) : [深度学习服务化部署-以互联网应用为例](https://aistudio.baidu.com/aistudio/course/introduce/19084)
T
TeslaZhao 已提交
49
- Edge AI solution(Chinese) : [基于Paddle Serving&百度智能边缘BIE的边缘AI解决方案](https://mp.weixin.qq.com/s/j0EVlQXaZ7qmoz9Fv96Yrw)
T
TeslaZhao 已提交
50

T
TeslaZhao 已提交
51 52 53 54 55 56
- Paper : [JiZhi: A Fast and Cost-Effective Model-As-A-Service System for
Web-Scale Online Inference at Baidu](https://arxiv.org/pdf/2106.01674.pdf)
- Paper : [ERNIE 3.0 TITAN: EXPLORING LARGER-SCALE KNOWLEDGE
ENHANCED PRE-TRAINING FOR LANGUAGE UNDERSTANDING
AND GENERATION](https://arxiv.org/pdf/2112.12731.pdf)

D
Dong Daxiang 已提交
57
<p align="center">
T
TeslaZhao 已提交
58
    <img src="doc/images/demo.gif" width="700">
D
Dong Daxiang 已提交
59
</p>
D
Dong Daxiang 已提交
60

T
TeslaZhao 已提交
61 62
<h2 align="center">Documentation</h2>

D
Dong Daxiang 已提交
63

T
TeslaZhao 已提交
64
> Set up
W
wangjiawei04 已提交
65

T
TeslaZhao 已提交
66
This chapter guides you through the installation and deployment steps. It is strongly recommended to use Docker to deploy Paddle Serving. If you do not use docker, ignore the docker-related steps. Paddle Serving can be deployed on cloud servers using Kubernetes, running on many commonly hardwares such as ARM CPU, Intel CPU, Nvidia GPU, Kunlun XPU. The latest development kit of the develop branch is compiled and generated every day for developers to use.
W
wangjiawei04 已提交
67

J
Jiawei Wang 已提交
68
- [Install Paddle Serving using docker](doc/Install_EN.md)
T
TeslaZhao 已提交
69
- [Build Paddle Serving from Source with Docker](doc/Compile_EN.md)
T
TeslaZhao 已提交
70
- [Deploy Paddle Serving on Kubernetes(Chinese)](doc/Run_On_Kubernetes_CN.md)
T
TeslaZhao 已提交
71
- [Deploy Paddle Serving with Security gateway(Chinese)](doc/Serving_Auth_Docker_CN.md)
T
TeslaZhao 已提交
72
- Deploy on more hardwares[[ARM CPU、百度昆仑](doc/Run_On_XPU_EN.md)[华为昇腾](doc/Run_On_NPU_CN.md)[海光DCU](doc/Run_On_DCU_CN.md)[Jetson](doc/Run_On_JETSON_CN.md)]
T
TeslaZhao 已提交
73
- [Docker Images](doc/Docker_Images_EN.md)
T
TeslaZhao 已提交
74
- [Download Wheel packages](doc/Latest_Packages_EN.md)
W
wangjiawei04 已提交
75

T
TeslaZhao 已提交
76
> Use
W
wangjiawei04 已提交
77

T
TeslaZhao 已提交
78
The first step is to call the model save interface to generate a model parameter configuration file (.prototxt), which will be used on the client and server. The second step, read the configuration and startup parameters and start the service. According to API documents and your case, the third step is to write client requests based on the SDK, and test the inference service.
D
Dong Daxiang 已提交
79

T
TeslaZhao 已提交
80 81 82
- [Quick Start](doc/Quick_Start_EN.md)
- [Save a servable model](doc/Save_EN.md)
- [Description of configuration and startup parameters](doc/Serving_Configure_EN.md)
T
TeslaZhao 已提交
83 84 85
- [Guide for RESTful/gRPC/bRPC APIs(Chinese)](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client)
- [Infer on quantizative models](doc/Low_Precision_EN.md)
- [Data format of classic models(Chinese)](doc/Process_data_CN.md)
86
- [Prometheus(Chinese)](doc/Prometheus_CN.md)
T
TeslaZhao 已提交
87 88
- [C++ Serving(Chinese)](doc/C++_Serving/Introduction_CN.md) 
  - [Protocols(Chinese)](doc/C++_Serving/Inference_Protocols_CN.md)
T
TeslaZhao 已提交
89 90 91 92 93
  - [Hot loading models](doc/C++_Serving/Hot_Loading_EN.md)
  - [A/B Test](doc/C++_Serving/ABTest_EN.md)
  - [Encryption](doc/C++_Serving/Encryption_EN.md)
  - [Analyze and optimize performance(Chinese)](doc/C++_Serving/Performance_Tuning_CN.md)
  - [Benchmark(Chinese)](doc/C++_Serving/Benchmark_CN.md)
H
HexToString 已提交
94
  - [Multiple models in series(Chinese)](doc/C++_Serving/2+_model.md)
95
  - [Request Cache(Chinese)](doc/C++_Serving/Request_Cache_CN.md)
T
TeslaZhao 已提交
96
- [Python Pipeline](doc/Python_Pipeline/Pipeline_Design_EN.md)
T
TeslaZhao 已提交
97
  - [Analyze and optimize performance](doc/Python_Pipeline/Performance_Tuning_EN.md)
T
TeslaZhao 已提交
98
  - [TensorRT dynamic Shape](doc/TensorRT_Dynamic_Shape_EN.md)
T
TeslaZhao 已提交
99 100
  - [Benchmark(Chinese)](doc/Python_Pipeline/Benchmark_CN.md)
- Client SDK
T
TeslaZhao 已提交
101
  - [Python SDK(Chinese)](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client)
T
TeslaZhao 已提交
102
  - [JAVA SDK](doc/Java_SDK_EN.md)
T
TeslaZhao 已提交
103
  - [C++ SDK(Chinese)](doc/C++_Serving/Introduction_CN.md#42-多语言多协议Client)
T
TeslaZhao 已提交
104
- [Large-scale sparse parameter server](doc/Cube_Local_EN.md)
T
TeslaZhao 已提交
105

T
TeslaZhao 已提交
106
<br>
W
wangjiawei04 已提交
107

T
TeslaZhao 已提交
108 109
> Developers

T
TeslaZhao 已提交
110
For Paddle Serving developers, we provide extended documents such as custom OP, level of detail(LOD) processing.
T
TeslaZhao 已提交
111
- [Custom Operators](doc/C++_Serving/OP_EN.md)
T
TeslaZhao 已提交
112
- [Processing LoD Data](doc/LOD_EN.md)
T
TeslaZhao 已提交
113
- [FAQ(Chinese)](doc/FAQ_CN.md)
T
TeslaZhao 已提交
114 115 116

<h2 align="center">Model Zoo</h2>

W
wangjiawei04 已提交
117

T
TeslaZhao 已提交
118
Paddle Serving works closely with the Paddle model suite, and implements a large number of service deployment examples, including image classification, object detection, language and text recognition, Chinese part of speech, sentiment analysis, content recommendation and other types of examples,  for a total of 46 models.
J
Jiawei Wang 已提交
119

T
TeslaZhao 已提交
120
<p align="center">
T
TeslaZhao 已提交
121

T
TeslaZhao 已提交
122 123 124
| PaddleOCR | PaddleDetection | PaddleClas | PaddleSeg | PaddleRec | Paddle NLP | Paddle Video |
| :----:  | :----: | :----: | :----: | :----: | :----: | :----: | 
| 8 | 12 | 14 | 2 | 3 | 6 | 1|
T
TeslaZhao 已提交
125

T
TeslaZhao 已提交
126
</p>
T
TeslaZhao 已提交
127

T
TeslaZhao 已提交
128
For more model examples, read [Model zoo](doc/Model_Zoo_EN.md)
T
TeslaZhao 已提交
129

T
TeslaZhao 已提交
130
<p align="center">
T
TeslaZhao 已提交
131 132
  <img src="https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/imgs_results/PP-OCRv2/PP-OCRv2-pic003.jpg?raw=true" width="345"/> 
  <img src="doc/images/detection.png" width="350">
T
TeslaZhao 已提交
133
</p>
W
fix doc  
wangjiawei04 已提交
134

D
Dong Daxiang 已提交
135

W
wangjiawei04 已提交
136
<h2 align="center">Community</h2>
D
Dong Daxiang 已提交
137

T
TeslaZhao 已提交
138 139 140
If you want to communicate with developers and other users? Welcome to join us, join the community through the following methods below.

### Wechat
T
TeslaZhao 已提交
141
- WeChat scavenging
T
TeslaZhao 已提交
142

T
TeslaZhao 已提交
143
<p align="center">
T
TeslaZhao 已提交
144
  <img src="doc/images/wechat_group_1.jpeg" width="250">
T
TeslaZhao 已提交
145
</p>
T
TeslaZhao 已提交
146 147

### QQ
T
TeslaZhao 已提交
148
- QQ Group(Group No.:697765514)
T
TeslaZhao 已提交
149

T
TeslaZhao 已提交
150
<p align="center">
T
TeslaZhao 已提交
151
  <img src="doc/images/qq_group_1.png" width="200">
T
TeslaZhao 已提交
152
</p>
T
TeslaZhao 已提交
153

D
Dong Daxiang 已提交
154

T
TeslaZhao 已提交
155
> Contribution
D
Dong Daxiang 已提交
156

T
TeslaZhao 已提交
157
If you want to contribute code to Paddle Serving, please reference [Contribution Guidelines](doc/Contribute_EN.md)
T
TeslaZhao 已提交
158 159 160 161 162 163
- Thanks to [@loveululu](https://github.com/loveululu) for providing python API of Cube.
- Thanks to [@EtachGu](https://github.com/EtachGu) in updating run docker codes.
- Thanks to [@BeyondYourself](https://github.com/BeyondYourself) in complementing the gRPC tutorial, updating the FAQ doc and modifying the mdkir command
- Thanks to [@mcl-stone](https://github.com/mcl-stone) in updating faster_rcnn benchmark
- Thanks to [@cg82616424](https://github.com/cg82616424) in updating the unet benchmark  modifying resize comment error
- Thanks to [@cuicheng01](https://github.com/cuicheng01) for providing 11 PaddleClas models
T
Thomas Young 已提交
164 165
- Thanks to [@Jiaqi Liu](https://github.com/LiuChiachi) for supporting prediction for string list input
- Thanks to [@Bin Lu](https://github.com/Intsigstephon) for adding pp-shitu example
P
PaddlePM 已提交
166

T
TeslaZhao 已提交
167
> Feedback
D
Dong Daxiang 已提交
168

D
Dong Daxiang 已提交
169 170
For any feedback or to report a bug, please propose a [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues).

T
TeslaZhao 已提交
171
> License
D
Dong Daxiang 已提交
172

D
Dong Daxiang 已提交
173
[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE)