diff --git a/README.md b/README.md index 8aa899a3d1db797ea1a38476e5d56c425501f23e..de48b7a9baa457f4d062e43d5fb0c79757a2a68d 100644 --- a/README.md +++ b/README.md @@ -18,19 +18,19 @@
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+
-[](https://travis-ci.com/PaddlePaddle/Serving) -[](Release) -[](Issues) -[](LICENSE) -[](https://paddleserving.slack.com/archives/CU0PB4K35) +
+
+
+
+
+ +``` shell +curl -H "Content-Type:application/json" -X POST -d '{"url": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg", "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction +``` +- **返回结果示例**: +``` shell +{"label":"daisy","prob":0.9341403245925903} +``` + +
diff --git a/doc/ABTEST_IN_PADDLE_SERVING_CN.md b/doc/ABTEST_IN_PADDLE_SERVING_CN.md
index d31ddba6f72dfc23fa15defeda23468ab1785e62..e32bf783fcde20bb5dff3d2addaf764838975a81 100644
--- a/doc/ABTEST_IN_PADDLE_SERVING_CN.md
+++ b/doc/ABTEST_IN_PADDLE_SERVING_CN.md
@@ -1,5 +1,7 @@
# 如何使用Paddle Serving做ABTEST
+(简体中文|[English](./ABTEST_IN_PADDLE_SERVING.md))
+
该文档将会用一个基于IMDB数据集的文本分类任务的例子,介绍如何使用Paddle Serving搭建A/B Test框架,例中的Client端、Server端结构如下图所示。
diff --git a/doc/deprecated/BENCHMARKING.md b/doc/BENCHMARKING.md
similarity index 100%
rename from doc/deprecated/BENCHMARKING.md
rename to doc/BENCHMARKING.md
diff --git a/doc/DESIGN.md b/doc/DESIGN.md
index 8686eb7fc585c9df89218bd25262678fb49468d1..5d00d02171dccf07bfdafb9cdd85222a92c20113 100644
--- a/doc/DESIGN.md
+++ b/doc/DESIGN.md
@@ -14,17 +14,17 @@ The result is a complete serving solution.
## 2. Terms explanation
-- baidu-rpc: Baidu's official open source RPC framework, supports multiple common communication protocols, and provides a custom interface experience based on protobuf
-- Variant: Paddle Serving architecture is an abstraction of a minimal prediction cluster, which is characterized by all internal instances (replicas) being completely homogeneous and logically corresponding to a fixed version of a model
-- Endpoint: Multiple Variants form an Endpoint. Logically, Endpoint represents a model, and Variants within the Endpoint represent different versions.
-- OP: PaddlePaddle is used to encapsulate a numerical calculation operator, Paddle Serving is used to represent a basic business operation operator, and the core interface is inference. OP configures its dependent upstream OP to connect multiple OPs into a workflow
-- Channel: An abstraction of all request-level intermediate data of the OP; data exchange between OPs through Channels
-- Bus: manages all channels in a thread, and schedules the access relationship between the two sets of OP and Channel according to the DAG dependency graph between DAGs
-- Stage: Workflow according to the topology diagram described by DAG, a collection of OPs that belong to the same link and can be executed in parallel
-- Node: An Op operator instance composed of an Op operator class combined with parameter configuration, which is also an execution unit in Workflow
-- Workflow: executes the inference interface of each OP in order according to the topology described by DAG
-- DAG/Workflow: consists of several interdependent Nodes. Each Node can obtain the Request object through a specific interface. The node Op obtains the output object of its pre-op through the dependency relationship. The output of the last Node is the Response object by default.
-- Service: encapsulates a pv request, can configure several Workflows, reuse the current PV's Request object with each other, and then execute each in parallel/serial execution, and finally write the Response to the corresponding output slot; a Paddle-serving process Multiple sets of Service interfaces can be configured. The upstream determines the Service interface currently accessed based on the ServiceName.
+- **baidu-rpc**: Baidu's official open source RPC framework, supports multiple common communication protocols, and provides a custom interface experience based on protobuf
+- **Variant**: Paddle Serving architecture is an abstraction of a minimal prediction cluster, which is characterized by all internal instances (replicas) being completely homogeneous and logically corresponding to a fixed version of a model
+- **Endpoint**: Multiple Variants form an Endpoint. Logically, Endpoint represents a model, and Variants within the Endpoint represent different versions.
+- **OP**: PaddlePaddle is used to encapsulate a numerical calculation operator, Paddle Serving is used to represent a basic business operation operator, and the core interface is inference. OP configures its dependent upstream OP to connect multiple OPs into a workflow
+- **Channel**: An abstraction of all request-level intermediate data of the OP; data exchange between OPs through Channels
+- **Bus**: manages all channels in a thread, and schedules the access relationship between the two sets of OP and Channel according to the DAG dependency graph between DAGs
+- **Stage**: Workflow according to the topology diagram described by DAG, a collection of OPs that belong to the same link and can be executed in parallel
+- **Node**: An OP operator instance composed of an OP operator class combined with parameter configuration, which is also an execution unit in Workflow
+- **Workflow**: executes the inference interface of each OP in order according to the topology described by DAG
+- **DAG/Workflow**: consists of several interdependent Nodes. Each Node can obtain the Request object through a specific interface. The node Op obtains the output object of its pre-op through the dependency relationship. The output of the last Node is the Response object by default.
+- **Service**: encapsulates a pv request, can configure several Workflows, reuse the current PV's Request object with each other, and then execute each in parallel/serial execution, and finally write the Response to the corresponding output slot; a Paddle-serving process Multiple sets of Service interfaces can be configured. The upstream determines the Service interface currently accessed based on the ServiceName.
## 3. Python Interface Design
@@ -38,10 +38,10 @@ Models that can be predicted using the Paddle Inference Library, models saved du
### 3.3 Overall design:
-The user starts the Client and Server through the Python Client. The Python API has a function to check whether the interconnection and the models to be accessed match.
-The Python API calls the pybind corresponding to the client and server functions implemented by Paddle Serving, and the information transmitted through RPC is implemented through RPC.
-The Client Python API currently has two simple functions, load_inference_conf and predict, which are used to perform loading of the model to be predicted and prediction, respectively.
-The Server Python API is mainly responsible for loading the estimation model and generating various configurations required by Paddle Serving, including engines, workflow, resources, etc.
+- The user starts the Client and Server through the Python Client. The Python API has a function to check whether the interconnection and the models to be accessed match.
+- The Python API calls the pybind corresponding to the client and server functions implemented by Paddle Serving, and the information transmitted through RPC is implemented through RPC.
+- The Client Python API currently has two simple functions, load_inference_conf and predict, which are used to perform loading of the model to be predicted and prediction, respectively.
+- The Server Python API is mainly responsible for loading the inference model and generating various configurations required by Paddle Serving, including engines, workflow, resources, etc.
### 3.4 Server Inferface
@@ -69,8 +69,8 @@ def save_model(server_model_folder,

**Model Management Framework**: Connects model files of multiple machine learning platforms and provides a unified inference interface
-**Business Scheduling Framework**: Abstracts the calculation logic of various different prediction models, provides a general DAG scheduling framework, and connects different operators through DAG diagrams to complete a prediction service together. This abstract model allows users to conveniently implement their own calculation logic, and at the same time facilitates operator sharing. (Users build their own forecasting services. A large part of their work is to build DAGs and provide operators.)
-**PredictService**: Encapsulation of the externally provided prediction service interface. Define communication fields with the client through protobuf.
+**Business Scheduling Framework**: Abstracts the calculation logic of various different inference models, provides a general DAG scheduling framework, and connects different operators through DAG diagrams to complete a prediction service together. This abstract model allows users to conveniently implement their own calculation logic, and at the same time facilitates operator sharing. (Users build their own forecasting services. A large part of their work is to build DAGs and provide operators.)
+**Predict Service**: Encapsulation of the externally provided prediction service interface. Define communication fields with the client through protobuf.
### 4.1 Model Management Framework
diff --git a/doc/DESIGN_CN.md b/doc/DESIGN_CN.md
index 2e10013fc46c4b121ffe5c9268e5b531fe7f9992..124e826c4591c89cb14d25153f4c9a3096ea8dfb 100644
--- a/doc/DESIGN_CN.md
+++ b/doc/DESIGN_CN.md
@@ -4,7 +4,7 @@
## 1. 项目背景
-PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学习模型的定制化开发; Paddle serving是Paddle的在线预测部分,与Paddle模型训练环节无缝衔接,提供机器学习预测云服务。本文将从模型、服务、接入等层面,自底向上描述Paddle Serving设计方案。
+PaddlePaddle是百度开源的机器学习框架,广泛支持各种深度学习模型的定制化开发; Paddle Serving是Paddle的在线预测部分,与Paddle模型训练环节无缝衔接,提供机器学习预测云服务。本文将从模型、服务、接入等层面,自底向上描述Paddle Serving设计方案。
1. 模型是Paddle Serving预测的核心,包括模型数据和推理计算的管理;
2. 预测框架封装模型推理计算,对外提供RPC接口,对接不同上游;
@@ -14,23 +14,23 @@ PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学
## 2. 名词解释
-- baidu-rpc 百度官方开源RPC框架,支持多种常见通信协议,提供基于protobuf的自定义接口体验
-- Variant Paddle Serving架构对一个最小预测集群的抽象,其特点是内部所有实例(副本)完全同质,逻辑上对应一个model的一个固定版本
-- Endpoint 多个Variant组成一个Endpoint,逻辑上看,Endpoint代表一个model,Endpoint内部的Variant代表不同的版本
-- OP PaddlePaddle用来封装一种数值计算的算子,Paddle Serving用来表示一种基础的业务操作算子,核心接口是inference。OP通过配置其依赖的上游OP,将多个OP串联成一个workflow
-- Channel 一个OP所有请求级中间数据的抽象;OP之间通过Channel进行数据交互
-- Bus 对一个线程中所有channel的管理,以及根据DAG之间的DAG依赖图对OP和Channel两个集合间的访问关系进行调度
-- Stage Workflow按照DAG描述的拓扑图中,属于同一个环节且可并行执行的OP集合
-- Node 由某个Op算子类结合参数配置组成的Op算子实例,也是Workflow中的一个执行单元
-- Workflow 按照DAG描述的拓扑,有序执行每个OP的inference接口
-- DAG/Workflow 由若干个相互依赖的Node组成,每个Node均可通过特定接口获得Request对象,节点Op通过依赖关系获得其前置Op的输出对象,最后一个Node的输出默认就是Response对象
-- Service 对一次pv的请求封装,可配置若干条Workflow,彼此之间复用当前PV的Request对象,然后各自并行/串行执行,最后将Response写入对应的输出slot中;一个Paddle-serving进程可配置多套Service接口,上游根据ServiceName决定当前访问的Service接口。
+- **baidu-rpc**: 百度官方开源RPC框架,支持多种常见通信协议,提供基于protobuf的自定义接口体验
+- **Variant**: Paddle Serving架构对一个最小预测集群的抽象,其特点是内部所有实例(副本)完全同质,逻辑上对应一个model的一个固定版本
+- **Endpoint**: 多个Variant组成一个Endpoint,逻辑上看,Endpoint代表一个model,Endpoint内部的Variant代表不同的版本
+- **OP**: PaddlePaddle用来封装一种数值计算的算子,Paddle Serving用来表示一种基础的业务操作算子,核心接口是inference。OP通过配置其依赖的上游OP,将多个OP串联成一个workflow
+- **Channel**: 一个OP所有请求级中间数据的抽象;OP之间通过Channel进行数据交互
+- **Bus**: 对一个线程中所有channel的管理,以及根据DAG之间的DAG依赖图对OP和Channel两个集合间的访问关系进行调度
+- **Stage**: Workflow按照DAG描述的拓扑图中,属于同一个环节且可并行执行的OP集合
+- **Node**: 由某个OP算子类结合参数配置组成的OP算子实例,也是Workflow中的一个执行单元
+- **Workflow**: 按照DAG描述的拓扑,有序执行每个OP的inference接口
+- **DAG/Workflow**: 由若干个相互依赖的Node组成,每个Node均可通过特定接口获得Request对象,节点OP通过依赖关系获得其前置OP的输出对象,最后一个Node的输出默认就是Response对象
+- **Service**: 对一次PV的请求封装,可配置若干条Workflow,彼此之间复用当前PV的Request对象,然后各自并行/串行执行,最后将Response写入对应的输出slot中;一个Paddle-serving进程可配置多套Service接口,上游根据ServiceName决定当前访问的Service接口。
## 3. Python Interface设计
### 3.1 核心目标:
-一套Paddle Serving的动态库,支持Paddle保存的通用模型的远程预估服务,通过Python Interface调用PaddleServing底层的各种功能。
+完成一整套Paddle Serving的动态库,支持Paddle保存的通用模型的远程预估服务,通过Python Interface调用PaddleServing底层的各种功能。
### 3.2 通用模型:
@@ -38,10 +38,10 @@ PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学
### 3.3 整体设计:
-用户通过Python Client启动Client和Server,Python API有检查互联和待访问模型是否匹配的功能
-Python API背后调用的是Paddle Serving实现的client和server对应功能的pybind,互传的信息通过RPC实现
-Client Python API当前有两个简单的功能,load_inference_conf和predict,分别用来执行加载待预测的模型和预测
-Server Python API主要负责加载预估模型,以及生成Paddle Serving需要的各种配置,包括engines,workflow,resource等
+- 用户通过Python Client启动Client和Server,Python API有检查互联和待访问模型是否匹配的功能
+- Python API背后调用的是Paddle Serving实现的client和server对应功能的pybind,互传的信息通过RPC实现
+- Client Python API当前有两个简单的功能,load_inference_conf和predict,分别用来执行加载待预测的模型和预测
+- Server Python API主要负责加载预估模型,以及生成Paddle Serving需要的各种配置,包括engines,workflow,resource等
### 3.4 Server Inferface
diff --git a/doc/DESIGN_DOC.md b/doc/DESIGN_DOC.md
index 2f8a36ea6686b5add2a7e4e407eabfd14167490d..2e7baaeb885c732bb723979e90edae529e7cbc74 100644
--- a/doc/DESIGN_DOC.md
+++ b/doc/DESIGN_DOC.md
@@ -1,5 +1,7 @@
# Paddle Serving Design Doc
+([简体中文](./DESIGN_DOC_CN.md)|English)
+
## 1. Design Objectives
- Long Term Vision: Online deployment of deep learning models will be a user-facing application in the future. Any AI developer will face the problem of deploying an online service for his or her trained model.
diff --git a/doc/DESIGN_DOC_CN.md b/doc/DESIGN_DOC_CN.md
index 312379cd7543e70095e5a6d8168aab06b79a0525..2a63d56593dc47a5ca69f9c5c324710ee6dc3fc6 100644
--- a/doc/DESIGN_DOC_CN.md
+++ b/doc/DESIGN_DOC_CN.md
@@ -1,5 +1,7 @@
# Paddle Serving设计文档
+(简体中文|[English](./DESIGN_DOC.md))
+
## 1. 整体设计目标
- 长期使命:Paddle Serving是一个PaddlePaddle开源的在线服务框架,长期目标就是围绕着人工智能落地的最后一公里提供越来越专业、可靠、易用的服务。
diff --git a/doc/README.md b/doc/README.md
index 5d529175054fa97c495b2a7581fdcb2fe0e4c394..2d51eba9e2a2902685f9385c83542f32b98e5b4f 100644
--- a/doc/README.md
+++ b/doc/README.md
@@ -109,7 +109,7 @@ for data in test_reader():
[Design Doc](DESIGN.md)
-[FAQ](FAQ.md)
+[FAQ](./deprecated/FAQ.md)
### Senior Developer Guildlines
diff --git a/doc/README_CN.md b/doc/README_CN.md
index 82a82622faffe7b3d8ccffea6e2108caa9e5b57c..da5641cad333518ded9fbae4438f05ae20e30ddd 100644
--- a/doc/README_CN.md
+++ b/doc/README_CN.md
@@ -109,7 +109,7 @@ for data in test_reader():
[设计文档](DESIGN_CN.md)
-[FAQ](FAQ.md)
+[FAQ](./deprecated/FAQ.md)
### 资深开发者使用指南
diff --git a/doc/RUN_IN_DOCKER.md b/doc/RUN_IN_DOCKER.md
index 972de2d951e602d025fb5fcb8b3229dcc300f696..708739851b8e3ec5ca8b5e204a68169ec88041b5 100644
--- a/doc/RUN_IN_DOCKER.md
+++ b/doc/RUN_IN_DOCKER.md
@@ -1,5 +1,7 @@
# How to run PaddleServing in Docker
+([简体中文](./RUN_IN_DOCKER_CN.md)|English)
+
## Requirements
Docker (GPU version requires nvidia-docker to be installed on the GPU machine)
diff --git a/doc/RUN_IN_DOCKER_CN.md b/doc/RUN_IN_DOCKER_CN.md
index 17bdd30adbcbecd971904011208fe01d1d08f5ba..9f2abba176ca89f6d03d9602c2fd1e7d4a78980b 100644
--- a/doc/RUN_IN_DOCKER_CN.md
+++ b/doc/RUN_IN_DOCKER_CN.md
@@ -1,5 +1,7 @@
# 如何在Docker中运行PaddleServing
+(简体中文|[English](RUN_IN_DOCKER.md))
+
## 环境要求
Docker(GPU版本需要在GPU机器上安装nvidia-docker)
diff --git a/doc/TRAIN_TO_SERVICE.md b/doc/TRAIN_TO_SERVICE.md
index a5773accae5d135cdfad4c978656a667f442ff8e..4219e66948a9bc3b0ae43e5cda61aad8ae35b3a0 100644
--- a/doc/TRAIN_TO_SERVICE.md
+++ b/doc/TRAIN_TO_SERVICE.md
@@ -1,8 +1,8 @@
-# End-to-end process from training to deployment
+# An End-to-end Tutorial from Training to Inference Service Deployment
([简体中文](./TRAIN_TO_SERVICE_CN.md)|English)
-Paddle Serving is Paddle's high-performance online prediction service framework, which can flexibly support the deployment of most models. In this article, the IMDB review sentiment analysis task is used as an example to show the entire process from model training to deployment of prediction service through 9 steps.
+Paddle Serving is Paddle's high-performance online inference service framework, which can flexibly support the deployment of most models. In this article, the IMDB review sentiment analysis task is used as an example to show the entire process from model training to deployment of inference service through 9 steps.
## Step1:Prepare for Running Environment
Paddle Serving can be deployed on Linux environments such as Centos and Ubuntu. On other systems or in environments where you do not want to install the serving module, you can still access the server-side prediction service through the http service.