diff --git a/README.md b/README.md
index 8aa899a3d1db797ea1a38476e5d56c425501f23e..de48b7a9baa457f4d062e43d5fb0c79757a2a68d 100644
--- a/README.md
+++ b/README.md
@@ -18,19 +18,19 @@
Motivation
-We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you can put the model online without much effort. A demo of serving is as follows:
+We consider deploying deep learning inference service online to be a user-facing application in the future. **The goal of this project**: When you have trained a deep neural net with [Paddle](https://github.com/PaddlePaddle/Paddle), you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:
Some Key Features
-- Integrate with Paddle training pipeline seemlessly, most paddle models can be deployed **with one line command**.
+- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed **with one line command**.
- **Industrial serving features** supported, such as models management, online loading, online A/B testing etc.
-- **Distributed Key-Value indexing** supported that is especially useful for large scale sparse features as model inputs.
-- **Highly concurrent and efficient communication** between clients and servers.
-- **Multiple programming languages** supported on client side, such as Golang, C++ and python
-- **Extensible framework design** that can support model serving beyond Paddle.
+- **Distributed Key-Value indexing** supported which is especially useful for large scale sparse features as model inputs.
+- **Highly concurrent and efficient communication** between clients and servers supported.
+- **Multiple programming languages** supported on client side, such as Golang, C++ and python.
+- **Extensible framework design** which can support model serving beyond Paddle.
Installation
@@ -53,7 +53,7 @@ Paddle Serving provides HTTP and RPC based service for users to access
### HTTP service
-Paddle Serving provides a built-in python module called `paddle_serving_server.serve` that can start a rpc service or a http service with one-line command. If we specify the argument `--name uci`, it means that we will have a HTTP service with a url of `$IP:$PORT/uci/prediction`
+Paddle Serving provides a built-in python module called `paddle_serving_server.serve` that can start a RPC service or a http service with one-line command. If we specify the argument `--name uci`, it means that we will have a HTTP service with a url of `$IP:$PORT/uci/prediction`
``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
```
@@ -75,7 +75,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"x": [0.0137, -0.1136, 0.25
### RPC service
-A user can also start a rpc service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here.
+A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here.
``` shell
python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
@@ -239,26 +239,26 @@ curl -H "Content-Type:application/json" -X POST -d '{"url": "https://paddle-serv
### New to Paddle Serving
- [How to save a servable model?](doc/SAVE.md)
-- [An end-to-end tutorial from training to serving(Chinese)](doc/TRAIN_TO_SERVICE.md)
-- [Write Bert-as-Service in 10 minutes(Chinese)](doc/BERT_10_MINS.md)
+- [An End-to-end tutorial from training to inference service deployment](doc/TRAIN_TO_SERVICE.md)
+- [Write Bert-as-Service in 10 minutes](doc/BERT_10_MINS.md)
### Developers
- [How to config Serving native operators on server side?](doc/SERVER_DAG.md)
-- [How to develop a new Serving operator](doc/NEW_OPERATOR.md)
+- [How to develop a new Serving operator?](doc/NEW_OPERATOR.md)
- [Golang client](doc/IMDB_GO_CLIENT.md)
-- [Compile from source code(Chinese)](doc/COMPILE.md)
+- [Compile from source code](doc/COMPILE.md)
### About Efficiency
-- [How profile serving efficiency?(Chinese)](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/util)
-- [Benchmarks](doc/BENCHMARK.md)
+- [How to profile Paddle Serving latency?(Chinese)](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/util)
+- [CPU Benchmarks(Chinese)](doc/BENCHMARKING.md)
+- [GPU Benchmarks(Chinese)](doc/GPU_BENCHMARKING.md)
### FAQ
- [FAQ(Chinese)](doc/FAQ.md)
### Design
-- [Design Doc(Chinese)](doc/DESIGN_DOC.md)
-- [Design Doc(English)](doc/DESIGN_DOC_EN.md)
+- [Design Doc](doc/DESIGN_DOC.md)
Community
diff --git a/README_CN.md b/README_CN.md
index 8400038f840a9f26a1342d9fcf4bd9729adcb06c..edc7b9f19d9236cbc80f3add08181a6a49359e1a 100644
--- a/README_CN.md
+++ b/README_CN.md
@@ -1,18 +1,31 @@
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-[![Build Status](https://img.shields.io/travis/com/PaddlePaddle/Serving/develop)](https://travis-ci.com/PaddlePaddle/Serving)
-[![Release](https://img.shields.io/badge/Release-0.0.3-yellowgreen)](Release)
-[![Issues](https://img.shields.io/github/issues/PaddlePaddle/Serving)](Issues)
-[![License](https://img.shields.io/github/license/PaddlePaddle/Serving)](LICENSE)
-[![Slack](https://img.shields.io/badge/Join-Slack-green)](https://paddleserving.slack.com/archives/CU0PB4K35)
+
动机
+
+Paddle Serving 旨在帮助深度学习开发者轻易部署在线预测服务。 **本项目目标**: 当用户使用 [Paddle](https://github.com/PaddlePaddle/Paddle) 训练了一个深度神经网络,就同时拥有了该模型的预测服务。
-## 动机
-Paddle Serving 帮助深度学习开发者轻易部署在线预测服务。 **本项目目标**: 只要你使用 [Paddle](https://github.com/PaddlePaddle/Paddle) 训练了一个深度神经网络,你就同时拥有了该模型的预测服务。
-## 核心功能
+核心功能
+
- 与Paddle训练紧密连接,绝大部分Paddle模型可以 **一键部署**.
- 支持 **工业级的服务能力** 例如模型管理,在线加载,在线A/B测试等.
- 支持 **分布式键值对索引** 助力于大规模稀疏特征作为模型输入.
@@ -20,7 +33,7 @@ Paddle Serving 帮助深度学习开发者轻易部署在线预测服务。 **
- 支持 **多种编程语言** 开发客户端,例如Golang,C++和Python.
- **可伸缩框架设计** 可支持不限于Paddle的模型服务.
-## 安装
+安装
强烈建议您在Docker内构建Paddle Serving,请查看[如何在Docker中运行PaddleServing](doc/RUN_IN_DOCKER_CN.md)
@@ -29,17 +42,51 @@ pip install paddle-serving-client
pip install paddle-serving-server
```
-## 快速启动示例
+快速启动示例
+
+波士顿房价预测
``` shell
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
tar -xzf uci_housing.tar.gz
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
```
-Python客户端请求
+Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务
+
+
+HTTP服务
+
+Paddle Serving提供了一个名为`paddle_serving_server.serve`的内置python模块,可以使用单行命令启动RPC服务或HTTP服务。如果我们指定参数`--name uci`,则意味着我们将拥有一个HTTP服务,其URL为$IP:$PORT/uci/prediction`。
+
+``` shell
+python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
+```
+
+
+| Argument | Type | Default | Description |
+|--------------|------|-----------|--------------------------------|
+| `thread` | int | `4` | Concurrency of current service |
+| `port` | int | `9292` | Exposed port of current service to users|
+| `name` | str | `""` | Service name, can be used to generate HTTP request url |
+| `model` | str | `""` | Path of paddle model directory to be served |
+
+我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求,请参考英文文档 [requests](https://requests.readthedocs.io/en/master/)。
+
+
+``` shell
+curl -H "Content-Type:application/json" -X POST -d '{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
+```
+
+RPC服务
+
+用户还可以使用`paddle_serving_server.serve`启动RPC服务。 尽管用户需要基于Paddle Serving的python客户端API进行一些开发,但是RPC服务通常比HTTP服务更快。需要指出的是这里我们没有指定`--name`。
+
+``` shell
+python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
+```
``` python
+# A user can visit rpc service through paddle_serving_client API
from paddle_serving_client import Client
client = Client()
@@ -51,24 +98,105 @@ fetch_map = client.predict(feed={"x": data}, fetch=["price"])
print(fetch_map)
```
+在这里,`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict`。 `fetch`被要从服务器返回的预测变量赋值。 在该示例中,在训练过程中保存可服务模型时,被赋值的tensor名为`"x"`和`"price"`。
+
+Paddle Serving预装的服务
+
+中文分词模型
+
+- **介绍**:
+``` shell
+本示例为中文分词HTTP服务一键部署
+```
+
+- **下载服务包**:
+``` shell
+wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
+```
+- **启动web服务**:
+``` shell
+tar -xzf lac_model_jieba_web.tar.gz
+python lac_web_service.py jieba_server_model/ lac_workdir 9292
+```
+- **客户端请求示例**:
+``` shell
+curl -H "Content-Type:application/json" -X POST -d '{"words": "我爱北京天安门", "fetch":["word_seg"]}' http://127.0.0.1:9292/lac/prediction
+```
+- **返回结果示例**:
+``` shell
+{"word_seg":"我|爱|北京|天安门"}
+```
+
+图像分类模型
+
+- **介绍**:
+``` shell
+图像分类模型由Imagenet数据集训练而成,该服务会返回一个标签及其概率
+```
+
+- **下载服务包**:
+``` shell
+wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/imagenet_demo.tar.gz
+```
+- **启动web服务**:
+``` shell
+tar -xzf imagenet_demo.tar.gz
+python image_classification_service_demo.py resnet50_serving_model
+```
+- **客户端请求示例**:
+
+
+
+
+
+
+
+``` shell
+curl -H "Content-Type:application/json" -X POST -d '{"url": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg", "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
+```
+- **返回结果示例**:
+``` shell
+{"label":"daisy","prob":0.9341403245925903}
+```
+
+
文档
+
+### 新手教程
+- [怎样保存用于Paddle Serving的模型?](doc/SAVE_CN.md)
+- [端到端完成从训练到部署全流程](doc/TRAIN_TO_SERVICE_CN.md)
+- [十分钟构建Bert-As-Service](doc/BERT_10_MINS_CN.md)
+
+### 开发者教程
+- [如何配置Server端的计算图?](doc/SERVER_DAG_CN.md)
+- [如何开发一个新的General Op?](doc/NEW_OPERATOR_CN.md)
+- [如何在Paddle Serving使用Go Client?](doc/IMDB_GO_CLIENT_CN.md)
+- [如何编译PaddleServing?](doc/COMPILE_CN.md)
+
+### 关于Paddle Serving性能
+- [如何测试Paddle Serving性能?](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/util)
+- [CPU版Benchmarks](doc/BENCHMARKING.md)
+- [GPU版Benchmarks](doc/GPU_BENCHMARKING.md)
+
+### FAQ
+- [常见问答](doc/deprecated/FAQ.md)
-## 文档
+### 设计文档
+- [Paddle Serving设计文档](doc/DESIGN_DOC_CN.md)
-[开发文档](doc/DESIGN.md)
+社区
-[如何在服务器端配置本地Op?](doc/SERVER_DAG.md)
+### Slack
-[如何开发一个新的Op?](doc/NEW_OPERATOR.md)
+想要同开发者和其他用户沟通吗?欢迎加入我们的 [Slack channel](https://paddleserving.slack.com/archives/CUBPKHKMJ)
-[Golang 客户端](doc/IMDB_GO_CLIENT.md)
+### 贡献代码
-[从源码编译](doc/COMPILE.md)
+如果您想为Paddle Serving贡献代码,请参考 [Contribution Guidelines](doc/CONTRIBUTE.md)
-[常见问答](doc/FAQ.md)
+### 反馈
-## 加入社区
-如果您想要联系其他用户和开发者,欢迎加入我们的 [Slack channel](https://paddleserving.slack.com/archives/CUBPKHKMJ)
+如有任何反馈或是bug,请在 [GitHub Issue](https://github.com/PaddlePaddle/Serving/issues)提交
-## 如何贡献代码
+### License
-如果您想要贡献代码给Paddle Serving,请参考[Contribution Guidelines](doc/CONTRIBUTE.md)
+[Apache 2.0 License](https://github.com/PaddlePaddle/Serving/blob/develop/LICENSE)
diff --git a/doc/ABTEST_IN_PADDLE_SERVING.md b/doc/ABTEST_IN_PADDLE_SERVING.md
index 17497fecf680c7579319e4f148a6e0af764dcda5..e02acbd8a1a6cfdb296cedf32ad7b7afc63995d7 100644
--- a/doc/ABTEST_IN_PADDLE_SERVING.md
+++ b/doc/ABTEST_IN_PADDLE_SERVING.md
@@ -1,5 +1,7 @@
# ABTEST in Paddle Serving
+([简体中文](./ABTEST_IN_PADDLE_SERVING_CN.md)|English)
+
This document will use an example of text classification task based on IMDB dataset to show how to build a A/B Test framework using Paddle Serving. The structure relationship between the client and servers in the example is shown in the figure below.
diff --git a/doc/ABTEST_IN_PADDLE_SERVING_CN.md b/doc/ABTEST_IN_PADDLE_SERVING_CN.md
index d31ddba6f72dfc23fa15defeda23468ab1785e62..e32bf783fcde20bb5dff3d2addaf764838975a81 100644
--- a/doc/ABTEST_IN_PADDLE_SERVING_CN.md
+++ b/doc/ABTEST_IN_PADDLE_SERVING_CN.md
@@ -1,5 +1,7 @@
# 如何使用Paddle Serving做ABTEST
+(简体中文|[English](./ABTEST_IN_PADDLE_SERVING.md))
+
该文档将会用一个基于IMDB数据集的文本分类任务的例子,介绍如何使用Paddle Serving搭建A/B Test框架,例中的Client端、Server端结构如下图所示。
diff --git a/doc/deprecated/BENCHMARKING.md b/doc/BENCHMARKING.md
similarity index 100%
rename from doc/deprecated/BENCHMARKING.md
rename to doc/BENCHMARKING.md
diff --git a/doc/DESIGN.md b/doc/DESIGN.md
index 8686eb7fc585c9df89218bd25262678fb49468d1..5d00d02171dccf07bfdafb9cdd85222a92c20113 100644
--- a/doc/DESIGN.md
+++ b/doc/DESIGN.md
@@ -14,17 +14,17 @@ The result is a complete serving solution.
## 2. Terms explanation
-- baidu-rpc: Baidu's official open source RPC framework, supports multiple common communication protocols, and provides a custom interface experience based on protobuf
-- Variant: Paddle Serving architecture is an abstraction of a minimal prediction cluster, which is characterized by all internal instances (replicas) being completely homogeneous and logically corresponding to a fixed version of a model
-- Endpoint: Multiple Variants form an Endpoint. Logically, Endpoint represents a model, and Variants within the Endpoint represent different versions.
-- OP: PaddlePaddle is used to encapsulate a numerical calculation operator, Paddle Serving is used to represent a basic business operation operator, and the core interface is inference. OP configures its dependent upstream OP to connect multiple OPs into a workflow
-- Channel: An abstraction of all request-level intermediate data of the OP; data exchange between OPs through Channels
-- Bus: manages all channels in a thread, and schedules the access relationship between the two sets of OP and Channel according to the DAG dependency graph between DAGs
-- Stage: Workflow according to the topology diagram described by DAG, a collection of OPs that belong to the same link and can be executed in parallel
-- Node: An Op operator instance composed of an Op operator class combined with parameter configuration, which is also an execution unit in Workflow
-- Workflow: executes the inference interface of each OP in order according to the topology described by DAG
-- DAG/Workflow: consists of several interdependent Nodes. Each Node can obtain the Request object through a specific interface. The node Op obtains the output object of its pre-op through the dependency relationship. The output of the last Node is the Response object by default.
-- Service: encapsulates a pv request, can configure several Workflows, reuse the current PV's Request object with each other, and then execute each in parallel/serial execution, and finally write the Response to the corresponding output slot; a Paddle-serving process Multiple sets of Service interfaces can be configured. The upstream determines the Service interface currently accessed based on the ServiceName.
+- **baidu-rpc**: Baidu's official open source RPC framework, supports multiple common communication protocols, and provides a custom interface experience based on protobuf
+- **Variant**: Paddle Serving architecture is an abstraction of a minimal prediction cluster, which is characterized by all internal instances (replicas) being completely homogeneous and logically corresponding to a fixed version of a model
+- **Endpoint**: Multiple Variants form an Endpoint. Logically, Endpoint represents a model, and Variants within the Endpoint represent different versions.
+- **OP**: PaddlePaddle is used to encapsulate a numerical calculation operator, Paddle Serving is used to represent a basic business operation operator, and the core interface is inference. OP configures its dependent upstream OP to connect multiple OPs into a workflow
+- **Channel**: An abstraction of all request-level intermediate data of the OP; data exchange between OPs through Channels
+- **Bus**: manages all channels in a thread, and schedules the access relationship between the two sets of OP and Channel according to the DAG dependency graph between DAGs
+- **Stage**: Workflow according to the topology diagram described by DAG, a collection of OPs that belong to the same link and can be executed in parallel
+- **Node**: An OP operator instance composed of an OP operator class combined with parameter configuration, which is also an execution unit in Workflow
+- **Workflow**: executes the inference interface of each OP in order according to the topology described by DAG
+- **DAG/Workflow**: consists of several interdependent Nodes. Each Node can obtain the Request object through a specific interface. The node Op obtains the output object of its pre-op through the dependency relationship. The output of the last Node is the Response object by default.
+- **Service**: encapsulates a pv request, can configure several Workflows, reuse the current PV's Request object with each other, and then execute each in parallel/serial execution, and finally write the Response to the corresponding output slot; a Paddle-serving process Multiple sets of Service interfaces can be configured. The upstream determines the Service interface currently accessed based on the ServiceName.
## 3. Python Interface Design
@@ -38,10 +38,10 @@ Models that can be predicted using the Paddle Inference Library, models saved du
### 3.3 Overall design:
-The user starts the Client and Server through the Python Client. The Python API has a function to check whether the interconnection and the models to be accessed match.
-The Python API calls the pybind corresponding to the client and server functions implemented by Paddle Serving, and the information transmitted through RPC is implemented through RPC.
-The Client Python API currently has two simple functions, load_inference_conf and predict, which are used to perform loading of the model to be predicted and prediction, respectively.
-The Server Python API is mainly responsible for loading the estimation model and generating various configurations required by Paddle Serving, including engines, workflow, resources, etc.
+- The user starts the Client and Server through the Python Client. The Python API has a function to check whether the interconnection and the models to be accessed match.
+- The Python API calls the pybind corresponding to the client and server functions implemented by Paddle Serving, and the information transmitted through RPC is implemented through RPC.
+- The Client Python API currently has two simple functions, load_inference_conf and predict, which are used to perform loading of the model to be predicted and prediction, respectively.
+- The Server Python API is mainly responsible for loading the inference model and generating various configurations required by Paddle Serving, including engines, workflow, resources, etc.
### 3.4 Server Inferface
@@ -69,8 +69,8 @@ def save_model(server_model_folder,
![Paddle-Serging Overall Architecture](framework.png)
**Model Management Framework**: Connects model files of multiple machine learning platforms and provides a unified inference interface
-**Business Scheduling Framework**: Abstracts the calculation logic of various different prediction models, provides a general DAG scheduling framework, and connects different operators through DAG diagrams to complete a prediction service together. This abstract model allows users to conveniently implement their own calculation logic, and at the same time facilitates operator sharing. (Users build their own forecasting services. A large part of their work is to build DAGs and provide operators.)
-**PredictService**: Encapsulation of the externally provided prediction service interface. Define communication fields with the client through protobuf.
+**Business Scheduling Framework**: Abstracts the calculation logic of various different inference models, provides a general DAG scheduling framework, and connects different operators through DAG diagrams to complete a prediction service together. This abstract model allows users to conveniently implement their own calculation logic, and at the same time facilitates operator sharing. (Users build their own forecasting services. A large part of their work is to build DAGs and provide operators.)
+**Predict Service**: Encapsulation of the externally provided prediction service interface. Define communication fields with the client through protobuf.
### 4.1 Model Management Framework
diff --git a/doc/DESIGN_CN.md b/doc/DESIGN_CN.md
index 2e10013fc46c4b121ffe5c9268e5b531fe7f9992..124e826c4591c89cb14d25153f4c9a3096ea8dfb 100644
--- a/doc/DESIGN_CN.md
+++ b/doc/DESIGN_CN.md
@@ -4,7 +4,7 @@
## 1. 项目背景
-PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学习模型的定制化开发; Paddle serving是Paddle的在线预测部分,与Paddle模型训练环节无缝衔接,提供机器学习预测云服务。本文将从模型、服务、接入等层面,自底向上描述Paddle Serving设计方案。
+PaddlePaddle是百度开源的机器学习框架,广泛支持各种深度学习模型的定制化开发; Paddle Serving是Paddle的在线预测部分,与Paddle模型训练环节无缝衔接,提供机器学习预测云服务。本文将从模型、服务、接入等层面,自底向上描述Paddle Serving设计方案。
1. 模型是Paddle Serving预测的核心,包括模型数据和推理计算的管理;
2. 预测框架封装模型推理计算,对外提供RPC接口,对接不同上游;
@@ -14,23 +14,23 @@ PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学
## 2. 名词解释
-- baidu-rpc 百度官方开源RPC框架,支持多种常见通信协议,提供基于protobuf的自定义接口体验
-- Variant Paddle Serving架构对一个最小预测集群的抽象,其特点是内部所有实例(副本)完全同质,逻辑上对应一个model的一个固定版本
-- Endpoint 多个Variant组成一个Endpoint,逻辑上看,Endpoint代表一个model,Endpoint内部的Variant代表不同的版本
-- OP PaddlePaddle用来封装一种数值计算的算子,Paddle Serving用来表示一种基础的业务操作算子,核心接口是inference。OP通过配置其依赖的上游OP,将多个OP串联成一个workflow
-- Channel 一个OP所有请求级中间数据的抽象;OP之间通过Channel进行数据交互
-- Bus 对一个线程中所有channel的管理,以及根据DAG之间的DAG依赖图对OP和Channel两个集合间的访问关系进行调度
-- Stage Workflow按照DAG描述的拓扑图中,属于同一个环节且可并行执行的OP集合
-- Node 由某个Op算子类结合参数配置组成的Op算子实例,也是Workflow中的一个执行单元
-- Workflow 按照DAG描述的拓扑,有序执行每个OP的inference接口
-- DAG/Workflow 由若干个相互依赖的Node组成,每个Node均可通过特定接口获得Request对象,节点Op通过依赖关系获得其前置Op的输出对象,最后一个Node的输出默认就是Response对象
-- Service 对一次pv的请求封装,可配置若干条Workflow,彼此之间复用当前PV的Request对象,然后各自并行/串行执行,最后将Response写入对应的输出slot中;一个Paddle-serving进程可配置多套Service接口,上游根据ServiceName决定当前访问的Service接口。
+- **baidu-rpc**: 百度官方开源RPC框架,支持多种常见通信协议,提供基于protobuf的自定义接口体验
+- **Variant**: Paddle Serving架构对一个最小预测集群的抽象,其特点是内部所有实例(副本)完全同质,逻辑上对应一个model的一个固定版本
+- **Endpoint**: 多个Variant组成一个Endpoint,逻辑上看,Endpoint代表一个model,Endpoint内部的Variant代表不同的版本
+- **OP**: PaddlePaddle用来封装一种数值计算的算子,Paddle Serving用来表示一种基础的业务操作算子,核心接口是inference。OP通过配置其依赖的上游OP,将多个OP串联成一个workflow
+- **Channel**: 一个OP所有请求级中间数据的抽象;OP之间通过Channel进行数据交互
+- **Bus**: 对一个线程中所有channel的管理,以及根据DAG之间的DAG依赖图对OP和Channel两个集合间的访问关系进行调度
+- **Stage**: Workflow按照DAG描述的拓扑图中,属于同一个环节且可并行执行的OP集合
+- **Node**: 由某个OP算子类结合参数配置组成的OP算子实例,也是Workflow中的一个执行单元
+- **Workflow**: 按照DAG描述的拓扑,有序执行每个OP的inference接口
+- **DAG/Workflow**: 由若干个相互依赖的Node组成,每个Node均可通过特定接口获得Request对象,节点OP通过依赖关系获得其前置OP的输出对象,最后一个Node的输出默认就是Response对象
+- **Service**: 对一次PV的请求封装,可配置若干条Workflow,彼此之间复用当前PV的Request对象,然后各自并行/串行执行,最后将Response写入对应的输出slot中;一个Paddle-serving进程可配置多套Service接口,上游根据ServiceName决定当前访问的Service接口。
## 3. Python Interface设计
### 3.1 核心目标:
-一套Paddle Serving的动态库,支持Paddle保存的通用模型的远程预估服务,通过Python Interface调用PaddleServing底层的各种功能。
+完成一整套Paddle Serving的动态库,支持Paddle保存的通用模型的远程预估服务,通过Python Interface调用PaddleServing底层的各种功能。
### 3.2 通用模型:
@@ -38,10 +38,10 @@ PaddlePaddle是公司开源的机器学习框架,广泛支持各种深度学
### 3.3 整体设计:
-用户通过Python Client启动Client和Server,Python API有检查互联和待访问模型是否匹配的功能
-Python API背后调用的是Paddle Serving实现的client和server对应功能的pybind,互传的信息通过RPC实现
-Client Python API当前有两个简单的功能,load_inference_conf和predict,分别用来执行加载待预测的模型和预测
-Server Python API主要负责加载预估模型,以及生成Paddle Serving需要的各种配置,包括engines,workflow,resource等
+- 用户通过Python Client启动Client和Server,Python API有检查互联和待访问模型是否匹配的功能
+- Python API背后调用的是Paddle Serving实现的client和server对应功能的pybind,互传的信息通过RPC实现
+- Client Python API当前有两个简单的功能,load_inference_conf和predict,分别用来执行加载待预测的模型和预测
+- Server Python API主要负责加载预估模型,以及生成Paddle Serving需要的各种配置,包括engines,workflow,resource等
### 3.4 Server Inferface
diff --git a/doc/DESIGN_DOC.md b/doc/DESIGN_DOC.md
index 2f8a36ea6686b5add2a7e4e407eabfd14167490d..2e7baaeb885c732bb723979e90edae529e7cbc74 100644
--- a/doc/DESIGN_DOC.md
+++ b/doc/DESIGN_DOC.md
@@ -1,5 +1,7 @@
# Paddle Serving Design Doc
+([简体中文](./DESIGN_DOC_CN.md)|English)
+
## 1. Design Objectives
- Long Term Vision: Online deployment of deep learning models will be a user-facing application in the future. Any AI developer will face the problem of deploying an online service for his or her trained model.
diff --git a/doc/DESIGN_DOC_CN.md b/doc/DESIGN_DOC_CN.md
index 312379cd7543e70095e5a6d8168aab06b79a0525..2a63d56593dc47a5ca69f9c5c324710ee6dc3fc6 100644
--- a/doc/DESIGN_DOC_CN.md
+++ b/doc/DESIGN_DOC_CN.md
@@ -1,5 +1,7 @@
# Paddle Serving设计文档
+(简体中文|[English](./DESIGN_DOC.md))
+
## 1. 整体设计目标
- 长期使命:Paddle Serving是一个PaddlePaddle开源的在线服务框架,长期目标就是围绕着人工智能落地的最后一公里提供越来越专业、可靠、易用的服务。
diff --git a/doc/README.md b/doc/README.md
index 5d529175054fa97c495b2a7581fdcb2fe0e4c394..2d51eba9e2a2902685f9385c83542f32b98e5b4f 100644
--- a/doc/README.md
+++ b/doc/README.md
@@ -109,7 +109,7 @@ for data in test_reader():
[Design Doc](DESIGN.md)
-[FAQ](FAQ.md)
+[FAQ](./deprecated/FAQ.md)
### Senior Developer Guildlines
diff --git a/doc/README_CN.md b/doc/README_CN.md
index 82a82622faffe7b3d8ccffea6e2108caa9e5b57c..da5641cad333518ded9fbae4438f05ae20e30ddd 100644
--- a/doc/README_CN.md
+++ b/doc/README_CN.md
@@ -109,7 +109,7 @@ for data in test_reader():
[设计文档](DESIGN_CN.md)
-[FAQ](FAQ.md)
+[FAQ](./deprecated/FAQ.md)
### 资深开发者使用指南
diff --git a/doc/RUN_IN_DOCKER.md b/doc/RUN_IN_DOCKER.md
index 972de2d951e602d025fb5fcb8b3229dcc300f696..708739851b8e3ec5ca8b5e204a68169ec88041b5 100644
--- a/doc/RUN_IN_DOCKER.md
+++ b/doc/RUN_IN_DOCKER.md
@@ -1,5 +1,7 @@
# How to run PaddleServing in Docker
+([简体中文](./RUN_IN_DOCKER_CN.md)|English)
+
## Requirements
Docker (GPU version requires nvidia-docker to be installed on the GPU machine)
diff --git a/doc/RUN_IN_DOCKER_CN.md b/doc/RUN_IN_DOCKER_CN.md
index 17bdd30adbcbecd971904011208fe01d1d08f5ba..9f2abba176ca89f6d03d9602c2fd1e7d4a78980b 100644
--- a/doc/RUN_IN_DOCKER_CN.md
+++ b/doc/RUN_IN_DOCKER_CN.md
@@ -1,5 +1,7 @@
# 如何在Docker中运行PaddleServing
+(简体中文|[English](RUN_IN_DOCKER.md))
+
## 环境要求
Docker(GPU版本需要在GPU机器上安装nvidia-docker)
diff --git a/doc/TRAIN_TO_SERVICE.md b/doc/TRAIN_TO_SERVICE.md
index a5773accae5d135cdfad4c978656a667f442ff8e..4219e66948a9bc3b0ae43e5cda61aad8ae35b3a0 100644
--- a/doc/TRAIN_TO_SERVICE.md
+++ b/doc/TRAIN_TO_SERVICE.md
@@ -1,8 +1,8 @@
-# End-to-end process from training to deployment
+# An End-to-end Tutorial from Training to Inference Service Deployment
([简体中文](./TRAIN_TO_SERVICE_CN.md)|English)
-Paddle Serving is Paddle's high-performance online prediction service framework, which can flexibly support the deployment of most models. In this article, the IMDB review sentiment analysis task is used as an example to show the entire process from model training to deployment of prediction service through 9 steps.
+Paddle Serving is Paddle's high-performance online inference service framework, which can flexibly support the deployment of most models. In this article, the IMDB review sentiment analysis task is used as an example to show the entire process from model training to deployment of inference service through 9 steps.
## Step1:Prepare for Running Environment
Paddle Serving can be deployed on Linux environments such as Centos and Ubuntu. On other systems or in environments where you do not want to install the serving module, you can still access the server-side prediction service through the http service.