提交 3cff8d14 编写于 作者: W wangguibao

Documentation

Change-Id: I6a21e9ecdf3ae46df15d92a5f10f5e0564da9ae9
上级 e2199e78
# Client side configuration # Client side configuration
Paddle Serving C++ client SDK主配置文件为conf/predictors.prototxt。其中一个示例如下: Paddle Serving C++ client SDK的配置文件格式用protobuf定义,全部在configure/proto/sdk_configure.proto中。如果要增加配置字段,需要先在该protobuf文件中增加相应字段,才能被Serving SDK读取和解析。
Paddle Serving主配置文件为conf/predictors.prototxt。其中一个示例如下:
## 1. Sample conf ## 1. Sample conf
......
...@@ -75,6 +75,8 @@ service ImageClassifyService { ...@@ -75,6 +75,8 @@ service ImageClassifyService {
#### 2.2.2 示例配置 #### 2.2.2 示例配置
关于Serving端的配置的详细信息,可以参考[Serving端配置](SERVING_CONFIGURE.md)
以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念,可参考[设计文档](DESIGN.md)) 以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念,可参考[设计文档](DESIGN.md))
- 配置文件示例: - 配置文件示例:
...@@ -209,6 +211,8 @@ target_link_libraries(serving opencv_imgcodecs ...@@ -209,6 +211,8 @@ target_link_libraries(serving opencv_imgcodecs
|enable_model_toolkit|true|模型管理| |enable_model_toolkit|true|模型管理|
|enable_protocol_list|baidu_std|brpc 通信协议列表| |enable_protocol_list|baidu_std|brpc 通信协议列表|
|log_dir|./log|log dir| |log_dir|./log|log dir|
|num_threads|brpc server使用的系统线程数,默认为CPU核数|
|max_concurrency|并发处理的请求数,设为<=0则为不予限制,若大于0则限定brpc server端同时处理的请求数上限|
可以通过在serving/conf/gflags.conf覆盖默认值,例如 可以通过在serving/conf/gflags.conf覆盖默认值,例如
``` ```
......
...@@ -10,3 +10,5 @@ ...@@ -10,3 +10,5 @@
[Getting Started](GETTING_STARTED.md) [Getting Started](GETTING_STARTED.md)
[Installation](INSTALL.md) [Installation](INSTALL.md)
[Server Side Configuration](SERVING_CONFIGURE.md)
# Serving Side Configuration
Paddle Serving配置文件格式采用明文格式的protobuf文件,配置文件的每个字段都需要事先在configure/proto/目录下相关.proto定义中定义好,才能被protobuf读取和解析到。
Serving端的所有配置均在configure/proto/server_configure.proto文件中。
## 1. service.prototxt
Serving端service 配置的入口是service.prototxt,用于配置Paddle Serving实例挂载的service列表。他的protobuf格式可参考`configure/server_configure.protobuf``InferServiceConf`类型。(至于具体的磁盘文件路径可通过--inferservice_path与--inferservice_file 命令行选项修改),样例如下:
```JSON
port: 8010
services {
name: "ImageClassifyService"
workflows: "workflow1"
}
```
其中
port: 该字段标明本机serving实例启动的监听端口。默认为8010。还可以通过--port=8010命令行参数指定。
services: 可以配置多个services。Paddle Serving被设计为单个Serving实例可以同时承载多个预测服务,服务间通过service name进行区分。例如以下代码配置2个预测服务:
```JSON
port: 8010
services {
name: "ImageClassifyService"
workflows: "workflow1"
}
services {
name: "BuiltinEchoService"
workflows: "workflow2"
}
```
service.name: 请填写serving/proto/xx.proto文件的service名称,例如,在serving/proto/image_class.proto中,service名称如下声明:
```JSON
service ImageClassifyService {
rpc inference(Request) returns (Response);
rpc debug(Request) returns (Response);
option (pds.options).generate_impl = true;
};
```
则service name就是`ImageClassifyService`
service.workflows: 用于指定该service下所配的workflow列表。可以配置多个workflow。在本例中,为`ImageClassifyService`配置了一个workflow:`workflow1``workflow1`的具体定义在workflow.prototxt
## 2. workflow.prototxt
workflow.prototxt用来描述每一个具体的workflow,他的protobuf格式可参考`configure/server_configure.protobuf``Workflow`类型。具体的磁盘文件路径可通过--workflow_path和--workflow_file指定。一个例子如下:
```JSON
workflows {
name: "workflow1"
workflow_type: "Sequence"
nodes {
name: "image_reader_op"
type: "ReaderOp"
}
nodes {
name: "image_classify_op"
type: "ClassifyOp"
dependencies {
name: "image_reader_op"
mode: "RO"
}
}
nodes {
name: "write_json_op"
type: "WriteJsonOp"
dependencies {
name: "image_classify_op"
mode: "RO"
}
}
}
workflows {
name: "workflow2"
workflow_type: "Sequence"
nodes {
name: "echo_op"
type: "CommonEchoOp"
}
}
```
以上样例配置了2个workflow:`workflow1``workflow2`。以`workflow1`为例:
name: workflow名称,用于从service.prototxt索引到具体的workflow
workflow_type: 可选"Sequence", "Parallel",表示本workflow下节点所代表的OP是否可并行。**当前只支持Sequence类型,如配置了Parallel类型,则该workflow不会被执行**
nodes: 用于串联成workflow的所有节点,可配置多个nodes。nodes间通过配置dependencies串联起来
node.name: 随意,建议取一个能代表当前node所执行OP的类
node.type: 当前node所执行OP的类名称,与serving/op/下每个具体的OP类的名称对应
node.dependencies: 依赖的上游node列表
node.dependencies.name: 与workflow内节点的name保持一致
node.dependencies.mode: RO-Read Only, RW-Read Write
# 3. resource.prototxt
Serving端resource配置的入口是resource.prototxt,用于配置模型信息。它的protobuf格式参考`configure/proto/server_configure.proto`的ResourceConf。具体的磁盘文件路径可用--resource_path和--resource_file指定。样例如下:
```JSON
model_manager_path: ./conf
model_manager_file: model_toolkit.prototxt
```
主要用来指定model_toolkit.prototxt路径
# 4. model_toolkit.prototxt
用来配置模型信息和所用的预测引擎。它的protobuf格式参考`configure/proto/server_configure.proto`的ModelToolkitConf。model_toolkit.protobuf的磁盘路径不能通过命令行参数覆盖。样例如下:
```JSON
engines {
name: "image_classification_resnet"
type: "FLUID_CPU_NATIVE_DIR"
reloadable_meta: "./data/model/paddle/fluid_time_file"
reloadable_type: "timestamp_ne"
model_data_path: "./data/model/paddle/fluid/SE_ResNeXt50_32x4d"
runtime_thread_num: 0
batch_infer_size: 0
enable_batch_align: 0
}
```
其中
name: 模型名称。InferManager通过此名称,找到要使用的模型和预测引擎。可参考serving/op/classify_op.h与serving/op/classify_op.cpp的InferManager::instance().infer()方法的参数来了解。
type: 预测引擎的类型。可在inferencer-fluid-cpu/src/fluid_cpu_engine.cpp找到当前注册的预测引擎列表
|预测引擎|含义|
|--------|----|
|FLUID_CPU_ANALYSIS|使用fluid Analysis API;模型所有参数保存在一个文件|
|FLUID_CPU_ANALYSIS_DIR|使用fluid Analysis API;模型所有参数分开保存为独立的文件,整个模型放到一个目录中|
|FLUID_CPU_NATIVE|使用fluid Native API;模型所有参数保存在一个文件|
|FLUID_CPU_NATIVE_DIR|使用fluid Native API;模型所有参数分开保存为独立的文件,整个模型放到一个目录中|
**fluid Analysis API和fluid Native API的区别**
Analysis API在模型加载过程中,会对模型计算逻辑进行多种优化,包括但不限于zero copy tensor,相邻OP的fuse等
reloadable_meta: 目前实际内容无意义,用来通过对该文件的mtime判断是否超过reload时间阈值
reloadable_type: 检查reload条件:timestamp_ne/timestamp_gt/md5sum/revision/none
|reloadable_type|含义|
|---------------|----|
|timestamp_ne|reloadable_meta所指定文件的mtime时间戳发生变化|
|timestamp_gt|reloadable_meta所指定文件的mtime时间戳大于等于上次检查时记录的mtime时间戳|
|md5sum|目前无用,配置后永远不reload|
|revision|目前无用,配置后用于不reload|
model_data_path: 模型文件路径
runtime_thread_num: 若大于0, 则启用bsf多线程调度框架,在每个预测bthread worker内启动多线程预测。
batch_infer_size: 启用bsf多线程预测时,每个预测线程的batch size
enable_batch_align:
## 5. 命令行配置参数
以下是serving端支持的gflag配置选项列表,并提供了默认值。
| name | 默认值 | 含义 |
|------|--------|------|
|workflow_path|./conf|workflow配置目录名|
|workflow_file|workflow.prototxt|workflow配置文件名|
|inferservice_path|./conf|service配置目录名|
|inferservice_file|service.prototxt|service配置文件名|
|resource_path|./conf|资源管理器目录名|
|resource_file|resource.prototxt|资源管理器文件名|
|reload_interval_s|10|重载线程间隔时间(s)|
|enable_model_toolkit|true|模型管理|
|enable_protocol_list|baidu_std|brpc 通信协议列表|
|log_dir|./log|log dir|
|num_threads|brpc server使用的系统线程数,默认为CPU核数|
|max_concurrency|并发处理的请求数,设为<=0则为不予限制,若大于0则限定brpc server端同时处理的请求数上限|
可以通过在serving/conf/gflags.conf覆盖默认值,例如
```
--log_dir=./serving_log/
```
将指定日志目录到./serving_log目录下
### 5.1 gflags.conf
可以将命令行配置参数写到配置文件中,该文件路径默认为`conf/gflags.conf`。如果`conf/gflags.conf`存在,则serving端会尝试解析其中的gflags命令。例如
```shell
--enable_model_toolkit
--port=8011
```
可用以下命令指定另外的命令行参数配置文件
```shell
bin/serving --g=true --flagfile=conf/gflags.conf.new
```
...@@ -486,154 +486,6 @@ class FluidInferEngine : public DBReloadableInferEngine<FluidFamilyCore> { ...@@ -486,154 +486,6 @@ class FluidInferEngine : public DBReloadableInferEngine<FluidFamilyCore> {
} }
}; };
template <typename TensorrtFamilyCore>
class TensorrtInferEngine : public DBReloadableInferEngine<TensorrtFamilyCore> {
public:
TensorrtInferEngine() {}
~TensorrtInferEngine() {}
int infer_impl1(const void* in, void* out, uint32_t batch_size) {
TensorrtFamilyCore* core =
DBReloadableInferEngine<TensorrtFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get fluid core in infer_impl()";
return -1;
}
if (!core->Run(in, out, batch_size)) {
LOG(ERROR) << "Failed run fluid family core";
return -1;
}
return 0;
}
int infer_impl2(const BatchTensor& in, BatchTensor& out) { // NOLINT
LOG(ERROR) << "Tensortrt donot supports infer_impl2 yet!";
return -1;
}
};
template <typename AbacusFamilyCore>
class AbacusInferEngine
: public CloneDBReloadableInferEngine<AbacusFamilyCore> {
public:
AbacusInferEngine() {}
~AbacusInferEngine() {}
int infer_impl1(const void* in, void* out, uint32_t batch_size = -1) {
LOG(ERROR) << "Abacus dnn engine must use predict interface";
return -1;
}
int infer_impl2(const BatchTensor& in, BatchTensor& out) { // NOLINT
LOG(ERROR) << "Abacus dnn engine must use predict interface";
return -1;
}
// Abacus special interface
int predict(uint32_t ins_num) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in predict()";
return -1;
}
return core->predict(ins_num);
}
int set_use_fpga(bool use_fpga) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in predict()";
return -1;
}
return core->set_use_fpga(use_fpga);
}
int debug() {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in debug()";
return -1;
}
return core->debug();
}
int set_search_id(uint64_t sid) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in set_serach_id()";
return -1;
}
return core->set_search_id(sid);
}
int set_hidden_layer_dim(uint32_t dim) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in set_layer_dim()";
return -1;
}
return core->set_hidden_layer_dim(dim);
}
int get_input(uint32_t ins_idx, uint32_t* fea_num, void* in) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in get_input()";
return -1;
}
return core->get_input(ins_idx, fea_num, in);
}
int get_layer_value(const std::string& name,
uint32_t ins_num,
uint32_t fea_dim,
void* out) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in get_layer_value()";
return -1;
}
return core->get_layer_value(name, ins_num, fea_dim, out);
}
void set_position_idx(void* input, uint64_t fea, uint32_t ins_idx) {
AbacusFamilyCore* core =
CloneDBReloadableInferEngine<AbacusFamilyCore>::get_core();
if (!core || !core->get()) {
LOG(ERROR) << "Failed get abacus core in set_position_idx()";
return;
}
core->set_position_idx(input, fea, ins_idx);
return;
}
};
template <typename PaddleV2FamilyCore>
class PaddleV2InferEngine
: public CloneDBReloadableInferEngine<PaddleV2FamilyCore> {
public:
PaddleV2InferEngine() {}
~PaddleV2InferEngine() {}
int infer_impl1(const void* in, void* out, uint32_t batch_size = -1) {
LOG(ERROR) << "Paddle V2 engine must use predict interface";
return -1;
}
int infer_impl2(const BatchTensor& in, BatchTensor& out) { // NOLINT
LOG(ERROR) << "Paddle V2 engine must use predict interface";
return -1;
}
};
typedef FactoryPool<InferEngine> StaticInferFactory; typedef FactoryPool<InferEngine> StaticInferFactory;
class VersionedInferEngine : public InferEngine { class VersionedInferEngine : public InferEngine {
......
...@@ -53,9 +53,6 @@ int EndpointConfigManager::load() { ...@@ -53,9 +53,6 @@ int EndpointConfigManager::load() {
} }
uint32_t ep_size = sdk_conf.predictors_size(); uint32_t ep_size = sdk_conf.predictors_size();
#if 1
LOG(INFO) << "ep_size: " << ep_size;
#endif
for (uint32_t ei = 0; ei < ep_size; ++ei) { for (uint32_t ei = 0; ei < ep_size; ++ei) {
EndpointInfo ep; EndpointInfo ep;
if (init_one_endpoint(sdk_conf.predictors(ei), ep, default_var) != 0) { if (init_one_endpoint(sdk_conf.predictors(ei), ep, default_var) != 0) {
...@@ -88,9 +85,6 @@ int EndpointConfigManager::load() { ...@@ -88,9 +85,6 @@ int EndpointConfigManager::load() {
int EndpointConfigManager::init_one_endpoint(const configure::Predictor& conf, int EndpointConfigManager::init_one_endpoint(const configure::Predictor& conf,
EndpointInfo& ep, EndpointInfo& ep,
const VariantInfo& dft_var) { const VariantInfo& dft_var) {
#if 1
LOG(INFO) << "init_one_endpoint " << conf.name().c_str();
#endif
try { try {
// name // name
ep.endpoint_name = conf.name(); ep.endpoint_name = conf.name();
...@@ -120,9 +114,6 @@ int EndpointConfigManager::init_one_endpoint(const configure::Predictor& conf, ...@@ -120,9 +114,6 @@ int EndpointConfigManager::init_one_endpoint(const configure::Predictor& conf,
// varlist // varlist
uint32_t var_size = conf.variants_size(); uint32_t var_size = conf.variants_size();
#if 1
LOG(INFO) << "Variant size: " << var_size;
#endif
for (uint32_t vi = 0; vi < var_size; ++vi) { for (uint32_t vi = 0; vi < var_size; ++vi) {
VariantInfo var; VariantInfo var;
if (merge_variant(dft_var, conf.variants(vi), var) != 0) { if (merge_variant(dft_var, conf.variants(vi), var) != 0) {
...@@ -180,9 +171,6 @@ int EndpointConfigManager::init_one_variant(const configure::VariantConf& conf, ...@@ -180,9 +171,6 @@ int EndpointConfigManager::init_one_variant(const configure::VariantConf& conf,
const configure::RpcParameter& params = conf.rpc_parameter(); const configure::RpcParameter& params = conf.rpc_parameter();
PARSE_CONF_ITEM(params, var.parameters.protocol, protocol, -1); PARSE_CONF_ITEM(params, var.parameters.protocol, protocol, -1);
#if 1
LOG(WARNING) << var.parameters.protocol.value.c_str();
#endif
PARSE_CONF_ITEM(params, var.parameters.compress_type, compress_type, -1); PARSE_CONF_ITEM(params, var.parameters.compress_type, compress_type, -1);
PARSE_CONF_ITEM(params, var.parameters.package_size, package_size, -1); PARSE_CONF_ITEM(params, var.parameters.package_size, package_size, -1);
PARSE_CONF_ITEM( PARSE_CONF_ITEM(
...@@ -213,9 +201,6 @@ int EndpointConfigManager::merge_variant(const VariantInfo& default_var, ...@@ -213,9 +201,6 @@ int EndpointConfigManager::merge_variant(const VariantInfo& default_var,
VariantInfo& merged_var) { VariantInfo& merged_var) {
merged_var = default_var; merged_var = default_var;
#if 1
LOG(INFO) << "merge_variant " << conf.tag().c_str();
#endif
return init_one_variant(conf, merged_var); return init_one_variant(conf, merged_var);
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册