未验证 提交 74487d0c 编写于 作者: T TeslaZhao 提交者: GitHub

Merge pull request #1546 from HexToString/update_doc_2_model

Update doc 2 model
......@@ -79,6 +79,7 @@ The first step is to call the model save interface to generate a model parameter
- [Encryption](doc/C++_Serving/Encryption_EN.md)
- [Analyze and optimize performance(Chinese)](doc/C++_Serving/Performance_Tuning_CN.md)
- [Benchmark(Chinese)](doc/C++_Serving/Benchmark_CN.md)
- [Multiple models in series(Chinese)](doc/C++_Serving/2+_model.md)
- [Python Pipeline](doc/Python_Pipeline/Pipeline_Design_EN.md)
- [Analyze and optimize performance](doc/Python_Pipeline/Performance_Tuning_EN.md)
- [Benchmark(Chinese)](doc/Python_Pipeline/Benchmark_CN.md)
......
......@@ -74,6 +74,7 @@ Paddle Serving依托深度学习框架PaddlePaddle旨在帮助深度学习开发
- [加密模型推理服务](doc/C++_Serving/Encryption_CN.md)
- [性能优化指南](doc/C++_Serving/Performance_Tuning_CN.md)
- [性能指标](doc/C++_Serving/Benchmark_CN.md)
- [多模型串联](doc/C++_Serving/2+_model.md)
- [Python Pipeline设计](doc/Python_Pipeline/Pipeline_Design_CN.md)
- [性能优化指南](doc/Python_Pipeline/Performance_Tuning_CN.md)
- [性能指标](doc/Python_Pipeline/Benchmark_CN.md)
......
......@@ -10,115 +10,151 @@
- 缺点:需要改动代码,且需要重新编译。
本文主要介绍第二种效率高的方法,该方法的基本步骤如下:
1. 自定义OP
2. 自定义DAG
3. 编译
4. 服务启动与调用
1. 自定义OP(即定义单个模型的前处理-模型预测-模型后处理)
2. 编译
3. 服务启动与调用
# 1. 自定义OP
OP是Paddle Serving服务端的处理流程(即DAG图)的基本组成,参考[从0开始自定义OP](./OP_CN.md),该文档只是讲述了如何自定义一个调用预测的OP节点,您可以在此基础上加上前处理,后处理。
首先获取前置OP的输出,作为本OP的输入,并可以根据自己的需求,通过修改TensorVector* in指向的内存的数据,进行数据的前处理。
``` c++
const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name());
const TensorVector *in = &input_blob->tensor_vector;
一个OP定义了单个模型的前处理-模型预测-模型后处理,定义OP需要以下2步:
1. 定义C++ .h头文件
2. 定义C++ .cpp源文件
## 1.1 定义C++ .h头文件
复制👇的代码,将其中`/*自定义Class名称*/`更换为自定义的类名即可,如`GeneralDetectionOp`
放置于`core/general-server/op/`路径下,文件名自定义即可,如`general_detection_op.h`
``` C++
#pragma once
#include <string>
#include <vector>
#include "core/general-server/general_model_service.pb.h"
#include "core/general-server/op/general_infer_helper.h"
#include "paddle_inference_api.h" // NOLINT
namespace baidu {
namespace paddle_serving {
namespace serving {
class /*自定义Class名称*/
: public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
public:
typedef std::vector<paddle::PaddleTensor> TensorVector;
DECLARE_OP(/*自定义Class名称*/);
int inference();
};
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
```
## 1.2 定义C++ .cpp源文件
复制👇的代码,将其中`/*自定义Class名称*/`更换为自定义的类名,如`GeneralDetectionOp`
将前处理和后处理的代码添加在👇的代码中注释的前处理和后处理的位置。
放置于`core/general-server/op/`路径下,文件名自定义即可,如`general_detection_op.cpp`
``` C++
#include "core/general-server/op/自定义的头文件名"
#include <algorithm>
#include <iostream>
#include <memory>
#include <sstream>
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/predictor/framework/resource.h"
#include "core/util/include/timer.h"
namespace baidu {
namespace paddle_serving {
namespace serving {
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::MempoolWrapper;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::Response;
using baidu::paddle_serving::predictor::general_model::Request;
using baidu::paddle_serving::predictor::InferManager;
using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
int /*自定义Class名称*/::inference() {
//获取前置OP节点
const std::vector<std::string> pre_node_names = pre_names();
if (pre_node_names.size() != 1) {
LOG(ERROR) << "This op(" << op_name()
<< ") can only have one predecessor op, but received "
<< pre_node_names.size();
return -1;
}
const std::string pre_name = pre_node_names[0];
//将前置OP的输出,作为本OP的输入。
GeneralBlob *input_blob = mutable_depend_argument<GeneralBlob>(pre_name);
if (!input_blob) {
LOG(ERROR) << "input_blob is nullptr,error";
return -1;
}
TensorVector *in = &input_blob->tensor_vector;
uint64_t log_id = input_blob->GetLogId();
int batch_size = input_blob->_batch_size;
声明本OP的输出
``` c++
//初始化本OP的输出。
GeneralBlob *output_blob = mutable_data<GeneralBlob>();
output_blob->SetLogId(log_id);
output_blob->_batch_size = batch_size;
VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
TensorVector *out = &output_blob->tensor_vector;
int batch_size = input_blob->GetBatchSize();
output_blob->SetBatchSize(batch_size);
```
完成前处理和定义输出变量后,核心调用预测引擎的一句话如下:
``` c++
if (InferManager::instance().infer(engine_name().c_str(), in, out, batch_size)) {
LOG(ERROR) << "Failed do infer in fluid model: " << engine_name().c_str();
/*前处理的代码添加在此处,前处理直接修改上文的TensorVector* in*/
Timer timeline;
int64_t start = timeline.TimeStampUS();
timeline.Start();
// 将前处理后的in,初始化的out传入,进行模型预测,模型预测的输出会直接修改out指向的内存中的数据
// 如果您想定义一个不需要模型调用,只进行数据处理的OP,删除下面这一部分的代码即可。
if (InferManager::instance().infer(
engine_name().c_str(), in, out, batch_size)) {
LOG(ERROR) << "(logid=" << log_id
<< ") Failed do infer in fluid model: " << engine_name().c_str();
return -1;
}
```
在此之后,模型预测的输出已经写入与OP绑定的TensorVector* out指针变量所指向的内存空间,此时`可以通过修改TensorVector* out指向的内存的数据,进行数据的后处理`,下一个后置OP获取该OP的输出。
最后如果您使用Python API的方式启动Server端,在服务器端为Paddle Serving定义C++运算符后,最后一步是在Python API中为Paddle Serving服务器API添加注册, `python/paddle_serving_server/dag.py`文件里有关于API注册的代码如下
``` python
self.op_dict = {
"general_infer": "GeneralInferOp",
"general_reader": "GeneralReaderOp",
"general_response": "GeneralResponseOp",
"general_text_reader": "GeneralTextReaderOp",
"general_text_response": "GeneralTextResponseOp",
"general_single_kv": "GeneralSingleKVOp",
"general_dist_kv_infer": "GeneralDistKVInferOp",
"general_dist_kv": "GeneralDistKVOp",
"general_copy": "GeneralCopyOp",
"general_detection":"GeneralDetectionOp",
}
```
其中左侧的`”general_infer“名字为自定义(下文有用)`,右侧的`"GeneralInferOp"为自定义的C++OP类的类名`。
在`python/paddle_serving_server/server.py`文件中仅添加`需要加载模型,执行推理预测的自定义的C++OP类的类名`。例如`general_reader`由于只是做一些简单的数据处理而不加载模型调用预测,故在👆的代码中需要添加,而不添加在👇的代码中。
``` python
default_engine_types = [
'GeneralInferOp',
'GeneralDistKVInferOp',
'GeneralDistKVQuantInferOp',
'GeneralDetectionOp',
]
```
# 2. 自定义DAG
DAG图是Server端处理流程的基本定义,在完成上述OP定义的基础上,参考[自定义DAG图](./DAG_CN.md),您可以自行构建Server端多模型(即多个OP)之间的处理逻辑关系。
框架一般需要在开头加上一个`general_reader`,在结尾加上一个`general_response`,中间添加实际需要调用预测的自定义OP,例如`general_infer`就是一个框架定义好的默认OP,它只调用预测,没有前后处理。
例如,对于OCR模型来说,实际是串联det和rec两个模型,我们可以使用一个`自定义的"general_detection"`和`"general_infer"(注意,此处名字必须与上述Python API中严格对应)`构建DAG图,代码(`python/paddle_serving_server/serve.py`)原理如下图所示。
}
``` python
import paddle_serving_server as serving
from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
/*后处理的代码添加在此处,前处理直接修改上文的TensorVector *out*/
op_maker = serving.OpMaker()
read_op = op_maker.create('general_reader')
general_detection_op = op_maker.create('general_detection')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
int64_t end = timeline.TimeStampUS();
CopyBlobInfo(input_blob, output_blob);
AddBlobInfo(output_blob, start);
AddBlobInfo(output_blob, end);
return 0;
}
DEFINE_OP(/*自定义Class名称*/);
op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_detection_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(general_response_op)
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
```
# 3. 编译
# 2. 编译
此时,需要您重新编译生成serving,并通过`export SERVING_BIN`设置环境变量来指定使用您编译生成的serving二进制文件,并通过`pip3 install`的方式安装相关python包,细节请参考[如何编译Serving](../Compile_CN.md)
# 4. 服务启动与调用
## 4.1 Server端启动
仍然以OCR模型为例,分别单独启动det单模型和的脚本代码如下:
```python
#分别单独启动模型
python3 -m paddle_serving_server.serve --model ocr_det_model --port 9293#det模型
python3 -m paddle_serving_server.serve --model ocr_rec_model --port 9294#rec模型
```
在前面三个小节工作做好的基础上,一个服务启动两个模型串联,只需要在`--model后依次按顺序传入模型文件夹的相对路径`即可,脚本代码如下:
# 3. 服务启动与调用
## 3.1 Server端启动
在前面两个小节工作做好的基础上,一个服务启动两个模型串联,只需要在`--model后依次按顺序传入模型文件夹的相对路径`,且需要在`--op后依次传入自定义C++OP类名称`,其中--model后面的模型与--op后面的类名称的顺序需要对应,脚本代码如下:
```python
#一个服务启动多模型串联
python3 -m paddle_serving_server.serve --model ocr_det_model ocr_rec_model --port 9295#多模型串联
python3 -m paddle_serving_server.serve --model ocr_det_model ocr_rec_model --op GeneralDetectionOp GeneralInferOp --port 9292
#多模型串联 ocr_det_model对应GeneralDetectionOp ocr_rec_model对应GeneralInferOp
```
## 4.2 Client端调用
此时,Client端的调用,也需要传入两个Client端的[proto定义](./Serving_Configure_CN.md),python脚本代码如下:
## 3.2 Client端调用
此时,Client端的调用,也需要传入两个Client端的proto文件或文件夹的路径,以OCR为例,python脚本代码如下:
```python
#一个服务启动多模型串联
python3 ocr_cpp_client.py ocr_det_client ocr_rec_client
#ocr_det_client为第一个模型的Client端proto文件夹相对路径
#ocr_rec_client为第二个模型的Client端proto文件夹相对路径
python3 [ocr_cpp_client.py](../../examples/C++/PaddleOCR/ocr/ocr_cpp_client.py) ocr_det_client ocr_rec_client
#ocr_det_client为第一个模型的Client端proto文件夹相对路径
#ocr_rec_client为第二个模型的Client端proto文件夹相对路径
```
此时,对于Server端而言,`'general_reader'`会检查输入的数据的格式是否与第一个模型的Client端proto格式定义的一致,`'general_response'`会保证输出的数据格式与第二个模型的Client端proto文件一致
此时,对于Server端而言,输入的数据的格式与`第一个模型的Client端proto格式`定义的一致,输出的数据格式与`最后一个模型的Client端proto`文件一致。如果您不了解[proto的定义,请参考此处](./Serving_Configure_CN.md)
......@@ -30,9 +30,9 @@ from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
op_maker = serving.OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
read_op = op_maker.create('GeneralReaderOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
......@@ -65,13 +65,13 @@ from paddle_serving_server import OpGraphMaker
from paddle_serving_server import Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
read_op = op_maker.create('GeneralReaderOp')
cnn_infer_op = op_maker.create(
'general_infer', engine_name='cnn', inputs=[read_op])
'GeneralInferOp', engine_name='cnn', inputs=[read_op])
bow_infer_op = op_maker.create(
'general_infer', engine_name='bow', inputs=[read_op])
'GeneralInferOp', engine_name='bow', inputs=[read_op])
response_op = op_maker.create(
'general_response', inputs=[cnn_infer_op, bow_infer_op])
'GeneralResponseOp', inputs=[cnn_infer_op, bow_infer_op])
op_graph_maker = OpGraphMaker()
op_graph_maker.add_op(read_op)
......@@ -92,10 +92,10 @@ from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
op_maker = serving.OpMaker()
read_op = op_maker.create('general_reader')
dist_kv_op = op_maker.create('general_dist_kv')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
read_op = op_maker.create('GeneralReaderOp')
dist_kv_op = op_maker.create('GeneralDistKVInferOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
......
......@@ -29,9 +29,9 @@ from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
op_maker = serving.OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
read_op = op_maker.create('GeneralReaderOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
......@@ -63,13 +63,13 @@ from paddle_serving_server import OpGraphMaker
from paddle_serving_server import Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
read_op = op_maker.create('GeneralReaderOp')
cnn_infer_op = op_maker.create(
'general_infer', engine_name='cnn', inputs=[read_op])
'GeneralInferOp', engine_name='cnn', inputs=[read_op])
bow_infer_op = op_maker.create(
'general_infer', engine_name='bow', inputs=[read_op])
'GeneralInferOp', engine_name='bow', inputs=[read_op])
response_op = op_maker.create(
'general_response', inputs=[cnn_infer_op, bow_infer_op])
'GeneralResponseOp', inputs=[cnn_infer_op, bow_infer_op])
op_graph_maker = OpGraphMaker()
op_graph_maker.add_op(read_op)
......@@ -90,10 +90,10 @@ from paddle_serving_server import OpMaker
from paddle_serving_server import OpSeqMaker
op_maker = serving.OpMaker()
read_op = op_maker.create('general_reader')
dist_kv_op = op_maker.create('general_dist_kv')
general_infer_op = op_maker.create('general_infer')
general_response_op = op_maker.create('general_response')
read_op = op_maker.create('GeneralReaderOp')
dist_kv_op = op_maker.create('GeneralDistKVInferOp')
general_infer_op = op_maker.create('GeneralInferOp')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker = serving.OpSeqMaker()
op_seq_maker.add_op(read_op)
......
......@@ -20,9 +20,9 @@ from paddle_serving_server import OpSeqMaker
from paddle_serving_server import Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_dist_kv_infer_op = op_maker.create('general_dist_kv_infer')
response_op = op_maker.create('general_response')
read_op = op_maker.create('GeneralReaderOp')
general_dist_kv_infer_op = op_maker.create('GeneralDistKVInferOp')
response_op = op_maker.create('GeneralResponseOp')
op_seq_maker = OpSeqMaker()
op_seq_maker.add_op(read_op)
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .proto import server_configure_pb2 as server_sdk
import google.protobuf.text_format
import collections
class OpMaker(object):
def __init__(self):
self.op_dict = {
"general_infer": "GeneralInferOp",
"general_reader": "GeneralReaderOp",
"general_response": "GeneralResponseOp",
"general_text_reader": "GeneralTextReaderOp",
"general_text_response": "GeneralTextResponseOp",
"general_single_kv": "GeneralSingleKVOp",
"general_dist_kv_infer": "GeneralDistKVInferOp",
"general_dist_kv": "GeneralDistKVOp",
"general_copy": "GeneralCopyOp",
"general_detection":"GeneralDetectionOp",
}
self.op_list = [
"GeneralInferOp",
"GeneralReaderOp",
"GeneralResponseOp",
"GeneralTextReaderOp",
"GeneralTextResponseOp",
"GeneralSingleKVOp",
"GeneralDistKVInferOp",
"GeneralDistKVOp",
"GeneralCopyOp",
"GeneralDetectionOp",
]
self.node_name_suffix_ = collections.defaultdict(int)
def create(self, node_type, engine_name=None, inputs=[], outputs=[]):
if node_type not in self.op_dict:
if node_type not in self.op_list:
raise Exception("Op type {} is not supported right now".format(
node_type))
node = server_sdk.DAGNode()
......@@ -32,7 +46,7 @@ class OpMaker(object):
self.node_name_suffix_[node_type])
self.node_name_suffix_[node_type] += 1
node.type = self.op_dict[node_type]
node.type = node_type
if inputs:
for dep_node_str in inputs:
dep_node = server_sdk.DAGNode()
......@@ -47,6 +61,7 @@ class OpMaker(object):
# overall efficiency.
return google.protobuf.text_format.MessageToString(node)
class OpSeqMaker(object):
def __init__(self):
self.workflow = server_sdk.Workflow()
......@@ -79,6 +94,7 @@ class OpSeqMaker(object):
workflow_conf.workflows.extend([self.workflow])
return workflow_conf
# TODO:Currently, SDK only supports "Sequence".OpGraphMaker is not useful.
# Config should be changed to adapt command-line for list[dict] or list[list[] ]
class OpGraphMaker(object):
......
......@@ -142,6 +142,8 @@ def serve_args():
help="Max batch of each op")
parser.add_argument(
"--model", type=str, default="", nargs="+", help="Model for serving")
parser.add_argument(
"--op", type=str, default="", nargs="+", help="Model for serving")
parser.add_argument(
"--workdir",
type=str,
......@@ -183,7 +185,10 @@ def serve_args():
parser.add_argument(
"--use_xpu", default=False, action="store_true", help="Use XPU")
parser.add_argument(
"--use_ascend_cl", default=False, action="store_true", help="Use Ascend CL")
"--use_ascend_cl",
default=False,
action="store_true",
help="Use Ascend CL")
parser.add_argument(
"--product_name",
type=str,
......@@ -208,6 +213,11 @@ def start_gpu_card_model(gpu_mode, port, args): # pylint: disable=doc-string-mi
if gpu_mode == True:
device = "gpu"
import paddle_serving_server as serving
op_maker = serving.OpMaker()
op_seq_maker = serving.OpSeqMaker()
server = serving.Server()
thread_num = args.thread
model = args.model
mem_optim = args.mem_optim_off is False
......@@ -215,38 +225,58 @@ def start_gpu_card_model(gpu_mode, port, args): # pylint: disable=doc-string-mi
use_mkl = args.use_mkl
max_body_size = args.max_body_size
workdir = "{}_{}".format(args.workdir, port)
dag_list_op = []
if model == "":
print("You must specify your serving model")
exit(-1)
for single_model_config in args.model:
if os.path.isdir(single_model_config):
pass
elif os.path.isfile(single_model_config):
raise ValueError("The input of --model should be a dir not file.")
import paddle_serving_server as serving
op_maker = serving.OpMaker()
op_seq_maker = serving.OpSeqMaker()
read_op = op_maker.create('general_reader')
# 如果通过--op GeneralDetectionOp GeneralRecOp
# 将不存在的自定义OP加入到DAG图和模型的列表中
# 并将传入顺序记录在dag_list_op中。
if args.op != "":
for single_op in args.op:
temp_str_list = single_op.split(':')
if len(temp_str_list) >= 1 and temp_str_list[0] != '':
if temp_str_list[0] not in op_maker.op_list:
op_maker.op_list.append(temp_str_list[0])
if len(temp_str_list) >= 2 and temp_str_list[1] == '0':
pass
else:
server.default_engine_types.append(temp_str_list[0])
dag_list_op.append(temp_str_list[0])
read_op = op_maker.create('GeneralReaderOp')
op_seq_maker.add_op(read_op)
for idx, single_model in enumerate(model):
infer_op_name = "general_infer"
# 目前由于ocr的节点Det模型依赖于opencv的第三方库
# 只有使用ocr的时候,才会加入opencv的第三方库并编译GeneralDetectionOp
# 故此处做特殊处理,当不满足下述情况时,所添加的op默认为GeneralInferOp
# 以后可能考虑不用python脚本来生成配置
if len(model) == 2 and idx == 0 and single_model == "ocr_det_model":
infer_op_name = "general_detection"
else:
infer_op_name = "general_infer"
general_infer_op = op_maker.create(infer_op_name)
op_seq_maker.add_op(general_infer_op)
#如果dag_list_op不是空,那么证明通过--op 传入了自定义OP或自定义的DAG串联关系。
#此时,根据--op 传入的顺序去组DAG串联关系
if len(dag_list_op) > 0:
for single_op in dag_list_op:
op_seq_maker.add_op(op_maker.create(single_op))
#否则,仍然按照原有方式根虎--model去串联。
else:
for idx, single_model in enumerate(model):
infer_op_name = "GeneralInferOp"
# 目前由于ocr的节点Det模型依赖于opencv的第三方库
# 只有使用ocr的时候,才会加入opencv的第三方库并编译GeneralDetectionOp
# 故此处做特殊处理,当不满足下述情况时,所添加的op默认为GeneralInferOp
# 以后可能考虑不用python脚本来生成配置
if len(model) == 2 and idx == 0 and single_model == "ocr_det_model":
infer_op_name = "GeneralDetectionOp"
else:
infer_op_name = "GeneralInferOp"
general_infer_op = op_maker.create(infer_op_name)
op_seq_maker.add_op(general_infer_op)
general_response_op = op_maker.create('general_response')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker.add_op(general_response_op)
server = serving.Server()
server.set_op_sequence(op_seq_maker.get_op_sequence())
server.set_num_threads(thread_num)
server.use_mkl(use_mkl)
......
......@@ -49,8 +49,8 @@ class Server(object):
self.workflow_fn:'str'="workflow.prototxt" # Only one for one Service/Workflow
self.resource_fn:'str'="resource.prototxt" # Only one for one Service,model_toolkit_fn and general_model_config_fn is recorded in this file
self.infer_service_fn:'str'="infer_service.prototxt" # Only one for one Service,Service--Workflow
self.model_toolkit_fn:'list'=[] # ["general_infer_0/model_toolkit.prototxt"]The quantity is equal to the InferOp quantity,Engine--OP
self.general_model_config_fn:'list'=[] # ["general_infer_0/general_model.prototxt"]The quantity is equal to the InferOp quantity,Feed and Fetch --OP
self.model_toolkit_fn:'list'=[] # ["GeneralInferOp_0/model_toolkit.prototxt"]The quantity is equal to the InferOp quantity,Engine--OP
self.general_model_config_fn:'list'=[] # ["GeneralInferOp_0/general_model.prototxt"]The quantity is equal to the InferOp quantity,Feed and Fetch --OP
self.subdirectory:'list'=[] # The quantity is equal to the InferOp quantity, and name = node.name = engine.name
self.model_config_paths:'collections.OrderedDict()' # Save the serving_server_conf.prototxt path (feed and fetch information) this is a map for multi-model in a workflow
"""
......@@ -92,6 +92,12 @@ class Server(object):
self.model_config_paths = collections.OrderedDict()
self.product_name = None
self.container_id = None
self.default_engine_types = [
'GeneralInferOp',
'GeneralDistKVInferOp',
'GeneralDistKVQuantInferOp',
'GeneralDetectionOp',
]
def get_fetch_list(self, infer_node_idx=-1):
fetch_names = [
......@@ -298,7 +304,7 @@ class Server(object):
fout.write(str(list(self.model_conf.values())[idx]))
for workflow in self.workflow_conf.workflows:
for node in workflow.nodes:
if "dist_kv" in node.name:
if "distkv" in node.name.lower():
self.resource_conf.cube_config_path = workdir
self.resource_conf.cube_config_file = self.cube_config_fn
if cube_conf == None:
......@@ -306,7 +312,7 @@ class Server(object):
"Please set the path of cube.conf while use dist_kv op."
)
shutil.copy(cube_conf, workdir)
if "quant" in node.name:
if "quant" in node.name.lower():
self.resource_conf.cube_quant_bits = 8
self.resource_conf.model_toolkit_path.extend([workdir])
self.resource_conf.model_toolkit_file.extend(
......@@ -343,17 +349,12 @@ class Server(object):
# If there is only one model path, use the default infer_op.
# Because there are several infer_op type, we need to find
# it from workflow_conf.
default_engine_types = [
'GeneralInferOp',
'GeneralDistKVInferOp',
'GeneralDistKVQuantInferOp',
'GeneralDetectionOp',
]
# now only support single-workflow.
# TODO:support multi-workflow
model_config_paths_list_idx = 0
for node in self.workflow_conf.workflows[0].nodes:
if node.type in default_engine_types:
if node.type in self.default_engine_types:
if node.name is None:
raise Exception(
"You have set the engine_name of Op. Please use the form {op: model_path} to configure model path"
......
......@@ -157,19 +157,19 @@ class WebService(object):
op_maker = OpMaker()
op_seq_maker = OpSeqMaker()
read_op = op_maker.create('general_reader')
read_op = op_maker.create('GeneralReaderOp')
op_seq_maker.add_op(read_op)
for idx, single_model in enumerate(self.server_config_dir_paths):
infer_op_name = "general_infer"
infer_op_name = "GeneralInferOp"
if len(self.server_config_dir_paths) == 2 and idx == 0:
infer_op_name = "general_detection"
infer_op_name = "GeneralDetectionOp"
else:
infer_op_name = "general_infer"
infer_op_name = "GeneralInferOp"
general_infer_op = op_maker.create(infer_op_name)
op_seq_maker.add_op(general_infer_op)
general_response_op = op_maker.create('general_response')
general_response_op = op_maker.create('GeneralResponseOp')
op_seq_maker.add_op(general_response_op)
server.set_op_sequence(op_seq_maker.get_op_sequence())
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册