提交 7a3a98f2 编写于 作者: M MRXLT

fix conflict

Contribute Code
You are welcome to contribute to project Paddle Serving.
We sincerely appreciate your contribution. This document explains our workflow and work style.
## Workflow
Paddle Serving uses this [Git branching model](http://nvie.com/posts/a-successful-git-branching-model/). The following steps guide usual contributions.
1. Fork
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the ["Fork" button](https://help.github.com/articles/fork-a-repo/).
1. Clone
To make a copy of your fork to your local computers, please run
```bash
git clone https://github.com/your-github-account/Serving
cd Serving
```
1. Create the local feature branch
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
```bash
git checkout -b my-cool-stuff
```
1. Commit
Before issuing your first `git commit` command, please install [`pre-commit`](http://pre-commit.com/) by running the following commands:
```bash
pip install pre-commit
pre-commit install
```
Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
Once installed, `pre-commit` checks the style of code and documentation in every commit. We will see something like the following when you run `git commit`:
```shell
$ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
NOTE: The `yapf` installed by `pip install pre-commit` and `conda install -c conda-forge pre-commit` is slightly different. Paddle developers use `pip install pre-commit`.
1. Build and test
Users can build Paddle Serving natively on Linux, see the [BUILD steps](doc/INSTALL.md).
1. Keep pulling
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
```bash
git remote add upstream https://github.com/PaddlePaddle/Serving
git pull upstream develop
```
1. Push and file a pull request
You can "push" your local work into your forked repo:
```bash
git push origin my-cool-stuff
```
The push allows you to create a pull request, requesting owners of this [official repo](https://github.com/PaddlePaddle/Serving) to pull your change into the official one.
To create a pull request, please follow [these steps](https://help.github.com/articles/creating-a-pull-request/).
If your change is for fixing an issue, please write ["Fixes <issue-URL>"](https://help.github.com/articles/closing-issues-using-keywords/) in the description section of your pull request. Github would close the issue when the owners merge your pull request.
Please remember to specify some reviewers for your pull request. If you don't know who are the right ones, please follow Github's recommendation.
1. Delete local and remote branches
To keep your local workspace and your fork clean, you might want to remove merged branches:
```bash
git push origin :my-cool-stuff
git checkout develop
git pull upstream develop
git branch -d my-cool-stuff
```
### Code Review
- Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
- Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; please give a reason otherwise.
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/).
- Reduce the unnecessary commits. Some developers commit often. It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
## Coding Standard
### Code Style
Our C/C++ code follows the [Google style guide](http://google.github.io/styleguide/cppguide.html).
Our Python code follows the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/).
Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`. To check the whole codebase, we can run the command `pre-commit run -a`, which is invoked by [our Travis CI configuration].
### Unit Tests
Please remember to add related unit tests.
- For C/C++ code, please follow [`google-test` Primer](https://github.com/google/googletest/blob/master/googletest/docs/primer.md) .
- For Python code, please use [Python's standard `unittest` package](http://pythontesting.net/framework/unittest/unittest-introduction/).
### Writing Logs
We use [glog](https://github.com/google/glog) for logging in our C/C++ code.
We use LOG() for general logging
```c++
LOG(INFO) << "Operator FC is taking " << num_inputs << "inputs."
```
When we run a Paddle Serving application or test, we can specify a logging level. For example:
```bash
GLOG_minloglevel=1 bin/serving
```
0 - INFO
1 - WARNING
2 -ERROR
3 - FATAL (Be careful as FATAL log will generate a coredump)
......@@ -70,11 +70,13 @@ from paddle_serving.serving_server import Server
op_maker = OpMaker()
read_op = op_maker.create('general_reader')
general_infer_op = op_maker.create('general_infer')
infer_op = op_maker.create('general_infer')
response_op = op_maker.create('general_response')
op_seq_maker = OpSeqMaker()
op_seq_maker.add_op(read_op)
op_seq_maker.add_op(general_infer_op)
op_seq_maker.add_op(infer_op)
op_seq_maker.add_op(response_op)
server = Server()
server.set_op_sequence(op_seq_maker.get_op_sequence())
......
......@@ -26,5 +26,5 @@ endif()
if (NOT CLIENT_ONLY)
add_subdirectory(predictor)
add_subdirectory(general-server)
add_subdirectory(util)
endif()
add_subdirectory(util)
if(CLIENT_ONLY)
add_subdirectory(pybind11)
pybind11_add_module(serving_client src/general_model.cpp src/pybind_general_model.cpp)
target_link_libraries(serving_client PRIVATE -Wl,--whole-archive sdk-cpp pybind python -Wl,--no-whole-archive -lpthread -lcrypto -lm -lrt -lssl -ldl -lz)
target_link_libraries(serving_client PRIVATE -Wl,--whole-archive utils sdk-cpp pybind python -Wl,--no-whole-archive -lpthread -lcrypto -lm -lrt -lssl -ldl -lz)
endif()
......@@ -31,6 +31,9 @@
using baidu::paddle_serving::sdk_cpp::Predictor;
using baidu::paddle_serving::sdk_cpp::PredictorApi;
DECLARE_bool(profile_client);
DECLARE_bool(profile_server);
// given some input data, pack into pb, and send request
namespace baidu {
namespace paddle_serving {
......@@ -45,6 +48,8 @@ class PredictorClient {
PredictorClient() {}
~PredictorClient() {}
void init_gflags(std::vector<std::string> argv);
int init(const std::string& client_conf);
void set_predictor_conf(const std::string& conf_path,
......@@ -87,6 +92,7 @@ class PredictorClient {
std::map<std::string, std::string> _fetch_name_to_var_name;
std::vector<std::vector<int>> _shape;
std::vector<int> _type;
std::vector<int64_t> _last_request_ts;
};
} // namespace general_model
......
make: *** No targets specified and no makefile found. Stop.
......@@ -17,23 +17,46 @@
#include "core/sdk-cpp/builtin_format.pb.h"
#include "core/sdk-cpp/include/common.h"
#include "core/sdk-cpp/include/predictor_sdk.h"
#include "core/util/include/timer.h"
DEFINE_bool(profile_client, false, "");
DEFINE_bool(profile_server, false, "");
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::general_model::Request;
using baidu::paddle_serving::predictor::general_model::Response;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::FeedInst;
using baidu::paddle_serving::predictor::general_model::FetchInst;
std::once_flag gflags_init_flag;
namespace baidu {
namespace paddle_serving {
namespace general_model {
using configure::GeneralModelConfig;
void PredictorClient::init_gflags(std::vector<std::string> argv) {
std::call_once(gflags_init_flag, [&]() {
FLAGS_logtostderr = true;
argv.insert(argv.begin(), "dummy");
int argc = argv.size();
char **arr = new char *[argv.size()];
std::string line;
for (size_t i = 0; i < argv.size(); i++) {
arr[i] = &argv[i][0];
line += argv[i];
line += ' ';
}
google::ParseCommandLineFlags(&argc, &arr, true);
VLOG(2) << "Init commandline: " << line;
});
}
int PredictorClient::init(const std::string &conf_file) {
try {
GeneralModelConfig model_config;
if (configure::read_proto_conf(conf_file.c_str(),
&model_config) != 0) {
if (configure::read_proto_conf(conf_file.c_str(), &model_config) != 0) {
LOG(ERROR) << "Failed to load general model config"
<< ", file path: " << conf_file;
return -1;
......@@ -51,26 +74,27 @@ int PredictorClient::init(const std::string &conf_file) {
VLOG(2) << "feed alias name: " << model_config.feed_var(i).alias_name()
<< " index: " << i;
std::vector<int> tmp_feed_shape;
VLOG(2) << "feed" << "[" << i << "] shape:";
VLOG(2) << "feed"
<< "[" << i << "] shape:";
for (int j = 0; j < model_config.feed_var(i).shape_size(); ++j) {
tmp_feed_shape.push_back(model_config.feed_var(i).shape(j));
VLOG(2) << "shape[" << j << "]: "
<< model_config.feed_var(i).shape(j);
VLOG(2) << "shape[" << j << "]: " << model_config.feed_var(i).shape(j);
}
_type.push_back(model_config.feed_var(i).feed_type());
VLOG(2) << "feed" << "[" << i << "] feed type: "
<< model_config.feed_var(i).feed_type();
VLOG(2) << "feed"
<< "[" << i
<< "] feed type: " << model_config.feed_var(i).feed_type();
_shape.push_back(tmp_feed_shape);
}
for (int i = 0; i < fetch_var_num; ++i) {
_fetch_name_to_idx[model_config.fetch_var(i).alias_name()] = i;
VLOG(2) << "fetch [" << i << "]" << " alias name: "
<< model_config.fetch_var(i).alias_name();
VLOG(2) << "fetch [" << i << "]"
<< " alias name: " << model_config.fetch_var(i).alias_name();
_fetch_name_to_var_name[model_config.fetch_var(i).alias_name()] =
model_config.fetch_var(i).name();
}
} catch (std::exception& e) {
} catch (std::exception &e) {
LOG(ERROR) << "Failed load general model config" << e.what();
return -1;
}
......@@ -88,7 +112,7 @@ int PredictorClient::destroy_predictor() {
_api.destroy();
}
int PredictorClient::create_predictor_by_desc(const std::string & sdk_desc) {
int PredictorClient::create_predictor_by_desc(const std::string &sdk_desc) {
if (_api.create(sdk_desc) != 0) {
LOG(ERROR) << "Predictor Creation Failed";
return -1;
......@@ -117,17 +141,22 @@ std::vector<std::vector<float>> PredictorClient::predict(
return fetch_result;
}
Timer timeline;
int64_t preprocess_start = timeline.TimeStampUS();
// we save infer_us at fetch_result[fetch_name.size()]
fetch_result.resize(fetch_name.size() + 1);
_api.thrd_clear();
_predictor = _api.fetch_predictor("general_model");
VLOG(2) << "fetch general model predictor done.";
VLOG(2) << "float feed name size: " << float_feed_name.size();
VLOG(2) << "int feed name size: " << int_feed_name.size();
VLOG(2) << "fetch name size: " << fetch_name.size();
Request req;
for (auto & name : fetch_name) {
for (auto &name : fetch_name) {
req.add_fetch_var_names(name);
}
std::vector<Tensor *> tensor_vec;
......@@ -175,16 +204,28 @@ std::vector<std::vector<float>> PredictorClient::predict(
vec_idx++;
}
VLOG(2) << "feed int feed var done.";
int64_t preprocess_end = timeline.TimeStampUS();
// std::map<std::string, std::vector<float> > result;
int64_t client_infer_start = timeline.TimeStampUS();
Response res;
int64_t client_infer_end = 0;
int64_t postprocess_start = 0;
int64_t postprocess_end = 0;
if (FLAGS_profile_client) {
if (FLAGS_profile_server) {
req.set_profile_server(true);
}
}
res.Clear();
if (_predictor->inference(&req, &res) != 0) {
LOG(ERROR) << "failed call predictor with req: " << req.ShortDebugString();
exit(-1);
} else {
client_infer_end = timeline.TimeStampUS();
postprocess_start = client_infer_end;
for (auto &name : fetch_name) {
int idx = _fetch_name_to_idx[name];
int len = res.insts(0).tensor_array(idx).data_size();
......@@ -196,8 +237,29 @@ std::vector<std::vector<float>> PredictorClient::predict(
*(const float *)res.insts(0).tensor_array(idx).data(i).c_str();
}
}
fetch_result[fetch_name.size()].resize(1);
fetch_result[fetch_name.size()][0] = res.mean_infer_us();
postprocess_end = timeline.TimeStampUS();
}
if (FLAGS_profile_client) {
std::ostringstream oss;
oss << "PROFILE\t"
<< "prepro_0:" << preprocess_start << " "
<< "prepro_1:" << preprocess_end << " "
<< "client_infer_0:" << client_infer_start << " "
<< "client_infer_1:" << client_infer_end << " ";
if (FLAGS_profile_server) {
int op_num = res.profile_time_size() / 2;
for (int i = 0; i < op_num; ++i) {
oss << "op" << i << "_0:" << res.profile_time(i * 2) << " ";
oss << "op" << i << "_1:" << res.profile_time(i * 2 + 1) << " ";
}
}
oss << "postpro_0:" << postprocess_start << " ";
oss << "postpro_1:" << postprocess_end;
fprintf(stderr, "%s\n", oss.str().c_str());
}
return fetch_result;
......@@ -226,7 +288,7 @@ std::vector<std::vector<std::vector<float>>> PredictorClient::batch_predict(
VLOG(2) << "float feed name size: " << float_feed_name.size();
VLOG(2) << "int feed name size: " << int_feed_name.size();
Request req;
for (auto & name : fetch_name) {
for (auto &name : fetch_name) {
req.add_fetch_var_names(name);
}
//
......@@ -262,7 +324,8 @@ std::vector<std::vector<std::vector<float>>> PredictorClient::batch_predict(
vec_idx++;
}
VLOG(2) << "batch [" << bi << "] " << "float feed value prepared";
VLOG(2) << "batch [" << bi << "] "
<< "float feed value prepared";
vec_idx = 0;
for (auto &name : int_feed_name) {
......@@ -282,7 +345,8 @@ std::vector<std::vector<std::vector<float>>> PredictorClient::batch_predict(
vec_idx++;
}
VLOG(2) << "batch [" << bi << "] " << "itn feed value prepared";
VLOG(2) << "batch [" << bi << "] "
<< "itn feed value prepared";
}
Response res;
......@@ -308,10 +372,6 @@ std::vector<std::vector<std::vector<float>>> PredictorClient::batch_predict(
}
}
}
//last index for infer time
fetch_result_batch[batch_size].resize(1);
fetch_result_batch[batch_size][0].resize(1);
fetch_result_batch[batch_size][0][0] = res.mean_infer_us();
}
return fetch_result_batch;
......
......@@ -31,6 +31,10 @@ PYBIND11_MODULE(serving_client, m) {
)pddoc";
py::class_<PredictorClient>(m, "PredictorClient", py::buffer_protocol())
.def(py::init())
.def("init_gflags",
[](PredictorClient &self, std::vector<std::string> argv) {
self.init_gflags(argv);
})
.def("init",
[](PredictorClient &self, const std::string &conf) {
return self.init(conf);
......
......@@ -14,6 +14,7 @@
#pragma once
#include <string.h>
#include <vector>
#ifdef BCLOUD
#ifdef WITH_GPU
......@@ -34,8 +35,8 @@ static const char* GENERAL_MODEL_NAME = "general_model";
struct GeneralBlob {
std::vector<paddle::PaddleTensor> tensor_vector;
double infer_time;
std::vector<std::string> fetch_name_vector;
int64_t time_stamp[20];
int p_size = 0;
int _batch_size;
......@@ -50,22 +51,20 @@ struct GeneralBlob {
int SetBatchSize(int batch_size) { _batch_size = batch_size; }
int GetBatchSize() const { return _batch_size; }
/*
int GetBatchSize() const {
if (tensor_vector.size() > 0) {
if (tensor_vector[0].lod.size() == 1) {
return tensor_vector[0].lod[0].size() - 1;
} else {
return tensor_vector[0].shape[0];
}
} else {
return -1;
}
}
*/
std::string ShortDebugString() const { return "Not implemented!"; }
};
static void AddBlobInfo(GeneralBlob* blob, int64_t init_value) {
blob->time_stamp[blob->p_size] = init_value;
blob->p_size++;
}
static void CopyBlobInfo(const GeneralBlob* src, GeneralBlob* tgt) {
memcpy(&(tgt->time_stamp[0]),
&(src->time_stamp[0]),
src->p_size * sizeof(int64_t));
}
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
......@@ -51,16 +51,20 @@ int GeneralInferOp::inference() {
output_blob->SetBatchSize(batch_size);
VLOG(2) << "infer batch size: " << batch_size;
// infer
// Timer timeline;
// double infer_time = 0.0;
// timeline.Start();
Timer timeline;
int64_t start = timeline.TimeStampUS();
timeline.Start();
if (InferManager::instance().infer(GENERAL_MODEL_NAME, in, out, batch_size)) {
LOG(ERROR) << "Failed do infer in fluid model: " << GENERAL_MODEL_NAME;
return -1;
}
// timeline.Pause();
// infer_time = timeline.ElapsedUS();
int64_t end = timeline.TimeStampUS();
CopyBlobInfo(input_blob, output_blob);
AddBlobInfo(output_blob, start);
AddBlobInfo(output_blob, end);
return 0;
}
DEFINE_OP(GeneralInferOp);
......
......@@ -20,11 +20,13 @@
#include "core/general-server/op/general_infer_helper.h"
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/util/include/timer.h"
namespace baidu {
namespace paddle_serving {
namespace serving {
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::MempoolWrapper;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::Request;
......@@ -86,9 +88,10 @@ int GeneralReaderOp::inference() {
LOG(ERROR) << "Failed get op tls reader object output";
}
Timer timeline;
int64_t start = timeline.TimeStampUS();
int var_num = req->insts(0).tensor_array_size();
VLOG(2) << "var num: " << var_num;
// read config
VLOG(2) << "start to call load general model_conf op";
baidu::paddle_serving::predictor::Resource &resource =
......@@ -197,6 +200,12 @@ int GeneralReaderOp::inference() {
}
}
timeline.Pause();
int64_t end = timeline.TimeStampUS();
res->p_size = 0;
AddBlobInfo(res, start);
AddBlobInfo(res, end);
VLOG(2) << "read data from client success";
return 0;
}
......
......@@ -51,6 +51,11 @@ int GeneralResponseOp::inference() {
const Request *req = dynamic_cast<const Request *>(get_request_message());
Timer timeline;
// double response_time = 0.0;
// timeline.Start();
int64_t start = timeline.TimeStampUS();
VLOG(2) << "start to call load general model_conf op";
baidu::paddle_serving::predictor::Resource &resource =
baidu::paddle_serving::predictor::Resource::instance();
......@@ -69,8 +74,6 @@ int GeneralResponseOp::inference() {
// response inst with only fetch_var_names
Response *res = mutable_data<Response>();
// res->set_mean_infer_us(infer_time);
for (int i = 0; i < batch_size; ++i) {
FetchInst *fetch_inst = res->add_insts();
for (auto &idx : fetch_index) {
......@@ -123,6 +126,18 @@ int GeneralResponseOp::inference() {
}
var_idx++;
}
if (req->profile_server()) {
int64_t end = timeline.TimeStampUS();
VLOG(2) << "p size for input blob: " << input_blob->p_size;
for (int i = 0; i < input_blob->p_size; ++i) {
res->add_profile_time(input_blob->time_stamp[i]);
}
// TODO(guru4elephant): find more elegant way to do this
res->add_profile_time(start);
res->add_profile_time(end);
}
return 0;
}
......
......@@ -19,11 +19,13 @@
#include "core/general-server/op/general_text_reader_op.h"
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/util/include/timer.h"
namespace baidu {
namespace paddle_serving {
namespace serving {
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::MempoolWrapper;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::Request;
......@@ -54,9 +56,11 @@ int GeneralTextReaderOp::inference() {
return -1;
}
Timer timeline;
int64_t start = timeline.TimeStampUS();
int var_num = req->insts(0).tensor_array_size();
VLOG(2) << "var num: " << var_num;
// read config
VLOG(2) << "start to call load general model_conf op";
baidu::paddle_serving::predictor::Resource &resource =
......@@ -157,6 +161,10 @@ int GeneralTextReaderOp::inference() {
}
}
int64_t end = timeline.TimeStampUS();
AddBlobInfo(res, start);
AddBlobInfo(res, end);
VLOG(2) << "read data from client success";
return 0;
}
......
......@@ -12,11 +12,11 @@
// See the License for the specific language governing permissions and
// limitations under the License.
#include "core/general-server/op/general_text_response_op.h"
#include <algorithm>
#include <iostream>
#include <memory>
#include <sstream>
#include "core/general-server/op/general_text_response_op.h"
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/predictor/framework/resource.h"
......@@ -36,12 +36,10 @@ using baidu::paddle_serving::predictor::InferManager;
using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
int GeneralTextResponseOp::inference() {
const GeneralBlob *input_blob =
get_depend_argument<GeneralBlob>(pre_name());
const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name());
if (!input_blob) {
LOG(ERROR) << "Failed mutable depended argument, op: "
<< pre_name();
LOG(ERROR) << "Failed mutable depended argument, op: " << pre_name();
return -1;
}
......@@ -49,10 +47,11 @@ int GeneralTextResponseOp::inference() {
int batch_size = input_blob->GetBatchSize();
VLOG(2) << "infer batch size: " << batch_size;
// infer
const Request *req = dynamic_cast<const Request *>(get_request_message());
Timer timeline;
int64_t start = timeline.TimeStampUS();
VLOG(2) << "start to call load general model_conf op";
baidu::paddle_serving::predictor::Resource &resource =
baidu::paddle_serving::predictor::Resource::instance();
......@@ -67,15 +66,13 @@ int GeneralTextResponseOp::inference() {
fetch_index[i] =
model_config->_fetch_alias_name_to_index[req->fetch_var_names(i)];
}
// response inst with only fetch_var_names
Response *res = mutable_data<Response>();
// res->set_mean_infer_us(infer_time);
for (int i = 0; i < batch_size; ++i) {
FetchInst *fetch_inst = res->add_insts();
for (auto & idx : fetch_index) {
for (auto &idx : fetch_index) {
Tensor *tensor = fetch_inst->add_tensor_array();
// currently only response float tensor or lod_tensor
tensor->set_elem_type(1);
......@@ -85,8 +82,7 @@ int GeneralTextResponseOp::inference() {
} else {
VLOG(2) << "out[" << idx << "] is tensor";
for (int k = 1; k < in->at(idx).shape.size(); ++k) {
VLOG(2) << "shape[" << k - 1 << "]: "
<< in->at(idx).shape[k];
VLOG(2) << "shape[" << k - 1 << "]: " << in->at(idx).shape[k];
tensor->add_shape(in->at(idx).shape[k]);
}
}
......@@ -94,7 +90,7 @@ int GeneralTextResponseOp::inference() {
}
int var_idx = 0;
for (auto & idx : fetch_index) {
for (auto &idx : fetch_index) {
float *data_ptr = static_cast<float *>(in->at(idx).data.data());
int cap = 1;
for (int j = 1; j < in->at(idx).shape.size(); ++j) {
......@@ -102,8 +98,8 @@ int GeneralTextResponseOp::inference() {
}
if (model_config->_is_lod_fetch[idx]) {
for (int j = 0; j < batch_size; ++j) {
for (int k = in->at(idx).lod[0][j];
k < in->at(idx).lod[0][j + 1]; k++) {
for (int k = in->at(idx).lod[0][j]; k < in->at(idx).lod[0][j + 1];
k++) {
res->mutable_insts(j)->mutable_tensor_array(var_idx)->add_float_data(
data_ptr[k]);
}
......@@ -118,6 +114,18 @@ int GeneralTextResponseOp::inference() {
}
var_idx++;
}
if (req->profile_server()) {
int64_t end = timeline.TimeStampUS();
for (int i = 0; i < input_blob->p_size; ++i) {
res->add_profile_time(input_blob->time_stamp[i]);
}
// TODO(guru4elephant): find more elegant way to do this
res->add_profile_time(start);
res->add_profile_time(end);
}
return 0;
}
DEFINE_OP(GeneralTextResponseOp);
......
......@@ -38,11 +38,12 @@ message FetchInst {
message Request {
repeated FeedInst insts = 1;
repeated string fetch_var_names = 2;
optional bool profile_server = 3 [ default = false ];
};
message Response {
repeated FetchInst insts = 1;
optional float mean_infer_us = 2;
repeated int64 profile_time = 2;
};
service GeneralModelService {
......
......@@ -147,6 +147,7 @@ int InferService::inference(const google::protobuf::Message* request,
TRACEPRINTF("finish to thread clear");
if (_enable_map_request_to_workflow) {
LOG(INFO) << "enable map request == True";
std::vector<Workflow*>* workflows = _map_request_to_workflow(request);
if (!workflows || workflows->size() == 0) {
LOG(ERROR) << "Failed to map request to workflow";
......@@ -169,6 +170,7 @@ int InferService::inference(const google::protobuf::Message* request,
}
}
} else {
LOG(INFO) << "enable map request == False";
TRACEPRINTF("start to execute one workflow");
size_t fsize = _flows.size();
for (size_t fi = 0; fi < fsize; ++fi) {
......@@ -233,6 +235,7 @@ int InferService::_execute_workflow(Workflow* workflow,
TRACEPRINTF("finish to copy from");
workflow_time.stop();
LOG(INFO) << "workflow total time: " << workflow_time.u_elapsed();
PredictorMetric::GetInstance()->update_latency_metric(
WORKFLOW_METRIC_PREFIX + dv->full_name(), workflow_time.u_elapsed());
......
......@@ -38,11 +38,12 @@ message FetchInst {
message Request {
repeated FeedInst insts = 1;
repeated string fetch_var_names = 2;
optional bool profile_server = 3 [ default = false ];
};
message Response {
repeated FetchInst insts = 1;
optional float mean_infer_us = 2;
repeated int64 profile_time = 2;
};
service GeneralModelService {
......
include(src/CMakeLists.txt)
add_library(utils ${util_srcs})
......@@ -38,6 +38,7 @@ class Timer {
double ElapsedMS();
// return elapsed time in sec
double ElapsedSec();
int64_t TimeStampUS();
private:
struct timeval _start;
......
FILE(GLOB srcs ${CMAKE_CURRENT_LIST_DIR}/*.cc)
LIST(APPEND util_srcs ${srcs})
......@@ -54,6 +54,11 @@ double Timer::ElapsedMS() { return _elapsed / 1000.0; }
double Timer::ElapsedSec() { return _elapsed / 1000000.0; }
int64_t Timer::TimeStampUS() {
gettimeofday(&_now, NULL);
return _now.tv_usec;
}
int64_t Timer::Tickus() {
gettimeofday(&_now, NULL);
return (_now.tv_sec - _start.tv_sec) * 1000 * 1000L +
......
# Contribution Guideline
Contribute Code
## How to contribute
You are welcome to contribute to project Paddle Serving.
### Contribute Code
We sincerely appreciate your contribution. This document explains our workflow and work style.
If you have improvements on Paddle Serving, please send us Pull Requests! github.com provides guidelines for submitting pull requests [howto](https://help.github.com/articles/using-pull-requests/).
## Workflow
Paddle Serving uses this [Git branching model](http://nvie.com/posts/a-successful-git-branching-model/). The following steps guide usual contributions.
1. Fork
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the ["Fork" button](https://help.github.com/articles/fork-a-repo/).
1. Clone
To make a copy of your fork to your local computers, please run
```bash
git clone https://github.com/your-github-account/Serving
cd Serving
```
1. Create the local feature branch
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
```bash
git checkout -b my-cool-stuff
```
1. Commit
Before issuing your first `git commit` command, please install [`pre-commit`](http://pre-commit.com/) by running the following commands:
```bash
pip install pre-commit
pre-commit install
```
Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
Once installed, `pre-commit` checks the style of code and documentation in every commit. We will see something like the following when you run `git commit`:
```shell
$ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
NOTE: The `yapf` installed by `pip install pre-commit` and `conda install -c conda-forge pre-commit` is slightly different. Paddle developers use `pip install pre-commit`.
1. Build and test
Users can build Paddle Serving natively on Linux, see the [BUILD steps](doc/INSTALL.md).
1. Keep pulling
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
```bash
git remote add upstream https://github.com/PaddlePaddle/Serving
git pull upstream develop
```
1. Push and file a pull request
You can "push" your local work into your forked repo:
```bash
git push origin my-cool-stuff
```
The push allows you to create a pull request, requesting owners of this [official repo](https://github.com/PaddlePaddle/Serving) to pull your change into the official one.
To create a pull request, please follow [these steps](https://help.github.com/articles/creating-a-pull-request/).
If your change is for fixing an issue, please write ["Fixes <issue-URL>"](https://help.github.com/articles/closing-issues-using-keywords/) in the description section of your pull request. Github would close the issue when the owners merge your pull request.
Please remember to specify some reviewers for your pull request. If you don't know who are the right ones, please follow Github's recommendation.
1. Delete local and remote branches
To keep your local workspace and your fork clean, you might want to remove merged branches:
```bash
git push origin :my-cool-stuff
git checkout develop
git pull upstream develop
git branch -d my-cool-stuff
```
### Code Review
- Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
- Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; please give a reason otherwise.
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/).
- Reduce the unnecessary commits. Some developers commit often. It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
## Coding Standard
### Code Style
Our C/C++ code follows the [Google style guide](http://google.github.io/styleguide/cppguide.html).
Our Python code follows the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/).
Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`. To check the whole codebase, we can run the command `pre-commit run -a`, which is invoked by [our Travis CI configuration].
### Unit Tests
Please remember to add related unit tests.
- For C/C++ code, please follow [`google-test` Primer](https://github.com/google/googletest/blob/master/googletest/docs/primer.md) .
- For Python code, please use [Python's standard `unittest` package](http://pythontesting.net/framework/unittest/unittest-introduction/).
### Writing Logs
We use [glog](https://github.com/google/glog) for logging in our C/C++ code.
We use LOG() for general logging
```c++
LOG(INFO) << "Operator FC is taking " << num_inputs << "inputs."
```
When we run a Paddle Serving application or test, we can specify a logging level. For example:
```bash
GLOG_minloglevel=1 bin/serving
```
0 - INFO
1 - WARNING
2 -ERROR
3 - FATAL (Be careful as FATAL log will generate a coredump)
......@@ -17,6 +17,7 @@ from .proto import sdk_configure_pb2 as sdk
from .proto import general_model_config_pb2 as m_config
import google.protobuf.text_format
import time
import sys
int_type = 0
float_type = 1
......@@ -87,6 +88,9 @@ class Client(object):
# map feed names to index
self.client_handle_ = PredictorClient()
self.client_handle_.init(path)
read_env_flags = ["profile_client", "profile_server"]
self.client_handle_.init_gflags([sys.argv[0]] +
["--tryfromenv=" + ",".join(read_env_flags)])
self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.feed_shapes_ = [var.shape for var in model_conf.feed_var]
......@@ -143,9 +147,6 @@ class Client(object):
for i, name in enumerate(fetch_names):
result_map[name] = result[i]
if profile:
result_map["infer_time"] = result[-1][0]
return result_map
def batch_predict(self, feed_batch=[], fetch=[], profile=False):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册