diff --git a/doc/README.md b/doc/README.md deleted file mode 100644 index 2d51eba9e2a2902685f9385c83542f32b98e5b4f..0000000000000000000000000000000000000000 --- a/doc/README.md +++ /dev/null @@ -1,119 +0,0 @@ -# Paddle Serving - -([简体中文](./README_CN.md)|English) - -Paddle Serving is PaddlePaddle's online estimation service framework, which can help developers easily implement remote prediction services that call deep learning models from mobile and server ends. At present, Paddle Serving is mainly based on models that support PaddlePaddle training. It can be used in conjunction with the Paddle training framework to quickly deploy inference services. Paddle Serving is designed around common industrial-level deep learning model deployment scenarios. Some common functions include multi-model management, model hot loading, [Baidu-rpc](https://github.com/apache/incubator-brpc)-based high-concurrency low-latency response capabilities, and online model A/B tests. The API that cooperates with the Paddle training framework can enable users to seamlessly transition between training and remote deployment, improving the landing efficiency of deep learning models. - ------------- - -## Quick Start - -Paddle Serving's current develop version supports lightweight Python API for fast predictions, and training with Paddle can get through. We take the most classic Boston house price prediction as an example to fully explain the process of model training on a single machine and model deployment using Paddle Serving. - -#### Install - -It is highly recommended that you build Paddle Serving inside Docker, please read [How to run PaddleServing in Docker](RUN_IN_DOCKER.md) - -``` -pip install paddle-serving-client -pip install paddle-serving-server -``` - -#### Training Script -``` python -import sys -import paddle -import paddle.fluid as fluid - -train_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.train(), buf_size=500), batch_size=16) - -test_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.test(), buf_size=500), batch_size=16) - -x = fluid.data(name='x', shape=[None, 13], dtype='float32') -y = fluid.data(name='y', shape=[None, 1], dtype='float32') - -y_predict = fluid.layers.fc(input=x, size=1, act=None) -cost = fluid.layers.square_error_cost(input=y_predict, label=y) -avg_loss = fluid.layers.mean(cost) -sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01) -sgd_optimizer.minimize(avg_loss) - -place = fluid.CPUPlace() -feeder = fluid.DataFeeder(place=place, feed_list=[x, y]) -exe = fluid.Executor(place) -exe.run(fluid.default_startup_program()) - -import paddle_serving_client.io as serving_io - -for pass_id in range(30): - for data_train in train_reader(): - avg_loss_value, = exe.run( - fluid.default_main_program(), - feed=feeder.feed(data_train), - fetch_list=[avg_loss]) - -serving_io.save_model( - "serving_server_model", "serving_client_conf", - {"x": x}, {"y": y_predict}, fluid.default_main_program()) -``` - -#### Server Side Code -``` python -import sys -from paddle_serving.serving_server import OpMaker -from paddle_serving.serving_server import OpSeqMaker -from paddle_serving.serving_server import Server - -op_maker = OpMaker() -read_op = op_maker.create('general_reader') -general_infer_op = op_maker.create('general_infer') - -op_seq_maker = OpSeqMaker() -op_seq_maker.add_op(read_op) -op_seq_maker.add_op(general_infer_op) - -server = Server() -server.set_op_sequence(op_seq_maker.get_op_sequence()) -server.load_model_config(sys.argv[1]) -server.prepare_server(workdir="work_dir1", port=9393, device="cpu") -server.run_server() -``` - -#### Launch Server End -``` shell -python test_server.py serving_server_model -``` - -#### Client Prediction -``` python -from paddle_serving_client import Client -import paddle -import sys - -client = Client() -client.load_client_config(sys.argv[1]) -client.connect(["127.0.0.1:9292"]) - -test_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.test(), buf_size=500), batch_size=1) - -for data in test_reader(): - fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["y"]) - print("{} {}".format(fetch_map["y"][0], data[0][1][0])) - -``` - -### Document - -[Design Doc](DESIGN.md) - -[FAQ](./deprecated/FAQ.md) - -### Senior Developer Guildlines - -[Compile Tutorial](COMPILE.md) - -## Contribution -If you want to make contributions to Paddle Serving Please refer to [CONRTIBUTE](CONTRIBUTE.md) diff --git a/doc/README_CN.md b/doc/README_CN.md deleted file mode 100644 index da5641cad333518ded9fbae4438f05ae20e30ddd..0000000000000000000000000000000000000000 --- a/doc/README_CN.md +++ /dev/null @@ -1,119 +0,0 @@ -# Paddle Serving - -(简体中文|[English](./README.md)) - -Paddle Serving是PaddlePaddle的在线预估服务框架,能够帮助开发者轻松实现从移动端、服务器端调用深度学习模型的远程预测服务。当前Paddle Serving以支持PaddlePaddle训练的模型为主,可以与Paddle训练框架联合使用,快速部署预估服务。Paddle Serving围绕常见的工业级深度学习模型部署场景进行设计,一些常见的功能包括多模型管理、模型热加载、基于[Baidu-rpc](https://github.com/apache/incubator-brpc)的高并发低延迟响应能力、在线模型A/B实验等。与Paddle训练框架互相配合的API可以使用户在训练与远程部署之间无缝过度,提升深度学习模型的落地效率。 - ------------- - -## 快速上手指南 - -Paddle Serving当前的develop版本支持轻量级Python API进行快速预测,并且与Paddle的训练可以打通。我们以最经典的波士顿房价预测为示例,完整说明在单机进行模型训练以及使用Paddle Serving进行模型部署的过程。 - -#### 安装 - -强烈建议您在Docker内构建Paddle Serving,请查看[如何在Docker中运行PaddleServing](RUN_IN_DOCKER_CN.md) - -``` -pip install paddle-serving-client -pip install paddle-serving-server -``` - -#### 训练脚本 -``` python -import sys -import paddle -import paddle.fluid as fluid - -train_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.train(), buf_size=500), batch_size=16) - -test_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.test(), buf_size=500), batch_size=16) - -x = fluid.data(name='x', shape=[None, 13], dtype='float32') -y = fluid.data(name='y', shape=[None, 1], dtype='float32') - -y_predict = fluid.layers.fc(input=x, size=1, act=None) -cost = fluid.layers.square_error_cost(input=y_predict, label=y) -avg_loss = fluid.layers.mean(cost) -sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01) -sgd_optimizer.minimize(avg_loss) - -place = fluid.CPUPlace() -feeder = fluid.DataFeeder(place=place, feed_list=[x, y]) -exe = fluid.Executor(place) -exe.run(fluid.default_startup_program()) - -import paddle_serving_client.io as serving_io - -for pass_id in range(30): - for data_train in train_reader(): - avg_loss_value, = exe.run( - fluid.default_main_program(), - feed=feeder.feed(data_train), - fetch_list=[avg_loss]) - -serving_io.save_model( - "serving_server_model", "serving_client_conf", - {"x": x}, {"y": y_predict}, fluid.default_main_program()) -``` - -#### 服务器端代码 -``` python -import sys -from paddle_serving.serving_server import OpMaker -from paddle_serving.serving_server import OpSeqMaker -from paddle_serving.serving_server import Server - -op_maker = OpMaker() -read_op = op_maker.create('general_reader') -general_infer_op = op_maker.create('general_infer') - -op_seq_maker = OpSeqMaker() -op_seq_maker.add_op(read_op) -op_seq_maker.add_op(general_infer_op) - -server = Server() -server.set_op_sequence(op_seq_maker.get_op_sequence()) -server.load_model_config(sys.argv[1]) -server.prepare_server(workdir="work_dir1", port=9393, device="cpu") -server.run_server() -``` - -#### 服务器端启动 -``` shell -python test_server.py serving_server_model -``` - -#### 客户端预测 -``` python -from paddle_serving_client import Client -import paddle -import sys - -client = Client() -client.load_client_config(sys.argv[1]) -client.connect(["127.0.0.1:9292"]) - -test_reader = paddle.batch(paddle.reader.shuffle( - paddle.dataset.uci_housing.test(), buf_size=500), batch_size=1) - -for data in test_reader(): - fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["y"]) - print("{} {}".format(fetch_map["y"][0], data[0][1][0])) - -``` - -### 文档 - -[设计文档](DESIGN_CN.md) - -[FAQ](./deprecated/FAQ.md) - -### 资深开发者使用指南 - -[编译指南](COMPILE_CN.md) - -## 贡献 -如果你想要给Paddle Serving做贡献,请参考[贡献指南](CONTRIBUTE.md) diff --git a/python/examples/imagenet/benchmark.py b/python/examples/imagenet/benchmark.py index 5c4c44cc1bd091af6c4d343d2b7f0f436cca2e7e..f4a7b083300be727ba81e880c41791bf36bfd6f7 100644 --- a/python/examples/imagenet/benchmark.py +++ b/python/examples/imagenet/benchmark.py @@ -25,36 +25,36 @@ import base64 from paddle_serving_client import Client from paddle_serving_client.utils import MultiThreadRunner from paddle_serving_client.utils import benchmark_args -from paddle_serving_app.reader import Sequential, URL2Image, Resize +from paddle_serving_app.reader import Sequential, File2Image, Resize from paddle_serving_app.reader import CenterCrop, RGB2BGR, Transpose, Div, Normalize args = benchmark_args() seq_preprocess = Sequential([ - URL2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)), + File2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)), Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True) ]) def single_func(idx, resource): file_list = [] + turns = 10 for file_name in os.listdir("./image_data/n01440764"): file_list.append(file_name) img_list = [] for i in range(1000): - img_list.append(open("./image_data/n01440764/" + file_list[i]).read()) + img_list.append("./image_data/n01440764/" + file_list[i]) profile_flags = False if "FLAGS_profile_client" in os.environ and os.environ[ "FLAGS_profile_client"]: profile_flags = True if args.request == "rpc": - reader = ImageReader() fetch = ["score"] client = Client() client.load_client_config(args.model) client.connect([resource["endpoint"][idx % len(resource["endpoint"])]]) start = time.time() - for i in range(1000): + for i in range(turns): if args.batch_size >= 1: feed_batch = [] i_start = time.time() @@ -77,7 +77,7 @@ def single_func(idx, resource): server = "http://" + resource["endpoint"][idx % len(resource[ "endpoint"])] + "/image/prediction" start = time.time() - for i in range(1000): + for i in range(turns): if py_version == 2: image = base64.b64encode( open("./image_data/n01440764/" + file_list[i]).read()) @@ -93,8 +93,9 @@ def single_func(idx, resource): if __name__ == '__main__': multi_thread_runner = MultiThreadRunner() - endpoint_list = ["127.0.0.1:9393"] - #endpoint_list = endpoint_list + endpoint_list + endpoint_list + endpoint_list = [ + "127.0.0.1:9292", "127.0.0.1:9293", "127.0.0.1:9294", "127.0.0.1:9295" + ] result = multi_thread_runner.run(single_func, args.thread, {"endpoint": endpoint_list}) #result = single_func(0, {"endpoint": endpoint_list}) diff --git a/python/examples/imagenet/benchmark.sh b/python/examples/imagenet/benchmark.sh index 84885908fa89d050b3ca71386fe2a21533ce0809..d7eb89fa9b0b68e5e442d15bdf16f431c91ba94d 100644 --- a/python/examples/imagenet/benchmark.sh +++ b/python/examples/imagenet/benchmark.sh @@ -11,7 +11,7 @@ $PYTHONROOT/bin/python benchmark.py --thread 8 --batch_size 1 --model $2/serving for thread_num in 4 8 16 do -for batch_size in 1 4 16 64 256 +for batch_size in 1 4 16 64 do $PYTHONROOT/bin/python benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1 echo "model name :" $1