提交 b5285396 编写于 作者: T TeslaZhao

Update examples of pipeline low precison

上级 d9b3d73b
......@@ -2,19 +2,24 @@
This document will takes Imagenet service as an example to introduce how to use Pipeline WebService.
## Get model
## 1.Get model
```
wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
tar zxvf ResNet50_quant.tar.gz
```
## Start server
## 2.Save model var for serving
```
python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
```
## 3.Start server
```
python3 resnet50_web_service.py &>log.txt &
```
## RPC test
## 4.Test
```
python3 pipeline_rpc_client.py
python3 pipeline_http_client.py
```
# Imagenet Pipeline WebService
# Low precsion of ResNet50 Pipeline WebService
这里以 Imagenet 服务为例来介绍 Pipeline WebService 的使用。
这里以 ResNet50 的低精度模型为例介绍 Pipeline WebService 的部署和使用。
## 获取模型
## 1.获取模型
```
wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
tar zxvf ResNet50_quant.tar.gz
```
## 启动服务
## 2.保存模型参数
```
python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
```
## 3.启动服务
```
python3 resnet50_web_service.py &>log.txt &
```
## 测试
## 4.测试
```
python3 pipeline_rpc_client.py
python3 pipeline_http_client.py
```
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
worker_num: 10
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18080
......@@ -21,7 +21,7 @@ op:
model_config: serving_server/
#计算硬件类型: 空缺时由devices决定(CPU/GPU),0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 1
device_type: 2
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1"
......@@ -30,15 +30,15 @@ op:
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["score"]
fetch_list: ["save_infer_model/scale_0.tmp_0"]
#precsion, 预测精度,降低预测精度可提升预测速度
#GPU 支持: "fp32"(default), "fp16", "int8";
#CPU 支持: "fp32"(default), "fp16", "bf16"(mkldnn); 不支持: "int8"
precision: "fp32"
precision: "int8"
#开启 TensorRT calibration
use_calib: True
#开启 TensorRT calibration, 量化模型要设置 use_calib: False, 非量化模型离线生成int8需要开启 use_calib: True
use_calib: False
#开启 ir_optim
ir_optim: True
......@@ -47,7 +47,7 @@ class ImagenetOp(Op):
return {"image": input_imgs}, False, None, ""
def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
score_list = fetch_dict["score"]
score_list = fetch_dict["save_infer_model/scale_0.tmp_0"]
result = {"label": [], "prob": []}
for score in score_list:
score = score.tolist()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册