提交 b5285396 编写于 作者: T TeslaZhao

Update examples of pipeline low precison

上级 d9b3d73b
...@@ -2,19 +2,24 @@ ...@@ -2,19 +2,24 @@
This document will takes Imagenet service as an example to introduce how to use Pipeline WebService. This document will takes Imagenet service as an example to introduce how to use Pipeline WebService.
## Get model ## 1.Get model
``` ```
wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
tar zxvf ResNet50_quant.tar.gz tar zxvf ResNet50_quant.tar.gz
``` ```
## Start server ## 2.Save model var for serving
```
python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
```
## 3.Start server
``` ```
python3 resnet50_web_service.py &>log.txt & python3 resnet50_web_service.py &>log.txt &
``` ```
## RPC test ## 4.Test
``` ```
python3 pipeline_rpc_client.py python3 pipeline_rpc_client.py
python3 pipeline_http_client.py
``` ```
# Imagenet Pipeline WebService # Low precsion of ResNet50 Pipeline WebService
这里以 Imagenet 服务为例来介绍 Pipeline WebService 的使用。 这里以 ResNet50 的低精度模型为例介绍 Pipeline WebService 的部署和使用。
## 获取模型 ## 1.获取模型
``` ```
wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
tar zxvf ResNet50_quant.tar.gz tar zxvf ResNet50_quant.tar.gz
``` ```
## 启动服务 ## 2.保存模型参数
```
python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
```
## 3.启动服务
``` ```
python3 resnet50_web_service.py &>log.txt & python3 resnet50_web_service.py &>log.txt &
``` ```
## 测试 ## 4.测试
``` ```
python3 pipeline_rpc_client.py python3 pipeline_rpc_client.py
python3 pipeline_http_client.py
``` ```
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG #worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num ##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1 worker_num: 10
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port #http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18080 http_port: 18080
...@@ -21,7 +21,7 @@ op: ...@@ -21,7 +21,7 @@ op:
model_config: serving_server/ model_config: serving_server/
#计算硬件类型: 空缺时由devices决定(CPU/GPU),0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu #计算硬件类型: 空缺时由devices决定(CPU/GPU),0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 1 device_type: 2
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡 #计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1" devices: "0" # "0,1"
...@@ -30,15 +30,15 @@ op: ...@@ -30,15 +30,15 @@ op:
client_type: local_predictor client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准 #Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["score"] fetch_list: ["save_infer_model/scale_0.tmp_0"]
#precsion, 预测精度,降低预测精度可提升预测速度 #precsion, 预测精度,降低预测精度可提升预测速度
#GPU 支持: "fp32"(default), "fp16", "int8"; #GPU 支持: "fp32"(default), "fp16", "int8";
#CPU 支持: "fp32"(default), "fp16", "bf16"(mkldnn); 不支持: "int8" #CPU 支持: "fp32"(default), "fp16", "bf16"(mkldnn); 不支持: "int8"
precision: "fp32" precision: "int8"
#开启 TensorRT calibration #开启 TensorRT calibration, 量化模型要设置 use_calib: False, 非量化模型离线生成int8需要开启 use_calib: True
use_calib: True use_calib: False
#开启 ir_optim #开启 ir_optim
ir_optim: True ir_optim: True
...@@ -47,7 +47,7 @@ class ImagenetOp(Op): ...@@ -47,7 +47,7 @@ class ImagenetOp(Op):
return {"image": input_imgs}, False, None, "" return {"image": input_imgs}, False, None, ""
def postprocess(self, input_dicts, fetch_dict, data_id, log_id): def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
score_list = fetch_dict["score"] score_list = fetch_dict["save_infer_model/scale_0.tmp_0"]
result = {"label": [], "prob": []} result = {"label": [], "prob": []}
for score in score_list: for score in score_list:
score = score.tolist() score = score.tolist()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册