Merge pull request #1766 from TeslaZhao/develop

Update examples of pipeline low precison

Merge pull request #1766 from TeslaZhao/develop
Update examples of pipeline low precison
eb0eb46c · TeslaZhao · GitHub · b37d6a96 · ce011fdb · eb0eb46c
5 changed file
--- a/examples/Pipeline/LowPrecision/ResNet50_Slim/README.md
+++ b/examples/Pipeline/LowPrecision/ResNet50_Slim/README.md
-# Imagenet Pipeline WebService
+# Low precsion examples of python pipeline 

-This document will takes Imagenet service as an example to introduce how to use Pipeline WebService.
+Here we take the ResNet50 quantization model as an example to introduce the low-precision deployment case of Python Pipline.

-## Get model
+## 1.Get model
 ```
 wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
 tar zxvf ResNet50_quant.tar.gz
 ```

-## Start server
+## 2.Save model var for serving
+```
+python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
+```

+## 3.Start server
 ```
 python3 resnet50_web_service.py &>log.txt &
 ```

-## RPC test
+## 4.Test
 ```
 python3 pipeline_rpc_client.py
+python3 pipeline_http_client.py
 ```
--- a/examples/Pipeline/LowPrecision/ResNet50_Slim/README_CN.md
+++ b/examples/Pipeline/LowPrecision/ResNet50_Slim/README_CN.md
-# Imagenet Pipeline WebService
+# Python Pipeline 低精度部署案例

-这里以 Imagenet 服务为例来介绍 Pipeline WebService 的使用。
+这里以 ResNet50 量化模型为例，介绍 Python Pipline 低精度量化模型部署案例。

-## 获取模型
+## 1.获取模型
 ```
 wget https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50_quant.tar.gz
 tar zxvf ResNet50_quant.tar.gz
 ```

-## 启动服务
+## 2.保存模型参数
+```
+python3 -m paddle_serving_client.convert --dirname ResNet50_quant --serving_server serving_server --serving_client serving_client
+```

+## 3.启动服务
 ```
 python3 resnet50_web_service.py &>log.txt &
 ```

-## 测试
+## 4.测试
 ```
 python3 pipeline_rpc_client.py
+python3 pipeline_http_client.py
 ```
--- a/examples/Pipeline/LowPrecision/ResNet50_Slim/config.yml
+++ b/examples/Pipeline/LowPrecision/ResNet50_Slim/config.yml
 #worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程，每个进程内构建grpcSever和DAG
 ##当build_dag_each_worker=False时，框架会设置主线程grpc线程池的max_workers=worker_num
-worker_num: 1
+worker_num: 10

 #http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时，不自动生成http_port
 http_port: 18080
@@ -21,7 +21,7 @@ op:
            model_config: serving_server/

            #计算硬件类型: 空缺时由devices决定(CPU/GPU)，0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
-            device_type: 1
+            device_type: 2

            #计算硬件ID，当devices为""或不写时为CPU预测；当devices为"0", "0,1,2"时为GPU预测，表示使用的GPU卡
            devices: "0" # "0,1"
@@ -30,15 +30,15 @@ op:
            client_type: local_predictor

            #Fetch结果列表，以client_config中fetch_var的alias_name为准
-            fetch_list: ["score"]
+            fetch_list: ["save_infer_model/scale_0.tmp_0"]

            #precsion, 预测精度，降低预测精度可提升预测速度
            #GPU 支持: "fp32"(default), "fp16", "int8"；
            #CPU 支持: "fp32"(default), "fp16", "bf16"(mkldnn); 不支持: "int8"
-            precision: "fp32" 
+            precision: "int8" 

-            #开启 TensorRT calibration
-            use_calib: True
+            #开启 TensorRT calibration, 量化模型要设置 use_calib: False, 非量化模型离线生成int8需要开启 use_calib: True
+            use_calib: False

            #开启 ir_optim
            ir_optim: True 
--- a/examples/Pipeline/LowPrecision/ResNet50_Slim/imagenet.label
+++ b/examples/Pipeline/LowPrecision/ResNet50_Slim/imagenet.label
--- a/examples/Pipeline/LowPrecision/ResNet50_Slim/resnet50_web_service.py
+++ b/examples/Pipeline/LowPrecision/ResNet50_Slim/resnet50_web_service.py
@@ -47,7 +47,7 @@ class ImagenetOp(Op):
        return {"image": input_imgs}, False, None, ""

    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
-        score_list = fetch_dict["score"]
+        score_list = fetch_dict["save_infer_model/scale_0.tmp_0"]
        result = {"label": [], "prob": []}
        for score in score_list:
            score = score.tolist()