diff --git a/configs/ppyoloe/README.md b/configs/ppyoloe/README.md index 1a749d895f2539aa8a87124416cb163c10d1e799..cb065e7dc7760c208b494edc4bd90b51950c48d6 100644 --- a/configs/ppyoloe/README.md +++ b/configs/ppyoloe/README.md @@ -130,6 +130,29 @@ CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inferenc ``` +**Using TensorRT Inference with ONNX** to test speed, run following command + +```bash +# export inference model with trt=True +python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams exclude_nms=True trt=True + +# convert to onnx +paddle2onnx --model_dir output_inference/ppyoloe_crn_s_300e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_crn_s_300e_coco.onnx + +# trt inference using fp16 and batch_size=1 +trtexec --onnx=./ppyoloe_crn_s_300e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16 + +# trt inference using fp16 and batch_size=32 +trtexec --onnx=./ppyoloe_crn_s_300e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16 + +# Using the above script, T4 and tensorrt 7.2 machine, the speed of PPYOLOE-s model is as follows, + +# batch_size=1, 2.80ms, 357fps +# batch_size=32, 67.69ms, 472fps + +``` + + ### Deployment PP-YOLOE can be deployed by following approches: diff --git a/configs/ppyoloe/README_cn.md b/configs/ppyoloe/README_cn.md index 60138a35af4fbec1fadf25de59e37c115b223fd0..3150639859dfaf65c21ca350447dbe5e7a960d1a 100644 --- a/configs/ppyoloe/README_cn.md +++ b/configs/ppyoloe/README_cn.md @@ -132,6 +132,29 @@ CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inferenc ``` + +**使用 ONNX 和 TensorRT** 进行测速,执行以下命令: + +```bash +# 导出模型 +python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams exclude_nms=True trt=True + +# 转化成ONNX格式 +paddle2onnx --model_dir output_inference/ppyoloe_crn_s_300e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_crn_s_300e_coco.onnx + +# 测试速度,半精度,batch_size=1 +trtexec --onnx=./ppyoloe_crn_s_300e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16 + +# 测试速度,半精度,batch_size=32 +trtexec --onnx=./ppyoloe_crn_s_300e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16 + +# 使用上边的脚本, 在T4 和 TensorRT 7.2的环境下,PPYOLOE-s模型速度如下 +# batch_size=1, 2.80ms, 357fps +# batch_size=32, 67.69ms, 472fps +``` + + + ### 部署 PP-YOLOE可以使用以下方式进行部署: