diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt index 698f1e9849b33d3b43472a75a5b410901c8db5a5..08e1fe9ba0aba4e3ab358be188aeed0212ad08ff 100644 --- a/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt +++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt @@ -25,29 +25,29 @@ eval:null null:null ## ===========================infer_params=========================== -Global.save_inference_dir:null +Global.save_inference_dir:./output/ Global.checkpoints: -norm_export:null +norm_export:tools/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o quant_export: fpgm_export: distill_export:null export1:null export2:null -inference_dir:null -infer_model:null +inference_dir:Student +infer_model:./inference/ch_PP-OCRv3_rec_infer infer_export:null -infer_quant: -inference: ---use_gpu: ---enable_mkldnn: ---cpu_threads: ---rec_batch_num: ---use_tensorrt: ---precision: +infer_quant:False +inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320" +--use_gpu:True|False +--enable_mkldnn:False +--cpu_threads:6 +--rec_batch_num:1|6 +--use_tensorrt:False +--precision:fp32 --rec_model_dir: ---image_dir: +--image_dir:./inference/rec_inference null:null ---benchmark: +--benchmark:True null:null ===========================infer_benchmark_params========================== random_infer_input:[{float32,[3,48,320]}] diff --git a/test_tipc/docs/test_train_fleet_inference_python.md b/test_tipc/docs/test_train_fleet_inference_python.md index 2da1170554901a6296426c94b1852b3b39031d3f..4479a47da83a951eeed9d7d0e8f9077fc0a9fed4 100644 --- a/test_tipc/docs/test_train_fleet_inference_python.md +++ b/test_tipc/docs/test_train_fleet_inference_python.md @@ -11,6 +11,12 @@ Linux GPU/CPU 多机多卡训练推理测试的主程序为`test_train_inference | PP-OCRv3 | ch_PP-OCRv3_rec | 分布式训练 | +- 推理相关: + +| 算法名称 | 模型名称 | device_CPU | device_GPU | batchsize | +| :----: | :----: | :----: | :----: | :----: | +| PP-OCRv3 | ch_PP-OCRv3_rec | 支持 | 支持 | 1 | + ## 2. 测试流程 @@ -56,10 +62,46 @@ bash test_tipc/test_train_inference_python.sh test_tipc/configs/ch_PP-OCRv3_rec ```bash Run successfully with command - ch_PP-OCRv3_rec - python3.7 -m paddle.distributed.launch --ips=192.168.0.1,192.168.0.2 --gpus=0,1 tools/train.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o Global.use_gpu=True Global.save_model_dir=./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/norm_train_gpus_0,1_autocast_fp32_nodes_2 Global.epoch_num=3 Global.auto_cast=fp32 Train.loader.batch_size_per_card=16 ! + ...... + Run successfully with command - ch_PP-OCRv3_rec - python3.7 tools/infer/predict_rec.py --rec_image_shape="3,48,320" --use_gpu=False --enable_mkldnn=False --cpu_threads=6 --rec_model_dir=./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/norm_train_gpus_0,1_autocast_fp32_nodes_2/Student --rec_batch_num=1 --image_dir=./inference/rec_inference --benchmark=True --precision=fp32 > ./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/python_infer_cpu_usemkldnn_False_threads_6_precision_fp32_batchsize_1.log 2>&1 ! +``` + +在开启benchmark参数时,可以得到测试的详细数据,包含运行环境信息(系统版本、CUDA版本、CUDNN版本、驱动版本),Paddle版本信息,参数设置信息(运行设备、线程数、是否开启内存优化等),模型信息(模型名称、精度),数据信息(batchsize、是否为动态shape等),性能信息(CPU,GPU的占用、运行耗时、预处理耗时、推理耗时、后处理耗时),内容如下所示: + +``` +[2022/06/02 22:53:35] ppocr INFO: + +[2022/06/02 22:53:35] ppocr INFO: ---------------------- Env info ---------------------- +[2022/06/02 22:53:35] ppocr INFO: OS_version: Ubuntu 16.04 +[2022/06/02 22:53:35] ppocr INFO: CUDA_version: 10.1.243 +[2022/06/02 22:53:35] ppocr INFO: CUDNN_version: 7.6.5 +[2022/06/02 22:53:35] ppocr INFO: drivier_version: 460.32.03 +[2022/06/02 22:53:35] ppocr INFO: ---------------------- Paddle info ---------------------- +[2022/06/02 22:53:35] ppocr INFO: paddle_version: 2.3.0-rc0 +[2022/06/02 22:53:35] ppocr INFO: paddle_commit: 5d4980c052583fec022812d9c29460aff7cdc18b +[2022/06/02 22:53:35] ppocr INFO: log_api_version: 1.0 +[2022/06/02 22:53:35] ppocr INFO: ----------------------- Conf info ----------------------- +[2022/06/02 22:53:35] ppocr INFO: runtime_device: cpu +[2022/06/02 22:53:35] ppocr INFO: ir_optim: True +[2022/06/02 22:53:35] ppocr INFO: enable_memory_optim: True +[2022/06/02 22:53:35] ppocr INFO: enable_tensorrt: False +[2022/06/02 22:53:35] ppocr INFO: enable_mkldnn: False +[2022/06/02 22:53:35] ppocr INFO: cpu_math_library_num_threads: 6 +[2022/06/02 22:53:35] ppocr INFO: ----------------------- Model info ---------------------- +[2022/06/02 22:53:35] ppocr INFO: model_name: rec +[2022/06/02 22:53:35] ppocr INFO: precision: fp32 +[2022/06/02 22:53:35] ppocr INFO: ----------------------- Data info ----------------------- +[2022/06/02 22:53:35] ppocr INFO: batch_size: 1 +[2022/06/02 22:53:35] ppocr INFO: input_shape: dynamic +[2022/06/02 22:53:35] ppocr INFO: data_num: 6 +[2022/06/02 22:53:35] ppocr INFO: ----------------------- Perf info ----------------------- +[2022/06/02 22:53:35] ppocr INFO: cpu_rss(MB): 288.957, gpu_rss(MB): None, gpu_util: None% +[2022/06/02 22:53:35] ppocr INFO: total time spent(s): 0.4824 +[2022/06/02 22:53:35] ppocr INFO: preprocess_time(ms): 0.1136, inference_time(ms): 79.5877, postprocess_time(ms): 0.6945 ``` 该信息可以在运行log中查看,以上面的`ch_PP-OCRv3_rec`为例,log位置在`./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/results_python.log`。 如果运行失败,也会在终端中输出运行失败的日志信息以及对应的运行命令。可以基于该命令,分析运行失败的原因。 -**注意:** 由于分布式训练时,仅在`trainer_id=0`所在的节点中保存模型,因此如果测试多机的推理过程,其他的节点中在运行模型导出与推理时会报错,为正常现象。 +**注意:** 由于分布式训练时,仅在`trainer_id=0`所在的节点中保存模型,因此其他的节点中在运行模型导出与推理时会报错,为正常现象。