Mobilenet fp32 latency regression
Created by: chuanqi129
We found a FP32 latency regression of Image classification model especially on MobileNet from 3.04ms to 8.46ms based on SKX-8180 1 socket. This regression caused by PR#14707 and below commit ID:
commit 669191c9 Author: Yihua Xu yihua.xu@intel.com Date: Mon Dec 3 11:54:00 2018 +0800
Implement conv3d with mkldnn library (test=develop)
We can use CAPI Image classification application to reproduce it.
FLAGS_use_mkldnn=true FLAGS_paddle_num_threads=28 KMP_AFFINITY=compact,granularity=fine taskset -c 0-27 numactl -l ./build/infer_image_classification \
--infer_model=${infer_model} \
--batch_size=1 \
--profile \
--skip_batch_num=10 \
--iterations=1000 \
--use_mkldnn \
--paddle_num_threads=28 \
--use_fake_data=true
You also can use python eval.py
script to reproduce this latency regression, but it has lower regression ratio than CAPI caused by overhead. Please have a look into it. @luotao1 @yihuaxu @jianhang-liu