Compare Inference Perf BTW CPU and MKLDNN [OCR CRNN_CTC model] (#10685) · Issue · PaddlePaddle / Paddle

Compare Inference Perf BTW CPU and MKLDNN [OCR CRNN_CTC model]

Created by: luotao1

Model

Original Model: https://github.com/PaddlePaddle/models/blob/develop/fluid/ocr_recognition/crnn_ctc_model.py

Since the inference model will do fuse batch norm, thus, we can easily change https://github.com/PaddlePaddle/models/blob/develop/fluid/ocr_recognition/crnn_ctc_model.py#L14-L28 directly for profiling inference Perf, which has the same Perf with fuse batch norm.

tmp = fluid.layers.conv2d(
            input=tmp,
            num_filters=out_ch[i],
            filter_size=3,
            padding=1,
            param_attr=param if param_0 is None else param_0,
            bias_attr=bias
            act=act
            use_cudnn=True)

We can add use_mkldnn=True directly to obain a MKLDNN ProgramDesc, like https://github.com/PaddlePaddle/Paddle/compare/develop...tensor-tang:compare. And after #10682 (closed) is solved, we can auto change a CPU ProgramDesc to MKLDNN ProgramDesc.

The final model likes:

4.0K	conv2d_0.b_0
4.0K	conv2d_0.w_0
4.0K	conv2d_1.b_0
12K	conv2d_1.w_0
4.0K	conv2d_2.b_0
20K	conv2d_2.w_0
4.0K	conv2d_3.b_0
40K	conv2d_3.w_0
4.0K	conv2d_4.b_0
76K	conv2d_4.w_0
4.0K	conv2d_5.b_0
148K	conv2d_5.w_0
4.0K	conv2d_6.b_0
292K	conv2d_6.w_0
4.0K	conv2d_7.b_0
580K	conv2d_7.w_0
4.0K	fc_0.b_0
904K	fc_0.w_0
4.0K	fc_1.b_0
904K	fc_1.w_0
44K	fc_2.b_0
8.3M	fc_2.w_0
8.3M	fc_2.w_1
4.0K	gru_0.b_0
472K	gru_0.w_0
4.0K	gru_1.b_0
472K	gru_1.w_0
12K	__model__

Test

A patch for test crnn_ctc model on C++ end. https://github.com/PaddlePaddle/Paddle/compare/develop...luotao1:ocr_test?expand=1

# build test
cd build
make test ARGS="-R test_crnn_ctc -V"

# run test
cd paddle/fluid/inference/tests/book
./test_crnn_ctc --dirname=DIR_PATH --batch_size=1 --repeat=10

Note that this will give the result of MKLDNN multi-threads, with single threads please try:

taskset -c 0 ./test_crnn_ctc --dirname=DIR_PATH --batch_size=1 --repeat=10

refer: #10651 (closed)

PaddlePaddle / Paddle 大约 1 年 前同步成功

Compare Inference Perf BTW CPU and MKLDNN [OCR CRNN_CTC model]

Model

Test

PaddlePaddle / Paddle
大约 1 年前同步成功