GoogleNet on MKLDNN (#14346) · Issue · PaddlePaddle / Paddle

GoogleNet on MKLDNN

Created by: luotao1

Model URL: http://paddle-inference-dist.bj.bcebos.com/googlenet.tar.gz command: likes test_analyzer_resnet50, only change the context of --infer_model

Diff

There is diff when use MKLDNN on googlenet.

I1109 13:37:42.628175 50355 helper.h:161] ====== batch_size: 1, repeat: 1, threads: 1, thread id: 0, latency: 268.931ms, fps: 3.71843 ======
/Paddle/paddle/fluid/inference/tests/api/tester_helper.h:70: Failure
The difference between pdata_ref[j] and pdata[j] is 0.015455007553100586, which exceeds 1e-3, where
pdata_ref[j] evaluates to 0.67508238554000854,
pdata[j] evaluates to 0.65962737798690796, and
1e-3 evaluates to 0.001.
/Paddle/paddle/fluid/inference/tests/api/tester_helper.h:70: Failure
The difference between pdata_ref[j] and pdata[j] is 0.029862642288208008, which exceeds 1e-3, where
pdata_ref[j] evaluates to 0.64170312881469727,
pdata[j] evaluates to 0.67156577110290527, and
1e-3 evaluates to 0.001.
/Paddle/paddle/fluid/inference/tests/api/tester_helper.h:70: Failure
The difference between pdata_ref[j] and pdata[j] is 2.03973388671875, which exceeds 1e-3, where
pdata_ref[j] evaluates to 226.21131896972656,
pdata[j] evaluates to 224.17158508300781, and
1e-3 evaluates to 0.001.
/Paddle/paddle/fluid/inference/tests/api/tester_helper.h:70: Failure
The difference between pdata_ref[j] and pdata[j] is 2.217803955078125, which exceeds 1e-3, where
pdata_ref[j] evaluates to 227.33116149902344,
pdata[j] evaluates to 225.11335754394531, and
1e-3 evaluates to 0.001.
[  FAILED  ] Analyzer_resnet50.compare_mkldnn (2188 ms)
[----------] 1 test from Analyzer_resnet50 (2188 ms total)

Performance

The performance on MKLDNN (193ms) is almost the same as MKL (206ms). Machine: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz

Profile on MKL

I1109 11:53:54.818035 51680 helper.h:161] ====== batch_size: 1, repeat: 100, threads: 1, thread id: 0, latency: 206.139ms, fps: 4.8511 ======

------------------------->     Profiling Report     <-------------------------

Place: CPU
Time unit: ms
Sorted by total time in descending order in the same thread

Event                       Calls       Total       Min.        Max.        Ave.        Ratio.
thread0::lrn                200         9851.59     25.3704     77.0484     49.258      0.478162
thread0::conv2d             5700        7756.49     0.09196     25.0828     1.36079     0.376473
thread0::pool2d             1400        1550.66     0.076101    2.85543     1.10761     0.0752636
thread0::elementwise_add    5700        1165.1      0.017176    3.08475     0.204403    0.0565497
thread0::relu               5700        119.171     0.004205    0.685152    0.0209072   0.00578414
thread0::concat             900         116.127     0.031237    0.37544     0.12903     0.0056364
thread0::load_combine       2           37.106      12.8365     24.2695     18.553      0.001801
thread0::fc                 100         5.55069     0.049486    0.13439     0.0555069   0.000269411
thread0::fetch              100         0.715028    0.005518    0.026451    0.00715028  3.4705e-05
thread0::feed               100         0.529316    0.004019    0.014951    0.00529316  2.56912e-05

Profile on MKLDNN, seems pool2d costs a lot of time

I1109 12:06:08.191417 17514 helper.h:161] ====== batch_size: 1, repeat: 100, threads: 1, thread id: 0, latency: 198.451ms, fps: 5.03903 ======

------------------------->     Profiling Report     <-------------------------

Place: CPU
Time unit: ms
Sorted by total time in descending order in the same thread

Event                    Calls       Total       Min.        Max.        Ave.        Ratio.
thread0::pool2d          1400        11195.9     0.136173    22.0287     7.99705     0.56419
thread0::conv2d          5700        6709.73     0.201956    15.2978     1.17714     0.338121
thread0::concat          900         1025.27     0.747875    3.38313     1.13919     0.051666
thread0::lrn             200         863.445     2.65105     9.83194     4.31723     0.0435113
thread0::load_combine    2           37.6152     12.6916     24.9237     18.8076     0.00189553
thread0::fc              100         10.8304     0.091616    0.84363     0.108304    0.000545772
thread0::fetch           100         0.811231    0.006754    0.027217    0.00811231  4.08801e-05
thread0::feed            100         0.600731    0.004528    0.035933    0.00600731  3.02724e-05

PaddlePaddle / Paddle 大约 1 年 前同步成功

GoogleNet on MKLDNN

Diff

Performance

PaddlePaddle / Paddle
大约 1 年前同步成功