[DNNL] Added MKL-DNN inplace pass for C-API inference (!23315) · 合并请求 · PaddlePaddle / Paddle

[DNNL] Added MKL-DNN inplace pass for C-API inference !23315

Created by: jczaja

This PR is introducing C-API MKL-DNN pass to take adventage of in-place ops of MKL-DNN for C-API inference. Currently only softmax in-place is available.

There should be little performance improvement visible on BERT fp32 and ERNIE int8( https://github.com/PaddlePaddle/Paddle/issues/22904#issuecomment-600227776)

Apart from performance improvement, memory consumption should also be smaller, but due to fact that PaddlePaddle is preallocating memory blocks , I did not observed lower memory usage in tested model (BERT).

After this PR, elementwise_add and activation in-place support will follow

PaddlePaddle / Paddle 大约 1 年 前同步成功

[DNNL] Added MKL-DNN inplace pass for C-API inference !23315

PaddlePaddle / Paddle
大约 1 年前同步成功