[DNNL] Draft of DNNL Memory optimizer - Part 1 (!22505) · 合并请求 · PaddlePaddle / Paddle

[DNNL] Draft of DNNL Memory optimizer - Part 1 !22505

Created by: jczaja

This is first PR trying to reduce MKL-DNN memory consumption to the level quite close to native CPU Paddle implementation. Relevant discussion takes place at #21493.

Changes introduced are very limited e.g.

only inference of fp32 is supported (no int8 support yet)
Memory savings happen only when original model params size is the same as DNNL weights size.

Both of limitations can be overcomed with more changes in core of PaddlePaddle.

Model	Peak memory cosumption : develop	Peak memory consumption: this PR
googlenetv1	600 MB	580 MB
demark	622 MB	600 MB
mobilenetv1	640 MB	580 MB

I would appreciate if @LeoZhao-Intel and @zhangting2020 would express their opinion on those changes. Please tell me if this kind of changes are conceptually fine to you. In short : original model's param is replaced with DNNL's params.

CI can only pass with approval.

PaddlePaddle / Paddle 1 年多 前同步成功

[DNNL] Draft of DNNL Memory optimizer - Part 1 !22505

PaddlePaddle / Paddle
1 年多前同步成功