Malformed graph of ernie when ran with benchmark application
Created by: Sand3r-
Current behaviour
The error has been discovered thanks to level 3 logging enabled by GLOG_v environmental variable. GLOG has reported, that:
Some operators use the same variables for reading/writing output. For example, when the fp32_model
has been ran, one could observe that scale_op
as well transpose2_op
accept transpose_4.tmp_0
as their input to the operator (while according to the original graph they do not)
Subpart of a GLOG error documenting this:
operator.cc:172 CPUPlace Op(scale), inputs:{X[transpose_4.tmp_0:float[1, 12, 128, 64]({})]}, outputs:{Out[scale_12.tmp_0:float[1, 12, 128, 64]({})]}.
(...several ops later...)
operator.cc:172 CPUPlace Op(transpose2), inputs:{X[transpose_4.tmp_0:float[1, 12, 128, 64]({})]}, outputs:{Out[fc_66.tmp_0:float[1, 128, 12, 64]({})], XShape[transpose_47.tmp_1:[0, 1, 12, 128, 64]({})]}.
As far as I understand that, this is a bug, since variable names should be unique (as long as they are enclosed in the same scope).
To illustrate the problem, please see the following figure depicting a different model (ernie_quant) which suffers from the same problem:
This is a blocking issue for INT8 Ernie quantization task, since our quantization system associates scales with variable names. And if the variable repeats in several places, we have end up with the same scales where we didn't mean to.
Reproduction
- based on 8da0cd53
-CPU: including MKLDNN version v.20
-OS Platform Ubuntu 16.04
-Cmake orders
-DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_GPU=OFF -DON_INFER=ON -DWITH_MKLDNN=ON -DWITH_TESTING=ON -DWITH_PROFILER=ON -DWITH_STYLE_CHECK=OFF -DWITH_INFERENCE_API_TEST=ON
-API information To Reproduce
- Build paddle
- Build benchmark Inference application for ernie https://github.com/PaddlePaddle/benchmark/tree/master/Inference/c%2B%2B/ernie Run any 4-input ernie model.
@luotao1 Could you please assign someone to help solving this issue?