Accuracy drop on INT8 Ernie QAT inference
Created by: wojtuss
The commit https://github.com/PaddlePaddle/Paddle/commit/b89dd86fb65fc9095219aa4d37ad52728b8f689a introduced accuracy drop from 0.7936 to 0.7916 for INT8 Ernie QAT inference test (using the qat2_int8_nlp_comparison.py
script) on a 6248 CLX machine. In our internal tests the commit makes the FP32-INT8 accuracy diff (0.8020 - 0.7916 = 0.0104) greater than > 0.01 and the test is failing.
What is your recommendation?