Quant2 updates and fixes (!25313) · 合并请求 · PaddlePaddle / Paddle

Quant2 updates and fixes !25313

Created by: wojtuss

PR types

Function optimization

PR changes

Others

Describe

In this PR several improvements to the Quant2 -> INT8 transformation process and tests have been added:

process:

the list of optimization passes is supplemented,

improved logging,

fixed conv2d quantization in case of ResidualData are present,

tests:

INT8 model: accuracy and performance results are reported,

Quant model: accuracy results are reported,

accuracy is compared between INT8 and Quant models,

if an FP32 model is provided using the --fp32_model option, it will be optimized and both accuracy and performance are reported,

Image Classification tests run without the --ops_to_quantize option, enabling quantization of all supported operators,

NLP test run with the --ops_to_quantize option for code coverage purposes,

removed option to save FP32 model obtained from the Quant model.

PaddlePaddle / Paddle 大约 1 年 前同步成功

Quant2 updates and fixes !25313

PR types

PR changes

Describe

PaddlePaddle / Paddle
大约 1 年前同步成功