Created by: wojtuss
Required(必填, multiple choices, two at most)
-
PR type(PR 类型) is (C, F ): A. New features(新功能)---------------- D. Performance optimization(性能优化) B. Bug fixes(问题修复)------------------ E. Breaking changes(向后不兼容的改变) C. Function optimization(功能优化)------F. Others(其它)
-
PR changes(改动点)is (C, D ): A. OPs(operators)---------------------- C. Docs(文档) B. APIs(接口)--------------------------- D. Others(其它)
-
Use one sentence to describe what this PR does.(简述本次PR的目的和改动) Improves QAT INT8 quantization process.
Optional(选填, If None, please delete it)
- Describe what this PR does in detail. If this PR fixes an issue, please give the issue id.
With this patch, the following updates are made to QAT:
- added the
--op_ids_to_skip
option to exclude operators from quantization by their id numbers, - added the
--debug
option to enable the generation of graphviz*.dot
files after each step of the graph transformation, - improved quantization of operators with missing output quantization scale,
- improved logging during quantization,
- enabled optimization of the FP32 model,
- fixed quantizing unsigned output data after
relu
op or postop, - updated the QAT documentation.
With the help of the --op_ids_to_skip
and --debug
options the accuracy of the pvnet_ocr
model after quantization was improved greatly. See the issue https://github.com/PaddlePaddle/Paddle/issues/24461 for details.
-
If you modified docs, please make sure that both Chinese and English docs were modified and provide a preview screenshot. (文档必填)
-
Please write down other information you want to tell reviewers.