Remove the cache in post_traning_quantization, test=develop (!26450) · 合并请求 · PaddlePaddle / Paddle

Remove the cache in post_traning_quantization, test=develop !26450

Created by: juncaipeng

PR types

Others

PR changes

Others

Describe

To avoid saving the cache data, get the abs min and abs max value of all quantized tensor in preparation stage, and then update the histogram fo quantized tensor in sampling stage.

优化离线量化方法，第一次前向计算，计算所有量化tensor的绝对值最大最小值，第二次前向计算，将采样数据的统计信息更新到同一个直方图中，最后基于直方图统计计算KL阈值。
可以避免缓存大量采样数据

PaddlePaddle / Paddle 大约 2 年 前同步成功

Remove the cache in post_traning_quantization, test=develop !26450

PR types

PR changes

Describe

PaddlePaddle / Paddle
大约 2 年前同步成功