Created by: wojtuss
We add cpu_quantize_pass
, which quantizes conv2d
and pool2d
operators in a model. After the pass, a quantize
op and a dequantize
op come before and after a quantized operator.
This pass works with the C-API quantization core (https://github.com/PaddlePaddle/Paddle/pull/15987) and is the first pass in a series of quantization passes.
After the pass we get the following accuracy MobileNet-v1: • FP32: 70.448% • INT8v2: 70.172% ResNet50: • FP32: 76.054% • INT8v2: 75.716%
There are still a few improvements in progress for this pass.