Created by: wojtuss
This is a request for comments.
There is a need for applying INT8 quantization strategy in PaddlePaddle models. This patch adds a mechanism for applying INT8 quantization strategy similar to MKL-DNN approach.
A use_int8
op attribute and int8_placement_pass
in the simplest form (similar to the use_mkldnn
attribute and mkldnn_placement_pass
) are added. The pass will allow a user to choose which operators should be quantized and use INT8 kernels.
We envisage more passes for INT8 optimization that would utilize the use_int8
attribute.
test=develop