AnalysisQuant provides to analysis the sensitivity of each op in the model.
AnalysisQuant provides to analysis the sensitivity of each op in the model.
Args:
Args:
model_dir(str): the path of fp32 model that will be quantized
model_dir(str): the path of fp32 model that will be quantized, it can also be '.onnx'
model_filename(str): the model file name of the fp32 model
model_filename(str, optional): the model file name of the fp32 model
params_filename(str): the parameter file name of the fp32 model
params_filename(str, optional): the parameter file name of the fp32 model
eval_function(function): eval function, define by yourself to return the metric of the inference program, can be used to judge the metric of quantized model. (TODO: optional)
eval_function(function): eval function, define by yourself to return the metric of the inference program, can be used to judge the metric of quantized model. (TODO: optional)
quantizable_op_type(list, optional): op types that can be quantized
batch_size(int, optional): the batch size of DataLoader, default is 10
data_loader(Python Generator, Paddle.io.DataLoader, optional): the
data_loader(Python Generator, Paddle.io.DataLoader, optional): the
Generator or Dataloader provides calibrate data, and it could
Generator or Dataloader provides calibrate data, and it could
return a batch every time
return a batch every time
save_dir(str, optional): the output dir that stores the analyzed information
save_dir(str, optional): the output dir that stores the analyzed information
checkpoint_name(str, optional): the name of checkpoint file that saves analyzed information and avoids break off while ananlyzing
checkpoint_name(str, optional): the name of checkpoint file that saves analyzed information and avoids break off while ananlyzing
num_histogram_plots: the number histogram plots you want to visilize, the plots will show in one PDF file in the save_dir
num_histogram_plots: the number histogram plots you want to visilize, the plots will show in four PDF files for both best and worst and for both weight and act ops in the save_dir
quantizable_op_type(list): op types that can be quantized
weight_quantize_type(str): quantization type for weights, support 'abs_max' and 'channel_wise_abs_max'
activation_quantize_type(str): quantization type for activation, now support 'range_abs_max', 'moving_average_abs_max' and 'abs_max'
is_full_quantize(bool): if True, apply quantization to all supported quantizable op type. If False, only apply quantization to the input quantizable_op_type. Default is False.
batch_size(int, optional): the batch size of DataLoader, default is 10
batch_nums(int, optional): the number of calibrate data is 'batch_size*batch_nums'