- 与PTQ(Post Training Quantization)量化分析工具的区别:与PTQ量化分析工具不同的是,QAT量化分析工具加载量化训练后的量化模型,遍历所有量化的层,依次去掉量化层,加载Float模型的参数,并进行验证获取精度误差分析。而PTQ量化分析工具则是加载待量化的原模型,对模型所有层依次进行量化,每次量化一层,进行验证获取精度误差分析。
AnalysisPTQ provides to analysis the sensitivity of each op in the model.
save_dir='analysis_results',
quant_config=None):
'''
Analysis provides to analysis the sensitivity of each op in the model.
Args:
model_dir(str): the path of fp32 model that will be quantized, it can also be '.onnx'
model_filename(str, optional): the model file name of the fp32 model
params_filename(str, optional): the parameter file name of the fp32 model
float_model_dir(str, required): the path of fp32 model, it can also be '.onnx'
quant_model_dir(str, optional):the path of quantized model, if is None, float model will be quantized by PTQ
model_filename(str, optional): the model file name of the fp32 and quantized model
params_filename(str, optional): the parameter file name of the fp32 and quantized model
eval_function(function): eval function, define by yourself to return the metric of the inference program, can be used to judge the metric of quantized model. (TODO: optional)
data_loader(Python Generator, Paddle.io.DataLoader, optional): the
Generator or Dataloader provides calibrate data, and it could
return a batch every time
save_dir(str, optional): the output dir that stores the analyzed information
resume(bool, optional): When break off while ananlyzing, could resume analysis program and load already analyzed information.
ptq_config(dict, optional): the args that can initialize PostTrainingQuantization
"""
quant_config(dict, optional): the args that can initialize PostTrainingQuantization
AnalysisQAT provides to analysis the sensitivity of each op in the model.
Args:
quant_model_dir(str): the path of INT8 model that quantized through QAT
float_model_dir(str): the path of FP32 model that is the base model of quant_model
model_filename(str, optional): the model file name of the model
params_filename(str, optional): the parameter file name of the model
quantizable_op_type(list of str, optional): the type of op that will be analyzed
qat_metric(float, optional): the metric of the quantized model, which will be calculated automatically if is None
eval_function(function): eval function, define by yourself to return the metric of the inference program, can be used to judge the metric of quantized model.
data_loader(Python Generator, Paddle.io.DataLoader, optional): the
Generator or Dataloader provides calibrate data, and it could
return a batch every time
save_dir(str, optional): the output dir that stores the analyzed information
resume(bool, optional): When break off while ananlyzing, could resume analysis program and load already analyzed information.