Running additional passes after a few warm-up inference iterations (#15499) · Issue · PaddlePaddle / Paddle

Running additional passes after a few warm-up inference iterations

Created by: wojtuss

@luotao1 , @Superjomn In order to optimize a model with INT8 quantization, it would be necessary to optimize the model using common passes first, then run a few warm-up iterations to gather data required for quantization, then quantize the model using additional passes and finally start the main inference iterations. In your opinion, what would be the best way to handle such a scenario (passes -> warm-up -> passes -> inference)?

Our current idea is to add a method like PaddlePredictor::Reconfigure(const ConfigT& config) where the predictor's configuration would be modified and new passes would be run on the same inference program.

Our current idea is to handle this inside the AnalysisPredictor::PrepareProgram() method.

PaddlePaddle / Paddle 大约 1 年 前同步成功

Running additional passes after a few warm-up inference iterations

PaddlePaddle / Paddle
大约 1 年前同步成功