fc_op slows on multi-instance inference
Created by: luotao1
When profile multi-instance inference on pyramid_dnn, we find fc_op slows on it.
command:
- one instance:
./paddle/fluid/inference/tests/api/test_analyzer_pyramid_dnn --infer_model=third_party/inference_demo/pyramid_dnn/model/ --infer_data=third_party/inference_demo/pyramid_dnn/data.txt --gtest_filter=Analyzer_Pyramid_DNN.profile --paddle_num_threads=1 --repeat=10000 --zero_copy --warmup --num_threads=1
- two instance:
./paddle/fluid/inference/tests/api/test_analyzer_pyramid_dnn --infer_model=third_party/inference_demo/pyramid_dnn/model/ --infer_data=third_party/inference_demo/pyramid_dnn/data.txt --gtest_filter=Analyzer_Pyramid_DNN.profile --paddle_num_threads=1 --repeat=10000 --zero_copy --warmup --num_threads=2
result (latency)
fc_op | remove fc_op, i.e use mul+add | |
---|---|---|
1 threads | 0.13118 | 0.141566 |
2 threads | 0.291523 | 0.152074 |
The remove fc_op codes are:
--- a/paddle/fluid/inference/tests/api/analyzer_pyramid_dnn_tester.cc
+++ b/paddle/fluid/inference/tests/api/analyzer_pyramid_dnn_tester.cc
@@ -110,6 +110,7 @@ void SetConfig(AnalysisConfig *cfg) {
if (FLAGS_zero_copy) {
cfg->SwitchUseFeedFetchOps(false);
}
+ cfg->pass_builder()->DeletePass("fc_fuse_pass");
}