[Paddle-TRT] Can't do ERNIE inference with tensorrt in develop branch (#25107) · Issue · PaddlePaddle / Paddle

[Paddle-TRT] Can't do ERNIE inference with tensorrt in develop branch

Created by: zlsh80826

-PaddlePaddle version: develop -GPU: including CUDA 10.2.89/CUDNN 7.6 -OS Platform: Ubuntu16.04 -Python version: 3.7 -C++version: 7.3.0 -API information To Reproduce Run the ERNIE tensorrt example in Paddle-Inference-Demo Other info / logs Hello, we ran into a tensorrt bug in the latest commit when running the above example. (Actually, I think this bug was happened for a long time). We can run the example in the older commit (61ec30f0). The bug seems like caused by the incorrect op convert, following is the error log.

I0617 21:33:10.236757  7086 analysis_predictor.cc:140] Profiler is deactivated, and no profiling report will be generated.
I0617 21:33:10.243189  7086 analysis_predictor.cc:929] MODEL VERSION: 1.7.2
I0617 21:33:10.243207  7086 analysis_predictor.cc:931] PREDICTOR VERSION: 0.0.0
W0617 21:33:10.243245  7086 analysis_predictor.cc:944]  - Version incompatible (1) dropout
W0617 21:33:10.243255  7086 analysis_predictor.cc:944]  - Version incompatible (1) elementwise_add
W0617 21:33:10.243261  7086 analysis_predictor.cc:944]  - Version incompatible (1) feed
W0617 21:33:10.243268  7086 analysis_predictor.cc:944]  - Version incompatible (1) fetch
W0617 21:33:10.243274  7086 analysis_predictor.cc:944]  - Version incompatible (3) layer_norm
W0617 21:33:10.243280  7086 analysis_predictor.cc:944]  - Version incompatible (1) lookup_table
W0617 21:33:10.243288  7086 analysis_predictor.cc:944]  - Version incompatible (2) matmul
W0617 21:33:10.243294  7086 analysis_predictor.cc:944]  - Version incompatible (2) mul
W0617 21:33:10.243299  7086 analysis_predictor.cc:944]  - Version incompatible (1) relu
W0617 21:33:10.243305  7086 analysis_predictor.cc:944]  - Version incompatible (2) reshape2
W0617 21:33:10.243311  7086 analysis_predictor.cc:944]  - Version incompatible (1) scale
W0617 21:33:10.243317  7086 analysis_predictor.cc:944]  - Version incompatible (2) slice
W0617 21:33:10.243324  7086 analysis_predictor.cc:944]  - Version incompatible (1) softmax
W0617 21:33:10.243330  7086 analysis_predictor.cc:944]  - Version incompatible (1) stack
W0617 21:33:10.243336  7086 analysis_predictor.cc:944]  - Version incompatible (1) tanh
W0617 21:33:10.243342  7086 analysis_predictor.cc:944]  - Version incompatible (1) transpose2
W0617 21:33:10.243348  7086 analysis_predictor.cc:196] WARNING: Results may be DIFF! Please use the corresponding version of the model and prediction library, and do not use the develop branch.
I0617 21:33:10.243445  7086 analysis_predictor.cc:450] TensorRT subgraph engine is enabled
^[[1m^[[35m--- Running analysis [ir_graph_build_pass]^[[0m
^[[1m^[[35m--- Running analysis [ir_graph_clean_pass]^[[0m
^[[1m^[[35m--- Running analysis [ir_analysis_pass]^[[0m
^[[32m--- Running IR pass [conv_affine_channel_fuse_pass]^[[0m
^[[32m--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]^[[0m
^[[32m--- Running IR pass [shuffle_channel_detect_pass]^[[0m
^[[32m--- Running IR pass [quant_conv2d_dequant_fuse_pass]^[[0m
^[[32m--- Running IR pass [delete_quant_dequant_op_pass]^[[0m
^[[32m--- Running IR pass [simplify_with_basic_ops_pass]^[[0m
^[[32m--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]^[[0m
I0617 21:33:11.094410  7086 graph_pattern_detector.cc:100] ---  detected 7 subgraphs
^[[32m--- Running IR pass [multihead_matmul_fuse_pass_v2]^[[0m
I0617 21:33:11.144394  7086 graph_pattern_detector.cc:100] ---  detected 3 subgraphs
^[[32m--- Running IR pass [skip_layernorm_fuse_pass]^[[0m
I0617 21:33:11.157673  7086 graph_pattern_detector.cc:100] ---  detected 6 subgraphs
^[[32m--- Running IR pass [conv_bn_fuse_pass]^[[0m
^[[32m--- Running IR pass [fc_fuse_pass]^[[0m
I0617 21:33:11.159014  7086 graph_pattern_detector.cc:100] ---  detected 3 subgraphs
I0617 21:33:11.159682  7086 graph_pattern_detector.cc:100] ---  detected 8 subgraphs
^[[32m--- Running IR pass [tensorrt_subgraph_pass]^[[0m
I0617 21:33:11.161391  7086 tensorrt_subgraph_pass.cc:115] ---  detect a sub-graph with 19 nodes
W0617 21:33:11.162122  7086 tensorrt_subgraph_pass.cc:285] The Paddle lib links the 7011 version TensorRT, make sure the runtime TensorRT you are using is no less than this version, otherwise, there might be Segfault!
I0617 21:33:11.162168  7086 tensorrt_subgraph_pass.cc:321] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I0617 21:33:12.478950  7086 engine.cc:83] Run Paddle-TRT FP16 mode
I0617 21:33:12.479008  7086 engine.cc:151] Run Paddle-TRT Dynamic Shape mode.
W0617 21:33:16.834189  7086 device_context.cc:265] Please NOTE: device: 2, CUDA Capability: 75, Driver API Version: 11.0, Runtime API Version: 10.2
W0617 21:33:16.834365  7086 device_context.cc:273] device: 2, cuDNN Version: 7.6.
I0617 21:33:37.546492  7086 tensorrt_subgraph_pass.cc:115] ---  detect a sub-graph with 4 nodes
I0617 21:33:37.546820  7086 tensorrt_subgraph_pass.cc:321] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2   paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine(paddle::framework::BlockDesc*, paddle::framework::Scope const&, std::vector<std::string, std::allocator<std::string > > const&, std::unordered_set<std::string, std::hash<std::string >, std::equal_to<std::string >, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > > const&, paddle::inference::tensorrt::TensorRTEngine*)
3   paddle::inference::analysis::TensorRtSubgraphPass::CreateTensorRTOp(paddle::framework::ir::Node*, paddle::framework::ir::Graph*, std::vector<std::string, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > >*) const
4   paddle::inference::analysis::TensorRtSubgraphPass::ApplyImpl(paddle::framework::ir::Graph*) const
5   paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph*) const
6   paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_delete<paddle::framework::ir::Graph> >)
7   paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*)
8   paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*)
9   paddle::AnalysisPredictor::OptimizeInferenceProgram()
10  paddle::AnalysisPredictor::PrepareProgram(std::shared_ptr<paddle::framework::ProgramDesc> const&)
11  paddle::AnalysisPredictor::Init(std::shared_ptr<paddle::framework::Scope> const&, std::shared_ptr<paddle::framework::ProgramDesc> const&)
12  std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&)
13  std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig>(paddle::AnalysisConfig const&)

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: TensorRT's tensor input requires at least 2 dimensions, but input slice_0.tmp_0 has 1 dims.
  [Hint: Expected shape.size() > 1UL, but received shape.size():1 <= 1UL:1.] at (/home/rewang/Paddle-dev/paddle/fluid/inference/tensorrt/engine.h:67)

PaddlePaddle / Paddle 大约 2 年 前同步成功

[Paddle-TRT] Can't do ERNIE inference with tensorrt in develop branch

PaddlePaddle / Paddle
大约 2 年前同步成功