Yolov3 增强模型，使用cpp_infer.py的trt_int8模型，报错out of memory, 1080Ti显卡 (#663) · Issue · PaddlePaddle / PaddleDetection

Yolov3 增强模型，使用cpp_infer.py的trt_int8模型，报错out of memory, 1080Ti显卡

Created by: mingmmq

W0514 12:45:04.687070 10049 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.2, Runtime API Version: 10.1
W0514 12:45:04.690155 10049 device_context.cc:245] device: 0, cuDNN Version: 7.6.
I0514 12:45:05.381332 10049 tensorrt_engine_op.h:135] This process is generating calibration table for Paddle TRT int8...
I0514 12:45:05.381502 10171 tensorrt_engine_op.h:310] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
E0514 12:45:05.891810 10171 helper.h:65] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
E0514 12:45:05.892612 10171 helper.h:65] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():  

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2   paddle::inference::tensorrt::TensorRTEngine::FreezeNetwork()
3   paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine(paddle::framework::BlockDesc*, paddle::framework::Scope const&, std::vector<std::string, std::allocator<std::string> > const&, std::unordered_set<std::string, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::string> > const&, std::vector<std::string, std::allocator<std::string> > const&, paddle::inference::tensorrt::TensorRTEngine*)
4   paddle::operators::TensorRTEngineOp::PrepareTRTEngine(paddle::framework::Scope const&, paddle::inference::tensorrt::TensorRTEngine*) const
5   std::thread::_Impl<std::_Bind_simple<paddle::operators::TensorRTEngineOp::RunCalibration(paddle::framework::Scope const&, paddle::platform::Place const&) const::{lambda()#1} ()> >::_M_run()

----------------------
Error Message Summary:
----------------------
Error: build cuda engine failed! at (/home/kik/Github/paddle/paddle/fluid/inference/tensorrt/engine.cc:136)

W0514 12:45:05.903018 10171 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0514 12:45:05.903030 10171 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0514 12:45:05.903038 10171 init.cc:214] The detail failure signal is:

W0514 12:45:05.903044 10171 init.cc:217] *** Aborted at 1589426105 (unix time) try "date -d @1589426105" if you are using GNU date ***
W0514 12:45:05.905047 10171 init.cc:217] PC: @                0x0 (unknown)
W0514 12:45:05.905108 10171 init.cc:217] *** SIGABRT (@0x3e800002741) received by PID 10049 (TID 0x7fcc2cffd700) from PID 10049; stack trace: ***
W0514 12:45:05.906908 10171 init.cc:217]     @     0x7fcd2cc5b390 (unknown)
W0514 12:45:05.908569 10171 init.cc:217]     @     0x7fcd2c8b5428 gsignal
W0514 12:45:05.910239 10171 init.cc:217]     @     0x7fcd2c8b702a abort
W0514 12:45:05.911355 10171 init.cc:217]     @     0x7fccc397684a __gnu_cxx::__verbose_terminate_handler()
W0514 12:45:05.912174 10171 init.cc:217]     @     0x7fccc3974f47 __cxxabiv1::__terminate()
W0514 12:45:05.913167 10171 init.cc:217]     @     0x7fccc3974f7d std::terminate()
W0514 12:45:05.914073 10171 init.cc:217]     @     0x7fccc397515a __cxa_throw
W0514 12:45:05.917677 10171 init.cc:217]     @     0x7fccb2e91c40 paddle::inference::tensorrt::TensorRTEngine::FreezeNetwork()
W0514 12:45:05.920116 10171 init.cc:217]     @     0x7fccb1dbd4dc paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine()
W0514 12:45:05.922443 10171 init.cc:217]     @     0x7fccb1dbe399 paddle::operators::TensorRTEngineOp::PrepareTRTEngine()
W0514 12:45:05.926564 10171 init.cc:217]     @     0x7fccb1dbf047 _ZNSt6thread5_ImplISt12_Bind_simpleIFZNK6paddle9operators16TensorRTEngineOp14RunCalibrationERKNS2_9framework5ScopeERKNS2_8platform5PlaceEEUlvE_vEEE6_M_runEv
W0514 12:45:05.927413 10171 init.cc:217]     @     0x7fccc3991421 execute_native_thread_routine_compat
W0514 12:45:05.929185 10171 init.cc:217]     @     0x7fcd2cc516ba start_thread
W0514 12:45:05.930933 10171 init.cc:217]     @     0x7fcd2c98741d clone
W0514 12:45:05.932567 10171 init.cc:217]     @                0x0 (unknown)
[1]    10049 abort (core dumped)  python tools/cpp_infer.py --model_path  --config_path tools/cpp_demo.yml

现在trt_fp32和trt_fp16都是可以跑的。

PaddlePaddle / PaddleDetection 1 年多 前同步成功

Yolov3 增强模型，使用cpp_infer.py的trt_int8模型，报错out of memory, 1080Ti显卡

PaddlePaddle / PaddleDetection
1 年多前同步成功