多线程环境下使用fluid在线预估库,释放clone的predictor出core
Created by: songyiting
- 版本、环境信息: 1)PaddlePaddle版本:fluid 1.4 2)CPU:Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz 3)系统环境:Linux
- 复现信息:内网环境,在模型更新时必现
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段
我们的预估服务在每个线程中分配一个thread_local的PaddlePredictor指针,各自指向一个Clone的PaddlePredictor对象,当模型更新时释放对象并将指针指向新Clone的对象。相关代码如下:
class UapTagPredictor {
...
std::unique_ptr<paddle::PaddlePredictor> _predictor;
}
//当检测到模型更新时,释放旧的_predictor并重新Clone
void UapTagPredictor::reset_predictor() {
_model = ModelManager::get_instance().get_model();
_predictor = nullptr;
_predictor = _model->clone_predictor();
_last_model_time = _model->get_udpate_time();
}
在释放clone出的predictor时出core,core信息如下: (gdb) bt #0 0x00007efedb8035f9 in paddle::memory::detail::MetadataCache::load(paddle::memory::detail::MemoryBlock const*) const () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #1 (closed) 0x00007efedb802e54 in paddle::memory::detail::MemoryBlock::type(paddle::memory::detail::MetadataCache const&) const () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #2 (closed) 0x00007efedb8022c9 in paddle::memory::detail::BuddyAllocator::Free(void*) () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #3 (closed) 0x00007efedb7ff365 in void paddle::memory::legacy::Freepaddle::platform::CPUPlace(paddle::platform::CPUPlace const&, void*, unsigned long) () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #4 (closed) 0x00007efedb8000e5 in paddle::memory::allocation::LegacyAllocator::Free(paddle::memory::allocation::Allocation*) () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #5 (closed) 0x00007efeda9a920a in paddle::AnalysisPredictor::~AnalysisPredictor() () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #6 (closed) 0x00007efeda9a9321 in paddle::AnalysisPredictor::~AnalysisPredictor() () from /home/work/cpufutureattention_a7ae9c75a852f7dbfd5afd52ffcafce7/bin/../lib/libpaddle_fluid.so #7 (closed) 0x000000000048294d in operator() (this=, __ptr=) at /home/opt/gcc-4.8.2.bpkg-r4/gcc-4.8.2.bpkg-r4/include/c++/4.8.2/bits/unique_ptr.h:67 #8 (closed) reset (__p=, this=0x20ae08150) at /home/opt/gcc-4.8.2.bpkg-r4/gcc-4.8.2.bpkg-r4/include/c++/4.8.2/bits/unique_ptr.h:262 #9 (closed) operator= (this=0x20ae08150) at /home/opt/gcc-4.8.2.bpkg-r4/gcc-4.8.2.bpkg-r4/include/c++/4.8.2/bits/unique_ptr.h:213 #10 (closed) cpu::predict::UapTagPredictor::reset_predictor (this=0x20ae080b0) at baidu/cpu/fluid-paddle-predictor/src/predictor/uap_tag_predictor.cpp:25 #11 0x00000000004779fe in get_predictor () at baidu/cpu/fluid-paddle-predictor/src/predictor/predictor_manager.h:31 #12 (closed) cpu::predict::PredictionServiceImpl::predict (this=, controller=0x237991c00, request=0x20af1bfc0, response=0x21b998800, done=0x21bfdc7b0) at baidu/cpu/fluid-paddle-predictor/src/business/prediction_service_impl.cpp:37 #13 (closed) 0x000000000046d7e5 in cpu::predict::PredictionService::CallMethod (this=, method=, controller=, request=, response=, done=) at bc_out/baidu/cpu/fluid-paddle-predictor/predict.pb.cc:3933 #14 (closed) 0x0000000000515f2d in baidu::rpc::policy::ProcessRpcRequest (msg_base=0x237c30000) at baidu/base/baidu-rpc/src/baidu/rpc/policy/baidu_rpc_protocol.cpp:522 #15 0x000000000059263a in baidu::rpc::ProcessInputMessage (void_arg=) at baidu/base/baidu-rpc/src/baidu/rpc/input_messenger.cpp:134 #16 (closed) 0x000000000059393f in baidu::rpc::InputMessenger::OnNewMessages (m=0x237cc1a00) at baidu/base/baidu-rpc/src/baidu/rpc/input_messenger.cpp:344 #17 0x00000000004c203d in baidu::rpc::Socket::ProcessEvent (arg=0x237cc1a00) at baidu/base/baidu-rpc/src/baidu/rpc/socket.cpp:1110 #18 (closed) 0x0000000000652fda in bthread::TaskGroup::task_runner (skip_remained=) at baidu/base/bthread/bthread/task_group.cpp:293 #19 (closed) 0x000000000064a681 in bthread_make_fcontext () Backtrace stopped: Cannot access memory at address 0x7efec4bdd000