MKLDNN 预测多线程创建多实例出现Segmentation fault
Created by: zhouyongxyz
通过 #701 (closed) 设置mkldnn 缓存数来解决内存泄露问题。我发现一个新的问题,就是在多线程情况下创建多个实例进行推理会出现段错误。 Thread 2 (Thread 0x7fffc81ec700 (LWP 7311)): #0 0x00007ffff5b9266c in dnnl::primitive_desc_base::query_md(dnnl::query, int) const () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #1 0x00007ffff64f3d2c in paddle::operators::ConvMKLDNNOpKernel<float, float>::ComputeFP32(paddle::framework::ExecutionContext const&) const () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #2 0x00007ffff64f4eee in paddle::operators::ConvMKLDNNOpKernel<float, float>::Compute(paddle::framework::ExecutionContext const&) const () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #3 0x00007ffff64f513f in std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::ConvMKLDNNOpKernel<float, float> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #4 0x00007ffff7345bd0 in paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #5 0x00007ffff73468ee in paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #6 0x00007ffff733d4d6 in paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #7 0x00007ffff5917c41 in paddle::framework::NaiveExecutor::Run() () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #8 0x00007ffff56bcf4c in paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocatorpaddle::PaddleTensor > const&, std::vector<paddle::PaddleTensor, std::allocatorpaddle::PaddleTensor >*, int) () from /home/haifan/haifan/zhouyong/ocr/Paddle_1.8.4/build/fluid_inference_install_dir/paddle/lib/libpaddle_fluid.so #9 0x00005555555e1859 in PaddleOCR::CRNNRecognizer::Run (this=0x555556f13680, boxes=std::vector of length 11, capacity 11 = {...}, img=...)
目前初步估计可能和mkldnn 本身多线程有关系,内存回收的时候出现了错误。多进程就没有该问题。