4机异步训练se-resnet模型中途有两台trainer出错
Created by: kolinwei
log: *** SIGSEGV (@0x7f1228ff7790) received by PID 12583 (TID 0x7ee2c1fcb700) from PID 687830928; stack trace: *** @ 0x7f0b493f6390 (unknown) @ 0x7f0af646b4c7 paddle::framework::Scope::FindVarLocally() @ 0x7f0af646b8f2 paddle::framework::Scope::FindVarInternal() @ 0x7f0af646b85e paddle::framework::Scope::FindVar() @ 0x7f0af638c940 _ZNSt17_Function_handlerIFSt10unique_ptrIN6paddle8platform13EnforceNotMetESt14default_deleteIS3_EEvESt17reference_wrapperISt12_Bind_simpleIFS8_IZNS1_9framework10ThreadPool18RunAndGetExceptionIZNS1_9operators11distributed10GRPCClient12AsyncSendVarERKSsRKNS2_13DeviceContextERKNSA_5ScopeESH_lEUlvE_EESt6futureIS6_ET_EUlvE_EvEEEE9_M_invokeERKSt9_Any_data @ 0x7f0af614c92a std::_Function_handler<>::_M_invoke() @ 0x7f0af614bf47 std::__future_base::_State_base::_M_do_set() @ 0x7f0b493f3a99 __pthread_once_slow @ 0x7f0af638f524 _ZNSt13__future_base11_Task_stateIZN6paddle9framework10ThreadPool18RunAndGetExceptionIZNS1_9operators11distributed10GRPCClient12AsyncSendVarERKSsRKNS1_8platform13DeviceContextERKNS2_5ScopeES9_lEUlvE_EESt6futureISt10unique_ptrINSA_13EnforceNotMetESt14default_deleteISK_EEET_EUlvE_SaIiEFSN_vEE6_M_runEv @ 0x7f0af646f6a8 paddle::framework::ThreadPool::TaskLoop() @ 0x7f0b0826cc80 (unknown) @ 0x7f0b493ec6ba start_thread @ 0x7f0b4912241d clone @ 0x0 (unknown)