pserver错误导致整个程序core
Created by: lexy0093
pserver.log日志如下:
get_pserver_program() is deprecated, call get_pserver_programs() to get pserver main and startup in a single call. E0701 14:16:51.560485246 2278 tcp_server_posix.cc:64] check for SO_REUSEPORT: {"created":"@1561961811.560460389","description":"OS Error","errno":92,"file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":169,"os_error":"Protocol not available","syscall":"setsockopt(SO_REUSEPORT)"} I0701 14:16:51.563292 2278 grpc_server.cc:430] Server listening on 10.73.182.26:62000 selected port: 62000
trainer.log 日志如下: ++ python -u train.py paddlecloud
* Aborted at 1561961935 (unix time) try "date -d @1561961935" if you are using GNU date *
PC: @ 0x0 (unknown)* SIGSEGV (@0x7f6d2b45e040) received by PID 3264 (TID 0x7f7ba5474700) from PID 725999680; stack trace: *
@ 0x7f7ba4c8f160 (unknown) @ 0x7f7b3edd137c mkl_blas_avx_sgemm_pst @ 0x7f7b3ebc25a6 mkl_blas_avx_xsgemm @ 0x7f7b3e07f73a gemm_batch_internal32 @ 0x7f7b3dfe4a99 mkl_blas__sgemm_batch @ 0x7f7b3dfa4eba cblas_sgemm_batch @ 0x7f7b4733a7b6 paddle::operators::math::Blas<>::BatchedGEMM<>() @ 0x7f7b4733bb9c paddle::operators::math::Blas<>::MatMul<>() @ 0x7f7b4733c6db paddle::operators::MatMulGradKernel<>::MatMul() @ 0x7f7b4733c7ff paddle::operators::MatMulGradKernel<>::CalcInputGrad() @ 0x7f7b4733cddd paddle::operators::MatMulGradKernel<>::Compute() @ 0x7f7b4733d0d3 ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform8CPUPlaceELb0ELm0EJNS0_9operators16MatMulGradKernelINS7_16CPUDeviceContextEfEENSA_ISB_dEENSA_ISB_NS7_7float16EEEEEclEPKcSI_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4 @ 0x7f7b47d66c93 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7f7b47d657bb paddle::framework::OperatorBase::Run() @ 0x7f7b46eb7042 paddle::framework::Executor::RunPreparedContext() @ 0x7f7b46eb8105 paddle::framework::Executor::Run() @ 0x7f7b46d8770b ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL18pybind11_init_coreERNS_6moduleEEUlRNS2_9framework8ExecutorERKNS6_11ProgramDescEPNS6_5ScopeEibbE90_vIS8_SB_SD_ibbEINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESV @ 0x7f7b46dc8f6e pybind11::cpp_function::dispatcher() @ 0x7f7ba4f77010 PyEval_EvalFrameEx @ 0x7f7ba4f78b80 PyEval_EvalCodeEx @ 0x7f7ba4f7708e PyEval_EvalFrameEx @ 0x7f7ba4f78b80 PyEval_EvalCodeEx @ 0x7f7ba4f7708e PyEval_EvalFrameEx @ 0x7f7ba4f78b80 PyEval_EvalCodeEx @ 0x7f7ba4f7708e PyEval_EvalFrameEx @ 0x7f7ba4f78b80 PyEval_EvalCodeEx @ 0x7f7ba4f7708e PyEval_EvalFrameEx @ 0x7f7ba4f78b80 PyEval_EvalCodeEx @ 0x7f7ba4f78c82 PyEval_EvalCode @ 0x7f7ba4f9160f run_mod @ 0x7f7ba4f9267e PyRun_FileExFlags @ 0x7f7ba4f937d7 PyRun_SimpleFileExFlags .//paddle/start_trainer.sh: line 112: 3264 Segmentation fault (core dumped) python -u train.py paddlecloudlog链接:http://10.73.182.26:8910/fileview.html?type=logsdir&path=/&instance=0.app-user-20190701141233-16102--mcdnn-fluid_paddlecloud jobid:job-0bb5d19a441bcfa9