【论文复现】loss.backward() 造成core dump
Created by: kgkzhiwen
再复现Few Shot Vid2Vid 的时候,运行Generator loss.backward() 造成core dump. 请问能給点建议如何去排查可能的出错原因?loss 本身没有问题: ` name tmp_3078, dtype: VarType.FP32 shape: [1] lod: {} dim: 1 layout: NCHW dtype: float data: [3.32153]
W0824 22:17:25.338116 10276 init.cc:216] Warning: PaddlePaddle catches a failure signal, it may not work properly W0824 22:17:25.338174 10276 init.cc:218] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle W0824 22:17:25.338186 10276 init.cc:221] The detail failure signal is:
W0824 22:17:25.338191 10276 init.cc:224] *** Aborted at 1598278645 (unix time) try "date -d @1598278645" if you are using GNU date *** W0824 22:17:25.339613 10276 init.cc:224] PC: @ 0x0 (unknown) W0824 22:17:25.339886 10276 init.cc:224] *** SIGSEGV (@0x68) received by PID 10276 (TID 0x7f22d9270700) from PID 104; stack trace: *** W0824 22:17:25.341013 10276 init.cc:224] @ 0x7f22d8e5b390 (unknown) W0824 22:17:25.343962 10276 init.cc:224] @ 0x7f226416b043 paddle::framework::Tensor::Resize() W0824 22:17:25.345258 10276 init.cc:224] @ 0x7f22620ff072 paddle::operators::CUDNNGridSampleGradOpKernel<>::Compute() W0824 22:17:25.346336 10276 init.cc:224] @ 0x7f22620ffcc3 ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform9CUDAPlaceELb0ELm0EINS0_9operators27CUDNNGridSampleGradOpKernelIfEENSA_IdEEEEclEPKcSF_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4 W0824 22:17:25.347174 10276 init.cc:224] @ 0x7f22640b28d1 paddle::imperative::PreparedOpRunImpl<>() W0824 22:17:25.349371 10276 init.cc:224] @ 0x7f22640b2c06 paddle::imperative::PreparedOp::Run() W0824 22:17:25.351394 10276 init.cc:224] @ 0x7f2260e31125 paddle::imperative::OpBase::Run() W0824 22:17:25.353406 10276 init.cc:224] @ 0x7f2260e3b771 paddle::imperative::BasicEngine::Execute() W0824 22:17:25.353579 10276 init.cc:224] @ 0x7f2260bdea13 ZZN8pybind1112cpp_function10initializeIZN6paddle6pybind14BindImperativeEPNS_6moduleEEUlRNS2_10imperative7VarBaseERKNS6_6detail16BackwardStrategyERKNS6_6TracerEE17_vIS8_SC_SF_EINS_4nameENS_9is_methodENS_7siblingENS_10call_guardIINS_18gil_scoped_releaseEEEEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNES10 W0824 22:17:25.354435 10276 init.cc:224] @ 0x7f2260ad2139 pybind11::cpp_function::dispatcher() W0824 22:17:25.354734 10276 init.cc:224] @ 0x562aae4c9744 _PyMethodDef_RawFastCallKeywords W0824 22:17:25.354971 10276 init.cc:224] @ 0x562aae4c9861 _PyCFunction_FastCallKeywords W0824 22:17:25.355197 10276 init.cc:224] @ 0x562aae5356e8 _PyEval_EvalFrameDefault W0824 22:17:25.355406 10276 init.cc:224] @ 0x562aae479539 _PyEval_EvalCodeWithName W0824 22:17:25.355612 10276 init.cc:224] @ 0x562aae47a635 _PyFunction_FastCallDict W0824 22:17:25.355835 10276 init.cc:224] @ 0x562aae532232 _PyEval_EvalFrameDefault W0824 22:17:25.356052 10276 init.cc:224] @ 0x562aae47981a _PyEval_EvalCodeWithName W0824 22:17:25.356261 10276 init.cc:224] @ 0x562aae47a635 _PyFunction_FastCallDict W0824 22:17:25.356485 10276 init.cc:224] @ 0x562aae532232 _PyEval_EvalFrameDefault W0824 22:17:25.356690 10276 init.cc:224] @ 0x562aae47981a _PyEval_EvalCodeWithName W0824 22:17:25.356884 10276 init.cc:224] @ 0x562aae4c8f57 _PyFunction_FastCallKeywords W0824 22:17:25.357115 10276 init.cc:224] @ 0x562aae530806 _PyEval_EvalFrameDefault W0824 22:17:25.357319 10276 init.cc:224] @ 0x562aae479539 _PyEval_EvalCodeWithName W0824 22:17:25.357509 10276 init.cc:224] @ 0x562aae4c8ef5 _PyFunction_FastCallKeywords W0824 22:17:25.357730 10276 init.cc:224] @ 0x562aae530a93 _PyEval_EvalFrameDefault W0824 22:17:25.357942 10276 init.cc:224] @ 0x562aae479539 _PyEval_EvalCodeWithName W0824 22:17:25.358147 10276 init.cc:224] @ 0x562aae4c8ef5 _PyFunction_FastCallKeywords W0824 22:17:25.358373 10276 init.cc:224] @ 0x562aae530806 _PyEval_EvalFrameDefault W0824 22:17:25.358566 10276 init.cc:224] @ 0x562aae4c8ccb _PyFunction_FastCallKeywords W0824 22:17:25.358785 10276 init.cc:224] @ 0x562aae530806 _PyEval_EvalFrameDefault W0824 22:17:25.358997 10276 init.cc:224] @ 0x562aae479539 _PyEval_EvalCodeWithName W0824 22:17:25.359216 10276 init.cc:224] @ 0x562aae47a424 PyEval_EvalCodeEx Segmentation fault (core dumped)`