显卡内存不足导致的?但错误日志不明显,辛苦确认下。
Created by: onfireisme
同样的脚本,小样本语料都基本是ok的。
大样本就会跪。
batch size是 64
猜测是显存不足,但是错误日志看着很不明显,辛苦确认下。
I0413 02:31:41.502923 6739 TrainerInternal.cpp:165] Batch=48599 samples=3110336 AvgCost=0.726715 CurrentCost=6.82351 Eval: error=0.0763677 CurrentEval: error=0.46875
* Aborted at 1555093901 (unix time) try "date -d @1555093901" if you are using GNU date *
PC: @ 0x0 (unknown)
* SIGFPE (@0x8d4166) received by PID 6739 (TID 0x7f7805a54700) from PID 9257318; stack trace: *
@ 0x7f7842117a30 (unknown)
@ 0x8d4166 paddle::GpuVectorT<>::getAbsMax()
@ 0x8768c2 ZNSt17_Function_handlerIFvPN6paddle9ParameterEEZNS0_15TrainerInternal13trainOneBatchElRKNS0_9DataBatchEPSt6vectorINS0_8ArgumentESaIS9_EEEUlS2_E_E9_M_invokeERKSt9_Any_dataS2
@ 0x9db033 paddle::Parameter::incUpdate()
@ 0x6d2259 paddle::FullMatrixProjection::backward()
@ 0x69c18a paddle::MixedLayer::backward()
@ 0x682b20 paddle::ParallelThread::computeThread()
@ 0x7f7837f00530 execute_native_thread_routine
@ 0x7f784210ff61 start_thread
@ 0x7f7837690d0d clone
@ 0x0 (unknown)