pairwise网络结构,训练若干论后报错退出
Created by: HugoLian
我有一个90012864*2的pairwise网络模型,训练样本大小1G左右,训练若干pass(数量不等)后会报相同的错误退出,请问大概可能是什么问题呢? 切图如下:
错误代码为:
I0124 17:57:33.678707 1677 Stat.cpp:140] ======= BarrierStatSet status ======
I0124 17:57:33.678711 1677 Stat.cpp:153] --------------------------------------------------
I0124 17:59:14.659487 1677 Tester.cpp:127] Test samples=500000 cost=0.168849 Eval:
I0124 17:59:14.667881 1677 GradientMachine.cpp:112] Saving parameters to ./output/pass-00013
I0124 17:59:14.668278 1677 Util.cpp:230] copy trainer_config_pairwise.py to ./output/pass-00013
*** Aborted at 1485251963 (unix time) try "date -d @1485251963" if you are using GNU date ***
PC: @ 0x7f057de57e82 PyFloat_AsDouble
*** SIGSEGV (@0x79) received by PID 1677 (TID 0x7f056a0b2700) from PID 121; stack trace: ***
@ 0x7f057e408160 (unknown)
@ 0x7f057de57e82 PyFloat_AsDouble
@ 0x69aa9a paddle::DenseScanner::fill()
@ 0x6a01ad paddle::PyDataProvider2::getNextBatchInternal()
@ 0x6a9fe9 paddle::DoubleBuffer::asyncLoadBatch()
@ 0x7f057db828a0 execute_native_thread_routine
@ 0x7f057e4001c3 start_thread
@ 0x7f057d2f312d __clone
/home/iknow/lianjie/paddle/paddle_internal_release_tools/idl/paddle/output/bin/paddle_local: line 109: 1677 Segmentation fault (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}