模型训练出错,内存超限Check failed: posix_memalign(&ptr, 32ul, size) == 0 (12 vs. 0)
Created by: peterjamesjohn
I0921 13:59:56.989329 32240 Util.cpp:166] commandline: --use_gpu=False --trainer_count=1
[WARNING 2017-09-21 14:07:54,033 unit_vec.py:76] build DSSM model with config of rank, cnn
[INFO 2017-09-21 14:07:54,033 unit_vec.py:77] vocabulary sizes: [55564254, 55564254]
[INFO 2017-09-21 14:07:54,033 unit_vec.py:207] build rank model
[INFO 2017-09-21 14:07:54,035 unit_vec.py:110] create embedding table [_] which dimention is 2048
[INFO 2017-09-21 14:07:54,036 unit_vec.py:110] create embedding table [_] which dimention is 2048
[INFO 2017-09-21 14:07:54,037 unit_vec.py:110] create embedding table [_] which dimention is 2048
[INFO 2017-09-21 14:07:54,038 unit_vec.py:163] create a sequence_conv_pool which context width is 3
[INFO 2017-09-21 14:07:54,041 unit_vec.py:165] create a sequence_conv_pool which context width is 4
[INFO 2017-09-21 14:07:54,044 unit_vec.py:177] create fc layer [__fc_0_1024] which dimention is 1024
[INFO 2017-09-21 14:07:54,045 unit_vec.py:177] create fc layer [__fc_1_512] which dimention is 512
[INFO 2017-09-21 14:07:54,046 unit_vec.py:177] create fc layer [__fc_2_256] which dimention is 256
[INFO 2017-09-21 14:07:54,047 unit_vec.py:163] create a sequence_conv_pool which context width is 3
[INFO 2017-09-21 14:07:54,049 unit_vec.py:165] create a sequence_conv_pool which context width is 4
[INFO 2017-09-21 14:07:54,051 unit_vec.py:177] create fc layer [__fc_0_1024] which dimention is 1024
[INFO 2017-09-21 14:07:54,052 unit_vec.py:177] create fc layer [__fc_1_512] which dimention is 512
[INFO 2017-09-21 14:07:54,053 unit_vec.py:177] create fc layer [__fc_2_256] which dimention is 256
[INFO 2017-09-21 14:07:54,054 unit_vec.py:163] create a sequence_conv_pool which context width is 3
[INFO 2017-09-21 14:07:54,056 unit_vec.py:165] create a sequence_conv_pool which context width is 4
[INFO 2017-09-21 14:07:54,059 unit_vec.py:177] create fc layer [__fc_0_1024] which dimention is 1024
[INFO 2017-09-21 14:07:54,060 unit_vec.py:177] create fc layer [__fc_1_512] which dimention is 512
[INFO 2017-09-21 14:07:54,061 unit_vec.py:177] create fc layer [__fc_2_256] which dimention is 256
F0921 14:07:54.075613 32240 Allocator.h:51] Check failed: posix_memalign(&ptr, 32ul, size) == 0 (12 vs. 0)
*** Check failure stack trace: ***
@ 0x7f5fcbe92e6d google::LogMessage::Fail()
@ 0x7f5fcbe9691c google::LogMessage::SendToLog()
@ 0x7f5fcbe92993 google::LogMessage::Flush()
@ 0x7f5fcbe97e2e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f5fcbe00d58 paddle::CpuAllocator::alloc()
@ 0x7f5fcbdffc66 paddle::PoolAllocator::alloc()
@ 0x7f5fcbdfeae6 paddle::CpuMemoryHandle::CpuMemoryHandle()
@ 0x7f5fcbdd480e paddle::CpuVectorT<>::CpuVectorT()
@ 0x7f5fcbdd4cca paddle::VectorT<>::create()
@ 0x7f5fcbdd4de9 paddle::VectorT<>::createParallelVector()
@ 0x7f5fcbcdd1b6 paddle::Parameter::enableType()
@ 0x7f5fcbcf66fc paddle::parameterInitNN()
@ 0x7f5fcbcf8df9 paddle::NeuralNetwork::init()
@ 0x7f5fcbcf576b paddle::GradientMachine::create()
@ 0x7f5fcbe6efb8 GradientMachine::createFromPaddleModelPtr()
@ 0x7f5fcbe6f17f GradientMachine::createByConfigProtoStr()
@ 0x7f5fcbb81ada _wrap_GradientMachine_createByConfigProtoStr
@ 0x4a9e33 PyEval_EvalFrameEx
@ 0x4ad70d PyEval_EvalCodeEx
@ 0x4aa88c PyEval_EvalFrameEx
@ 0x4ad70d PyEval_EvalCodeEx
@ 0x4aa88c PyEval_EvalFrameEx
@ 0x4ad70d PyEval_EvalCodeEx
@ 0x51c2c5 function_call
@ 0x4243c3 PyObject_Call
@ 0x427b3d instancemethod_call
@ 0x4243c3 PyObject_Call
@ 0x47989f slot_tp_init
@ 0x47615f type_call
@ 0x4243c3 PyObject_Call
@ 0x4a79f6 PyEval_EvalFrameEx
@ 0x4ad70d PyEval_EvalCodeEx
Aborted (core dumped)
加载的词典有5000多万行,layer类型为paddle.data_type.integer_value_sequence(self.vocab_sizes[0])),当使用小量词典时,模型训练正常,但是当使用5000多万行的词典时就出了,有什么解决方案吗?无论单机还是mpi,都是一样的错误