F0801 10:40:21.851949 61013 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537)
Created by: know133
mpi跑一个任务的时候显示错误如下: Wed Aug 1 10:40:16 2018[1,68]:+ ./paddle_trainer --num_gradient_servers=80 --trainer_id=68 --pservers=10.73.88.48,10.73.87.27,10.73.87.28,10.73.87.33,10.73.87.35,10.73.87.38,10.73.87.39,10.73.87.40,10.73.87.42,10.73.87.44,10.73.87.47,10.73.87.50,10.73.87.51,10.73.87.53,10.73.87.55,10.73.88.11,10.73.88.13,10.73.88.17,10.73.88.24,10.73.88.25,10.73.88.31,10.73.88.35,10.73.88.37,10.73.88.38,10.73.88.40,10.73.88.41,10.73.88.47,10.73.87.26,10.73.88.50,10.73.88.51,10.73.88.52,10.73.88.53,10.73.88.54,10.73.90.11,10.73.90.12,10.73.90.13,10.73.90.14,10.73.62.52,10.73.90.19,10.73.90.20,10.73.62.51,10.73.62.48,10.73.91.41,10.73.91.43,10.73.91.44,10.73.91.45,10.73.91.46,10.73.91.47,10.73.91.48,10.73.98.15,10.73.98.16,10.73.98.17,10.73.98.22,10.73.98.26,10.73.98.50,10.73.99.33,10.73.99.36,10.73.99.37,10.73.99.39,10.73.99.48,10.75.51.11,10.75.51.12,10.75.51.13,10.75.51.14,10.75.51.16,10.75.51.19,10.75.51.20,10.75.52.38,10.75.54.13,10.75.54.14,10.75.54.39,10.75.54.41,10.75.54.43,10.75.54.46,10.75.54.49,10.75.55.11,10.75.55.12,10.75.55.13,10.75.55.22,10.75.55.36 --rdma_tcp=tcp --nics=xgbe0 --saving_period=1 --port=7164 --ports_num=1 --parallel_thread_num=50 --local=0 --comment=job.742199.instances --dot_period=100000 --log_period=100000 --num_passes=20 --trainer_count=11 --enable_grad_share=0 --enable_grad_sparse_update=5000000 --use_sparse_updater=1 --use_old_updater=1 --show_parameter_stats_period=100000 --ports_num_for_sparse=1 --loadsave_parameters_in_pserver=0 --config=conf/trainer_config.conf --save_dir=./output --python_path=./python-gcc345 --python_bin=python2.7 --use_gpu=0 Wed Aug 1 10:40:21 2018[1,57]:F0801 10:40:21.850185 61014 SparseRowMatrix.h:134] Check failed: localIndices->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rate Wed Aug 1 10:40:21 2018[1,57]:*** Check failure stack trace: *** Wed Aug 1 10:40:21 2018[1,57]: @ 0x8d7788 google::LogMessage::Fail() Wed Aug 1 10:40:21 2018[1,57]: @ 0x8d76e0 google::LogMessage::SendToLog() Wed Aug 1 10:40:21 2018[1,57]: @ 0x8d7175 google::LogMessage::Flush() Wed Aug 1 10:40:21 2018[1,57]:F0801 10:40:21.851949 61013 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rate Wed Aug 1 10:40:21 2018[1,57]:*** Check failure stack trace: *** Wed Aug 1 10:40:21 2018[1,52]:F0801 10:40:21.871891 8383 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rateF0801 10:40:21.872252 8387 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rate Wed Aug 1 10:40:21 2018[1,52]:*** Check failure stack trace: *** Wed Aug 1 10:40:21 2018[1,68]:F0801 10:40:21.867554 21665 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rate Wed Aug 1 10:40:21 2018[1,52]:F0801 10:40:21.871891 8385 SparseRowMatrix.h:134] Check failed: localIndices_->size() <= heightStore_ (1538 vs. 1537) You may consider increasing the value of --grad_sparse_update_max_sparse_rate
请问这是什么原因导致的