ernie,序列标注,训练随机失败
Created by: onfireisme
- 版本、环境信息: 1)PaddlePaddle版本:请提供您的PaddlePaddle版本号,例如1.1或CommitID 3)GPU:p40 cuda-9.0 cudnn_v7 4)正常开发机
- 训练信息 1)单机,单卡 2)显存够,ernie不够会失败 3) 训练代码直接照搬 https://github.com/PaddlePaddle/models/blob/develop/PaddleNLP/lexical_analysis/run_ernie.sh
- 复现信息: 随机跪,训练数据已经让ernie的同学check过了,也没有问题 W0807 22:51:30.941779 33146 device_context.cc:263] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 9.0, Runtime API Version: 9.0 W0807 22:51:30.941838 33146 device_context.cc:271] device: 0, cuDNN Version: 7.0. Traceback (most recent call last): File "run_ernie_sequence_labeling.py", line 386, in main(args) File "run_ernie_sequence_labeling.py", line 331, in main outputs = exe.run(program=train_program, fetch_list=fetch_list) File "/home/ssd2/wangyan40/fluid/python/lib/python2.7/site-packages/paddle/fluid/executor.py", line 525, in run use_program_cache=use_program_cache) File "/home/ssd2/wangyan40/fluid/python/lib/python2.7/site-packages/paddle/fluid/executor.py", line 591, in run exe.run(program.desc, scope, 0, True, True) paddle.fluid.core.EnforceNotMet: Invoke operator linear_chain_crf error. Python Callstacks: File "/home/ssd2/wangyan40/fluid/python/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1317, in append_op attrs=kwargs.get("attrs", None)) File "/home/ssd2/wangyan40/fluid/python/lib/python2.7/site-packages/paddle/fluid/layer_helper.py", line 56, in append_op return self.main_program.current_block().append_op(args, kwargs) File "/home/ssd2/wangyan40/fluid/python/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 1186, in linear_chain_crf "LogLikelihood": log_likelihood File "run_ernie_sequence_labeling.py", line 134, in create_model learning_rate=args.crf_learning_rate)) File "run_ernie_sequence_labeling.py", line 249, in main train_ret = create_model(args, embeddings, labels=labels, is_prediction=False) File "run_ernie_sequence_labeling.py", line 386, in main(args) C++ Callstacks: Enforce failed. Expected emission_dims[0] == label_dims[0], but received emission_dims[0]:1399 != label_dims[0]:1433. The height of Input(Emission) and the height of Input(Label) should be the same. at [/ssd2/liyukun01/paddle-env/repos/Paddle/paddle/fluid/operators/linear_chain_crf_op.cc:170] PaddlePaddle Call Stacks: 0 0x7f2cdc0fe7cap void paddle::platform::EnforceNotMet::Initstd::string(std::string, char const, int) + 362 1 0x7f2cdc0feb22p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const, int) + 82 2 0x7f2cdc30b175p paddle::operators::LinearChainCRFOp::InferShape(paddle::framework::InferShapeContext) const + 1685 3 0x7f2cdd9d3667p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 631 4 0x7f2cdd9d1064p paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 308 5 0x7f2cdc20db36p paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 246 6 0x7f2cdc20fab1p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool) + 257 7 0x7f2cdc0e4fc2p 8 0x7f2cdc127d89p 9 0x7f2d3e1b955fp PyEval_EvalFrameEx + 29855 10 0x7f2d3e1bb86dp PyEval_EvalCodeEx + 2061