Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • ERNIE
  • Issue
  • #177

E
ERNIE
  • 项目概览

PaddlePaddle / ERNIE
大约 2 年 前同步成功

通知 115
Star 5997
Fork 1271
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 29
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 0
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
E
ERNIE
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 29
    • Issue 29
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 0
    • 合并请求 0
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 6月 24, 2019 by saxon_zh@saxon_zhGuest

数据量大时的预测错误

Created by: DominicXWang

将chnsenticorp 数据集中的文本替换成自己需要的文本,然后用ernie_encoder.py导出词向量遇到问题。 小数据:5000条以下,正常执行 中等规模2-4万,有时报错,有时正常 10万以上行,都会报错。

每次报错不同,分两种: 第一种是 Load pretraining parameters from /home/X/tools/py27/ernie/model/params. Traceback (most recent call last): File "ernie_encoder.py", line 182, in main(args) File "ernie_encoder.py", line 160, in main return_numpy=False) File "/home/X/.jumbo/lib/python2.7/lib/python2.7/site-packages/paddle/fluid/executor.py", line 565, in run use_program_cache=use_program_cache) File "/home/X/.jumbo/lib/python2.7/lib/python2.7/site-packages/paddle/fluid/executor.py", line 642, in run exe.run(program.desc, scope, 0, True, True, fetch_var_name) paddle.fluid.core.EnforceNotMet: Invoke operator sequence_unpad error. Python Callstacks: File "/home/X/.jumbo/lib/python2.7/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1654, in append_op attrs=kwargs.get("attrs", None)) File "/home/X/.jumbo/lib/python2.7/lib/python2.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op return self.main_program.current_block().append_op(*args, **kwargs) File "/home/X/.jumbo/lib/python2.7/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 4120, in sequence_unpad outputs={'Out': out}) File "ernie_encoder.py", line 75, in create_model unpad_enc_out = fluid.layers.sequence_unpad(enc_out, length=seq_lens) File "ernie_encoder.py", line 128, in main args, pyreader_name='reader', ernie_config=ernie_config) File "ernie_encoder.py", line 182, in main(args) C++ Callstacks: Enforce failed. Expected numel() * SizeOfType(type()) <= memory_size(), but received numel() * SizeOfType(type()):16195584 > memory_size():3735552. Tensor's dims is out of bound. Call Tensor::mutable_data first to re-allocate memory. or maybe the required data-type mismatches the data already stored. at [/paddle/paddle/fluid/framework/tensor.cc:28] PaddlePaddle Call Stacks: 0 0x7f561f86cb68p void paddle::platform::EnforceNotMet::Initstd::string(std::string, char const*, int) + 360 1 0x7f561f86ceb7p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 87 2 0x7f56215525cap paddle::framework::Tensor::check_memory_size() const + 394 3 0x7f56211bb903p paddle::operators::math::UnpaddingLoDTensorFunctor<paddle::platform::CUDADeviceContext, float>::operator()(paddle::platform::CUDADeviceContext const&, paddle::framework::LoDTensor const&, paddle::framework::LoDTensor*, int, int, bool, paddle::operators::math::PadLayout) + 611 4 0x7f562069e2a5p paddle::operators::SequenceUnpadOpKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 981 5 0x7f562069e4a3p std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::SequenceUnpadOpKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::SequenceUnpadOpKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::SequenceUnpadOpKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::SequenceUnpadOpKernel<paddle::platform::CUDADeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&) + 35 6 0x7f56214fe6f6p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 662 7 0x7f56214fee64p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 292 8 0x7f56214fc78cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332 9 0x7f561f9e18bep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382 10 0x7f561f9e26ffp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool) + 143 11 0x7f561f85c35ep 12 0x7f561f89f72ep 13 0x7f56836264e2p PyEval_EvalFrameEx + 29874 14 0x7f56836287fdp PyEval_EvalCodeEx + 2061 15 0x7f5683625982p PyEval_EvalFrameEx + 26962 16 0x7f56836287fdp PyEval_EvalCodeEx + 2061 17 0x7f5683625982p PyEval_EvalFrameEx + 26962 18 0x7f5683625a9dp PyEval_EvalFrameEx + 27245 19 0x7f56836287fdp PyEval_EvalCodeEx + 2061 20 0x7f5683628932p PyEval_EvalCode + 50 21 0x7f5683654882p PyRun_FileExFlags + 146 22 0x7f5683655bf9p PyRun_SimpleFileExFlags + 217 23 0x7f568366bb0dp Py_Main + 3149 24 0x38bfc21b45p __libc_start_main + 245 25 0x400691p 请问这是某一个batch分配现存时的错误吗?reader里没有控制机制吗?

另一种错误是 Load pretraining parameters from /home/X/tools/py367gcc48_paddle/ernie/model/params.

* Aborted at 1561361472 (unix time) try "date -d @1561361472" if you are using GNU date *

PC: @ 0x0 (unknown)

* SIGFPE (@0x7ff61b501850) received by PID 14084 (TID 0x7ff67d886740) from PID 458233936; stack trace: *

@ 0x38c040f130 (unknown) @ 0x7ff61b501850 paddle::operators::math::UnpaddingLoDTensorFunctor<>::operator()() @ 0x7ff61a9e42a5 paddle::operators::SequenceUnpadOpKernel<>::Compute() @ 0x7ff61a9e44a3 ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform9CUDAPlaceELb0ELm0EINS0_9operators21SequenceUnpadOpKernelINS7_17CUDADeviceContextEfEENSA_ISB_dEENSA_ISB_iEENSA_ISB_lEEEEclEPKcSI_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4 @ 0x7ff61b8446f6 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7ff61b844e64 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7ff61b84278c paddle::framework::OperatorBase::Run() @ 0x7ff619d278be paddle::framework::Executor::RunPreparedContext() @ 0x7ff619d286ff paddle::framework::Executor::Run() @ 0x7ff619ba235e ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL18pybind11_init_coreERNS_6moduleEEUlRNS2_9framework8ExecutorERKNS6_11ProgramDescEPNS6_5ScopeEibbRKSt6vectorISsSaISsEEE97_vIS8_SB_SD_ibbSI_EINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNES10 @ 0x7ff619be572e pybind11::cpp_function::dispatcher() @ 0x7ff67d9ac4e2 PyEval_EvalFrameEx @ 0x7ff67d9ae7fd PyEval_EvalCodeEx @ 0x7ff67d9ab982 PyEval_EvalFrameEx @ 0x7ff67d9ae7fd PyEval_EvalCodeEx @ 0x7ff67d9ab982 PyEval_EvalFrameEx @ 0x7ff67d9aba9d PyEval_EvalFrameEx @ 0x7ff67d9ae7fd PyEval_EvalCodeEx @ 0x7ff67d9ae932 PyEval_EvalCode @ 0x7ff67d9da882 PyRun_FileExFlags @ 0x7ff67d9dbbf9 PyRun_SimpleFileExFlags @ 0x7ff67d9f1b0d Py_Main @ 0x38bfc21b45 (unknown) @ 0x400691 (unknown) @ 0x0 (unknown)

相同环境和代码测试多遍(未shuffle),发现出错时两个错误都可能出现,没有什么规律

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/ERNIE#177
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7