Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #18880

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 7月 29, 2019 by saxon_zh@saxon_zhGuest

在first_trainer进行fleet.save_persistables之后,直接使用fluid.io.load_persistables报错。

Created by: MrChengmo

  • 标题:在0号Trainer执行fleet.save_persistables后,不关闭进程,直接使用fluid.io.load_persistables报错。
  • 版本、环境信息:    1)PaddlePaddle版本:v1.5.0    2)CPU:开发机    3)GPU:无    4)系统环境:Centos,Python 2.7
  • 复现信息:0号Trainer先执行: if is_first_trainer: fleet.save_persistables(executor=exe, dirname=model_path,main_program=fluid.default_main_program()) logger.info("Train Success!") fleet.stop_worker()

再执行: fluid.io.load_persistables(executor=exe,dirname=model_path,main_program=fluid.default_main_program()) 报错,参数无法加载,进一步看是文件无法打开,仿佛不存在,而事实上是保存了的,怀疑是文件读写没有close。同时,若单独对infer部分代码测试,相同数据,没有问题。

  • 问题描述: 2019-07-29 16:13:04,599 - INFO - Train Success! Traceback (most recent call last): File "model.py", line 151, in <module> runtime_main(params, CTR) File "/home/chengmo/workroot/ctr_cloud/dist_continuous_evaluation.py", line 286, in runtime_main model.run_infer(params) File "model.py", line 118, in run_infer main_program=fluid.default_main_program() File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 747, in load_persistables filename=filename) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 611, in load_vars filename=filename) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 648, in load_vars executor.run(load_prog) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 651, in run use_program_cache=use_program_cache) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 749, in _run exe.run(program.desc, scope, 0, True, True, fetch_var_name) paddle.fluid.core_avx.EnforceNotMet: Invoke operator load error. Python Callstacks: File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1771, in append_op attrs=kwargs.get("attrs", None)) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 633, in load_vars 'file_path': os.path.join(load_dirname, new_var.name) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 611, in load_vars filename=filename) File "/home/chengmo/.jumbo/lib/python2.7/site-packages/paddle/fluid/io.py", line 747, in load_persistables filename=filename) File "model.py", line 118, in run_infer main_program=fluid.default_main_program() File "/home/chengmo/workroot/ctr_cloud/dist_continuous_evaluation.py", line 286, in runtime_main model.run_infer(params) File "model.py", line 151, in <module> runtime_main(params, CTR) C++ Callstacks: Cannot open file output/final_pyReader_train/fc_4.w_0 for load op at [/paddle/paddle/fluid/operators/load_op.h:37] PaddlePaddle Call Stacks: 0 0x7f08c60ba6d0p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352 1 0x7f08c60baa49p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 137 2 0x7f08c66faed6p paddle::operators::LoadOpKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 774 3 0x7f08c66fb143p _ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform8CPUPlaceELb0ELm0EJNS0_9operators12LoadOpKernelINS7_16CPUDeviceContextEfEENSA_ISB_dEENSA_ISB_iEENSA_ISB_aEENSA_ISB_lEEEEclEPKcSJ_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4_ + 35 4 0x7f08c7403627p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375 5 0x7f08c7403d91p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529 6 0x7f08c7401c3bp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 267 7 0x7f08c623c00ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 206 8 0x7f08c623f08fp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 143 9 0x7f08c60acefdp 10 0x7f08c60e9ceep 11 0x7f097f70f3e4p PyEval_EvalFrameEx + 25956 12 0x7f097f710130p PyEval_EvalCodeEx + 2240 13 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 14 0x7f097f710130p PyEval_EvalCodeEx + 2240 15 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 16 0x7f097f710130p PyEval_EvalCodeEx + 2240 17 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 18 0x7f097f710130p PyEval_EvalCodeEx + 2240 19 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 20 0x7f097f710130p PyEval_EvalCodeEx + 2240 21 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 22 0x7f097f710130p PyEval_EvalCodeEx + 2240 23 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 24 0x7f097f710130p PyEval_EvalCodeEx + 2240 25 0x7f097f70e4a1p PyEval_EvalFrameEx + 22049 26 0x7f097f710130p PyEval_EvalCodeEx + 2240 27 0x7f097f710242p PyEval_EvalCode + 50 28 0x7f097f72a62cp 29 0x7f097f72a700p PyRun_FileExFlags + 144 30 0x7f097f72bc0cp PyRun_SimpleFileExFlags + 220 31 0x7f097f73d4ccp Py_Main + 3164 32 0x318ae1ecddp __libc_start_main + 253 33 0x400669p
指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#18880
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7