Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PARL
  • Issue
  • #187

P
PARL
  • 项目概览

PaddlePaddle / PARL

通知 68
Star 3
Fork 0
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 18
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PARL
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 18
    • Issue 18
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
    • 合并请求 3
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 12月 16, 2019 by saxon_zh@saxon_zhGuest

Errors occurred when running training scripts in NeurIPS2019-Learn-to-Move-Challenge

Created by: luoruiming

When running sh scripts/train_difficulty1.sh ./low_speed_model in /PARL/examples/NeurIPS2019-Learn-to-Move-Challenge, absurd errors occurred (as shown below). Can anyone help me? Thanks in advance!

(opensim-rl) luo@idserver:~/PARL/examples/NeurIPS2019-Learn-to-Move-Challenge$ sh scripts/train_difficulty1.sh ./low_speed_model /home/luo/anaconda3/envs/opensim-rl/bin/python [12-16 23:08:12 MainThread @logger.py:224] Argv: train.py --actor_num 300 --difficulty 1 --penalty_coeff 3.0 --logdir ./output/difficulty1 --restore_model_path ./low_speed_model /home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/opensim/simbody.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp [12-16 23:08:12 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4 [12-16 23:08:13 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4 W1216 23:08:14.078102 6084 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.0, Runtime API Version: 8.0 W1216 23:08:14.081565 6084 device_context.cc:267] device: 0, cuDNN Version: 7.5. [12-16 23:08:16 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4 /home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead """) WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True

     # pass the build_strategy to with_data_parallel API
     compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
         loss_name=loss.name, build_strategy=build_strategy)
  
 !!! Memory optimize is our experimental feature !!!
     some variables may be removed/reused internal to save memory usage, 
     in order to fetch the right value of the fetch_list, please set the 
     persistable property to true for each variable in fetch_list

     # Sample
     conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
     # if you need to fetch conv1, then:
     conv1.persistable = True

I1216 23:08:16.079864 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies I1216 23:08:17.081748 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 [12-16 23:08:17 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4 /home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead """) WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True

     # pass the build_strategy to with_data_parallel API
     compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
         loss_name=loss.name, build_strategy=build_strategy)
  
 !!! Memory optimize is our experimental feature !!!
     some variables may be removed/reused internal to save memory usage, 
     in order to fetch the right value of the fetch_list, please set the 
     persistable property to true for each variable in fetch_list

     # Sample
     conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
     # if you need to fetch conv1, then:
     conv1.persistable = True

I1216 23:08:17.209542 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies I1216 23:08:17.324332 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 [12-16 23:08:17 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4 /home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead """) WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True

     # pass the build_strategy to with_data_parallel API
     compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
         loss_name=loss.name, build_strategy=build_strategy)
  
 !!! Memory optimize is our experimental feature !!!
     some variables may be removed/reused internal to save memory usage, 
     in order to fetch the right value of the fetch_list, please set the 
     persistable property to true for each variable in fetch_list

     # Sample
     conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
     # if you need to fetch conv1, then:
     conv1.persistable = True

share_vars_from is set, scope is ignored. I1216 23:08:17.525264 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies I1216 23:08:17.640771 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 [12-16 23:08:17 MainThread @train.py:303] restore model from ./low_speed_model Traceback (most recent call last): File "train.py", line 327, in learner = Learner(args) File "train.py", line 85, in init self.restore(args.restore_model_path) File "train.py", line 304, in restore self.agent.restore(model_path) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/parl/core/fluid/agent.py", line 221, in restore filename=filename) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 699, in load_params filename=filename) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 611, in load_vars filename=filename) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 648, in load_vars executor.run(load_prog) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/executor.py", line 651, in run use_program_cache=use_program_cache) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/executor.py", line 749, in run exe.run(program.desc, scope, 0, True, True, fetch_var_name) paddle.fluid.core_avx.EnforceNotMet: Invoke operator load_combine error. Python Callstacks: File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/framework.py", line 1771, in append_op attrs=kwargs.get("attrs", None)) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 647, in load_vars attrs={'file_path': os.path.join(load_dirname, filename)}) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 611, in load_vars filename=filename) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 699, in load_params filename=filename) File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/parl/core/fluid/agent.py", line 221, in restore filename=filename) File "train.py", line 304, in restore self.agent.restore(model_path) File "train.py", line 85, in init self.restore(args.restore_model_path) File "train.py", line 327, in learner = Learner(args) C++ Callstacks: tensor version 3393762800 is not supported. at [/paddle/paddle/fluid/framework/lod_tensor.cc:256] PaddlePaddle Call Stacks: 0 0x7efdba6c1f10p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352 1 0x7efdba6c2289p paddle::platform::EnforceNotMet::EnforceNotMet(std::exception_ptr::exception_ptr, char const*, int) + 137 2 0x7efdbc38c7d4p paddle::framework::DeserializeFromStream(std::istream&, paddle::framework::LoDTensor*, paddle::platform::DeviceContext const&) + 724 3 0x7efdbb35e480p paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>::LoadParamsFromBuffer(paddle::framework::ExecutionContext const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, std::istream*, bool, std::vector<std::string, std::allocatorstd::string > const&) const + 352 4 0x7efdbb35edfep paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 798 5 0x7efdbb35f273p std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, signed char>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&) + 35 6 0x7efdbc7411e7p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375 7 0x7efdbc7415c1p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529 8 0x7efdbc73ebbcp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332 9 0x7efdba84cd0ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382 10 0x7efdba84fdafp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool) + 143 11 0x7efdba6b359dp 12 0x7efdba6f4826p 13 0x7efe81ea2df2p _PyCFunction_FastCallDict + 258 14 0x7efe81f282bbp 15 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981 16 0x7efe81f26a60p 17 0x7efe81f2848ap 18 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805 19 0x7efe81f26a60p 20 0x7efe81f2848ap 21 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981 22 0x7efe81f26a60p 23 0x7efe81f2848ap 24 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805 25 0x7efe81f26a60p 26 0x7efe81f2848ap 27 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805 28 0x7efe81f26a60p 29 0x7efe81f2848ap 30 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805 31 0x7efe81f26a60p 32 0x7efe81f2848ap 33 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981 34 0x7efe81f25e74p 35 0x7efe81f285e8p 36 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981 37 0x7efe81f25e74p 38 0x7efe81f26e75p _PyFunction_FastCallDict + 645 39 0x7efe81e4bba6p _PyObject_FastCallDict + 358 40 0x7efe81e4bdfcp _PyObject_Call_Prepend + 204 41 0x7efe81e4be96p PyObject_Call + 86 42 0x7efe81ec4233p 43 0x7efe81eb9d4cp 44 0x7efe81e4badep _PyObject_FastCallDict + 158 45 0x7efe81f282bbp 46 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981 47 0x7efe81f26a60p 48 0x7efe81f26ee3p PyEval_EvalCodeEx + 99 49 0x7efe81f26f2bp PyEval_EvalCode + 59 50 0x7efe81f596c0p PyRun_FileExFlags + 304 51 0x7efe81f5ac83p PyRun_SimpleFileExFlags + 371 52 0x7efe81f760b5p Py_Main + 3621 53 0x400c1dp main + 365 54 0x7efe80f01830p __libc_start_main + 240 55 0x4009e9p

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PARL#187
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7