Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PaddleClas
  • Issue
  • #250

P
PaddleClas
  • 项目概览

PaddlePaddle / PaddleClas
大约 2 年 前同步成功

通知 118
Star 4999
Fork 1114
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 19
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 6
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PaddleClas
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 19
    • Issue 19
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 6
    • 合并请求 6
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 8月 22, 2020 by saxon_zh@saxon_zhGuest

按照文档出现问题,请教!!!

Created by: zhaoguoqing12

API is deprecated since 2.0.0 Please use FleetAPI instead. WIKI: https://github.com/PaddlePaddle/Fleet/blob/develop/markdown_doc/transpiler

W0822 16:45:36.462678 28971 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 52, Driver API Version: 10.0, Runtime API Version: 10.0 W0822 16:45:36.466768 28971 device_context.cc:245] device: 0, cuDNN Version: 7.6. W0822 16:45:37.586989 28971 dynamic_loader.cc:120] Can not find library: libnccl.so. The process maybe hang. Please try to add the lib path to LD_LIBRARY_PATH. /data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py:789: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 153, in main(args) File "tools/train.py", line 90, in main exe.run(startup_prog) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 790, in run six.reraise(*sys.exc_info()) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 785, in run use_program_cache=use_program_cache) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 838, in _run_impl use_program_cache=use_program_cache) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 912, in _run_program fetch_var_name) paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) 2 paddle::platform::dynload::GetNCCLDsoHandle() 3 void std::__once_call_impl<std::_Bind_simple<paddle::platform::dynload::DynLoad__ncclGetUniqueId::operator()<ncclUniqueId*>(ncclUniqueId*)::{lambda()#1} ()> >() 4 paddle::operators::GenNCCLIdOp::GenerateAndSend(paddle::framework::Scope*, paddle::platform::DeviceContext const&, std::string const&, std::vector<std::string, std::allocatorstd::string > const&) const 5 paddle::operators::GenNCCLIdOp::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 6 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 7 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) 8 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool, bool)


Python Call Stacks (More useful to users):

File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op attrs=kwargs.get("attrs", None)) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/transpiler/distribute_transpiler.py", line 397, in _transpile_nccl2 self.config.hierarchical_allreduce_inter_nranks File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/transpiler/distribute_transpiler.py", line 625, in transpile wait_port=self.config.wait_port) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/incubate/fleet/collective/init.py", line 285, in _transpile current_endpoint=current_endpoint) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/incubate/fleet/collective/init.py", line 358, in _try_to_compile self._transpile(startup_program, main_program) File "/data/data_pc_phone/miniconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/incubate/fleet/collective/init.py", line 424, in minimize fleet.main_program = self._try_to_compile(startup_program, main_program) File "/data/PaddleClas/tools/program.py", line 363, in build optimizer.minimize(fetchs['loss'][0]) File "tools/train.py", line 75, in main config, train_prog, startup_prog, is_train=True) File "tools/train.py", line 153, in main(args)


Error Message Summary:

Error: Failed to find dynamic library: libnccl.so ( libnccl.so: cannot open shared object file: No such file or directory ) Please specify its path correctly using following ways: Method. set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH on Mac OS. For instance, issue command: export LD_LIBRARY_PATH=... Note: After Mac OS 10.11, using the DYLD_LIBRARY_PATH is impossible unless System Integrity Protection (SIP) is disabled. at (/paddle/paddle/fluid/platform/dynload/dynamic_loader.cc:177) [operator < gen_nccl_id > error] 2020-08-22 16:45:39,064-ERROR: ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log. ERROR 2020-08-22 16:45:39,064 launch.py:284] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PaddleClas#250
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7