Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PaddleDetection
  • Issue
  • #1444

P
PaddleDetection
  • 项目概览

PaddlePaddle / PaddleDetection
大约 2 年 前同步成功

通知 708
Star 11112
Fork 2696
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PaddleDetection
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 184
    • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
    • 合并请求 40
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 9月 21, 2020 by saxon_zh@saxon_zhGuest

训练报错,寻求帮助!!!

Created by: zgsxwsdxg

那位技术同仁帮忙看下,定位下问题,谢谢;我的训练日志如下: 2020-09-18 18:59:45,566-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000500] in Optimizer will not take effect, and it will only be applied to other Parameters! loading annotations into memory... Done (t=0.00s) creating index... index created! 2020-09-18 18:59:50,686-INFO: places would be ommited when DataLoader is not iterable W0918 18:59:50.764096 10283 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0 W0918 18:59:50.876243 10283 device_context.cc:260] device: 0, cuDNN Version: 7.6. 2020-09-18 18:59:56,455-INFO: Downloading ResNet50_vd_ssld_pretrained.tar from https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar

0%| | 0/92837 [00:00<?, ?KB/s] 1%| | 755/92837 [00:00<00:12, 7484.04KB/s] 4%|▍ | 3683/92837 [00:00<00:09, 9635.64KB/s] 8%|▊ | 7731/92837 [00:00<00:06, 12490.43KB/s] 10%|█ | 9670/92837 [00:00<00:08, 9338.40KB/s] 12%|█▏ | 11204/92837 [00:00<00:10, 7928.85KB/s] 13%|█▎ | 12454/92837 [00:01<00:11, 7175.30KB/s] 15%|█▍ | 13505/92837 [00:01<00:11, 6747.20KB/s] 16%|█▌ | 14419/92837 [00:01<00:12, 6426.71KB/s] 16%|█▋ | 15232/92837 [00:01<00:12, 6277.22KB/s] 17%|█▋ | 15980/92837 [00:01<00:12, 6140.45KB/s] 18%|█▊ | 16679/92837 [00:01<00:12, 6045.80KB/s] 19%|█▊ | 17344/92837 [00:01<00:12, 6015.28KB/s] 19%|█▉ | 17988/92837 [00:02<00:12, 5938.82KB/s] 20%|██ | 18612/92837 [00:02<00:12, 5918.70KB/s] 21%|██ | 19225/92837 [00:02<00:12, 5914.43KB/s] 21%|██▏ | 19831/92837 [00:02<00:12, 5898.27KB/s] 22%|██▏ | 20431/92837 [00:02<00:12, 5909.86KB/s] 23%|██▎ | 21030/92837 [00:02<00:12, 5876.63KB/s] 23%|██▎ | 21623/92837 [00:02<00:12, 5877.34KB/s] 24%|██▍ | 22215/92837 [00:02<00:12, 5874.68KB/s] 25%|██▍ | 22805/92837 [00:02<00:11, 5867.35KB/s] 25%|██▌ | 23395/92837 [00:02<00:11, 5861.26KB/s] 26%|██▌ | 23987/92837 [00:03<00:11, 5863.60KB/s] 26%|██▋ | 24579/92837 [00:03<00:11, 5865.08KB/s] 27%|██▋ | 25171/92837 [00:03<00:11, 5864.32KB/s] 28%|██▊ | 25763/92837 [00:03<00:11, 5867.06KB/s] 28%|██▊ | 26355/92837 [00:03<00:11, 5867.97KB/s] 29%|██▉ | 26947/92837 [00:03<00:11, 5871.57KB/s] 30%|██▉ | 27539/92837 [00:03<00:11, 5871.02KB/s] 30%|███ | 28131/92837 [00:03<00:11, 5870.32KB/s] 31%|███ | 28723/92837 [00:03<00:10, 5869.79KB/s] 32%|███▏ | 29315/92837 [00:03<00:10, 5869.68KB/s] 32%|███▏ | 29907/92837 [00:04<00:10, 5869.44KB/s] 33%|███▎ | 30499/92837 [00:04<00:10, 5869.44KB/s] 33%|███▎ | 31091/92837 [00:04<00:10, 5868.58KB/s] 34%|███▍ | 31683/92837 [00:04<00:10, 5869.02KB/s] 35%|███▍ | 32275/92837 [00:04<00:10, 5868.59KB/s] 35%|███▌ | 32867/92837 [00:04<00:10, 5869.07KB/s] 36%|███▌ | 33459/92837 [00:04<00:10, 5872.93KB/s] 37%|███▋ | 34051/92837 [00:04<00:10, 5869.56KB/s] 37%|███▋ | 34643/92837 [00:04<00:09, 5871.04KB/s] 38%|███▊ | 35235/92837 [00:04<00:09, 5868.33KB/s] 39%|███▊ | 35827/92837 [00:05<00:09, 5868.62KB/s] 39%|███▉ | 36419/92837 [00:05<00:09, 5868.71KB/s] 40%|███▉ | 37011/92837 [00:05<00:09, 5868.72KB/s] 41%|████ | 37603/92837 [00:05<00:09, 5867.43KB/s] 41%|████ | 38195/92837 [00:05<00:09, 5866.92KB/s] 42%|████▏ | 38787/92837 [00:05<00:09, 5868.43KB/s] 42%|████▏ | 39379/92837 [00:05<00:09, 5872.50KB/s] 43%|████▎ | 39971/92837 [00:05<00:09, 5870.84KB/s] 44%|████▎ | 40563/92837 [00:05<00:08, 5870.22KB/s] 44%|████▍ | 41155/92837 [00:05<00:08, 5869.61KB/s] 45%|████▍ | 41747/92837 [00:06<00:08, 5868.80KB/s] 46%|████▌ | 42339/92837 [00:06<00:08, 5868.92KB/s] 46%|████▌ | 42931/92837 [00:06<00:08, 5869.12KB/s] 47%|████▋ | 43523/92837 [00:06<00:08, 5868.69KB/s] 48%|████▊ | 44115/92837 [00:06<00:08, 5868.63KB/s] 48%|████▊ | 44707/92837 [00:06<00:08, 5868.87KB/s] 49%|████▉ | 45299/92837 [00:06<00:08, 5872.47KB/s] 49%|████▉ | 45891/92837 [00:06<00:07, 5870.03KB/s] 50%|█████ | 46483/92837 [00:06<00:07, 5868.72KB/s] 51%|█████ | 47075/92837 [00:07<00:07, 5868.74KB/s] 51%|█████▏ | 47667/92837 [00:07<00:07, 5868.84KB/s] 52%|█████▏ | 48259/92837 [00:07<00:07, 5868.86KB/s] 53%|█████▎ | 48851/92837 [00:07<00:07, 5868.93KB/s] 53%|█████▎ | 49443/92837 [00:07<00:07, 5868.57KB/s] 54%|█████▍ | 50035/92837 [00:07<00:07, 5868.95KB/s] 55%|█████▍ | 50627/92837 [00:07<00:07, 5868.66KB/s] 55%|█████▌ | 51219/92837 [00:07<00:07, 5872.15KB/s] 56%|█████▌ | 51811/92837 [00:07<00:06, 5871.01KB/s] 56%|█████▋ | 52403/92837 [00:07<00:06, 5870.38KB/s] 57%|█████▋ | 52995/92837 [00:08<00:06, 5869.70KB/s] 58%|█████▊ | 53587/92837 [00:08<00:06, 5869.40KB/s] 58%|█████▊ | 54179/92837 [00:08<00:06, 5869.45KB/s] 59%|█████▉ | 54771/92837 [00:08<00:06, 5867.67KB/s] 60%|█████▉ | 55363/92837 [00:08<00:06, 5868.21KB/s] 60%|██████ | 55955/92837 [00:08<00:06, 5866.44KB/s] 61%|██████ | 56547/92837 [00:08<00:06, 5867.18KB/s] 62%|██████▏ | 57139/92837 [00:08<00:06, 5872.02KB/s] 62%|██████▏ | 57731/92837 [00:08<00:05, 5870.58KB/s] 63%|██████▎ | 58323/92837 [00:08<00:05, 5870.25KB/s] 63%|██████▎ | 58915/92837 [00:09<00:05, 5869.78KB/s] 64%|██████▍ | 59507/92837 [00:09<00:05, 5869.07KB/s] 65%|██████▍ | 60099/92837 [00:09<00:05, 5869.27KB/s] 65%|██████▌ | 60691/92837 [00:09<00:05, 5869.23KB/s] 66%|██████▌ | 61283/92837 [00:09<00:05, 5868.96KB/s] 67%|██████▋ | 61875/92837 [00:09<00:05, 5868.95KB/s] 67%|██████▋ | 62467/92837 [00:09<00:05, 5868.96KB/s] 68%|██████▊ | 63059/92837 [00:09<00:05, 5868.51KB/s] 69%|██████▊ | 63651/92837 [00:09<00:04, 5872.88KB/s] 69%|██████▉ | 64243/92837 [00:09<00:04, 5871.61KB/s] 70%|██████▉ | 64835/92837 [00:10<00:04, 5870.47KB/s] 70%|███████ | 65427/92837 [00:10<00:04, 5870.54KB/s] 71%|███████ | 66019/92837 [00:10<00:04, 5869.16KB/s] 72%|███████▏ | 66611/92837 [00:10<00:04, 5869.42KB/s] 72%|███████▏ | 67203/92837 [00:10<00:04, 5868.65KB/s] 73%|███████▎ | 67795/92837 [00:10<00:04, 5868.59KB/s] 74%|███████▎ | 68387/92837 [00:10<00:04, 5867.95KB/s] 74%|███████▍ | 68979/92837 [00:10<00:04, 5868.09KB/s] 75%|███████▍ | 69571/92837 [00:10<00:03, 5872.13KB/s] 76%|███████▌ | 70163/92837 [00:10<00:03, 5870.56KB/s] 76%|███████▌ | 70755/92837 [00:11<00:03, 5870.18KB/s] 77%|███████▋ | 71347/92837 [00:11<00:03, 5869.78KB/s] 77%|███████▋ | 71939/92837 [00:11<00:03, 5868.76KB/s] 78%|███████▊ | 72531/92837 [00:11<00:03, 5868.85KB/s] 79%|███████▉ | 73123/92837 [00:11<00:03, 5868.57KB/s] 79%|███████▉ | 73715/92837 [00:11<00:03, 5868.71KB/s] 80%|████████ | 74307/92837 [00:11<00:03, 5868.91KB/s] 81%|████████ | 74899/92837 [00:11<00:03, 5867.64KB/s] 81%|████████▏ | 75491/92837 [00:11<00:02, 5872.29KB/s] 82%|████████▏ | 76083/92837 [00:11<00:02, 5871.00KB/s] 83%|████████▎ | 76675/92837 [00:12<00:02, 5870.68KB/s] 83%|████████▎ | 77267/92837 [00:12<00:02, 5869.76KB/s] 84%|████████▍ | 77859/92837 [00:12<00:02, 5869.78KB/s] 85%|████████▍ | 78451/92837 [00:12<00:02, 5869.60KB/s] 85%|████████▌ | 79043/92837 [00:12<00:02, 5869.40KB/s] 86%|████████▌ | 79635/92837 [00:12<00:02, 5869.00KB/s] 86%|████████▋ | 80227/92837 [00:12<00:02, 5868.57KB/s] 87%|████████▋ | 80819/92837 [00:12<00:02, 5868.27KB/s] 88%|████████▊ | 81411/92837 [00:12<00:01, 5873.18KB/s] 88%|████████▊ | 82003/92837 [00:12<00:01, 5871.67KB/s] 89%|████████▉ | 82595/92837 [00:13<00:01, 5870.87KB/s] 90%|████████▉ | 83187/92837 [00:13<00:01, 5870.24KB/s] 90%|█████████ | 83779/92837 [00:13<00:01, 5869.72KB/s] 91%|█████████ | 84371/92837 [00:13<00:01, 5869.32KB/s] 92%|█████████▏| 84963/92837 [00:13<00:01, 5869.49KB/s] 92%|█████████▏| 85555/92837 [00:13<00:01, 5868.45KB/s] 93%|█████████▎| 86147/92837 [00:13<00:01, 5869.15KB/s] 93%|█████████▎| 86739/92837 [00:13<00:01, 5869.19KB/s] 94%|█████████▍| 87331/92837 [00:13<00:00, 5869.29KB/s] 95%|█████████▍| 87923/92837 [00:13<00:00, 5873.25KB/s] 95%|█████████▌| 88515/92837 [00:14<00:00, 5871.75KB/s] 96%|█████████▌| 89107/92837 [00:14<00:00, 5870.95KB/s] 97%|█████████▋| 89699/92837 [00:14<00:00, 5869.85KB/s] 97%|█████████▋| 90291/92837 [00:14<00:00, 5869.47KB/s] 98%|█████████▊| 90883/92837 [00:14<00:00, 5869.38KB/s] 99%|█████████▊| 91475/92837 [00:14<00:00, 5869.17KB/s] 99%|█████████▉| 92067/92837 [00:14<00:00, 5869.13KB/s] 100%|█████████▉| 92659/92837 [00:14<00:00, 5868.70KB/s] 100%|██████████| 92837/92837 [00:14<00:00, 6272.83KB/s] 2020-09-18 19:00:11,565-INFO: Decompressing /home/ubuntu/.cache/paddle/weights/ResNet50_vd_ssld_pretrained.tar... 2020-09-18 19:00:12,866-WARNING: /home/ubuntu/.cache/paddle/weights/ResNet50_vd_ssld_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] /home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/io.py:1998: UserWarning: This list is not set, Because of Paramerter not found in program. There are: fc_0.w_0 fc_0.b_0 format(" ".join(unused_para_list))) loading annotations into memory... Done (t=0.00s) creating index... index created! 2020-09-18 19:00:18,819-INFO: places would be ommited when DataLoader is not iterable W0918 19:00:27.958123 10283 dynamic_loader.cc:167] You may need to install 'nccl2' from NVIDIA official website: https://developer.nvidia.com/nccl/nccl-downloadbefore install PaddlePaddle. /home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 372, in main() File "tools/train.py", line 245, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run six.reraise(*sys.exc_info()) File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run return_merged=return_merged) File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1156, in _run_impl program._compile(scope, self.place) File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/compiler.py", line 443, in _compile places=self._places) File "/home/ubuntu/anaconda3/envs/py3.7-paddle-1.8/lib/python3.7/site-packages/paddle/fluid/compiler.py", line 396, in _compile_data_parallel self._exec_strategy, self._build_strategy, self._graph) paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackStringstd::string(std::string&&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int) 2 paddle::platform::dynload::GetNCCLDsoHandle() 3 void std::__once_call_impl<std::_Bind_simple<paddle::platform::dynload::DynLoad__ncclCommInitAll::operator()<ncclComm**, int, int*>(ncclComm**, int, int*)::{lambda()#1} ()> >() 4 paddle::platform::NCCLContextMap::NCCLContextMap(std::vector<paddle::platform::Place, std::allocatorpaddle::platform::Place > const&, ncclUniqueId*, unsigned long, unsigned long) 5 paddle::platform::NCCLCommunicator::InitFlatCtxs(std::vector<paddle::platform::Place, std::allocatorpaddle::platform::Place > const&, std::vector<ncclUniqueId*, std::allocator<ncclUniqueId*> > const&, unsigned long, unsigned long) 6 paddle::framework::ParallelExecutorPrivate::InitNCCLCtxs(paddle::framework::Scope*, paddle::framework::details::BuildStrategy const&) 7 paddle::framework::ParallelExecutorPrivate::InitOrGetNCCLCommunicator(paddle::framework::Scope*, paddle::framework::details::BuildStrategy*) 8 paddle::framework::ParallelExecutor::ParallelExecutor(std::vector<paddle::platform::Place, std::allocatorpaddle::platform::Place > const&, std::vector<std::string, std::allocatorstd::string > const&, std::string const&, paddle::framework::Scope*, std::vector<paddle::framework::Scope*, std::allocatorpaddle::framework::Scope* > const&, paddle::framework::details::ExecutionStrategy const&, paddle::framework::details::BuildStrategy const&, paddle::framework::ir::Graph*)


Error Message Summary:

PreconditionNotMetError: The third-party dynamic library (libnccl.so) that Paddle depends on is not configured correctly. (error code is libnccl.so: cannot open shared object file: No such file or directory) Suggestions:

  1. Check if the third-party dynamic library (e.g. CUDA, CUDNN) is installed correctly and its version is matched with paddlepaddle you installed.
  2. Configure third-party dynamic library environment variables as follows:
  • Linux: set LD_LIBRARY_PATH by export LD_LIBRARY_PATH=...
  • Windows: set PATH by `set PATH=XXX; at (/paddle/paddle/fluid/platform/dynload/dynamic_loader.cc:194)
指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PaddleDetection#1444
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7