Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • 合并请求
  • !27247

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板

move DataLoader._worker_loop to top level !27247

  • Report abuse
!27247 已合并 9月 10, 2020 由 saxon_zh@saxon_zh 创建
#<User:0x00007f0e6d6e6298>
  • 概览 0
  • 提交 3
  • 变更 4

Created by: chenwhql

PR types

Bug fixes

PR changes

APIs

Describe

move DataLoader._worker_loop to top level

Otherwise, it will cause errors when using paddle.distributed.spawn method to start multi-process DataLoader because the _worker_loop cannot be pickled.

error example:

I0910 12:49:50.989151  8382 nccl_context.cc:127] init nccl context nranks: 2 local rank: 0 gpu id: 0
I0910 12:49:50.989272  8383 nccl_context.cc:127] init nccl context nranks: 2 local rank: 1 gpu id: 1
W0910 12:49:51.561908  8382 device_context.cc:320] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0
W0910 12:49:51.562680  8383 device_context.cc:320] Please NOTE: device: 1, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0
W0910 12:49:51.568655  8382 device_context.cc:328] device: 0, cuDNN Version: 7.5.
W0910 12:49:51.569162  8383 device_context.cc:328] device: 1, cuDNN Version: 7.5.
Exception ignored in: <bound method _DataLoaderIterMultiProcess.__del__ of <paddle.fluid.dataloader.dataloader_iter._DataLoaderIterMultiProcess object at 0x7f71cbeb5898>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 719, in __del__
    self._try_shutdown_all()
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 456, in _try_shutdown_all
    if not self._shutdown:
AttributeError: '_DataLoaderIterMultiProcess' object has no attribute '_shutdown'
Exception ignored in: <bound method _DataLoaderIterMultiProcess.__del__ of <paddle.fluid.dataloader.dataloader_iter._DataLoaderIterMultiProcess object at 0x7fcf918958d0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 719, in __del__
    self._try_shutdown_all()
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 456, in _try_shutdown_all
    if not self._shutdown:
AttributeError: '_DataLoaderIterMultiProcess' object has no attribute '_shutdown'
Traceback (most recent call last):
  File "spawn_dataloader.py", line 72, in <module>
    dist.spawn(train, nprocs=2)
  File "/usr/local/lib/python3.5/dist-packages/paddle/distributed/spawn.py", line 409, in spawn
    while not context.join():
  File "/usr/local/lib/python3.5/dist-packages/paddle/distributed/spawn.py", line 210, in join
    self._throw_exception(error_index)
  File "/usr/local/lib/python3.5/dist-packages/paddle/distributed/spawn.py", line 228, in _throw_exception
    raise Exception(msg)
Exception: 

----------------------------------------------
Process 0 terminated with the following error:
----------------------------------------------

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/paddle/distributed/spawn.py", line 159, in _func_wrapper
    result = func(*args)
  File "/work/scripts/spawn/spawn_dataloader.py", line 58, in train
    for batch_id, (image, label) in enumerate(loader()):
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 406, in __call__
    return self.__iter__()
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/reader.py", line 403, in __iter__
    return _DataLoaderIterMultiProcess(self)
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 381, in __init__
    self._init_workers()
  File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 413, in _init_workers
    worker.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 274, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 33, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 48, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.lock objects
指派人
分配到
审核者
Request review from
无
里程碑
无
分配里程碑
工时统计
标识: paddlepaddle/Paddle!27247
Source branch: github/fork/chenwhql/dataloader/move_worker_loop_to_top_level
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7