Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PaddleDetection
  • Issue
  • #446

P
PaddleDetection
  • 项目概览

PaddlePaddle / PaddleDetection
大约 2 年 前同步成功

通知 708
Star 11112
Fork 2696
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PaddleDetection
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 184
    • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
    • 合并请求 40
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 4月 06, 2020 by saxon_zh@saxon_zhGuest

显存充足,但报错 out of memory

Created by: yinggo

用cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms.yml训练自己的数据集 显存充足,但报错 out of memory,请问该怎么解决这个问题?

`python3 -u tools/train.py -c configs/dcn/cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms.yml -o pretrain_weights=models/cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms --use_tb=True --tb_log_dir=tb_log_caltech/scalar --eval

P.S. batch_size已经设置为1, 尝试多进程方式 python -m paddle.distributed.launch --selected_gpus 0,1,2,3 tools/train.py ... 也有同样的问题 实在不知道要怎么做了,求指教...

2020-04-06 17:36:49,707-INFO: 6707 samples in file dataset/coco/annotations/instances_val2007.json 2020-04-06 17:36:49,712-INFO: places would be ommited when DataLoader is not iterable W0406 17:36:50.419684 27808 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.0, Runtime API Version: 10.0 W0406 17:36:50.422271 27808 device_context.cc:245] device: 0, cuDNN Version: 7.6. 2020-04-06 17:36:51,523-INFO: Loading parameters from models/cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms... 2020-04-06 17:36:51,524-WARNING: models/cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-04-06 17:36:51,524-WARNING: models/cascade_rcnn_cbr200_vd_fpn_dcnv2_nonlocal_softnms.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] loading annotations into memory... Done (t=0.20s) creating index... index created! 2020-04-06 17:36:53,468-WARNING: Found an invalid bbox in annotations: im_id: 5387, area: -10.0 x1: 348, y1: 176, x2: 348, y2: 196. 2020-04-06 17:36:53,481-WARNING: Found an invalid bbox in annotations: im_id: 5765, area: -10.0 x1: 71, y1: 174, x2: 71, y2: 197. 2020-04-06 17:36:53,686-INFO: 15649 samples in file dataset/coco/annotations/instances_train2007.json 2020-04-06 17:36:53,699-INFO: places would be ommited when DataLoader is not iterable I0406 17:36:54.286912 27808 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 4 cards are used, so 4 programs are executed in parallel. W0406 17:37:00.890586 27808 fuse_all_reduce_op_pass.cc:74] Find all_reduce operators: 730. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 604. I0406 17:37:01.025799 27808 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 I0406 17:37:28.992170 27808 parallel_executor.cc:307] Inplace strategy is enabled, when build_strategy.enable_inplace = True I0406 17:37:29.978662 27808 parallel_executor.cc:375] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0 W0406 17:37:42.174067 508 operator.cc:181] deformable_conv raises an exception paddle::memory::allocation::BadAlloc,


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackStringstd::string(std::string&&, char const*, int) 1 paddle::memory::allocation::CUDAAllocator::AllocateImpl(unsigned long) 2 paddle::memory::allocation::AlignedAllocator::AllocateImpl(unsigned long) 3 paddle::memory::allocation::AutoGrowthBestFitAllocator::AllocateImpl(unsigned long) 4 paddle::memory::allocation::Allocator::Allocate(unsigned long) 5 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 6 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 7 paddle::memory::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::Alloc(paddle::platform::DeviceContext const&, unsigned long) 9 paddle::framework::Tensor paddle::framework::ExecutionContext::AllocateTmpTensor<float, paddle::platform::CUDADeviceContext>(paddle::framework::DDim const&, paddle::platform::CUDADeviceContext const&) const 10 paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 11 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 12 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 14 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 15 paddle::framework::details::ComputationOpHandle::RunImpl() 16 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*) 17 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*) 18 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&) 19 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 20 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Error Message Summary:

ResourceExhaustedError:

Out of memory error on GPU 2. Cannot allocate 66.797119MB memory on GPU 2, available memory is only 31.062500MB.

Please check whether there is any other process using GPU 2.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.

at (/paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:69) F0406 17:37:42.174147 508 exception_holder.h:37] std::exception caught,


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackStringstd::string(std::string&&, char const*, int) 1 paddle::memory::allocation::CUDAAllocator::AllocateImpl(unsigned long) 2 paddle::memory::allocation::AlignedAllocator::AllocateImpl(unsigned long) 3 paddle::memory::allocation::AutoGrowthBestFitAllocator::AllocateImpl(unsigned long) 4 paddle::memory::allocation::Allocator::Allocate(unsigned long) 5 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 6 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 7 paddle::memory::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::Alloc(paddle::platform::DeviceContext const&, unsigned long) 9 paddle::framework::Tensor paddle::framework::ExecutionContext::AllocateTmpTensor<float, paddle::platform::CUDADeviceContext>(paddle::framework::DDim const&, paddle::platform::CUDADeviceContext const&) const 10 paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 11 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 12 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 14 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 15 paddle::framework::details::ComputationOpHandle::RunImpl() 16 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*) 17 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*) 18 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&) 19 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 20 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Error Message Summary:

ResourceExhaustedError:

Out of memory error on GPU 2. Cannot allocate 66.797119MB memory on GPU 2, available memory is only 31.062500MB.

Please check whether there is any other process using GPU 2.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.

at (/paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:69) *** Check failure stack trace: *** @ 0x7fdb53276c2d google::LogMessage::Fail() @ 0x7fdb5327a6dc google::LogMessage::SendToLog() @ 0x7fdb53276753 google::LogMessage::Flush() @ 0x7fdb5327bbee google::LogMessageFatal::~LogMessageFatal() @ 0x7fdb558509b8 paddle::framework::details::ExceptionHolder::Catch() @ 0x7fdb558fc68e paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync() @ 0x7fdb558fb29f paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp() @ 0x7fdb558fb564 _ZNSt17_Function_handlerIFvvESt17reference_wrapperISt12_Bind_simpleIFS1_ISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS6_12OpHandleBaseESt6atomicIiESt4hashISA_ESt8equal_toISA_ESaISt4pairIKSA_SC_EEESA_RKSt10shared_ptrINS5_13BlockingQueueImEEEEUlvE_vEEEvEEEE9_M_invokeERKSt9_Any_data @ 0x7fdb532cf983 std::_Function_handler<>::_M_invoke() @ 0x7fdb5305dc37 std::__future_base::_State_base::_M_do_set() @ 0x7fdb8dfe5a99 __pthread_once_slow @ 0x7fdb558f6a52 _ZNSt13__future_base11_Task_stateISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS4_12OpHandleBaseESt6atomicIiESt4hashIS8_ESt8equal_toIS8_ESaISt4pairIKS8_SA_EEES8_RKSt10shared_ptrINS3_13BlockingQueueImEEEEUlvE_vEESaIiEFvvEE6_M_runEv @ 0x7fdb5305fe64 _ZZN10ThreadPoolC1EmENKUlvE_clEv @ 0x7fdb7f27d3e7 execute_native_thread_routine_compat @ 0x7fdb8dfde6ba start_thread @ 0x7fdb8dd1441d clone @ (nil) (unknown) Aborted (core dumped)

nvidia-smi

Mon Apr 6 17:51:08 2020

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.78 Driver Version: 410.78 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN X (Pascal) Off | 00000000:05:00.0 On | N/A | | 26% 46C P8 19W / 250W | 380MiB / 12188MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 TITAN X (Pascal) Off | 00000000:06:00.0 Off | N/A | | 26% 46C P8 18W / 250W | 2MiB / 12196MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 TITAN X (Pascal) Off | 00000000:09:00.0 Off | N/A | | 25% 44C P8 19W / 250W | 2MiB / 12196MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 TITAN X (Pascal) Off | 00000000:0A:00.0 Off | N/A | | 23% 40C P8 17W / 250W | 2MiB / 12196MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1523 G /usr/lib/xorg/Xorg 215MiB | | 0 3783 G /opt/teamviewer/tv_bin/TeamViewer 41MiB | | 0 12214 G /usr/bin/nvidia-settings 1MiB | | 0 20941 G compiz 109MiB | | 0 26423 G .../local/MATLAB/R2018a/bin/glnxa64/MATLAB 6MiB | +-----------------------------------------------------------------------------+

`

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PaddleDetection#446
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7