Wrong error message when run out of GPU memory (#12578) · Issue · PaddlePaddle / Paddle

Wrong error message when run out of GPU memory

Created by: cjld

Here is a minimal example to illustrate the issue:

import os

import paddle
import paddle.fluid as fluid

sp = fluid.Program()
tp = fluid.Program()
#image_shape = [1025, 2049]
image_shape = [1025, 2049]
with fluid.program_guard(tp, sp):
    img = fluid.layers.data(name='img', shape=[50,0,0], dtype='float32')
    for i in range(1000):
        img = img + 1.0
        #img = fluid.layers.resize_bilinear(img, image_shape)

exe = fluid.Executor(fluid.CUDAPlace(0))
exe.run(sp)

import numpy as np
result = exe.run(tp, feed={'img':np.zeros((1,50,1025,2049), dtype=np.float32)}, fetch_list=[img])
print result[0].shape

outputs:

---------------------------------------------------------------------------
EnforceNotMet                             Traceback (most recent call last)
<ipython-input-4-a6475bdc6bdf> in <module>()
      1 import numpy as np
----> 2 result = exe.run(tp, feed={'img':np.zeros((1,50,1025,2049), dtype=np.float32)}, fetch_list=[img])
      3 print result[0].shape

/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.pyc in run(self, program, feed, fetch_list, feed_var_name, fetch_var_name, scope, return_numpy, use_program_cache)
    441 
    442         self._feed_data(program, feed, feed_var_name, scope)
--> 443         self.executor.run(program.desc, scope, 0, True, True)
    444         outs = self._fetch_data(fetch_list, fetch_var_name, scope)
    445         if return_numpy:

EnforceNotMet: enforce allocating <= available failed, 11164145418 > 860487424
 at [/paddle/paddle/fluid/platform/gpu_info.cc:119]
PaddlePaddle Call Stacks: 
0       0x7f46d9cda2f6p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 486
1       0x7f46dab525bep paddle::platform::GpuMaxChunkSize() + 766
2       0x7f46daa8268cp void* paddle::memory::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace, unsigned long) + 444

It seems that the paddle::platform::GpuMaxChunkSize() should not be called when running out of memory, It is only allowed to call once at the beginning.

PaddlePaddle / Paddle 1 年多 前同步成功

Wrong error message when run out of GPU memory

PaddlePaddle / Paddle
1 年多前同步成功