win10训练，GPU不够用自动分配，但提示可用显存更少 (#23867) · Issue · PaddlePaddle / Paddle

win10训练，GPU不够用自动分配，但提示可用显存更少

Created by: Keen-King

1）PaddlePaddle版本：1.7.0 3）GPU：NVIDIA GeForce GTX 1050 Ti、 CUDA:10.0、 CUDNN:cudnn-10.0-windows10-x64-v7.6.2.24 4）系统环境：WINDOWS10专业版，python 3

训练信息 1）单机，单卡
复现信息：如为报错，请给出复现环境、复现步骤
问题描述：请详细描述您的问题，同步贴出报错信息、日志、可复现的代码片段

我在运行猫狗分类，尝试使用GPU进行训练，提示我显卡内存不足。

D:\python\lib\site-packages\paddle\fluid\executor.py:782: UserWarning: The following exception is not an EOF exception.
  "The following exception is not an EOF exception.")
Traceback (most recent call last):
  File "1.1-猫十二分类-建造模型.py", line 200, in <module>
    fetch_list=[avg_cost, acc])                    #fetch均方误差和准确率
  File "D:\python\lib\site-packages\paddle\fluid\executor.py", line 783, in run
    six.reraise(*sys.exc_info())
  File "D:\python\lib\site-packages\six.py", line 703, in reraise
    raise value
  File "D:\python\lib\site-packages\paddle\fluid\executor.py", line 778, in run
    use_program_cache=use_program_cache)
  File "D:\python\lib\site-packages\paddle\fluid\executor.py", line 831, in _run_impl
    use_program_cache=use_program_cache)
  File "D:\python\lib\site-packages\paddle\fluid\executor.py", line 905, in _run_program
    fetch_var_name)
RuntimeError: 

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
Windows not support stack backtrace yet.

----------------------
Error Message Summary:
----------------------
ResourceExhaustedError: 

Out of memory error on GPU 0. Cannot allocate 975.156494MB memory on GPU 0, available memory is only 692.362499MB.

Please check whether there is any other process using GPU 0.
1. If yes, please stop them, or start PaddlePaddle on another GPU.
2. If no, please try one of the following suggestions:
   1) Decrease the batch size of your model.
   2) FLAGS_fraction_of_gpu_memory_to_use is 0.50 now, please set it to a higher value but less than 1.0.
      The command is `export FLAGS_fraction_of_gpu_memory_to_use=xxx`.

 at (D:\1.7.0\paddle\paddle\fluid\memory\detail\system_allocator.cc:151)

W0414 22:07:52.919553  1340 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0
W0414 22:07:53.216182  1340 device_context.cc:245] device: 0, cuDNN Version: 7.6.
W0414 22:08:20.092104  1340 operator.cc:181] relu raises an exception struct paddle::memory::allocation::BadAlloc, 

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
Windows not support stack backtrace yet.

----------------------
Error Message Summary:
----------------------
ResourceExhaustedError: 

Out of memory error on GPU 0. Cannot allocate 975.156494MB memory on GPU 0, available memory is only 692.362499MB.

Please check whether there is any other process using GPU 0.
1. If yes, please stop them, or start PaddlePaddle on another GPU.
2. If no, please try one of the following suggestions:
   1) Decrease the batch size of your model.
   2) FLAGS_fraction_of_gpu_memory_to_use is 0.50 now, please set it to a higher value but less than 1.0.
      The command is `export FLAGS_fraction_of_gpu_memory_to_use=xxx`.

 at (D:\1.7.0\paddle\paddle\fluid\memory\detail\system_allocator.cc:151)

尝试在命令行里加入语句export FLAGS_fraction_of_gpu_memory_to_use=0.9

尝试后无果。

在运行文件中加入os.environ["FLAGS_fraction_of_gpu_memory_to_use"]="0.9"

运行仍是无果。

怀疑是windows环境影响，于是在环境变量-系统变量设置了：

仍是运行报错。

之后修改batch_size从128变成了64

结果提示可用内存又相应的减少了？

Out of memory error on GPU 0. Cannot allocate 65.051514MB memory on GPU 0, available memory is only 24.174999MB.

而且我想知道如何能够将我4G的显卡内存都用在这个上面，而不是如何自动分配？

请求支援。

PaddlePaddle / Paddle 1 年多 前同步成功

win10训练，GPU不够用自动分配，但提示可用显存更少

PaddlePaddle / Paddle
1 年多前同步成功