使用GPU 2.0-beta版本无法跑CPU训练 (#1502) · Issue · PaddlePaddle / PaddleDetection

使用GPU 2.0-beta版本无法跑CPU训练

Created by: luotao1

用CPU 2.0-beta版本可以跑CPU训练，但觉得比较慢，就想换成GPU
安装了GPU 2.0-beta版本后，无法跑GPU训练，提示CUDA driver version is insufficient for CUDA runtime version。

$ python tools/train.py -c configs/yolov3_mobilenet_v1_roadsign.yml --eval -o use_gpu=1
W0924 22:58:24.747603  9368 init.cc:157] Compiled with WITH_GPU, but no GPU found in runtime.
Traceback (most recent call last):
  File "tools/train.py", line 372, in <module>
    main()
  File "tools/train.py", line 85, in main
    devices_num = fluid.core.get_cuda_device_count()
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::platform::GetCUDADeviceCount()
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2   std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
3   paddle::platform::GetCurrentTraceBackString()

----------------------
Error Message Summary:
----------------------
ExternalError:  Cuda error(35), CUDA driver version is insufficient for CUDA runtime version.
  [Advise: This indicates that the installed NVIDIA CUDA driver is older than the CUDA runtime library. This is not a supported configuration.Users should install an updated NVIDIA display driver to allow the application to run.] (at /paddle/paddle/fluid/platform/gpu_info.cc:68)

因为报错是GPU驱动没装对，所以想换CPU方式训练，但依然报错

$ python tools/train.py -c configs/yolov3_mobilenet_v1_roadsign.yml --eval -o use_gpu=0
W0924 22:58:47.446882 30162 init.cc:157] Compiled with WITH_GPU, but no GPU found in runtime.
2020-09-24 22:58:47,718-WARNING: config YOLOv3Loss.batch_size is deprecated, training batch size should be set by TrainReader.batch_size
2020-09-24 22:58:48,839-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000500] in Optimizer will not take effect, and it will only be applied to other Parameters!
2020-09-24 22:58:49,292-WARNING: config YOLOv3Loss.batch_size is deprecated, training batch size should be set by TrainReader.batch_size
2020-09-24 22:58:50,776-INFO: places would be ommited when DataLoader is not iterable
Traceback (most recent call last):
  File "tools/train.py", line 372, in <module>
    main()
  File "tools/train.py", line 180, in main
    exe.run(startup_prog)
  File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1101, in run
    six.reraise(*sys.exc_info())
  File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1099, in run
    return_merged=return_merged)
  File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1223, in _run_impl
    use_program_cache=use_program_cache)
  File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1308, in _run_program
    fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:

  Compile Traceback (most recent call last):
    File "tools/train.py", line 372, in <module>
      main()
    File "tools/train.py", line 123, in main
      optimizer.minimize(loss)
    File "<decorator-gen-92>", line 2, in minimize
      None
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/dygraph/base.py", line 239, in __impl__
      return func(*args, **kwargs)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 947, in minimize
      loss, startup_program=startup_program, params_grads=params_grads)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 861, in apply_optimize
      optimize_ops = self.apply_gradients(params_grads)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 835, in apply_gradients
      optimize_ops = self._create_optimization_pass(params_grads)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 660, in _create_optimization_pass
      [p[0] for p in parameters_and_grads if p[0].trainable])
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 1142, in _create_accumulators
      self._add_accumulator(self._velocity_acc_str, p)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 573, in _add_accumulator
      var, initializer=Constant(value=float(fill_value)))
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/layer_helper_base.py", line 448, in set_variable_initializer
      initializer=initializer)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2719, in create_var
      kwargs['initializer'](var, self)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/initializer.py", line 149, in __call__
      stop_gradient=True)
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2949, in _prepend_op
      attrs=kwargs.get("attrs", None))
    File "/home/luotao/.jumbo/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1977, in __init__
      for frame in traceback.extract_stack():

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool, bool)
1   paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
2   paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool)
3   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
5   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
6   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::FillConstantKernel<float>, paddle::operators::FillConstantKernel<double>, paddle::operators::FillConstantKernel<long>, paddle::operators::FillConstantKernel<int>, paddle::operators::FillConstantKernel<bool>, paddle::operators::FillConstantKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
7   paddle::operators::FillConstantKernel<float>::Compute(paddle::framework::ExecutionContext const&) const
8   paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long)
9   paddle::memory::AllocShared(paddle::platform::Place const&, unsigned long)
10  paddle::memory::allocation::AllocatorFacade::Instance()
11  paddle::memory::allocation::AllocatorFacade::AllocatorFacade()
12  paddle::memory::allocation::AllocatorFacadePrivate::AllocatorFacadePrivate()
13  paddle::platform::GetCUDADeviceCount()
14  paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
15  std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
16  paddle::platform::GetCurrentTraceBackString()

----------------------
Error Message Summary:
----------------------
ExternalError:  Cuda error(35), CUDA driver version is insufficient for CUDA runtime version.
  [Advise: This indicates that the installed NVIDIA CUDA driver is older than the CUDA runtime library. This is not a supported configuration.Users should install an updated NVIDIA display driver to allow the application to run.] (at /paddle/paddle/fluid/platform/gpu_info.cc:68)
  [operator < fill_constant > error]

PaddlePaddle / PaddleDetection 接近 2 年 前同步成功

使用GPU 2.0-beta版本无法跑CPU训练

PaddlePaddle / PaddleDetection
接近 2 年前同步成功