训练时报错PaddleCheckError: CUBLAS
Created by: pfan8
---------------------------------------------------------------------------EnforceNotMet Traceback (most recent call last)<ipython-input-11-c202560c70bc> in <module>
61 avg_loss = fluid.layers.mean(loss)
62 print('avg_loss:{}'.format(avg_loss.numpy()))
---> 63 avg_loss.backward()
64 adam.minimize(avg_loss)
65 model.clear_gradients()
</opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/decorator.py:decorator-gen-133> in backward(self, backward_strategy)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py in __impl__(func, *args, **kwargs)
23 def __impl__(func, *args, **kwargs):
24 wrapped_func = decorator_func(func)
---> 25 return wrapped_func(*args, **kwargs)
26
27 return __impl__
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py in __impl__(*args, **kwargs)
205 assert in_dygraph_mode(
206 ), "We Only support %s in Dygraph mode, please use fluid.dygraph.guard() as context to run it in Dygraph Mode" % func.__name__
--> 207 return func(*args, **kwargs)
208
209 return __impl__
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py in backward(self, backward_strategy)
889 backward_strategy.sort_sum_gradient = False
890
--> 891 self._ivar._run_backward(backward_strategy, _dygraph_tracer())
892 else:
893 raise ValueError(
EnforceNotMet: 0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 void paddle::operators::math::CUBlas<float>::AXPY<cublasContext*, int, float*, float const*, int, float*, int>(cublasContext*, int, float*, float const*, int, float*, int)
3 paddle::imperative::TensorAdd(paddle::framework::Variable const&, paddle::framework::Variable*)
4 paddle::imperative::EagerGradientAccumulator::Add(std::shared_ptr<paddle::imperative::VarBase>, unsigned long)
5 paddle::imperative::BasicEngine::SumGradient(paddle::imperative::OpBase*, std::shared_ptr<paddle::imperative::VarBase>, paddle::imperative::VarBase*)
6 paddle::imperative::BasicEngine::Execute()
PaddleCheckError: CUBLAS: execution failed, at [/paddle/paddle/fluid/operators/math/blas_impl.cu.h:39]
训练到几十个batch后会出现这个错误,输出loss查看都是正常的数值,不知道问题出在哪 模型是seq2seq 参照https://github.com/PaddlePaddle/models/blob/develop/dygraph/sentiment/main.py