为什么同样的代码在CPU环境下可以正确输出,GPU环境会报错?
Created by: 131250208
以下测试代码在AI studio上的CPU环境可以正常运行并输出:[64, 100, 128],但是切换到GPU环境就会报错。
代码:
from paddle import fluid
import numpy as np
param_attr = fluid.ParamAttr(initializer = fluid.initializer.UniformInitializer(low = -0.5, high = 0.5))
hidden_size = 128
batch_size = 64
seq_len = 100
attn_fc = fluid.dygraph.nn.Linear(hidden_size,
hidden_size,
param_attr = param_attr,
bias_attr=False)
with fluid.dygraph.guard():
test_sample = np.random.rand(batch_size, seq_len, hidden_size) - .5
test_sample = fluid.dygraph.to_variable(test_sample.astype(np.float32))
attn_fc(test_sample).shape # 这一句在GPU环境报错
报错信息:
---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-2-a153b84de2d5> in <module>
12 test_sample = np.random.rand(batch_size, seq_len, hidden_size) - .5
13 test_sample = fluid.dygraph.to_variable(test_sample.astype(np.float32))
---> 14 attn_fc(test_sample).shape
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
302 self._built = True
303
--> 304 outputs = self.forward(*inputs, **kwargs)
305 return outputs
306
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/nn.py in forward(self, input)
941 tmp = self._helper.create_variable_for_type_inference(self._dtype)
942 self._helper.append_op(
--> 943 type="matmul", inputs=inputs, outputs={"Out": tmp}, attrs=attrs)
944 if self.bias:
945 pre_activation = self._helper.create_variable_for_type_inference(
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layer_object_helper.py in append_op(self, type, inputs, outputs, attrs, stop_gradient)
50 outputs=outputs,
51 attrs=attrs,
---> 52 stop_gradient=stop_gradient)
53
54 def _multiple_input(self, inputs_in):
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py in append_op(self, *args, **kwargs)
2523 inputs=kwargs.get("inputs", None),
2524 outputs=kwargs.get("outputs", None),
-> 2525 attrs=kwargs.get("attrs", None))
2526
2527 self.ops.append(op)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py in __init__(***failed resolving arguments***)
1833 "or one of [str, bytes, Variable] in python3."
1834 "but received : %s" %
-> 1835 (in_proto.name, type, arg))
1836 self.desc.set_input(in_proto.name, in_arg_names)
1837 else:
TypeError: __str__ returned non-string (type NoneType)
如果把 attn_fc(test_sample).shape 放到 with fluid.dygraph.guard(): 里,执行器会崩溃重启。jupyter notebook里的报错信息是:
terminate called after throwing an instance of 'pybind11::error_already_set'
what(): TypeError: __repr__ returned non-string (type NoneType)
W0330 19:47:11.587920 26504 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0330 19:47:11.587961 26504 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0330 19:47:11.587975 26504 init.cc:214] The detail failure signal is:
W0330 19:47:11.587991 26504 init.cc:217] *** Aborted at 1585568831 (unix time) try "date -d @1585568831" if you are using GNU date ***
W0330 19:47:11.590921 26504 init.cc:217] PC: @ 0x0 (unknown)
W0330 19:47:11.591012 26504 init.cc:217] *** SIGABRT (@0x3ec00006788) received by PID 26504 (TID 0x7ff630731700) from PID 26504; stack trace: ***
W0330 19:47:11.593632 26504 init.cc:217] @ 0x7ff630319390 (unknown)
W0330 19:47:11.596221 26504 init.cc:217] @ 0x7ff62ff73428 gsignal
W0330 19:47:11.598780 26504 init.cc:217] @ 0x7ff62ff7502a abort
W0330 19:47:11.609251 26504 init.cc:217] @ 0x7ff62d21184a __gnu_cxx::__verbose_terminate_handler()
W0330 19:47:11.625243 26504 init.cc:217] @ 0x7ff62d20ff47 __cxxabiv1::__terminate()
W0330 19:47:11.627378 26504 init.cc:217] @ 0x7ff62d20ff7d std::terminate()
W0330 19:47:11.629420 26504 init.cc:217] @ 0x7ff62d21015a __cxa_throw
W0330 19:47:11.630650 26504 init.cc:217] @ 0x7ff581ae46c8 pybind11::cpp_function::dispatcher()
W0330 19:47:11.631027 26504 init.cc:217] @ 0x55c5d5fd8c54 _PyCFunction_FastCallDict
W0330 19:47:11.634313 26504 init.cc:217] @ 0x55c5d6060abc call_function
W0330 19:47:11.634702 26504 init.cc:217] @ 0x55c5d608375a _PyEval_EvalFrameDefault
W0330 19:47:11.635062 26504 init.cc:217] @ 0x55c5d605b2db _PyFunction_FastCallDict
W0330 19:47:11.635416 26504 init.cc:217] @ 0x55c5d5fd901f _PyObject_FastCallDict
W0330 19:47:11.635766 26504 init.cc:217] @ 0x55c5d5fddaa3 _PyObject_Call_Prepend
W0330 19:47:11.636147 26504 init.cc:217] @ 0x55c5d5fd8a5e PyObject_Call
W0330 19:47:11.636529 26504 init.cc:217] @ 0x55c5d6084e37 _PyEval_EvalFrameDefault
W0330 19:47:11.636781 26504 init.cc:217] @ 0x55c5d6059e66 _PyEval_EvalCodeWithName
W0330 19:47:11.637133 26504 init.cc:217] @ 0x55c5d605b37e _PyFunction_FastCallDict
W0330 19:47:11.651616 26504 init.cc:217] @ 0x55c5d5fd901f _PyObject_FastCallDict
W0330 19:47:11.651965 26504 init.cc:217] @ 0x55c5d5fddaa3 _PyObject_Call_Prepend
W0330 19:47:11.652350 26504 init.cc:217] @ 0x55c5d5fd8a5e PyObject_Call
W0330 19:47:11.652570 26504 init.cc:217] @ 0x55c5d6032371 slot_tp_call
W0330 19:47:11.652917 26504 init.cc:217] @ 0x55c5d5fd8e3b _PyObject_FastCallDict
W0330 19:47:11.653172 26504 init.cc:217] @ 0x55c5d6060c0e call_function
W0330 19:47:11.653553 26504 init.cc:217] @ 0x55c5d608375a _PyEval_EvalFrameDefault
W0330 19:47:11.653921 26504 init.cc:217] @ 0x55c5d605b9b9 PyEval_EvalCodeEx
W0330 19:47:11.654268 26504 init.cc:217] @ 0x55c5d605c75c PyEval_EvalCode
W0330 19:47:11.654549 26504 init.cc:217] @ 0x55c5d6081167 builtin_exec
W0330 19:47:11.654906 26504 init.cc:217] @ 0x55c5d5fd8b91 _PyCFunction_FastCallDict
W0330 19:47:11.655164 26504 init.cc:217] @ 0x55c5d6060abc call_function
W0330 19:47:11.655566 26504 init.cc:217] @ 0x55c5d608375a _PyEval_EvalFrameDefault
W0330 19:47:11.655912 26504 init.cc:217] @ 0x55c5d6063be6 _PyGen_Send
[I 19:47:14.035 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel bcda5f6f-18a7-4a3e-9554-b22f12e740bc restarted