scaled_dot_product_attention调用时报错
Created by: imwujue
调用代码: def cross_attention_layer(query, title): pad_value = fluid.layers.assign( input=numpy.array([0.0], dtype=numpy.float32)) query_new = layers.sequence_pad(query, pad_value = pad_value) title_new = layers.sequence_pad(title, pad_value = pad_value) layers.Print(query, message="query") layers.Print(title, message="title") layers.Print(query_new[0], message='query_pad') layers.Print(title_new[0], message='title_pad') print(query_new[0], title_new[0]) context = nets.scaled_dot_product_attention(queries=title_new[0], keys=query_new[0], values=query_new[0]) layers.Print(query_new[0], message='query_new') return context
报错信息:
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) 2 paddle::operators::MatMulOp::InferShape(paddle::framework::InferShapeContext*) const 3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 5 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 6 paddle::framework::HogwildWorker::TrainFiles()
Python Call Stacks (More useful to users):
File "/home/work/python_dx/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2500, in append_op attrs=kwargs.get("attrs", None)) File "/home/work/python_dx/lib/python2.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op return self.main_program.current_block().append_op(*args, **kwargs) File "/home/work/python_dx/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 4560, in matmul 'alpha': float(alpha), File "/home/work/python_dx/lib/python2.7/site-packages/paddle/fluid/nets.py", line 547, in scaled_dot_product_attention ctx_multiheads = layers.matmul(weights, v) File "/home/work/baidu/personal-code/wujue/pyramid_rnn_multi_attention_bigram_site/pyramid_layers.py", line 166, in cross_attention_layer context = nets.scaled_dot_product_attention(queries=title_new[0], keys=query_new[0], values=query_new[0]) File "/home/work/baidu/personal-code/wujue/pyramid_rnn_multi_attention_bigram_site/pyramid_train_net.py", line 111, in simnet cross_attention = layers.cross_attention_layer(gru_encoder) File "/home/work/baidu/personal-code/wujue/pyramid_rnn_multi_attention_bigram_site/pyramid_train_net.py", line 153, in train qps = self.simnet(q_data=q_data, u_data=p_u_data, is_train=1) File "pyramid_train_pe.py", line 47, in train avg_cost, pos_score, neg_score, pnum, nnum, train_pn = pyramid.train(input_data) File "pyramid_train_pe.py", line 220, in train(pyramid_config, args)
Error Message Summary:
Error: ShapeError: The batch size of the two matrices should be equal, or at least one is zero. But received X's shape: [2016, 1, 1], Y's shape: [12, 12, 256]. at (/home/baidu/cm_develop/Paddle/paddle/fluid/operators/matmul_op.cc:331) [operator < matmul > error]
打印的query、title和query、title pad之后的信息: 597125298 query The place is:CPUPlace Tensor[concat_0.tmp_0] shape: [72,256,] dtype: f LoD: [[ 0,6,12,22,27,32,37,42,48,52,55,60,72, ]] data: -0.000221436,-0.000367491,0.000374207,-7.22531e-05,0.000665179,-0.000853978,0.000652466,0.000449938,-7.64808e-05,0.00143405,0.000343646,-0.000226622,-4.05259e-06,-0.000125455,0.000420233,-0.00011094,-0.00045795,-0.00189013,0.000222249,-0.000519661, 1597125298 title The place is:CPUPlace Tensor[concat_2.tmp_0] shape: [87,256,] dtype: f LoD: [[ 0,14,21,34,42,47,52,55,60,66,70,76,87, ]] data: -0.000221436,-0.000367491,0.000374207,-7.22531e-05,0.000665179,-0.000853978,0.000652466,0.000449938,-7.64808e-05,0.00143405,0.000343646,-0.000226622,-4.05259e-06,-0.000125455,0.000420233,-0.00011094,-0.00045795,-0.00189013,0.000222249,-0.000519661, 1597125298 query_pad The place is:CPUPlace Tensor[sequence_pad_0.tmp_0] shape: [12,12,256,] dtype: f data: -0.000221436,-0.000367491,0.000374207,-7.22531e-05,0.000665179,-0.000853978,0.000652466,0.000449938,-7.64808e-05,0.00143405,0.000343646,-0.000226622,-4.05259e-06,-0.000125455,0.000420233,-0.00011094,-0.00045795,-0.00189013,0.000222249,-0.000519661, 1597125298 title_pad The place is:CPUPlace Tensor[sequence_pad_1.tmp_0] shape: [12,14,256,] dtype: f data: -0.000221436,-0.000367491,0.000374207,-7.22531e-05,0.000665179,-0.000853978,0.000652466,0.000449938,-7.64808e-05,0.00143405,0.000343646,-0.000226622,-4.05259e-06,-0.000125455,0.000420233,-0.00011094,-0.00045795,-0.00189013,0.000222249,-0.000519661,