paddle1.5版本 lstm 数据输出的size不对
Created by: eshaoliu
为使您的问题得到快速解决,在建立Issue前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
-
标题:lstm输出的size不对
-
版本、环境信息: 1)PaddlePaddle版本:1.5.1
-
模型信息 lstm https://www.paddlepaddle.org.cn/documentation/docs/zh/1.5/api_cn/layers_cn/nn_cn.html ernie
-
复现信息:
-
问题描述: 代码片段 def seq_from_ernie( args, src_ids, position_ids, sentence_ids, task_ids, input_mask, config, use_fp16, ): """cls_from_ernie""" ernie = ErnieModel( src_ids=src_ids, position_ids=position_ids, sentence_ids=sentence_ids, task_ids=task_ids, input_mask=input_mask, config=config, use_fp16=use_fp16, ) seq_feats = ernie.get_sequence_output() ''' seq_feats = fluid.layers.dropout( x=seq_feats, dropout_prob=0.1, dropout_implementation="upscale_in_train", ) ''' return seq_feats
if is_classify: pyreader = fluid.layers.py_reader( capacity=50, shapes=[ [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], #[-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], #[-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], #[-1, args.max_seq_len, 1], [-1, 1], [-1, 1], [-1, 1], ], dtypes=[ 'int64', 'int64', 'int64', 'int64', 'float32', 'int64', 'int64', 'int64', 'int64', 'float32', #'int64', 'int64', 'int64', 'int64', 'float32', 'int64', 'int64', 'int64', ], lod_levels=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], name=task_name + "_" + pyreader_name, use_double_buffer=True, )
( src_ids_1, sent_ids_1, pos_ids_1, task_ids_1, input_mask_1, src_ids_2, sent_ids_2, pos_ids_2, task_ids_2, input_mask_2, #src_ids_3, sent_ids_3, pos_ids_3, task_ids_3, input_mask_3, labels, types, qids, ) = fluid.layers.read_file(pyreader)
seq_feats_query = seq_from_ernie( args, src_ids=src_ids_1, position_ids=pos_ids_1, sentence_ids=sent_ids_1, task_ids=task_ids_1, input_mask=input_mask_1, config=ernie_config, use_fp16=args.use_fp16, )
seq_feats_left = seq_from_ernie( args, src_ids=src_ids_2, position_ids=pos_ids_2, sentence_ids=sent_ids_2, task_ids=task_ids_2, input_mask=input_mask_2, config=ernie_config, use_fp16=args.use_fp16, ) seq_feats_query = fluid.layers.transpose(seq_feats_query,[1,0,2]) seq_feats_left = fluid.layers.transpose(seq_feats_left,[1,0,2]) print('seq_feats_query',seq_feats_query) num_layers,batch_size,hidden_size,max_len = 1,32,64,128 query_init_h = fluid.layers.fill_constant( [num_layers, batch_size, hidden_size], 'float32', 0.0 ) query_init_c = fluid.layers.fill_constant( [num_layers, batch_size, hidden_size], 'float32', 0.0 ) query_rnn_out, query_last_h, query_last_c = fluid.layers.lstm( seq_feats_query, query_init_h,query_init_c,max_len, hidden_size, num_layers, is_bidirec=True ) print('query_rnn_out',query_rnn_out) print('seq_feats_left',seq_feats_left) left_init_h = fluid.layers.fill_constant( [num_layers, batch_size, hidden_size], 'float32', 0.0 ) left_init_c = fluid.layers.fill_constant( [num_layers, batch_size, hidden_size], 'float32', 0.0 ) left_rnn_out, left_last_h, left_last_c = fluid.layers.lstm( seq_feats_left, left_init_h,left_init_c,max_len, hidden_size, num_layers, is_bidirec=True ) #query_rnn_out = fluid.layers.transpose(query_rnn_out,[1,0,2]) #left_rnn_out = fluid.layers.transpose(left_rnn_out,[1,0,2]) print('left_rnn_out',left_rnn_out) 说下整个过程,先用ernie get_sequence_output 获得整个序列每个位置的向量(768维),然后过lstm,输出按理来说的维度应该是预设的hidden_size2,然而事实不是这样的。 输入数据 第三个分量是原始的dimension 768 seq_feats_query name: "transpose_96.tmp_0" 2020-09-02 13:42:55 type { 2020-09-02 13:42:55 type: LOD_TENSOR 2020-09-02 13:42:55 lod_tensor { 2020-09-02 13:42:55 tensor { 2020-09-02 13:42:55 data_type: FP32 2020-09-02 13:42:55 dims: 128 2020-09-02 13:42:55 dims: -1 2020-09-02 13:42:55 dims: 768 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 persistable: false 2020-09-02 13:42:55 经过lstm 输出数据,第三个分量不是我预期的hidden_size2 =128而仍然是768 2020-09-02 13:42:55 query_rnn_out name: "cudnn_lstm_0.tmp_0" 2020-09-02 13:42:55 type { 2020-09-02 13:42:55 type: LOD_TENSOR 2020-09-02 13:42:55 lod_tensor { 2020-09-02 13:42:55 tensor { 2020-09-02 13:42:55 data_type: FP32 2020-09-02 13:42:55 dims: 128 2020-09-02 13:42:55 dims: -1 2020-09-02 13:42:55 dims: 768 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 persistable: false 这个也是同理,输入768维,输出hidden_size 仍未768.与文档中提到的并不符合,
2020-09-02 13:42:55 seq_feats_left name: "transpose_97.tmp_0" 2020-09-02 13:42:55 type { 2020-09-02 13:42:55 type: LOD_TENSOR 2020-09-02 13:42:55 lod_tensor { 2020-09-02 13:42:55 tensor { 2020-09-02 13:42:55 data_type: FP32 2020-09-02 13:42:55 dims: 128 2020-09-02 13:42:55 dims: -1 2020-09-02 13:42:55 dims: 768 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 persistable: false 2020-09-02 13:42:55 2020-09-02 13:42:55 left_rnn_out name: "cudnn_lstm_1.tmp_0" 2020-09-02 13:42:55 type { 2020-09-02 13:42:55 type: LOD_TENSOR 2020-09-02 13:42:55 lod_tensor { 2020-09-02 13:42:55 tensor { 2020-09-02 13:42:55 data_type: FP32 2020-09-02 13:42:55 dims: 128 2020-09-02 13:42:55 dims: -1 2020-09-02 13:42:55 dims: 768 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 } 2020-09-02 13:42:55 persistable: false