How to pass 2-dimentional sequence to LSTM?
Created by: ganji15
I found that given examples can process 1-dimentional sequence data with RNN by passing the input sequence to a embeding-layer, then the input sequence is transformed to a 2-dimentional data sequence(word vector matrix I guess). However, sometimes input sequence is 2-dimentional , such as (time_step, 1-dimentional feature_vector) <=> [ [f, f, f, f], [f, f, f, f], ..., [f, f, f, f]]. So how can I directly put the 2-dimentional sequence to RNN/LSTM?
I tried following method, but I failed.
DataProvider.py
from paddle.trainer.PyDataProvider2 import *
def hook(settings, input_dim, num_class, is_train, **kwargs):
settings.input_types = [
dense_vector_sequence(int(input_dim)),
integer_value_sequence(int(num_class))]
settings.is_train = is_train
@provider(init_hook=hook)
def processData(settings, file_name):
seqs, labels = cPickle.load(open(file_name, 'rb'))
indexs = list(range(len(labels)))
if settings.is_train:
random.shuffle(indexs)
for i in indexs:
seq = seqs[i] # sequence of 1-dim fixed length vector
label = labels[i] # various length integer sequence
yield seq, label
Error Information
I0919 00:57:43.780038 592 Util.cpp:138] commandline: /usr/local/bin/../opt/paddle/bin/paddle_trainer --config=OcrRecognition.py --dot_period=10 --log_period=10 --test_all_data_in_one_period=1 --use_gpu=1 --gpu_id=0 --trainer_count=1 --num_passes=100 --save_dir=./model I0919 00:57:44.160080 592 Util.cpp:113] Calling runInitFunctions I0919 00:57:44.160356 592 Util.cpp:126] Call runInitFunctions done. [WARNING 2016-09-19 00:57:44,193 default_decorators.py:40] please use keyword arguments in paddle config. [WARNING 2016-09-19 00:57:44,194 default_decorators.py:40] please use keyword arguments in paddle config. [INFO 2016-09-19 00:57:44,195 networks.py:1122] The input order is [ocr_seq, label] [INFO 2016-09-19 00:57:44,195 networks.py:1129] The output order is [ctc] I0919 00:57:44.197438 592 Trainer.cpp:169] trainer mode: Normal I0919 00:57:44.198976 592 PyDataProvider2.cpp:219] loading dataprovider dataprovider::processData I0919 00:57:44.201690 592 PyDataProvider2.cpp:219] loading dataprovider dataprovider::processData I0919 00:57:44.201730 592 GradientMachine.cpp:134] Initing parameters.. I0919 00:57:44.205186 592 GradientMachine.cpp:141] Init parameters done.
* Aborted at 1474217866 (unix time) try "date -d @1474217866" if you are using GNU date *
PC: @ 0x7fd4325a5767 (unknown)* SIGSEGV (@0x10) received by PID 592 (TID 0x7fd433536840) from PID 16; stack trace: *
@ 0x7fd432e28330 (unknown) @ 0x7fd4325a5767 (unknown) @ 0x7fd43256c444 (unknown) @ 0x7fd432644370 (unknown) @ 0x7fd4325cf193 (unknown) @ 0x7fd43261b3b7 (unknown) @ 0x7fd4107d6da4 array_str @ 0x7fd4325d258a (unknown) @ 0x7fd4325d277a (unknown) @ 0x7fd4108ab3dc gentype_repr @ 0x7fd432624da0 (unknown) @ 0x82abf9 paddle::py::repr() @ 0x569eb1 paddle::IndexScanner::fill() @ 0x56a2c1 paddle::SequenceScanner::fill() @ 0x56d3fc paddle::PyDataProvider2::getNextBatchInternal() @ 0x563982 paddle::DataProvider::getNextBatch() @ 0x69b437 paddle::Trainer::trainOnePass() @ 0x69ecc7 paddle::Trainer::train() @ 0x53bf73 main @ 0x7fd43144cf45 (unknown) @ 0x5475b5 (unknown) @ 0x0 (unknown)/usr/local/bin/paddle: 行 81: 592 段错误 (核心已转储) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}