Init parameters done 之后半个多小时没输出
Created by: LoganZhou
昨天尝试将多维度特征输入到网络中进行训练,每个维度特征对应一个data_layer
输入,但在特征融合时报错,具体参见此issues。
今天改了下思路,干脆在dataprovider
中融合特征,将多维度特征用sklearn转为OneHotVector,变成两个list输入到网络,然后就卡在Init parameters done.
,半个多小时都没输出,没任何报错信息,输出仅有这些:
Paddle release a new version 0.9.0, you can get the install package in http://www.paddlepaddle.org
I0423 16:22:37.310995 1770 Util.cpp:154] commandline: /home/admin/Paddle-CPU-bin/Paddle-CPU/bin/../opt/paddle/bin/paddle_trainer --config=my_trainer_config.py --save_dir=./output/combined_feature --trainer_count=12 --log_period=1000 --dot_period=10 --num_passes=10 --use_gpu=false --show_parameter_stats_period=3000
[WARNING 2017-04-23 16:22:37,554 networks.py:1438] `outputs` routine try to calculate network's inputs and outputs order. It might not work well.Please see follow log carefully.
[INFO 2017-04-23 16:22:37,556 networks.py:1466] The input order is [id_and_time, train_spd, label_5min, label_10min, label_15min, label_20min, label_25min, label_30min, label_35min, label_40min, label_45min, label_50min, label_55min, label_60min, label_65min, label_70min, label_75min, label_80min, label_85min, label_90min, label_95min, label_100min, label_105min, label_110min, label_115min, label_120min]
[INFO 2017-04-23 16:22:37,556 networks.py:1472] The output order is [cost_5min, cost_10min, cost_15min, cost_20min, cost_25min, cost_30min, cost_35min, cost_40min, cost_45min, cost_50min, cost_55min, cost_60min, cost_65min, cost_70min, cost_75min, cost_80min, cost_85min, cost_90min, cost_95min, cost_100min, cost_105min, cost_110min, cost_115min, cost_120min]
I0423 16:22:37.564220 1770 Trainer.cpp:175] trainer mode: Normal
I0423 16:22:37.618906 1770 PyDataProvider2.cpp:243] loading dataprovider my_dataprovider::process
I0423 16:22:37.816644 1770 PyDataProvider2.cpp:243] loading dataprovider my_dataprovider::process
I0423 16:22:37.817775 1770 GradientMachine.cpp:135] Initing parameters..
I0423 16:22:37.833380 1770 GradientMachine.cpp:142] Init parameters done.
我print了dataprovider的输出,也没有错误。
{'label_105min': 0, 'label_115min': 0, 'label_75min': 0, 'label_5min': 0, 'label_120min': 0,
'label_95min': 0, 'label_25min': 0, 'label_40min': 0, 'label_10min': 0, 'label_85min': 0,
'label_60min': 0, 'label_90min': 0, 'label_110min': 0, 'label_15min': 0, 'label_80min': 0,
'label_50min': 0, 'label_30min': 0, 'label_35min': 0, 'label_70min': 0, 'label_65min': 0, 'label_55min': 0,
'id_and_time': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
'train_spd': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'label_100min': 0, 'label_20min': 0, 'label_45min': 0}
另外我在两台节点上分别尝试运行CPU版本和GPU两个版本的,但结果仍然是同样的。 修改后的网络配置如下:
output_label = []
# id and time id input(integer_value_sequence)
id_time = data_layer(name='id_and_time', size=ID_AND_TIME_LEN + 1)
id_time_emb = embedding_layer(input=id_time, size=emb_size)
id_time_fc = fc_layer(input=id_time_emb, size=emb_size)
# train_data input(integer_value_sequence)
train_spd = data_layer(name='train_spd', size=TERM_NUM)
train_spd_emb = embedding_layer(input=train_spd, size=24)
train_spd_fc = fc_layer(input=train_spd_emb, size=24)
# combine feature
node_combined_feature = fc_layer(
input=[id_time_fc, train_spd_fc],
size=256,
act=TanhActivation()
)
# lstm network
lstm = simple_lstm(
input=node_combined_feature, size=128, lstm_cell_attr=ExtraAttr(drop_rate=0.25))
lstm_max = pooling_layer(input=lstm, pooling_type=MaxPooling())
for i in xrange(FORECASTING_NUM):
score = fc_layer(input=lstm_max, size=4, act=SoftmaxActivation())
if is_predict:
maxid = maxid_layer(score)
output_label.append(maxid)
else:
# Multi-task training.
label = data_layer(name='label_%dmin' % ((i + 1) * 5), size=4)
cls = classification_cost(
input=score, name="cost_%dmin" % ((i + 1) * 5), label=label)
output_label.append(cls)
outputs(output_label)