训练网络最后一层为负采样nce,预测替换成矩阵映射,保存和load模型的时候出现问题
Created by: ShadowSkyLiu
训练网络最后一层为负采样nce,预测替换成input * nce_w^T + nce_b,保存和load模型的时候出现问题。
-
训练保存模型参数指定filename,load_persistables之后维度不对 训练保存: with open(model_name, "wb") as f: f.write(main_program.desc.serialize_to_string()) fluid.io.save_persistables(exe, save_dirname, main_program, filename=save_path) 预测加载: print("nce_w shape before", fluid.global_scope().find_var("nce_w").get_tensor().shape()) fluid.io.load_persistables(exe, save_dirname, main_program, filename=params_filename) print("nce_w shape after", fluid.global_scope().find_var("nce_w").get_tensor().shape()) ... 打印结果: nce_w shape before [21391L, 31L] nce_w shape after [1L]
-
训练保存模型参数不指定filename,显示加载之后数据维度没有问题 训练保存: with open(model_name, "wb") as f: f.write(main_program.desc.serialize_to_string()) fluid.io.save_persistables(exe, save_dirname, main_program) #, filename=save_path) 预测加载: print("nce_w shape before", fluid.global_scope().find_var("nce_w").get_tensor().shape()) fluid.io.load_persistables(exe, save_dirname, main_program) #, filename=params_filename) print("nce_w shape after", fluid.global_scope().find_var("nce_w").get_tensor().shape()) ... 打印结果: nce_w shape before [21391L, 31L] nce_w shape after [21391L, 31L]
ps:附加网络最后一层定义:
def net(...):
...
if is_train:
cost = fluid.layers.nce(
input=fc,
label=next_id,
num_total_classes=nid_dict_size,
param_attr=fluid.ParamAttr(name="nce_w"),
bias_attr=fluid.ParamAttr(name="nce_b"),
num_neg_samples=5)
return cost, fc
else:
w_var = fluid.layers.create_global_var(
shape=[nid_dict_size, 31], value=1.0,
dtype='float32', persistable=True, name='nce_w')
b_var = fluid.layers.create_global_var(
shape=[nid_dict_size, 1], value=1.0, dtype='float32', persistable=True, name='nce_b')
mul_var = fluid.layers.matmul(fc, w_var, transpose_y=True)
sum_var = fluid.layers.sums(mul_var, b_var)
prediction_layer = fluid.layers.softmax(sum_var)
return prediction_layer, fc