使用预训练模型ERNIE时,加入额外的特征训练,保存模型检查点时报错,说变量不在block中
Created by: MiloHang
使用ERNIE预训练模型做文本分类,需要额外加入自己的特征,目前已经继承扩展了自己的Reader、Task、Dataset,可以正常开启训练,但是在训练保存检查点时报错。
具体步骤
1.在输入变量列表中加入了自己的特征,已经对应扩展了dataset、reader
feed_list = [
inputs["input_ids"].name,
inputs["position_ids"].name,
inputs["segment_ids"].name,
inputs["input_mask"].name,
"concept_ids"
]
2.自定义了Task的组网事件
def _build_net(self):
...
concept_ids = fluid.data(name='concept_ids',
shape=[-1, self.max_concept_sequence_len, 1],
dtype='int64')
...
3.开启微调
run_states = cls_task.finetune()
4.控制台输入
[2020-03-26 22:57:38,500] [ INFO] - PaddleHub finetune start
[2020-03-26 22:57:38,925] [ INFO] - Saving model checkpoint to ernie_tity_single_intent_cls\step_1
2020-03-26 22:57:39,450-WARNING: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
Traceback (most recent call last):
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-18576bfe9166>", line 1, in <module>
runfile('D:/dev/projects/local/python/research/user-intent-classification/models/conco_ernie/single_intent/single_intent_conco_ernie_tiny.py', wdir='D:/dev/projects/local/python/research/user-intent-classification/models/conco_ernie/single_intent')
File "D:\Application\PyCharm 2019.3.2\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "D:\Application\PyCharm 2019.3.2\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/dev/projects/local/python/research/user-intent-classification/models/conco_ernie/single_intent/single_intent_conco_ernie_tiny.py", line 118, in <module>
run_states = cls_task.finetune()
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddlehub\finetune\task\base_task.py", line 784, in finetune
self.save_checkpoint()
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddlehub\finetune\task\base_task.py", line 719, in save_checkpoint
self.save_inference_model(dirname=model_saved_dir)
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddlehub\finetune\task\base_task.py", line 760, in save_inference_model
params_filename=params_filename)
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddle\fluid\io.py", line 1189, in save_inference_model
prepend_feed_ops(main_program, feeded_var_names)
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddle\fluid\io.py", line 988, in prepend_feed_ops
out = global_block.var(name)
File "D:\dev\Anaconda\envs\paddle\lib\site-packages\paddle\fluid\framework.py", line 2263, in var
raise ValueError("var %s not in this block" % name)
ValueError: var concept_ids not in this block
经过跟踪发现是framework.py
的prepend_feed_ops
函数中去获取全局block的变量,发现concept_ids
不存在,是不是在build_net
事件中使用fluid.data
定义的数据不在全局变量中呢?那么我该在哪里定义?