实体(比如‘哈尔滨’)被masked时,实体的所有字的MASK机制一样的么?
Created by: lexmen318
从batching.py的如下代码看,似乎不太一样哦: prob = prob_mask[prob_index + index] base_prob = 1.0 if index == beg: base_prob = 0.15 if base_prob * 0.2 < prob <= base_prob: mask_label.append(sent[index]) sent[index] = MASK mask_flag = True mask_pos.append(sent_index * max_len + index) elif base_prob * 0.1 < prob <= base_prob * 0.2: mask_label.append(sent[index]) sent[index] = replace_ids[prob_index + index] mask_flag = True mask_pos.append(sent_index * max_len + index) else: mask_label.append(sent[index]) mask_pos.append(sent_index * max_len + index)