训练KT-NET时执行cd reading_comprehension && sh ./run_record_twomemory.sh报错
Created by: Soulmate303
09/25/2019 14:17:39 - INFO - utils.args - ----------- Configuration Arguments ----------- 09/25/2019 14:17:39 - INFO - utils.args - batch_size: 6 09/25/2019 14:17:39 - INFO - utils.args - bert_config_path: cased_L-24_H-1024_A-16/bert_config.json 09/25/2019 14:17:39 - INFO - utils.args - checkpoints: output/ 09/25/2019 14:17:39 - INFO - utils.args - concept_embedding_path: ../retrieve_concepts/KB_embeddings/wn_concept2vec.txt 09/25/2019 14:17:39 - INFO - utils.args - dev_retrieved_nell_concept_path: ../retrieve_concepts/retrieve_nell/output_record/dev.retrieved_nell_concepts.data 09/25/2019 14:17:39 - INFO - utils.args - do_lower_case: False 09/25/2019 14:17:39 - INFO - utils.args - do_predict: True 09/25/2019 14:17:39 - INFO - utils.args - do_train: True 09/25/2019 14:17:39 - INFO - utils.args - do_val: False 09/25/2019 14:17:39 - INFO - utils.args - doc_stride: 128 09/25/2019 14:17:39 - INFO - utils.args - ema_decay: 0.9999 09/25/2019 14:17:39 - INFO - utils.args - epoch: 4 09/25/2019 14:17:39 - INFO - utils.args - freeze: False 09/25/2019 14:17:39 - INFO - utils.args - in_tokens: False 09/25/2019 14:17:39 - INFO - utils.args - init_checkpoint: None 09/25/2019 14:17:39 - INFO - utils.args - init_pretraining_params: cased_L-24_H-1024_A-16/params 09/25/2019 14:17:39 - INFO - utils.args - learning_rate: 3e-05 09/25/2019 14:17:39 - INFO - utils.args - loss_scaling: 1.0 09/25/2019 14:17:39 - INFO - utils.args - lr_scheduler: linear_warmup_decay 09/25/2019 14:17:39 - INFO - utils.args - max_answer_length: 30 09/25/2019 14:17:39 - INFO - utils.args - max_query_length: 64 09/25/2019 14:17:39 - INFO - utils.args - max_seq_len: 384 09/25/2019 14:17:39 - INFO - utils.args - n_best_size: 20 09/25/2019 14:17:39 - INFO - utils.args - null_score_diff_threshold: 0.0 09/25/2019 14:17:39 - INFO - utils.args - num_iteration_per_drop_scope: 1 09/25/2019 14:17:39 - INFO - utils.args - predict_file: ../data//ReCoRD/dev.json 09/25/2019 14:17:39 - INFO - utils.args - random_seed: 45 09/25/2019 14:17:39 - INFO - utils.args - retrieved_synset_path: ../retrieve_concepts/retrieve_wordnet/output_record/retrived_synsets.data 09/25/2019 14:17:39 - INFO - utils.args - save_steps: 4000 09/25/2019 14:17:39 - INFO - utils.args - skip_steps: 10 09/25/2019 14:17:39 - INFO - utils.args - train_file: ../data//ReCoRD/train.json 09/25/2019 14:17:39 - INFO - utils.args - train_retrieved_nell_concept_path: ../retrieve_concepts/retrieve_nell/output_record/train.retrieved_nell_concepts.data 09/25/2019 14:17:39 - INFO - utils.args - use_cuda: True 09/25/2019 14:17:39 - INFO - utils.args - use_ema: True 09/25/2019 14:17:39 - INFO - utils.args - use_fast_executor: False 09/25/2019 14:17:39 - INFO - utils.args - use_fp16: False 09/25/2019 14:17:39 - INFO - utils.args - use_nell: False 09/25/2019 14:17:39 - INFO - utils.args - use_wordnet: True 09/25/2019 14:17:39 - INFO - utils.args - validation_steps: 1000 09/25/2019 14:17:39 - INFO - utils.args - verbose: False 09/25/2019 14:17:39 - INFO - utils.args - version_2_with_negative: False 09/25/2019 14:17:39 - INFO - utils.args - vocab_path: cased_L-24_H-1024_A-16/vocab.txt 09/25/2019 14:17:39 - INFO - utils.args - warmup_proportion: 0.1 09/25/2019 14:17:39 - INFO - utils.args - weight_decay: 0.01 09/25/2019 14:17:39 - INFO - utils.args - ------------------------------------------------ 09/25/2019 14:17:39 - INFO - model.bert - attention_probs_dropout_prob: 0.1 09/25/2019 14:17:39 - INFO - model.bert - directionality: bidi 09/25/2019 14:17:39 - INFO - model.bert - hidden_act: gelu 09/25/2019 14:17:39 - INFO - model.bert - hidden_dropout_prob: 0.1 09/25/2019 14:17:39 - INFO - model.bert - hidden_size: 1024 09/25/2019 14:17:39 - INFO - model.bert - initializer_range: 0.02 09/25/2019 14:17:39 - INFO - model.bert - intermediate_size: 4096 09/25/2019 14:17:39 - INFO - model.bert - max_position_embeddings: 512 09/25/2019 14:17:39 - INFO - model.bert - num_attention_heads: 16 09/25/2019 14:17:39 - INFO - model.bert - num_hidden_layers: 24 09/25/2019 14:17:39 - INFO - model.bert - pooler_fc_size: 768 09/25/2019 14:17:39 - INFO - model.bert - pooler_num_attention_heads: 12 09/25/2019 14:17:39 - INFO - model.bert - pooler_num_fc_layers: 3 09/25/2019 14:17:39 - INFO - model.bert - pooler_size_per_head: 128 09/25/2019 14:17:39 - INFO - model.bert - pooler_type: first_token_transform 09/25/2019 14:17:39 - INFO - model.bert - type_vocab_size: 2 09/25/2019 14:17:39 - INFO - model.bert - vocab_size: 28996 09/25/2019 14:17:39 - INFO - model.bert - ------------------------------------------------ Traceback (most recent call last): File "src/run_record.py", line 594, in train(args) File "src/run_record.py", line 332, in train max_query_length=args.max_query_length) File "/home/KTNET/ACL2019-KTNET/reading_comprehension/src/reader/record.py", line 538, in init vocab_file=vocab_path, do_lower_case=do_lower_case) File "/home/KTNET/ACL2019-KTNET/reading_comprehension/src/tokenization.py", line 113, in init self.vocab = load_vocab(vocab_file) File "/home/KTNET/ACL2019-KTNET/reading_comprehension/src/tokenization.py", line 73, in load_vocab for num, line in enumerate(fin): File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1323: ordinal not in range(128)