------------------------- main configuration ------------------------- target_tag 1, 0 vocab_path pretrain_model/ernie/vocab.txt do_lower_case True learning_rate 3e-05 backbone ernie batch_size 4 max_seq_len 512 backbone_config_path pretrain_model/ernie/ernie_config.json mix_ratio 0.5, 0.5 warmup_proportion 0.1 save_path output_model/secondrun num_epochs 0.5 print_every_n_steps 1 weight_decay 0.1 optimizer adam task_instance mrqa, match4mrqa ------------------------- mrqa configuration ------------------------- pred_file data/mrqa/dev.json verbose False max_query_len 64 pred_output_path mrqa_output train_file data/mrqa/train.json null_score_diff_threshold 0.0 reader mrc doc_stride 128 max_answer_len 30 n_best_size 20 paradigm mrc ---------------------- match4mrqa configuration ---------------------- train_file data/match4mrqa/train.tsv paradigm match reader match mrqa: set as target task. match4mrqa: set as aux task. mrqa: mix_ratio is set to 0.5 match4mrqa: mix_ratio is set to 0.5 ----------------------- backbone configuration ----------------------- vocab_size 30522 target_tag 1, 0 num_hidden_layers 24 sent_type_vocab_size 4 warmup_proportion 0.1 num_epochs 0.5 print_every_n_steps 1 weight_decay 0.1 optimizer adam hidden_dropout_prob 0.1 do_lower_case True max_position_embeddings 512 max_seq_len 512 mix_ratio 0.5, 0.5 save_path output_model/secondrun attention_probs_dropout_prob 0.1 hidden_size 1024 num_attention_heads 16 vocab_path pretrain_model/ernie/vocab.txt learning_rate 3e-05 backbone ernie batch_size 4 backbone_config_path pretrain_model/ernie/ernie_config.json task_instance mrqa, match4mrqa task_type_vocab_size 16 initializer_range 0.02 hidden_act gelu initialing for training... mrqa: set as main task mrqa: preparing data...