bert fp32 和fp16运行失败
Created by: ccmeteorljh
paddle commit-id :691ced87c087d3b25c2069e96c74c17a36ff2de2
----------- Configuration Arguments -----------
batch_size: 32
bert_config_path: /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/chinese_L-12_H-768_A-12/bert_config.json
checkpoints: /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/save
data_dir: /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/data
decr_every_n_nan_or_inf: 2
decr_ratio: 0.8
do_lower_case: True
do_test: True
do_train: True
do_val: True
enable_ce: False
epoch: 2
in_tokens: False
incr_every_n_steps: 1000
incr_ratio: 2.0
init_checkpoint: None
init_loss_scaling: 4294967296
init_pretraining_params: /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/chinese_L-12_H-768_A-12/params
learning_rate: 5e-05
lr_scheduler: linear_warmup_decay
max_seq_len: 128
num_iteration_per_drop_scope: 1
random_seed: 1
save_steps: 1000
shuffle: True
skip_steps: 100
task_name: XNLI
use_cuda: True
use_dynamic_loss_scaling: True
use_fast_executor: False
use_fp16: False
validation_steps: 1000
verbose: False
vocab_path: /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/chinese_L-12_H-768_A-12/vocab.txt
warmup_proportion: 0.1
weight_decay: 0.01
------------------------------------------------
attention_probs_dropout_prob: 0.1
directionality: bidi
hidden_act: gelu
hidden_dropout_prob: 0.1
hidden_size: 768
initializer_range: 0.02
intermediate_size: 3072
max_position_embeddings: 512
num_attention_heads: 12
num_hidden_layers: 12
pooler_fc_size: 768
pooler_num_attention_heads: 12
pooler_num_fc_layers: 3
pooler_size_per_head: 128
pooler_type: first_token_transform
type_vocab_size: 2
vocab_size: 21128
------------------------------------------------
Device count: 1
Num train examples: 392702
Max train steps: 24543
Num warmup steps: 2454
W1124 22:17:56.750891 20557 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W1124 22:17:56.756631 20557 device_context.cc:244] device: 0, cuDNN Version: 7.4.
Load pretraining parameters from /home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/chinese_L-12_H-768_A-12/params.
I1124 22:18:00.329705 20557 parallel_executor.cc:423] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel.
I1124 22:18:00.433110 20557 build_strategy.cc:364] SeqOnlyAllReduceOps:0, num_trainers:1
I1124 22:18:00.579540 20557 parallel_executor.cc:287] Inplace strategy is enabled, when build_strategy.enable_inplace = True
I1124 22:18:00.645557 20557 parallel_executor.cc:370] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0
/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py:773: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "run_classifier.py", line 428, in <module>
main(args)
File "run_classifier.py", line 331, in main
outputs = exe.run(train_compiled_program, fetch_list=fetch_list)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 774, in run
six.reraise(*sys.exc_info())
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 769, in run
use_program_cache=use_program_cache)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 828, in _run_impl
return_numpy=return_numpy)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 668, in _run_parallel
tensors = exe.run(fetch_var_names)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::ReadOp::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
3 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
4 paddle::framework::details::ComputationOpHandle::RunImpl()
5 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*)
6 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase*, std::shared_ptr<paddle::framework::BlockingQueue<unsigned long> > const&, unsigned long*)
7 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&)
8 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
9 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/framework.py", line 2479, in append_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/reader.py", line 424, in _init_non_iterable
outputs={'Out': self._feed_list})
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/reader.py", line 331, in __init__
self._init_non_iterable()
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/reader.py", line 258, in from_generator
return_list)
File "/home/crim/benchmark/models/PaddleNLP/PaddleLARK/BERT/model/classifier.py", line 45, in create_model
feed_list=inputs, capacity=50, iterable=False)
File "run_classifier.py", line 211, in main
num_labels=num_labels)
File "run_classifier.py", line 428, in <module>
main(args)
----------------------
Error Message Summary:
----------------------
Error: The feeded Variable input_mask should have dimensions = 3, shape = [-1, 128, 1], but received feeded shape [32, 125, 1]
[Hint: Expected DimensionIsCompatibleWith(shapes[i], in_dims) == true, but received DimensionIsCompatibleWith(shapes[i], in_dims):0 != true:1.] at (/paddle/paddle/fluid/operators/reader/read_op.cc:133)
[operator < read > error]