Created by: yihuaxu
According to the performance status of Bert model, optimized GELU operator to accelerate the data processing.
Platform: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz Model Path: third_party/inference_demo/bert_emb128/model Batch Size: 1 Command: ./paddle/fluid/inference/tests/api/test_analyzer_bert --infer_model=third_party/inference_demo/bert_emb128/model/ --infer_data=third_party/inference_demo/bert_emb128/data.txt --gtest_filter=Analyzer_bert.profile --paddle_num_threads=1 --repeat=1 --batch_size=1 --test_all_data --profile Data Source: third_party/inference_demo/bert_emb128/data.txt.
Associated with: PR#15770 PR#15871 Fix the memcpy@GLIBC_2.14 link issue of mklml library.
The following is the comparison with the different scenarios.