Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (!12767) · 合并请求 · PaddlePaddle / Paddle

Optimize CRF Decoding with AVX/AVX2/AVX512F instruction !12767

Created by: yihuaxu

According to the performance status of CRF Decoding, just implemented the intrinsic function's optimization to accelerate the data processing.

Platform: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz / Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz Model Path: models/fluid/chinese_ner Batch Size: 6, 1, 12 Command: python infer.py --device CPU --profile Data Source: Use original date provided in the model Chinese_NER in the github.

The following is the comparison with the different scenarios.

PaddlePaddle / Paddle 大约 2 年 前同步成功

Optimize CRF Decoding with AVX/AVX2/AVX512F instruction !12767

PaddlePaddle / Paddle
大约 2 年前同步成功