未验证 提交 aa768240 编写于 作者: J Jiaqi Liu 提交者: GitHub

Add ERNIE-3.0 for model center (#5577)

* add ernie-3.0

* update ERNIE 3.0
上级 dd2dc855
## 1. 推理 Benchmark
### 1.1 软硬件环境
1. 计算卡:T4、CUDA11.2、CuDNN8.2
2. CPU 信息:Intel(R) Xeon(R) Gold 6271C CPU
3. PaddlePaddle 版本:2.3
4. PaddleNLP 版本:2.3
5. 性能数据单位是 QPS。QPS 测试方法:固定 batch size 为 32,测试运行时间 total_time,计算 QPS = total_samples / total_time
6. 精度数据单位:文本分类是 Accuracy,序列标注是 F1-Score,阅读理解是 EM (Exact Match)
### 1.2 数据集
数据集:CLUE TNEWS(文本分类)、MSRA_NER(序列标注)、CLUE CMRC2018(阅读理解)
### 1.3 指标
##### CPU 性能
测试环境及说明如上,测试 CPU 性能时,线程数设置为12。
| | TNEWS 性能 | TNEWS 精度 | MSRA_NER 性能 | MSRA_NER 精度 | CMRC2018 性能 | CMRC2018 精度 |
| -------------------------- | ------------ | ------------ | ------------- | ------------- | ------------- | ------------- |
| ERNIE 3.0-Medium+FP32 | 311.95(1.0X) | 57.45 | 90.91(1.0x) | 93.04 | 33.74(1.0x) | 66.95 |
| ERNIE 3.0-Medium+INT8 | 600.35(1.9x) | 56.57(-0.88) | 141.00(1.6x) | 92.64(-0.40) | 56.51(1.7x) | 66.23(-0.72) |
| ERNIE 3.0-Medium+裁剪+FP32 | 408.65(1.3x) | 57.31(-0.14) | 122.13(1.3x) | 93.27(+0.23) | 48.47(1.4x) | 65.55(-1.40) |
| ERNIE 3.0-Medium+裁剪+INT8 | 704.42(2.3x) | 56.69(-0.76) | 215.58(2.4x) | 92.39(-0.65) | 75.23(2.2x) | 63.47(-3.48) |
三类任务(分类、序列标注、阅读理解)经过相同压缩过程后,加速比达到 2.3 左右。
##### GPU 性能
| | TNEWS 性能 | TNEWS 精度 | MSRA_NER 性能 | MSRA_NER 精度 | CMRC2018 性能 | CMRC2018 精度 |
| -------------------------- | ------------- | ------------ | ------------- | ------------- | ------------- | ------------- |
| ERNIE 3.0-Medium+FP32 | 1123.85(1.0x) | 57.45 | 366.75(1.0x) | 93.04 | 146.84(1.0x) | 66.95 |
| ERNIE 3.0-Medium+FP16 | 2672.41(2.4x) | 57.45(0.00) | 840.11(2.3x) | 93.05(0.01) | 303.43(2.1x) | 66.95(0.00) |
| ERNIE 3.0-Medium+INT8 | 3226.26(2.9x) | 56.99(-0.46) | 889.33(2.4x) | 92.70(-0.34) | 348.84(2.4x) | 66.32(-0.63 |
| ERNIE 3.0-Medium+裁剪+FP32 | 1424.01(1.3x) | 57.31(-0.14) | 454.27(1.2x) | 93.27(+0.23) | 183.77(1.3x) | 65.92(-1.03) |
| ERNIE 3.0-Medium+裁剪+FP16 | 3577.62(3.2x) | 57.27(-0.18) | 1138.77(3.1x) | 93.27(+0.23) | 445.71(3.0x) | 65.89(-1.06) |
| ERNIE 3.0-Medium+裁剪+INT8 | 3635.48(3.2x) | 57.26(-0.19) | 1105.26(3.0x) | 93.20(+0.16) | 444.27(3.0x) | 66.17(-0.78) |
三类任务(分类、序列标注、阅读理解)经过裁剪 + 量化后加速比均达到 3 倍左右,所有任务上平均精度损失可控制在 0.5 以内(0.46)。
## 2. 相关使用说明
1. https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-3.0/README.md#%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95
## 1. Inference Benchmark
### 1.1 Environment
1. 计算卡:T4、CUDA11.2、CuDNN8.2
2. CPU:Intel(R) Xeon(R) Gold 6271C CPU
3. PaddlePaddle Version:2.3
4. PaddleNLP Version:2.3
5. The unit of performance data is QPS. How to calculate QPS: fixed batch size of 32, test running time total_time, calculated QPS = total_samples / total_time.
6. Metrics:Accuracy for sequence classification,F1-Score for token classification, EM (Exact Match) for question answering.
### 1.2 数据集
Dataset:CLUE TNEWS(sequence classofication)、MSRA_NER(token classification)、CLUE CMRC2018(question answering)
### 1.3 Benchmark
##### CPU Performance
The test environment and instructions are as above. When testing the CPU performance, the number of threads is set to 12.
| | TNEWS Performance | TNEWS Accuracy | MSRA_NER Performance | MSRA_NER F1 Score | CMRC2018 Performance | CMRC2018 EM |
| -------------------------- | ------------ | ------------ | ------------- | ------------- | ------------- | ------------- |
| ERNIE 3.0-Medium+FP32 | 311.95(1.0X) | 57.45 | 90.91(1.0x) | 93.04 | 33.74(1.0x) | 66.95 |
| ERNIE 3.0-Medium+INT8 | 600.35(1.9x) | 56.57(-0.88) | 141.00(1.6x) | 92.64(-0.40) | 56.51(1.7x) | 66.23(-0.72) |
| ERNIE 3.0-Medium+prune+FP32 | 408.65(1.3x) | 57.31(-0.14) | 122.13(1.3x) | 93.27(+0.23) | 48.47(1.4x) | 65.55(-1.40) |
| ERNIE 3.0-Medium+prune+INT8 | 704.42(2.3x) | 56.69(-0.76) | 215.58(2.4x) | 92.39(-0.65) | 75.23(2.2x) | 63.47(-3.48) |
After same compression, the speedup ratio of three models reaches about 2.3.
##### GPU Performance
| | TNEWS Performance | TNEWS Accuracy | MSRA_NER Performance | MSRA_NER F1 Score | CMRC2018 Performance | CMRC2018 EM |
| -------------------------- | ------------- | ------------ | ------------- | ------------- | ------------- | ------------- |
| ERNIE 3.0-Medium+FP32 | 1123.85(1.0x) | 57.45 | 366.75(1.0x) | 93.04 | 146.84(1.0x) | 66.95 |
| ERNIE 3.0-Medium+FP16 | 2672.41(2.4x) | 57.45(0.00) | 840.11(2.3x) | 93.05(0.01) | 303.43(2.1x) | 66.95(0.00) |
| ERNIE 3.0-Medium+INT8 | 3226.26(2.9x) | 56.99(-0.46) | 889.33(2.4x) | 92.70(-0.34) | 348.84(2.4x) | 66.32(-0.63 |
| ERNIE 3.0-Medium+prune+FP32 | 1424.01(1.3x) | 57.31(-0.14) | 454.27(1.2x) | 93.27(+0.23) | 183.77(1.3x) | 65.92(-1.03) |
| ERNIE 3.0-Medium+prune+FP16 | 3577.62(3.2x) | 57.27(-0.18) | 1138.77(3.1x) | 93.27(+0.23) | 445.71(3.0x) | 65.89(-1.06) |
| ERNIE 3.0-Medium+prune+INT8 | 3635.48(3.2x) | 57.26(-0.19) | 1105.26(3.0x) | 93.20(+0.16) | 444.27(3.0x) | 66.17(-0.78) |
The three tasks have a speedup of about 3 times after pruning and quantization, and the average accuracy loss could be controlled within 0.5 (0.46).
## 2. Reference
1. https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-3.0/README.md#%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95
# 下载
| 模型名称 | 模型结构 | 参数量 | 模型大小 |下载地址 |
|-----------------|---------------------------------|---------|---------|
|ERNIE 3.0-Base | 12-layer, 768-hidden, 12-heads | 117.9M |452.4M|[预训练模型](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh.pdparams)|
|ERNIE 3.0-Medium | 6-layer, 768-hidden, 12-heads | 75.4M |312.5MB|[预训练模型](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh.pdparams)|
|ERNIE 3.0-Mini | 6-layer, 384-hidden, 12-heads | 26.9M |109.0MB|[预训练模型](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_mini_zh.pdparams)|
|ERNIE 3.0-Micro | 4-layer, 384-hidden, 12-heads | 23.4M |95.48MB|[预训练模型](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_micro_zh.pdparams)|
|ERNIE 3.0-Nano | 4-layer, 312-hidden, 12-heads | 17.9M |72.4MB|[预训练模型](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_nano_zh.pdparams)|
# Download
| model | model arc | Number of parameters | Model Size | download |
|-----------------|--------------------------------------|----------------------|------------------|---------------------------------------------------|
|ERNIE 3.0-Base | 12-layer, 768-hidden, 12-heads | 117.9 M | 452.4MB |[Pretrained Model](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh.pdparams)|
|ERNIE 3.0-Medium | 6-layer, 768-hidden, 12-heads | 75.4 M | 312.5MB |[Pretrained Model](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_medium.pdparams)|
|ERNIE 3.0-Mini | 6-layer, 384-hidden, 12-heads | 26.9 M | 109.0MB |[Pretrained Model](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_mini_zh.pdparams)|
|ERNIE 3.0-Micro | 4-layer, 384-hidden, 12-heads | 23.4 M | 95.48MB |[Pretrained Model](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_micro_zh.pdparams)|
|ERNIE 3.0-Nano | 4-layer, 312-hidden, 12-heads | 17.9 M | 72.4MB |[Pretrained Model](https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_nano_zh.pdparams)|
---
Model_Info:
name: "ERNIE 3.0"
description: "ERNIE 3.0 轻量级模型"
description_en: "ERNIE Tiny"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleNLP"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Pretrained Model"
sub_tag: "预训练模型"
Example:
- title: "【快速上手ERNIE 3.0】中文情感分析实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3955163"
- title: "【快速上手ERNIE 3.0】法律文本多标签分类实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3996601"
- title: "【快速上手ERNIE 3.0】中文语义匹配实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3986803"
- title: "【快速上手ERNIE 3.0】MSRA序列标注实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3989073"
- title: "【快速上手ERNIE 3.0】机器阅读理解实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/2017189"
- title: "【快速上手ERNIE 3.0】对话意图识别实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/2017202?contributionType=1"
Datasets: ""
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "ERNIE-Tiny: A Progressive Distillation Framework for Pretrained Transformer Compression"
url: "https://arxiv.org/abs/2106.02241"
IfTraining: 0
IfOnlineDemo: 1
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册