diff --git a/demo/ofa/bert/README.md b/demo/ofa/bert/README.md index 27f0eeea004a2d72fd0202a68a3bee27fdf9196c..7ac94833786e2b3307e588a867574c6521839df5 100644 --- a/demo/ofa/bert/README.md +++ b/demo/ofa/bert/README.md @@ -8,14 +8,14 @@ BERT-base模型是一个è¿ç§»èƒ½åŠ›å¾ˆå¼ºçš„通用è¯ä¹‰è¡¨ç¤ºæ¨¡åž‹ï¼Œä½†æ˜¯ | Task | Metric | Baseline | Result with PaddleSlim | |:-----:|:----------------------------:|:-----------------:|:----------------------:| -| SST-2 | Accuracy | 0.93005 | 0.931193 | -| QNLI | Accuracy | 0.91781 | 0.920740 | -| CoLA | Mattehew's corr | 0.59557 | 0.601244 | -| MRPC | F1/Accuracy | 0.91667/0.88235 | 0.91740/0.88480 | -| STS-B | Person/Spearman corr | 0.88847/0.88350 | 0.89271/0.88958 | -| QQP | Accuracy/F1 | 0.90581/0.87347 | 0.90994/0.87947 | -| MNLI | Matched acc/MisMatched acc | 0.84422/0.84825 | 0.84687/0.85242 | -| RTE | Accuracy | 0.711191 | 0.718412 | +| SST-2 | Accuracy | 0.93005 | [0.931193]() | +| QNLI | Accuracy | 0.91781 | [0.920740]() | +| CoLA | Mattehew's corr | 0.59557 | [0.601244]() | +| MRPC | F1/Accuracy | 0.91667/0.88235 | [0.91740/0.88480]() | +| STS-B | Person/Spearman corr | 0.88847/0.88350 | [0.89271/0.88958]() | +| QQP | Accuracy/F1 | 0.90581/0.87347 | [0.90994/0.87947]() | +| MNLI | Matched acc/MisMatched acc | 0.84422/0.84825 | [0.84687/0.85242]() | +| RTE | Accuracy | 0.711191 | [0.718412]() | <p align="center"> <strong>表1-1: GLUEæ•°æ®é›†ç²¾åº¦å¯¹æ¯”</strong> @@ -184,4 +184,41 @@ python -u ./run_glue_ofa.py --model_type bert \ 压缩è®ç»ƒä¹‹åŽåœ¨dev上的结果如表1-1ä¸ã€ŽResult with PaddleSlimã€åˆ—所示,延时情况如表1-2所示。 ## 3. OFA接å£ä»‹ç» -TODO +OFA API介ç»å‚考[API](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/ofa_api.rst) + +# 基于本代ç 对TinyBERT(L=4, D=312)进行压缩 +下游任务模型是从TinyBERT官方repo转æ¢å¾—到。 + +## 1. 压缩结果 + +| Task | Metric | TinyBERT(L=4, D=312) | Result with OFA | +|:-----:|:----------------------------:|:--------------------:|:----------------------:| +| SST-2 | Accuracy | [0.9234]() | [0.9220]() | +| QNLI | Accuracy | [0.8746]() | [0.8720]() | +| CoLA | Mattehew's corr | [0.4961]() | [0.5048]() | +| MRPC | F1/Accuracy | [0.8998/0.8554]() | [0.9003/0.8578]() | +| STS-B | Person/Spearman corr | [0.8635/0.8631]() | [0.8717/0.8706]() | +| QQP | Accuracy/F1 | [0.9047/0.8751]() | [0.9034/0.8733]() | +| MNLI | Matched acc/MisMatched acc | [0.8256/0.8294]() | [0.8211/0.8261]() | +| RTE | Accuracy | [0.6534]() | [0.6787]() | + +## 2. å¯åŠ¨å‘½ä»¤ + +以GLUE/QQP任务为例。 + +```shell +export CUDA_VISIBLE_DEVICES=3 +export TASK_NAME='QQP' + +python -u ./run_glue_ofa.py --model_type bert \ + --model_name_or_path ${PATH_OF_QQP} \ + --task_name $TASK_NAME --max_seq_length 128 \ + --batch_size 32 \ + --learning_rate 2e-5 \ + --num_train_epochs 6 \ + --logging_steps 10 \ + --save_steps 500 \ + --output_dir ./tmp/$TASK_NAME/ \ + --n_gpu 1 \ + --width_mult_list 1.0 0.8333333333333334 0.6666666666666666 0.5 +```