未验证 提交 26368f23 编写于 作者: L LiuChiachi 提交者: GitHub

Update paddlenlp.metrics README (#5010)

* Update paddlenlp.metrics README

* Add info for Mcc

* delete DuReader metric

* update task name, add API col
上级 e9820280
# paddlenlp.metrics # paddlenlp.metrics
## Perplexity 目前paddlenlp提供以下评价指标:
Perplexity is calculated using cross entropy. It supports both padding data
and no padding data. | Metric | 简介 | API |
| ---------------------------------------------------------- | :----------------------------------------------------------- | ------------------------------------------------------------ |
If data is not padded, users should provide `seq_len` for `Metric` | Perplexity | 困惑度,常用来衡量语言模型优劣,也可用于机器翻译、文本生成等任务。 | paddlenlp.metrics.Perplexity |
initialization. If data is padded, your label should contain `seq_mask`, | BLEU(bilingual evaluation understudy) | 机器翻译常用评价指标 | paddlenlp.metrics.BLEU |
which indicates the actual length of samples. | Rouge-L(Recall-Oriented Understudy for Gisting Evaluation) | 评估自动文摘以及机器翻译的指标 | paddlenlp.metrics.RougeL |
| AccuracyAndF1 | 准确率及F1-score,可用于GLUE中的MRPC 和QQP任务 | paddlenlp.metrics.AccuracyAndF1 |
This Perplexity requires that the output of your network is prediction, | PearsonAndSpearman | 皮尔森相关性系数和斯皮尔曼相关系数。可用于GLUE中的STS-B任务 | paddlenlp.metrics.PearsonAndSpearman |
label and sequence length (opitonal). If the Perplexity here doesn't meet | Mcc(Matthews correlation coefficient) | 马修斯相关系数,用以测量二分类的分类性能的指标。可用于GLUE中的CoLA任务 | paddlenlp.metrics.Mcc |
your needs, you could override the `compute` or `update` method for | ChunkEvaluator | 计算了块检测的精确率、召回率和F1-score。常用于序列标记任务,如命名实体识别(NER) | paddlenlp.metrics.ChunkEvaluator |
caculating Perplexity. | Squad | 用于SQuAD和DuReader-robust的评价指标 | paddlenlp.metrics.compute_predictions paddlenlp.metrics.squad_evaluate |
## BLEU
BLEU (bilingual evaluation understudy) is an algorithm for evaluating the
quality of text which has been machine-translated from one natural language
to another. This metric uses a modified form of precision to compare a
candidate translation against multiple reference translations.
BLEU could be used as `paddle.metrics.Metric` class, or an ordinary
class.
When BLEU is used as `paddle.metrics.Metric` class. A function is
needed that transforms the network output to reference string list, and
transforms the label to candidate string. By default, a default function
`_default_trans_func` is provided, which gets target sequence id by
calculating the maximum probability of each step. In this case, user must
provide `vocab`. It should be noted that the BLEU here is different from
the BLEU calculated in prediction, and it is only for observation during
training and evaluation.
## Rouge
### rouge-l
## dureader
## chunk
## squad
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册