diff --git a/PaddleNLP/paddlenlp/metrics/README.md b/PaddleNLP/paddlenlp/metrics/README.md new file mode 100644 index 0000000000000000000000000000000000000000..2495e82f774f63244e202abcf916a4f33aae6bf3 --- /dev/null +++ b/PaddleNLP/paddlenlp/metrics/README.md @@ -0,0 +1,39 @@ +# paddlenlp.metrics + +## Perplexity +Perplexity is calculated using cross entropy. It supports both padding data +and no padding data. + +If data is not padded, users should provide `seq_len` for `Metric` +initialization. If data is padded, your label should contain `seq_mask`, +which indicates the actual length of samples. + +This Perplexity requires that the output of your network is prediction, +label and sequence length (opitonal). If the Perplexity here doesn't meet +your needs, you could override the `compute` or `update` method for +caculating Perplexity. + +## BLEU +BLEU (bilingual evaluation understudy) is an algorithm for evaluating the +quality of text which has been machine-translated from one natural language +to another. This metric uses a modified form of precision to compare a +candidate translation against multiple reference translations. + +BLEU could be used as `paddle.metrics.Metric` class, or an ordinary +class. + +When BLEU is used as `paddle.metrics.Metric` class. A function is +needed that transforms the network output to reference string list, and +transforms the label to candidate string. By default, a default function +`_default_trans_func` is provided, which gets target sequence id by +calculating the maximum probability of each step. In this case, user must +provide `vocab`. It should be noted that the BLEU here is different from +the BLEU calculated in prediction, and it is only for observation during +training and evaluation. + +## Rouge +### rouge-l + +## dureader +## chunk +## squad