This project is the implementation of [XLNet](https://github.com/zihangdai/xlnet) on Paddle Fluid, currently supporting the fine-tuning on all downstream tasks, including natural language inference, question answering (SQuAD) etc.
There are a lot differences between XLNet and [BERT](../BERT). XLNet takes adavangtage of a new novel model [Transformer-XL](https://arxiv.org/abs/1901.02860) as the backbone of language representation, and the permutation language modeling as the optimizing objective. Also XLNet involed much more data in the pre-training stage. Finally, XLNet achieved SOTA results on several NLP tasks.
For more details, please refer to the research paper
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
## Installation
This project requires Paddle Fluid version 1.6.0 and later, please follow the [installation guide](https://www.paddlepaddle.org.cn/start) to install.
## Pre-trained models
Two pre-trained models converted from the official release are available
Each compressed package contains one subdirectory and two files:
-`params`: a directory consisting of all converted parameters, one file for a parameter.
-`spiece.model`: a [Sentence Piece](https://github.com/google/sentencepiece) model used for (de)tokenization.
-`xlnet_config.json`: a config file which specifies the hyperparameters of the model.
## Fine-tuning with XLNet
We provide the scripts for fine-tuning on NLP tasks with XLNet on multi-card GPUs. And their correctness has been verified that all experiments on V100 GPUs can achieve the same performance as the officially reported (mainly on TPU). In the following statements, we assume that the two pre-trained models have been downloaded and extracted.
### Text regression/classification
The fine-tuning of regression/classification can be preformed via the script `run_classifier.py` , which contains examples for standard one-document classification, one-document regression, and document pair classification. The two examples, one for regression and another one for classification can go on in the following way.
- Download the [GLUE data](https://gluebenchmark.com/tasks) by running [this script](https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e) and unpack it to some directory $GLUE_DIR
-**Note**: You may meet the error `ImportError: No module named request` when running the script under Python 2.x, this is because the module `urllib` doesn't have submodule `request`. It can be resolved by replacing all the code `urllib.request` with `urllib` or changing to a Python 3.x environment.
- Perform fine-tuning on 4 V100 GPUs with XLNet-Large