README.md 9.0 KB
Newer Older
X
Xiaoyao Xi 已提交
1
# PaddlePALM
X
xixiaoyao 已提交
2

W
wangxiao1021 已提交
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
PaddlePALM (PArallel Learning from Multi-tasks) is a flexible, general and easy-to-use NLP large-scale pretraining and multi-task learning friendly framework. PALM is a high level framework aiming at **fastly** develop **high-performance** NLP models. With PALM, 8 steps to achieve a typical NLP task for supervised learning or pretraining. 6 steps to achieve multi-task learning for prepared tasks. Zero steps to adapt your code to large-scale training/inference (with multiple GPUs and multiple computation nodes).

PaddlePALM also provides state-of-the-art general purpose architectures (BERT,ERNIE,RoBERTa,...) as build-in model backbones. We have decoupled the model backbone, dataset reader and task output layers, so that you can easily replace any of the component to other candidates with quite minor changes of your code. In addition, PaddlePALM support customized development of any component, e.g, backbone, task head, reader and optimizer, which gives high flexibility for developers to adapt to complicated NLP scenes. 

然后给出一些成功案例和一些公开数据集的各个backbone的实验结果(BERT、ERNIE、RoBERTa)和一些成功的多任务学习示例。

<table>
  <tbody>
    <tr>
      <th><strong>Dataset</strong>
        <br></th>
      <th colspan="3"><center><strong>chnsenticorp</strong></center></th>
      <th colspan="3"><center><strong>Quora Question Pairs matching</strong><center></th>
      <th colspan="3"><strong>MSRA-NER<br>(SIGHAN2006)</strong></th>
      <th colspan="2"><strong>CMRC2018</strong></th>
    </tr>
    <tr>
      <td rowspan="2">
        <p>
          <strong>Metric</strong>
          <br></p>
      </td>
      <td colspan="1">
        <center><strong>precision</strong></center>
        <br></td>
      <td colspan="1">
        <strong>recall</strong>
        <br></td>
      <td colspan="1">
        <strong>f1-score</strong>
        <strong></strong>
        <br></td>
      <td colspan="1">
        <center><strong>precision</strong></center>
        <br></td>
      <td colspan="1">
        <strong>recall</strong>
        <br></td>
      <td colspan="1">
        <strong>f1-score</strong>
        <strong></strong>
        <br></td>
      <td colspan="1">
        <center><strong>precision</strong></center>
        <br></td>
      <td colspan="1">
        <strong>recall</strong>
        <br></td>
      <td colspan="1">
        <strong>f1-score</strong>
        <strong></strong>
        <br></td>
      <td colspan="1">
        <strong>em</strong>
        <br></td>
      <td colspan="1">
        <strong>f1-score</strong>
        <br></td>
    </tr>
    <tr>
      <td colspan="3" width="">
        <strong>test</strong>
        <br></td>
      <td colspan="3" width="">
        <strong>test</strong>
        <br></td>
      <td colspan="3" width="">
        <strong>test</strong>
        <br></td>
      <td colspan="2" width="">
        <strong>dev</strong>
        <br></td>
    </tr>
    <tr>
      <td><strong>ERNIE Base</strong></td>
      <td>95.7</td>
      <td>95.0</td>
      <td>95.7</td>
      <td>85.8</td>
      <td>82.4</td>
      <td>81.5</td>
      <td>94.9</td>
      <td>94.5</td>
      <td>94.7</td>
      <td>96.3</td>
      <td>84.0</td>
    </tr>

  </tbody>
</table>


## Package Overview

| module | illustration | 
| - | - |
| **paddlepalm** | an open source NLP pretraining and multitask learning framework, built on paddlepaddle. |
| **paddlepalm.reader** | a collection of elastic task-specific dataset readers. |
| **paddlepalm.backbone** | a collection of classic NLP representation models, e.g., BERT, ERNIE, RoBERTa. |
| **paddlepalm.head** | a collection of task-specific output layers. |
| **paddlepalm.lr_sched** | a collection of learning rate schedualers. |
| **paddlepalm.optimizer** | a collection of optimizers. |
| **paddlepalm.downloader** | a download module for pretrained models with configure and vocab files. |
| **paddlepalm.Trainer** | the core unit to start a single task training/predicting session. A trainer is to build computation graph, manage training and evaluation process, achieve model/checkpoint saving and pretrain_model/checkpoint loading.|
| **paddlepalm.MultiHeadTrainer** | the core unit to start a multi-task training/predicting session. A MultiHeadTrainer is built based on several Trainers. Beyond the inheritance of Trainer, it additionally achieves model backbone reuse across tasks, trainer sampling for multi-task learning, and multi-head inference for effective evaluation and prediction. |


## Installation

PaddlePALM support both python2 and python3, linux and windows, CPU and GPU. The preferred way to install PaddlePALM is via `pip`. Just run following commands in your shell environment.

```bash
X
xixiaoyao 已提交
115
pip install paddlepalm
X
Xiaoyao Xi 已提交
116 117
```

W
wangxiao1021 已提交
118
### Installing via source
X
Xiaoyao Xi 已提交
119 120 121

```shell
git clone https://github.com/PaddlePaddle/PALM.git
X
Xiaoyao Xi 已提交
122
cd PALM && python setup.py install
X
Xiaoyao Xi 已提交
123 124
```

W
wangxiao1021 已提交
125
### Library Dependencies
X
Xiaoyao Xi 已提交
126
- Python >= 2.7
X
Xiaoyao Xi 已提交
127 128
- cuda >= 9.0
- cudnn >= 7.0
W
wangxiao1021 已提交
129
- PaddlePaddle >= 1.6.3 (请参考[安装指南](http://www.paddlepaddle.org/#quick-start)进行安装)
X
Xiaoyao Xi 已提交
130 131


W
wangxiao1021 已提交
132 133
### Downloading pretrain models
We incorporate many pretrained models to initialize model backbone parameters. Training big NLP model, e.g., 12-layer transformers, with pretrained models is practically much more effective than that with randomly initialized parameters. To see all the available pretrained models and download, run following code in python interpreter (input command `python` in shell):
X
Xiaoyao Xi 已提交
134

X
Xiaoyao Xi 已提交
135
```python
W
wangxiao1021 已提交
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
>>> from paddlepalm import downloader
>>> downloader.ls('pretrain')
Available pretrain items:
  => roberta-cn-base
  => roberta-cn-large
  => bert-cn-base
  => bert-cn-large
  => bert-en-uncased-base
  => bert-en-uncased-large
  => bert-en-cased-base
  => bert-en-cased-large
  => ernie-en-uncased-base
  => ernie-en-uncased-large
  ...

>>> downloader.download('pretrain', 'bert-en-uncased-base', './pretrain_models')
X
Xiaoyao Xi 已提交
152 153 154
...
```

X
Xiaoyao Xi 已提交
155

W
wangxiao1021 已提交
156
## Usage
X
Xiaoyao Xi 已提交
157

W
wangxiao1021 已提交
158
8 steps to start a typical NLP training task.
X
Xiaoyao Xi 已提交
159

W
wangxiao1021 已提交
160 161 162 163 164 165 166 167
1. use `paddlepalm.reader` to create a *reader* for dataset loading and input features generation, then call `reader.load_data` method to load your training data.
2. use `paddlepalm.backbone` to create a model *backbone* to extract text features (e.g., contextual word embedding, sentence embedding).
3. register your *reader* with your *backbone* through `reader.register_with` method. After this step, your reader is able to yield input features used by backbone.
4. use `paddlepalm.head` to create a task output *head*. This head can provide task loss for training and predicting results for model inference.
5. create a task *trainer* with `paddlepalm.Trainer`, then build forward graph with backbone and task head (created in step 2 and 4) through `trainer.build_forward`.
6. use `paddlepalm.optimizer` (and `paddlepalm.lr_sched` if is necessary) to create a *optimizer*, then build backward through `trainer.build_backward`.
7. fit prepared reader and data (achieved in step 1) to trainer with `trainer.fit_reader` method.
8. load pretrain model with `trainer.load_pretrain`, or load checkpoint with `trainer.load_ckpt` or nothing to do for training from scratch, then do training with `trainer.train`.
X
Xiaoyao Xi 已提交
168

W
wangxiao1021 已提交
169
More implementation details see following demos: [Sentiment Classification](), [Quora Question Pairs matching](), [Tagging](), [SQuAD machine Reading Comprehension]().
X
Xiaoyao Xi 已提交
170

W
wangxiao1021 已提交
171
To save models/checkpoints during training, just call `trainer.set_saver` method. More implementation details see [this]().
X
Xiaoyao Xi 已提交
172

W
wangxiao1021 已提交
173
To do predict/evaluation after a training stage, just create another three reader, backbone and head instance with `phase='predict'` (repeat step 1~4 above). Then do predicting with `predict` method in trainer (no need to create another trainer). More implementation details see [this]().
X
Xiaoyao Xi 已提交
174

W
wangxiao1021 已提交
175
To run with multi-task learning mode:
X
Xiaoyao Xi 已提交
176

W
wangxiao1021 已提交
177 178 179 180 181 182
1. repeatedly create components (i.e., reader, backbone and head) for each task followed with step 1~5 above. 
2. create empty trainers (each trainer is corresponded to one task) and pass them to create a `MultiHeadTrainer`. 
3. build multi-task forward graph with `multi_head_trainer.build_forward` method.
4. use `paddlepalm.optimizer` (and `paddlepalm.lr_sched` if is necessary) to create a *optimizer*, then build backward through `multi_head_trainer.build_backward`.
5. fit all prepared readers and data to multi_head_trainer with `multi_head_trainer.fit_readers` method.
6. randomly initialize model parameters with `multi_head_trainer.random_init_params` (and `multi_head_trainer.load_pretrain` if needed), then do training with `multi_head_trainer.train`.
X
Xiaoyao Xi 已提交
183

W
wangxiao1021 已提交
184
The save/load and predict operations of a multi_head_trainer is the same as a trainer.
X
Xiaoyao Xi 已提交
185

W
wangxiao1021 已提交
186
More implementation details of running multi-task learning with multi_head_trainer can be found [here]().
X
Xiaoyao Xi 已提交
187

X
Xiaoyao Xi 已提交
188

X
Xiaoyao Xi 已提交
189 190 191 192 193 194 195
## License

This tutorial is contributed by [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) and licensed under the [Apache-2.0 license](https://github.com/PaddlePaddle/models/blob/develop/LICENSE).

## 许可证书

此向导由[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)贡献,受[Apache-2.0 license](https://github.com/PaddlePaddle/models/blob/develop/LICENSE)许可认证。
X
xixiaoyao 已提交
196