README.md 4.0 KB
Newer Older
Q
qiuxuezhong 已提交
1
# Abstract
Q
qiuxuezhong 已提交
2
Dureader is an end-to-end neural network model for machine reading comprehesion style question answering, which aims to anser questions from given passages. We first match the question and passage with a bidireactional attention flow networks to obtrain the question-aware passages represenation. Then we employ the pointer networks to locate the positions of answers from passages. Our experimental evalutions show that DuReader model achieves the state-of-the-art results in DuReader Dadaset.
Q
qiuxuezhong 已提交
3 4
# Dataset
DuReader Dataset is a new large-scale real-world and human sourced MRC dataset in Chinese. DuReader focuses on real-world open-domain question answering. The advantages of DuReader over existing datasets are concluded as follows:
Q
qiuxuezhong 已提交
5 6 7 8 9 10
 - Real question
 - Real article
 - Real answer
 - Real application scenario
 - Rich annotation

Q
qiuxuezhong 已提交
11 12
# Network
DuReader is inspired by 3 classic reading comprehension models([BiDAF](https://arxiv.org/abs/1611.01603), [Match-LSTM](https://arxiv.org/abs/1608.07905), [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf)).
Q
qiuxuezhong 已提交
13

Q
qiuxuezhong 已提交
14
DuReader model is a hierarchical multi-stage process and consists of five layers
Q
qiuxuezhong 已提交
15

Q
qiuxuezhong 已提交
16 17
- **Word Embedding Layer** maps each word to a vector using a pre-trained word embedding model.
- **Encoding Layer** extracts context infomation for each position in question and passages with bi-directional LSTM network.
Q
qiuxuezhong 已提交
18 19
- **Attention Flow Layer** couples the query and context vectors and produces a set of query-aware feature vectors for each word in the context. Please refer to [BiDAF](https://arxiv.org/abs/1611.01603) for more details.
- **Fusion Layer** employs two layers of bi-directional LSTM to capture the interaction among context words independent of the query.
Q
qiuxuezhong 已提交
20
- **Decode Layer** employs an answer point network with attention pooling of the quesiton to locate the positions of answers from passages. Please refer to [Match-LSTM](https://arxiv.org/abs/1608.07905) and [R-NET](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf) for more details.
Q
qiuxuezhong 已提交
21 22 23 24 25 26 27 28 29 30

## How to Run
### Download the Dataset
To Download DuReader dataset:
```
cd data && bash download.sh
```
For more details about DuReader dataset please refer to [DuReader Dataset Homepage](https://ai.baidu.com//broad/subordinate?dataset=dureader).

### Download Thirdparty Dependencies
Q
qiuxuezhong 已提交
31
We use Bleu and Rouge as evaluation metrics, the calculation of these metrics relies on the scoring scripts under [coco-caption](https://github.com/tylin/coco-caption), to download them, run:
Q
qiuxuezhong 已提交
32 33 34 35

```
cd utils && bash download_thirdparty.sh
```
Q
qiuxuezhong 已提交
36 37
#### Environment Requirements
For now we've only tested on PaddlePaddle v1.0, to install PaddlePaddle and for more details about PaddlePaddle, see [PaddlePaddle Homepage](http://paddlepaddle.org).
Q
qiuxuezhong 已提交
38

Q
qiuxuezhong 已提交
39 40
#### Preparation
Before training the model, we have to make sure that the data is ready. For preparation, we will check the data files, make directories and extract a vocabulary for later use. You can run the following command to do this with a specified task name:
Q
qiuxuezhong 已提交
41 42

```
Q
qiuxuezhong 已提交
43
sh run.sh --prepare
Q
qiuxuezhong 已提交
44
```
Q
qiuxuezhong 已提交
45 46 47
You can specify the files for train/dev/test by setting the `trainset`/`devset`/`testset`.
#### Training
To train the model and you can also set the hyper-parameters such as the learning rate by using `--learning_rate NUM`. For example, to train the model for 10 passes, you can run:
Q
qiuxuezhong 已提交
48 49

```
Q
qiuxuezhong 已提交
50
sh run.sh --train --pass_num 10
Q
qiuxuezhong 已提交
51 52
```

Q
qiuxuezhong 已提交
53
The training process includes an evaluation on the dev set after each training epoch. By default, the model with the least Bleu-4 score on the dev set will be saved.
Q
qiuxuezhong 已提交
54

Q
qiuxuezhong 已提交
55 56
#### Evaluation
To conduct a single evaluation on the dev set with the the model already trained, you can run the following command:
Q
qiuxuezhong 已提交
57 58

```
Q
qiuxuezhong 已提交
59
sh run.sh --evaluate  --load_dir models/1
Q
qiuxuezhong 已提交
60
```
Q
qiuxuezhong 已提交
61 62 63 64

#### Prediction
You can also predict answers for the samples in some files using the following command:

Q
qiuxuezhong 已提交
65
```
Q
qiuxuezhong 已提交
66
sh run.sh --predict --load_dir models/1 --testset ../data/demo/devset/search.dev.json
Q
qiuxuezhong 已提交
67
```
Q
qiuxuezhong 已提交
68 69

By default, the results are saved at `../data/results/` folder. You can change this by specifying `--result_dir DIR_PATH`.