index.html 6.1 KB
Newer Older
P
Peng Li 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

<html>
<head>
  <script type="text/x-mathjax-config">
  MathJax.Hub.Config({
    extensions: ["tex2jax.js", "TeX/AMSsymbols.js", "TeX/AMSmath.js"],
    jax: ["input/TeX", "output/HTML-CSS"],
    tex2jax: {
      inlineMath: [ ['$','$'] ],
      displayMath: [ ['$$','$$'] ],
      processEscapes: true
    },
    "HTML-CSS": { availableFonts: ["TeX"] }
  });
  </script>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js" async></script>
  <script type="text/javascript" src="../.tools/theme/marked.js">
  </script>
  <link href="http://cdn.bootcss.com/highlight.js/9.9.0/styles/darcula.min.css" rel="stylesheet">
  <script src="http://cdn.bootcss.com/highlight.js/9.9.0/highlight.min.js"></script>
  <link href="http://cdn.bootcss.com/bootstrap/4.0.0-alpha.6/css/bootstrap.min.css" rel="stylesheet">
  <link href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" rel="stylesheet">
  <link href="../.tools/theme/github-markdown.css" rel='stylesheet'>
</head>
<style type="text/css" >
.markdown-body {
    box-sizing: border-box;
    min-width: 200px;
    max-width: 980px;
    margin: 0 auto;
    padding: 45px;
}
</style>


<body>

<div id="context" class="container-fluid markdown-body">
</div>

<!-- This block will be replaced by each markdown file content. Please do not change lines below.-->
<div id="markdown" style='display:none'>
# Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering

This model implements the work in the following paper:

Peng Li, Wei Li, Zhengyan He, Xuguang Wang, Ying Cao, Jie Zhou, and Wei Xu. Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering. [arXiv:1607.06275](https://arxiv.org/abs/1607.06275).

If you use the dataset/code in your research, please cite the above paper:

```text
@article{li:2016:arxiv,
  author  = {Li, Peng and Li, Wei and He, Zhengyan and Wang, Xuguang and Cao, Ying and Zhou, Jie and Xu, Wei},
  title   = {Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering},
  journal = {arXiv:1607.06275v2},
  year    = {2016},
  url     = {https://arxiv.org/abs/1607.06275v2},
}
```


P
Peng Li 已提交
62
## Installation
P
Peng Li 已提交
63 64 65 66 67 68 69 70 71 72 73 74 75 76

1. Install PaddlePaddle v0.10.5 by the following commond. Note that v0.10.0 is not supported.
    ```bash
    # either one is OK
    # CPU
    pip install paddlepaddle
    # GPU
    pip install paddlepaddle-gpu
    ```
2. Download the [WebQA](http://idl.baidu.com/WebQA.html) dataset by running
   ```bash
   cd data && ./download.sh && cd ..
   ```

P
Peng Li 已提交
77
## Hyperparameters
P
Peng Li 已提交
78 79 80

All the hyperparameters are defined in `config.py`. The default values are aligned with the paper.

P
Peng Li 已提交
81
## Training
P
Peng Li 已提交
82 83 84 85 86 87

Training can be launched using the following command:

```bash
PYTHONPATH=data/evaluation:$PYTHONPATH python train.py 2>&1 | tee train.log
```
P
Peng Li 已提交
88
## Validation and Test
P
Peng Li 已提交
89

P
Peng Li 已提交
90
WebQA provides two versions of validation and test sets.  Automatic validation and test can be lauched by
P
Peng Li 已提交
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107

```bash
PYTHONPATH=data/evaluation:$PYTHONPATH python val_and_test.py models [ann|ir]
```

where

* `models`: the directory where model files are stored. You can use `models` if `config.py` is not changed.
* `ann`: using the validation and test sets with annotated evidence.
* `ir`: using the validation and test sets with retrieved evidence.

Note that validation and test can run simultaneously with training. `val_and_test.py` will handle the synchronization related problems.

Intermediate results are stored in the directory `tmp`. You can delete them safely after validation and test.

The results should be comparable with those shown in Table 3 in the paper.

P
Peng Li 已提交
108
## Inferring using a Trained Model
P
Peng Li 已提交
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

Infer using a trained model by running:
```bash
PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \
  MODEL_FILE \
  INPUT_DATA \
  OUTPUT_FILE \
  2>&1 | tee infer.log
```

where

* `MODEL_FILE`: a trained model produced by `train.py`.
* `INPUT_DATA`: input data in the same format as the validation/test sets of the WebQA dataset.
* `OUTPUT_FILE`: results in the format specified in the WebQA dataset for the evaluation scripts.

P
Peng Li 已提交
125
## Pre-trained Models
P
Peng Li 已提交
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167

We have provided two pre-trained models, one for the validation and test sets with annotated evidence, and one for those with retrieved evidence. These two models are selected according to the performance on the corresponding version of validation set, which is consistent with the paper.

The models can be downloaded with
```bash
cd pre-trained-models && ./download-models.sh && cd ..
```

The evaluation result on the test set with annotated evidence can be achieved by

```bash
PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \
  pre-trained-models/params_pass_00010.tar.gz \
  data/data/test.ann.json.gz \
  test.ann.output.txt.gz

PYTHONPATH=data/evaluation:$PYTHONPATH \
  python data/evaluation/evaluate-tagging-result.py \
  test.ann.output.txt.gz \
  data/data/test.ann.json.gz \
  --fuzzy --schema BIO2
# The result should be
# chunk_f1=0.739091 chunk_precision=0.686119 chunk_recall=0.800926 true_chunks=3024 result_chunks=3530 correct_chunks=2422
```

And the evaluation result on the test set with retrieved evidence can be achieved by

```bash
PYTHONPATH=data/evaluation:$PYTHONPATH python infer.py \
  pre-trained-models/params_pass_00021.tar.gz \
  data/data/test.ir.json.gz \
  test.ir.output.txt.gz

PYTHONPATH=data/evaluation:$PYTHONPATH \
  python data/evaluation/evaluate-voting-result.py \
  test.ir.output.txt.gz \
  data/data/test.ir.json.gz \
  --fuzzy --schema BIO2
# The result should be
# chunk_f1=0.749358 chunk_precision=0.727868 chunk_recall=0.772156 true_chunks=3024 result_chunks=3208 correct_chunks=2335
```

P
Peng Li 已提交
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
</div>
<!-- You can change the lines below now. -->

<script type="text/javascript">
marked.setOptions({
  renderer: new marked.Renderer(),
  gfm: true,
  breaks: false,
  smartypants: true,
  highlight: function(code, lang) {
    code = code.replace(/&amp;/g, "&")
    code = code.replace(/&gt;/g, ">")
    code = code.replace(/&lt;/g, "<")
    code = code.replace(/&nbsp;/g, " ")
    return hljs.highlightAuto(code, [lang]).value;
  }
});
document.getElementById("context").innerHTML = marked(
        document.getElementById("markdown").innerHTML)
</script>
</body>