Merge pull request #71 from WeiyueSu/erniesage

example data format

Merge pull request #71 from WeiyueSu/erniesage
example data format
05276913 · Weiyue Su · GitHub · e18c14c7 · cbf4a1a3 · 05276913
隐藏空白更改
内联并排

Showing with 9 addition and 0 deletion

examples/erniesage/README.en.md examples/erniesage/README.en.md +4 -0

examples/erniesage/README.md examples/erniesage/README.md +5 -0

未找到文件。
--- a/examples/erniesage/README.en.md
+++ b/examples/erniesage/README.en.md
@@ -32,6 +32,10 @@ Thanks to the flexibility and usability of PGL, **ERNIESage** can be quickly imp
 - pgl>=1.1

 ## Dataformat
+In the example data ```data.txt```, part of NLPCC2016-DBQA is used, and the format is "query \t answer" for each line.
+```text
+NLPCC2016-DBQA is a sub-task of NLPCC-ICCPOL 2016 Shared Task which is hosted by NLPCC(Natural Language Processing and Chinese Computing), this task targets on selecting documents from the candidates to answer the questions. [url: http://tcci.ccf.org.cn/conference/2016/dldoc/evagline2.pdf]
+```

 ## How to run


--- a/examples/erniesage/README.md
+++ b/examples/erniesage/README.md
@@ -32,11 +32,16 @@
 - pgl>=1.1

 ## Dataformat
+示例数据```data.txt```中使用了NLPCC2016-DBQA的部分数据，格式为每行"query \t answer"。
+```text
+NLPCC2016-DBQA 是由国际自然语言处理和中文计算会议 NLPCC 于 2016 年举办的评测任务，其目标是从候选中找到合适的文档作为问题的答案。[链接: http://tcci.ccf.org.cn/conference/2016/dldoc/evagline2.pdf]
+```

 ## How to run

 我们采用了[PaddlePaddle Fleet](https://github.com/PaddlePaddle/Fleet)作为我们的分布式训练框架，在```config/*.yaml```中，有部分用于训练ERNIESage的配置, 其中ERNIE模型```ckpt_path```以及词表```ernie_vocab_file```在[ERNIE](https://github.com/PaddlePaddle/ERNIE)下载。

+
 ```sh
 # 分布式GPU模式或单机模式ERNIESage
 sh local_run.sh config/erniesage_v2_gpu.yaml