update README

d5c3e3c6 · qiuxuezhong · e99b1ffe · d5c3e3c6
隐藏空白更改
内联并排

Showing with 1 addition and 1 deletion

fluid/machine_reading_comprehesion/DuReader/README.md fluid/machine_reading_comprehesion/DuReader/README.md +1 -1

未找到文件。
--- a/fluid/machine_reading_comprehesion/DuReader/README.md
+++ b/fluid/machine_reading_comprehesion/DuReader/README.md
@@ -35,7 +35,7 @@ cd utils && bash download_thirdparty.sh
 ```
 ### Preprocess the Data
-After the dataset is downloaded, there is still some work to do to run the baseline systems. DuReader dataset offers rich amount of documents for every user question, the documents are too long for popular RC models to cope with. In our baseline models, we preprocess the train set and development set data by selecting the paragraph that is most related to the answer string, while for inferring(no available golden answer), we select the paragraph that is most related to the question string. The preprocessing strategy is implemented in `utils/preprocess.py`. To preprocess the raw data, you should first segment 'question', 'title', 'paragraphs' and then store the segemented result into 'segmented_question', 'segmented_title', 'segmented_paragraphs' like the downloaded preprocessed data, then run:
+After the dataset is downloaded, there is still some work to do to run DuReader. DuReader dataset offers rich amount of documents for every user question, the documents are too long for popular RC models to cope with. In our model, we preprocess the train set and development set data by selecting the paragraph that is most related to the answer string, while for inferring(no available golden answer), we select the paragraph that is most related to the question string. The preprocessing strategy is implemented in `utils/preprocess.py`. To preprocess the raw data, you should first segment 'question', 'title', 'paragraphs' and then store the segemented result into 'segmented_question', 'segmented_title', 'segmented_paragraphs' like the downloaded preprocessed data, then run:
 ```
 cat data/raw/trainset/search.train.json | python utils/preprocess.py > data/preprocessed/trainset/search.train.json
 ```