未验证 提交 54ef8cbc 编写于 作者: Z zhang wenhui 提交者: GitHub

Merge pull request #1973 from frankwhzhang/fix_bug

fix Readme
...@@ -26,18 +26,24 @@ ...@@ -26,18 +26,24 @@
```bash ```bash
wget http://www.statmt.org/lm-benchmark/1-billion-word-language-modeling-benchmark-r13output.tar.gz wget http://www.statmt.org/lm-benchmark/1-billion-word-language-modeling-benchmark-r13output.tar.gz
tar xzvf 1-billion-word-language-modeling-benchmark-r13output.tar
mv 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/ data/
``` ```
备用数据地址下载命令如下 备用数据地址下载命令如下
```bash ```bash
wget https://paddlerec.bj.bcebos.com/word2vec/1-billion-word-language-modeling-benchmark-r13output.tar wget https://paddlerec.bj.bcebos.com/word2vec/1-billion-word-language-modeling-benchmark-r13output.tar
tar xvf 1-billion-word-language-modeling-benchmark-r13output.tar
mv 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/ data/
``` ```
为了方便快速验证,我们也提供了经典的text8样例数据集,包含1700w个词。 下载命令如下 为了方便快速验证,我们也提供了经典的text8样例数据集,包含1700w个词。 下载命令如下
```bash ```bash
wget https://paddlerec.bj.bcebos.com/word2vec/text.tar wget https://paddlerec.bj.bcebos.com/word2vec/text.tar
tar xvf text.tar
mv text data/
``` ```
......
...@@ -199,7 +199,7 @@ def GetFileList(data_path): ...@@ -199,7 +199,7 @@ def GetFileList(data_path):
def train(args): def train(args):
if not os.path.isdir(args.model_output_dir) and args.train_id == 0: if not os.path.isdir(args.model_output_dir) and args.trainer_id == 0:
os.mkdir(args.model_output_dir) os.mkdir(args.model_output_dir)
filelist = GetFileList(args.train_data_dir) filelist = GetFileList(args.train_data_dir)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册