Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
models
提交
266b8eeb
M
models
项目概览
PaddlePaddle
/
models
1 年多 前同步成功
通知
222
Star
6828
Fork
2962
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
602
列表
看板
标记
里程碑
合并请求
255
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
models
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
602
Issue
602
列表
看板
标记
里程碑
合并请求
255
合并请求
255
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
266b8eeb
编写于
10月 13, 2017
作者:
P
peterzhang2029
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
refine notation
上级
e89af968
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
62 addition
and
66 deletion
+62
-66
nested_sequence/text_classification/README.md
nested_sequence/text_classification/README.md
+18
-17
nested_sequence/text_classification/index.html
nested_sequence/text_classification/index.html
+18
-17
nested_sequence/text_classification/infer.py
nested_sequence/text_classification/infer.py
+10
-14
nested_sequence/text_classification/network_conf.py
nested_sequence/text_classification/network_conf.py
+0
-2
nested_sequence/text_classification/train.py
nested_sequence/text_classification/train.py
+16
-16
未找到文件。
nested_sequence/text_classification/README.md
浏览文件 @
266b8eeb
...
...
@@ -76,7 +76,7 @@ python train.py
### 预测
训练结束后模型将存储在指定目录当中(默认models目录),在终端执行:
```
bash
python infer.py
python infer.py
--model_path
'models/params_pass_00000.tar.gz'
```
默认情况下,预测脚本将加载训练一个pass的模型对
`imdb的测试集`
进行测试。
...
...
@@ -139,20 +139,21 @@ def train_reader(data_dir, word_dict):
`train.py`
训练脚本中包含以下参数:
```
Options:
--train_data_dir TEXT
path of training dataset (default: None). i
f this
--train_data_dir TEXT
The path of training dataset (default: None). I
f this
parameter is not set, imdb dataset will be used.
--test_data_dir TEXT
path of testing dataset (default: None). i
f this
--test_data_dir TEXT
The path of testing dataset (default: None). I
f this
parameter is not set, imdb dataset will be used.
--word_dict_path TEXT
path of word dictionary (default: None).i
f this
parameter is not set, imdb dataset will be used.
i
f
--word_dict_path TEXT
The path of word dictionary (default: None). I
f this
parameter is not set, imdb dataset will be used.
I
f
this parameter is set, but the file does not exist,
word dictionay will be built from the training data
automatically.
--class_num INTEGER class number (default: 2).
--batch_size INTEGER
t
he number of training examples in one batch
--class_num INTEGER
The
class number (default: 2).
--batch_size INTEGER
T
he number of training examples in one batch
(default: 32).
--num_passes INTEGER number of passes to train (default: 10).
--model_save_dir TEXT path to save the trained models (default: 'models').
--num_passes INTEGER The number of passes to train (default: 10).
--model_save_dir TEXT The path to save the trained models (default:
'models').
--help Show this message and exit.
```
...
...
@@ -170,20 +171,20 @@ python train.py --train_data_dir 'data/train_data' --test_data_dir 'data/test_da
```
Options:
--data_path TEXT
path of data for inference (default: None). if this
parameter is not set, imdb test dataset will be used.
--model_path TEXT path of saved model. (default:
'models/params_pass_00000.tar.gz')
--word_dict_path TEXT
path of word dictionary (default: None).i
f this
--data_path TEXT
The path of data for inference (default: None). If
this parameter is not set, imdb test dataset will be
used.
--model_path TEXT The path of saved model. [required]
--word_dict_path TEXT
The path of word dictionary (default: None). I
f this
parameter is not set, imdb dataset will be used.
--class_num INTEGER class number (default: 2).
--batch_size INTEGER
t
he number of examples in one batch (default: 32).
--class_num INTEGER
The
class number (default: 2).
--batch_size INTEGER
T
he number of examples in one batch (default: 32).
--help Show this message and exit.
```
2.
以
`data`
目录下的示例数据为例,在终端执行:
```
bash
python infer.py
--data_path
'data/infer.txt'
--word_dict_path
'dict.txt'
python infer.py
--data_path
'data/infer.txt'
--word_dict_path
'dict.txt'
--model_path
'models/params_pass_00000.tar.gz'
```
即可对样例数据进行预测。
nested_sequence/text_classification/index.html
浏览文件 @
266b8eeb
...
...
@@ -118,7 +118,7 @@ python train.py
### 预测
训练结束后模型将存储在指定目录当中(默认models目录),在终端执行:
```bash
python infer.py
python infer.py
--model_path 'models/params_pass_00000.tar.gz'
```
默认情况下,预测脚本将加载训练一个pass的模型对 `imdb的测试集` 进行测试。
...
...
@@ -181,20 +181,21 @@ def train_reader(data_dir, word_dict):
`train.py`训练脚本中包含以下参数:
```
Options:
--train_data_dir TEXT
path of training dataset (default: None). i
f this
--train_data_dir TEXT
The path of training dataset (default: None). I
f this
parameter is not set, imdb dataset will be used.
--test_data_dir TEXT
path of testing dataset (default: None). i
f this
--test_data_dir TEXT
The path of testing dataset (default: None). I
f this
parameter is not set, imdb dataset will be used.
--word_dict_path TEXT
path of word dictionary (default: None).i
f this
parameter is not set, imdb dataset will be used.
i
f
--word_dict_path TEXT
The path of word dictionary (default: None). I
f this
parameter is not set, imdb dataset will be used.
I
f
this parameter is set, but the file does not exist,
word dictionay will be built from the training data
automatically.
--class_num INTEGER class number (default: 2).
--batch_size INTEGER
t
he number of training examples in one batch
--class_num INTEGER
The
class number (default: 2).
--batch_size INTEGER
T
he number of training examples in one batch
(default: 32).
--num_passes INTEGER number of passes to train (default: 10).
--model_save_dir TEXT path to save the trained models (default: 'models').
--num_passes INTEGER The number of passes to train (default: 10).
--model_save_dir TEXT The path to save the trained models (default:
'models').
--help Show this message and exit.
```
...
...
@@ -212,20 +213,20 @@ python train.py --train_data_dir 'data/train_data' --test_data_dir 'data/test_da
```
Options:
--data_path TEXT
path of data for inference (default: None). if this
parameter is not set, imdb test dataset will be used.
--model_path TEXT path of saved model. (default:
'models/params_pass_00000.tar.gz')
--word_dict_path TEXT
path of word dictionary (default: None).i
f this
--data_path TEXT
The path of data for inference (default: None). If
this parameter is not set, imdb test dataset will be
used.
--model_path TEXT The path of saved model. [required]
--word_dict_path TEXT
The path of word dictionary (default: None). I
f this
parameter is not set, imdb dataset will be used.
--class_num INTEGER class number (default: 2).
--batch_size INTEGER
t
he number of examples in one batch (default: 32).
--class_num INTEGER
The
class number (default: 2).
--batch_size INTEGER
T
he number of examples in one batch (default: 32).
--help Show this message and exit.
```
2.以`data`目录下的示例数据为例,在终端执行:
```bash
python infer.py --data_path 'data/infer.txt' --word_dict_path 'dict.txt'
python infer.py --data_path 'data/infer.txt' --word_dict_path 'dict.txt'
--model_path 'models/params_pass_00000.tar.gz'
```
即可对样例数据进行预测。
...
...
nested_sequence/text_classification/infer.py
浏览文件 @
266b8eeb
...
...
@@ -14,28 +14,24 @@ from utils import logger, load_dict
@
click
.
option
(
"--data_path"
,
default
=
None
,
help
=
(
"path of data for inference (default: None). "
"
i
f this parameter is not set, "
help
=
(
"
The
path of data for inference (default: None). "
"
I
f this parameter is not set, "
"imdb test dataset will be used."
))
@
click
.
option
(
"--model_path"
,
type
=
str
,
default
=
'models/params_pass_00000.tar.gz'
,
help
=
(
"path of saved model. "
"(default: 'models/params_pass_00000.tar.gz')"
))
"--model_path"
,
type
=
str
,
required
=
True
,
help
=
"The path of saved model."
)
@
click
.
option
(
"--word_dict_path"
,
type
=
str
,
default
=
None
,
help
=
(
"
path of word dictionary (default: None).
"
"
i
f this parameter is not set, imdb dataset will be used."
))
help
=
(
"
The path of word dictionary (default: None).
"
"
I
f this parameter is not set, imdb dataset will be used."
))
@
click
.
option
(
"--class_num"
,
type
=
int
,
default
=
2
,
help
=
"class number (default: 2)."
)
"--class_num"
,
type
=
int
,
default
=
2
,
help
=
"
The
class number (default: 2)."
)
@
click
.
option
(
"--batch_size"
,
type
=
int
,
default
=
32
,
help
=
"
t
he number of examples in one batch (default: 32)."
)
help
=
"
T
he number of examples in one batch (default: 32)."
)
def
infer
(
data_path
,
model_path
,
word_dict_path
,
batch_size
,
class_num
):
def
_infer_a_batch
(
inferer
,
test_batch
,
ids_2_word
):
probs
=
inferer
.
infer
(
input
=
test_batch
,
field
=
[
"value"
])
...
...
@@ -49,8 +45,8 @@ def infer(data_path, model_path, word_dict_path, batch_size, class_num):
" "
.
join
([
"{:0.4f}"
.
format
(
p
)
for
p
in
prob
]),
word_text
))
assert
os
.
path
.
exists
(
model_path
),
"
t
he trained model does not exist."
logger
.
info
(
"
b
egin to predict..."
)
assert
os
.
path
.
exists
(
model_path
),
"
T
he trained model does not exist."
logger
.
info
(
"
B
egin to predict..."
)
use_default_data
=
(
data_path
is
None
)
if
use_default_data
:
...
...
@@ -61,7 +57,7 @@ def infer(data_path, model_path, word_dict_path, batch_size, class_num):
class_num
=
2
else
:
assert
os
.
path
.
exists
(
word_dict_path
),
"
t
he word dictionary file does not exist"
word_dict_path
),
"
T
he word dictionary file does not exist"
word_dict
=
load_dict
(
word_dict_path
)
word_reverse_dict
=
dict
((
value
,
key
)
...
...
nested_sequence/text_classification/network_conf.py
浏览文件 @
266b8eeb
...
...
@@ -7,8 +7,6 @@ def cnn_cov_group(group_input, hidden_size):
conv4
=
paddle
.
networks
.
sequence_conv_pool
(
input
=
group_input
,
context_len
=
4
,
hidden_size
=
hidden_size
)
#output_group = paddle.layer.concat(input=[conv3, conv4])
output_group
=
paddle
.
layer
.
fc
(
input
=
[
conv3
,
conv4
],
size
=
hidden_size
,
...
...
nested_sequence/text_classification/train.py
浏览文件 @
266b8eeb
...
...
@@ -14,42 +14,42 @@ from utils import build_dict, load_dict, logger
@
click
.
option
(
"--train_data_dir"
,
default
=
None
,
help
=
(
"path of training dataset (default: None). "
"
i
f this parameter is not set, "
help
=
(
"
The
path of training dataset (default: None). "
"
I
f this parameter is not set, "
"imdb dataset will be used."
))
@
click
.
option
(
"--test_data_dir"
,
default
=
None
,
help
=
(
"path of testing dataset (default: None). "
"
i
f this parameter is not set, "
help
=
(
"
The
path of testing dataset (default: None). "
"
I
f this parameter is not set, "
"imdb dataset will be used."
))
@
click
.
option
(
"--word_dict_path"
,
type
=
str
,
default
=
None
,
help
=
(
"
path of word dictionary (default: None).
"
"
if this parameter is not set, imdb dataset will be used.
"
"
i
f this parameter is set, but the file does not exist, "
help
=
(
"
The path of word dictionary (default: None).
"
"
If this parameter is not set, imdb dataset will be used.
"
"
I
f this parameter is set, but the file does not exist, "
"word dictionay will be built from "
"the training data automatically."
))
@
click
.
option
(
"--class_num"
,
type
=
int
,
default
=
2
,
help
=
"class number (default: 2)."
)
"--class_num"
,
type
=
int
,
default
=
2
,
help
=
"
The
class number (default: 2)."
)
@
click
.
option
(
"--batch_size"
,
type
=
int
,
default
=
32
,
help
=
(
"
t
he number of training examples in one batch "
help
=
(
"
T
he number of training examples in one batch "
"(default: 32)."
))
@
click
.
option
(
"--num_passes"
,
type
=
int
,
default
=
10
,
help
=
"number of passes to train (default: 10)."
)
help
=
"
The
number of passes to train (default: 10)."
)
@
click
.
option
(
"--model_save_dir"
,
type
=
str
,
default
=
"models"
,
help
=
"path to save the trained models (default: 'models')."
)
help
=
"
The
path to save the trained models (default: 'models')."
)
def
train
(
train_data_dir
,
test_data_dir
,
word_dict_path
,
class_num
,
model_save_dir
,
batch_size
,
num_passes
):
"""
...
...
@@ -70,7 +70,7 @@ def train(train_data_dir, test_data_dir, word_dict_path, class_num,
:type num_pass: int
"""
if
train_data_dir
is
not
None
:
assert
word_dict_path
,
(
"
t
he parameter train_data_dir, word_dict_path "
assert
word_dict_path
,
(
"
T
he parameter train_data_dir, word_dict_path "
"should be set at the same time."
)
if
not
os
.
path
.
exists
(
model_save_dir
):
...
...
@@ -81,7 +81,7 @@ def train(train_data_dir, test_data_dir, word_dict_path, class_num,
if
use_default_data
:
logger
.
info
((
"No training data are porivided, "
"use imdb to train the model."
))
logger
.
info
(
"
p
lease wait to build the word dictionary ..."
)
logger
.
info
(
"
P
lease wait to build the word dictionary ..."
)
word_dict
=
reader
.
imdb_word_dict
()
...
...
@@ -94,7 +94,7 @@ def train(train_data_dir, test_data_dir, word_dict_path, class_num,
class_num
=
2
else
:
if
word_dict_path
is
None
or
not
os
.
path
.
exists
(
word_dict_path
):
logger
.
info
((
"
w
ord dictionary is not given, the dictionary "
logger
.
info
((
"
W
ord dictionary is not given, the dictionary "
"is automatically built from the training data."
))
# build the word dictionary to map the original string-typed
...
...
@@ -107,7 +107,7 @@ def train(train_data_dir, test_data_dir, word_dict_path, class_num,
word_dict
=
load_dict
(
word_dict_path
)
class_num
=
class_num
logger
.
info
(
"
c
lass number is : %d."
%
class_num
)
logger
.
info
(
"
C
lass number is : %d."
%
class_num
)
train_reader
=
paddle
.
batch
(
paddle
.
reader
.
shuffle
(
...
...
@@ -129,7 +129,7 @@ def train(train_data_dir, test_data_dir, word_dict_path, class_num,
emb_size
=
28
hidden_size
=
128
logger
.
info
(
"
l
ength of word dictionary is : %d."
%
(
dict_dim
))
logger
.
info
(
"
L
ength of word dictionary is : %d."
%
(
dict_dim
))
paddle
.
init
(
use_gpu
=
True
,
trainer_count
=
4
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录