Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
FluidDoc
提交
aae80b4d
F
FluidDoc
项目概览
PaddlePaddle
/
FluidDoc
通知
7
Star
2
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
23
列表
看板
标记
里程碑
合并请求
111
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
F
FluidDoc
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
23
Issue
23
列表
看板
标记
里程碑
合并请求
111
合并请求
111
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
aae80b4d
编写于
9月 14, 2020
作者:
L
LiuChiachi
提交者:
GitHub
9月 14, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add text/datasets Chinese doc (#2598)
上级
2eac1e60
变更
8
隐藏空白更改
内联
并排
Showing
8 changed file
with
266 addition
and
276 deletion
+266
-276
doc/paddle/api/paddle/text/datasets/conll05/Conll05st_cn.rst
doc/paddle/api/paddle/text/datasets/conll05/Conll05st_cn.rst
+41
-43
doc/paddle/api/paddle/text/datasets/imdb/Imdb_cn.rst
doc/paddle/api/paddle/text/datasets/imdb/Imdb_cn.rst
+28
-29
doc/paddle/api/paddle/text/datasets/imikolov/Imikolov_cn.rst
doc/paddle/api/paddle/text/datasets/imikolov/Imikolov_cn.rst
+32
-33
doc/paddle/api/paddle/text/datasets/movie_reviews/MovieReviews_cn.rst
...pi/paddle/text/datasets/movie_reviews/MovieReviews_cn.rst
+27
-28
doc/paddle/api/paddle/text/datasets/movielens/Movielens_cn.rst
...addle/api/paddle/text/datasets/movielens/Movielens_cn.rst
+32
-33
doc/paddle/api/paddle/text/datasets/uci_housing/UCIHousing_cn.rst
...le/api/paddle/text/datasets/uci_housing/UCIHousing_cn.rst
+28
-29
doc/paddle/api/paddle/text/datasets/wmt14/WMT14_cn.rst
doc/paddle/api/paddle/text/datasets/wmt14/WMT14_cn.rst
+32
-34
doc/paddle/api/paddle/text/datasets/wmt16/WMT16_cn.rst
doc/paddle/api/paddle/text/datasets/wmt16/WMT16_cn.rst
+46
-47
未找到文件。
doc/paddle/api/paddle/text/datasets/conll05/Conll05st_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,59 +6,57 @@ Conll05st
...
@@ -6,59 +6,57 @@ Conll05st
..
py
:
class
::
paddle
.
text
.
datasets
.
Conll05st
()
..
py
:
class
::
paddle
.
text
.
datasets
.
Conll05st
()
Implementation
of
`
Conll05st
<
https
://
www
.
cs
.
upc
.
edu
/~
srlconll
/
soft
.
html
>`
_
该类是对
`
Conll05st
<
https
://
www
.
cs
.
upc
.
edu
/~
srlconll
/
soft
.
html
>`
_
test
dataset
.
测试数据集的实现
.
Note
:
only
support
download
test
dataset
automatically
for
that
..
note
::
only
test
dataset
of
Conll05st
is
public
.
只支持自动下载公共的
Conll05st
测试数据集。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
word_dict_file
(
str
):
path
to
word
dictionary
file
,
can
be
set
None
if
-
word_dict_file
(
str
)
-
保存词典的路径。如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
verb_dict_file
(
str
):
path
to
verb
dictionary
file
,
can
be
set
None
if
-
verb_dict_file
(
str
)
-
保存动词词典的路径。如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
target_dict_file
(
str
):
path
to
target
dictionary
file
,
can
be
set
None
if
-
target_dict_file
(
str
)
-
保存目标词典的路径如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
emb_file
(
str
):
path
to
embedding
dictionary
file
,
only
used
for
-
emb_file
(
str
)
-
保存词嵌入词典的文件。只有在
:
code
:`
get_embedding
`
能被设置为
None
:
code
:`
get_embedding
`
can
be
set
None
if
:
attr
:`
download
`
is
且
:
attr
:`
download
`
为
True
时使用。
True
.
Default
None
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
:
attr
:`
word_dict_file
`
download
(
bool
):
whether
to
download
dataset
automatically
if
:
attr
:`
verb_dict_file
`
和
:
attr
:`
target_dict_file
`
未设置,是否下载数据集。默认为
True
。
:
attr
:`
data_file
`
:
attr
:`
word_dict_file
`
:
attr
:`
verb_dict_file
`
:
attr
:`
target_dict_file
`
is
not
set
.
Default
True
返回值
Returns
:
Dataset
:
instance
of
conll05st
dataset
代码示例
:::::::::
:::::::::
``
Dataset
``
,
conll05st
数据集实例。
..
code
-
block
::
python
代码示例
:::::::::
..
code
-
block
::
python
import
paddle
import
paddle
from
paddle
.
text
.
datasets
import
Conll05st
from
paddle
.
text
.
datasets
import
Conll05st
class
SimpleNet
(
paddle
.
nn
.
Layer
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
def
__init__
(
self
):
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
pred_idx
,
mark
,
label
):
def
forward
(
self
,
pred_idx
,
mark
,
label
):
return
paddle
.
sum
(
pred_idx
),
paddle
.
sum
(
mark
),
paddle
.
sum
(
label
)
return
paddle
.
sum
(
pred_idx
),
paddle
.
sum
(
mark
),
paddle
.
sum
(
label
)
paddle
.
disable_static
()
paddle
.
disable_static
()
conll05st
=
Conll05st
()
conll05st
=
Conll05st
()
for
i
in
range
(
10
):
for
i
in
range
(
10
):
pred_idx
,
mark
,
label
=
conll05st
[
i
][-
3
:]
pred_idx
,
mark
,
label
=
conll05st
[
i
][-
3
:]
pred_idx
=
paddle
.
to_tensor
(
pred_idx
)
pred_idx
=
paddle
.
to_tensor
(
pred_idx
)
mark
=
paddle
.
to_tensor
(
mark
)
mark
=
paddle
.
to_tensor
(
mark
)
label
=
paddle
.
to_tensor
(
label
)
label
=
paddle
.
to_tensor
(
label
)
model
=
SimpleNet
()
model
=
SimpleNet
()
pred_idx
,
mark
,
label
=
model
(
pred_idx
,
mark
,
label
)
pred_idx
,
mark
,
label
=
model
(
pred_idx
,
mark
,
label
)
print
(
pred_idx
.
numpy
(),
mark
.
numpy
(),
label
.
numpy
())
print
(
pred_idx
.
numpy
(),
mark
.
numpy
(),
label
.
numpy
())
\ No newline at end of file
doc/paddle/api/paddle/text/datasets/imdb/Imdb_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,46 +6,45 @@ Imdb
...
@@ -6,46 +6,45 @@ Imdb
..
py
:
class
::
paddle
.
text
.
datasets
.
Imdb
()
..
py
:
class
::
paddle
.
text
.
datasets
.
Imdb
()
Implementation
of
`
IMDB
<
https
://
www
.
imdb
.
com
/
interfaces
/>`
_
dataset
.
该类是对
`
IMDB
<
https
://
www
.
imdb
.
com
/
interfaces
/>`
_
测试数据集的实现。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存压缩数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
mode
(
str
):
'train'
'test'
mode
.
Default
'train'
.
-
mode
(
str
)
-
'train'
或
'test'
模式。默认为
'train'
。
cutoff
(
int
):
cutoff
number
for
building
word
dictionary
.
Default
150.
-
cutoff
(
int
)
-
构建词典的截止大小。默认为
Default
150
。
download
(
bool
):
whether
to
download
dataset
automatically
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
is
not
set
.
Default
True
Returns
:
返回值
Dataset
:
instance
of
IMDB
dataset
:::::::::
``
Dataset
``
,
IMDB
数据集实例。
代码示例
代码示例
:::::::::
:::::::::
..
code
-
block
::
python
..
code
-
block
::
python
import
paddle
import
paddle
from
paddle
.
text
.
datasets
import
Imdb
from
paddle
.
text
.
datasets
import
Imdb
class
SimpleNet
(
paddle
.
nn
.
Layer
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
def
__init__
(
self
):
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
doc
,
label
):
def
forward
(
self
,
doc
,
label
):
return
paddle
.
sum
(
doc
),
label
return
paddle
.
sum
(
doc
),
label
paddle
.
disable_static
()
paddle
.
disable_static
()
imdb
=
Imdb
(
mode
=
'train'
)
imdb
=
Imdb
(
mode
=
'train'
)
for
i
in
range
(
10
):
for
i
in
range
(
10
):
doc
,
label
=
imdb
[
i
]
doc
,
label
=
imdb
[
i
]
doc
=
paddle
.
to_tensor
(
doc
)
doc
=
paddle
.
to_tensor
(
doc
)
label
=
paddle
.
to_tensor
(
label
)
label
=
paddle
.
to_tensor
(
label
)
model
=
SimpleNet
()
model
=
SimpleNet
()
image
,
label
=
model
(
doc
,
label
)
image
,
label
=
model
(
doc
,
label
)
print
(
doc
.
numpy
().
shape
,
label
.
numpy
().
shape
)
print
(
doc
.
numpy
().
shape
,
label
.
numpy
().
shape
)
\ No newline at end of file
doc/paddle/api/paddle/text/datasets/imikolov/Imikolov_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,48 +6,47 @@ Imikolov
...
@@ -6,48 +6,47 @@ Imikolov
..
py
:
class
::
paddle
.
text
.
datasets
.
Imikolov
()
..
py
:
class
::
paddle
.
text
.
datasets
.
Imikolov
()
Implementation
of
imikolov
dataset
.
该类是对
imikolov
测试数据集的实现。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
data_type
(
str
):
'NGRAM'
or
'SEQ'
.
Default
'NGRAM'
.
-
data_type
(
str
)
-
'NGRAM'
或
'SEQ'
。默认为
'NGRAM'
。
window_size
(
int
):
sliding
window
size
for
'NGRAM'
data
.
Default
-
1.
-
window_size
(
int
)
-
'NGRAM'
数据滑动窗口的大小。默认为
-
1
。
mode
(
str
):
'train'
'test'
mode
.
Default
'train'
.
-
mode
(
str
)
-
'train'
'test'
mode
.
Default
'train'
.
min_word_freq
(
int
):
minimal
word
frequence
for
building
word
dictionary
.
Default
50.
-
min_word_freq
(
int
)
-
构建词典的最小词频。默认为
50
。
download
(
bool
):
whether
to
download
dataset
automatically
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
is
not
set
.
Default
True
返回值
Returns
:
Dataset
:
instance
of
imikolov
dataset
代码示例
:::::::::
:::::::::
``
Dataset
``
,
imikolov
数据集实例。
..
code
-
block
::
python
代码示例
:::::::::
..
code
-
block
::
python
import
paddle
import
paddle
from
paddle
.
text
.
datasets
import
Imikolov
from
paddle
.
text
.
datasets
import
Imikolov
class
SimpleNet
(
paddle
.
nn
.
Layer
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
def
__init__
(
self
):
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
src
,
trg
):
def
forward
(
self
,
src
,
trg
):
return
paddle
.
sum
(
src
),
paddle
.
sum
(
trg
)
return
paddle
.
sum
(
src
),
paddle
.
sum
(
trg
)
paddle
.
disable_static
()
paddle
.
disable_static
()
imikolov
=
Imikolov
(
mode
=
'train'
,
data_type
=
'SEQ'
,
window_size
=
2
)
imikolov
=
Imikolov
(
mode
=
'train'
,
data_type
=
'SEQ'
,
window_size
=
2
)
for
i
in
range
(
10
):
for
i
in
range
(
10
):
src
,
trg
=
imikolov
[
i
]
src
,
trg
=
imikolov
[
i
]
src
=
paddle
.
to_tensor
(
src
)
src
=
paddle
.
to_tensor
(
src
)
trg
=
paddle
.
to_tensor
(
trg
)
trg
=
paddle
.
to_tensor
(
trg
)
model
=
SimpleNet
()
model
=
SimpleNet
()
src
,
trg
=
model
(
src
,
trg
)
src
,
trg
=
model
(
src
,
trg
)
print
(
src
.
numpy
().
shape
,
trg
.
numpy
().
shape
)
print
(
src
.
numpy
().
shape
,
trg
.
numpy
().
shape
)
\ No newline at end of file
doc/paddle/api/paddle/text/datasets/movie_reviews/MovieReviews_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,45 +6,44 @@ MovieReviews
...
@@ -6,45 +6,44 @@ MovieReviews
..
py
:
class
::
paddle
.
text
.
datasets
.
MovieReviews
()
..
py
:
class
::
paddle
.
text
.
datasets
.
MovieReviews
()
Implementation
of
`
NLTK
movie
reviews
<
http
://
www
.
nltk
.
org
/
nltk_data
/>`
_
dataset
.
该类是对
`
NLTK
movie
reviews
<
http
://
www
.
nltk
.
org
/
nltk_data
/>`
_
测试数据集的实现。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存压缩数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
mode
(
str
):
'train'
'test'
mode
.
Default
'train'
.
-
mode
(
str
)
-
'train'
或
'test'
模式。默认为
'train'
。
download
(
bool
):
whether
auto
download
cifar
dataset
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
unset
.
Default
True
.
Returns
:
返回值
Dataset
:
instance
of
movie
reviews
dataset
:::::::::
``
Dataset
``
,
NLTK
movie
reviews
数据集实例。
代码示例
代码示例
:::::::::
:::::::::
..
code
-
block
::
python
..
code
-
block
::
python
import
paddle
import
paddle
from
paddle
.
text
.
datasets
import
MovieReviews
from
paddle
.
text
.
datasets
import
MovieReviews
class
SimpleNet
(
paddle
.
nn
.
Layer
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
def
__init__
(
self
):
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
word
,
category
):
def
forward
(
self
,
word
,
category
):
return
paddle
.
sum
(
word
),
category
return
paddle
.
sum
(
word
),
category
paddle
.
disable_static
()
paddle
.
disable_static
()
movie_reviews
=
MovieReviews
(
mode
=
'train'
)
movie_reviews
=
MovieReviews
(
mode
=
'train'
)
for
i
in
range
(
10
):
for
i
in
range
(
10
):
word_list
,
category
=
movie_reviews
[
i
]
word_list
,
category
=
movie_reviews
[
i
]
word_list
=
paddle
.
to_tensor
(
word_list
)
word_list
=
paddle
.
to_tensor
(
word_list
)
category
=
paddle
.
to_tensor
(
category
)
category
=
paddle
.
to_tensor
(
category
)
model
=
SimpleNet
()
model
=
SimpleNet
()
word_list
,
category
=
model
(
word_list
,
category
)
word_list
,
category
=
model
(
word_list
,
category
)
print
(
word_list
.
numpy
().
shape
,
category
.
numpy
())
print
(
word_list
.
numpy
().
shape
,
category
.
numpy
())
\ No newline at end of file
doc/paddle/api/paddle/text/datasets/movielens/Movielens_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,48 +6,47 @@ Movielens
...
@@ -6,48 +6,47 @@ Movielens
..
py
:
class
::
paddle
.
text
.
datasets
.
Movielens
()
..
py
:
class
::
paddle
.
text
.
datasets
.
Movielens
()
Implementation
of
`
Movielens
1
-
M
<
https
://
grouplens
.
org
/
datasets
/
movielens
/
1
m
/>`
_
dataset
.
该类是对
`
Movielens
1
-
M
<
https
://
grouplens
.
org
/
datasets
/
movielens
/
1
m
/>`
_
测试数据集的实现。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存压缩数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
mode
(
str
):
'train'
or
'test'
mode
.
Default
'train'
.
-
mode
(
str
)
-
'train'
或
'test'
模式。默认为
'train'
。
test_ratio
(
float
):
split
ratio
for
test
sample
.
Default
0.1
.
-
test_ratio
(
float
)
-
为测试集划分的比例。默认为
0.1
。
rand_seed
(
int
):
random
seed
.
Default
0.
-
rand_seed
(
int
)
-
随机数种子。默认为
0
。
download
(
bool
):
whether
to
download
dataset
automatically
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
is
not
set
.
Default
True
返回值
Returns
:
Dataset
:
instance
of
Movielens
1
-
M
dataset
代码示例
:::::::::
:::::::::
``
Dataset
``
,
Movielens
1
-
M
数据集实例。
..
code
-
block
::
python
代码示例
:::::::::
import
paddle
..
code
-
block
::
python
from
paddle
.
text
.
datasets
import
Movielens
class
SimpleNet
(
paddle
.
nn
.
Layer
):
import
paddle
def
__init__
(
self
):
from
paddle
.
text
.
datasets
import
Movielens
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
category
,
title
,
rating
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
return
paddle
.
sum
(
category
),
paddle
.
sum
(
title
),
paddle
.
sum
(
rating
)
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
paddle
.
disable_static
()
def
forward
(
self
,
category
,
title
,
rating
):
return
paddle
.
sum
(
category
),
paddle
.
sum
(
title
),
paddle
.
sum
(
rating
)
movielens
=
Movielens
(
mode
=
'train'
)
paddle
.
disable_static
(
)
for
i
in
range
(
10
):
movielens
=
Movielens
(
mode
=
'train'
)
category
,
title
,
rating
=
movielens
[
i
][-
3
:]
category
=
paddle
.
to_tensor
(
category
)
title
=
paddle
.
to_tensor
(
title
)
rating
=
paddle
.
to_tensor
(
rating
)
model
=
SimpleNet
()
for
i
in
range
(
10
):
category
,
title
,
rating
=
model
(
category
,
title
,
rating
)
category
,
title
,
rating
=
movielens
[
i
][-
3
:]
print
(
category
.
numpy
().
shape
,
title
.
numpy
().
shape
,
rating
.
numpy
().
shape
)
category
=
paddle
.
to_tensor
(
category
)
title
=
paddle
.
to_tensor
(
title
)
rating
=
paddle
.
to_tensor
(
rating
)
model
=
SimpleNet
()
\ No newline at end of file
category
,
title
,
rating
=
model
(
category
,
title
,
rating
)
print
(
category
.
numpy
().
shape
,
title
.
numpy
().
shape
,
rating
.
numpy
().
shape
)
doc/paddle/api/paddle/text/datasets/uci_housing/UCIHousing_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,46 +6,45 @@ UCIHousing
...
@@ -6,46 +6,45 @@ UCIHousing
..
py
:
class
::
paddle
.
text
.
datasets
.
UCIHousing
()
..
py
:
class
::
paddle
.
text
.
datasets
.
UCIHousing
()
Implementation
of
`
UCI
housing
<
https
://
archive
.
ics
.
uci
.
edu
/
ml
/
datasets
/
Housing
>`
_
该类是对
`
UCI
housing
<
https
://
archive
.
ics
.
uci
.
edu
/
ml
/
datasets
/
Housing
>`
_
dataset
测试数据集的实现。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存数据的路径,如果参数
:
attr
:`
download
`
设置为
True
,
:
attr
:`
download
`
is
True
.
Default
None
可设置为
None
。默认为
None
。
mode
(
str
):
'train'
or
'test'
mode
.
Default
'train'
.
-
mode
(
str
)
-
'train'
或
'test'
模式。默认为
'train'
。
download
(
bool
):
whether
to
download
dataset
automatically
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
is
not
set
.
Default
True
Returns
:
返回值
Dataset
:
instance
of
UCI
housing
dataset
.
:::::::::
``
Dataset
``
,
UCI
housing
数据集实例。
代码示例
代码示例
:::::::::
:::::::::
..
code
-
block
::
python
..
code
-
block
::
python
import
paddle
import
paddle
from
paddle
.
text
.
datasets
import
UCIHousing
from
paddle
.
text
.
datasets
import
UCIHousing
class
SimpleNet
(
paddle
.
nn
.
Layer
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
def
__init__
(
self
):
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
feature
,
target
):
def
forward
(
self
,
feature
,
target
):
return
paddle
.
sum
(
feature
),
target
return
paddle
.
sum
(
feature
),
target
paddle
.
disable_static
()
paddle
.
disable_static
()
uci_housing
=
UCIHousing
(
mode
=
'train'
)
uci_housing
=
UCIHousing
(
mode
=
'train'
)
for
i
in
range
(
10
):
for
i
in
range
(
10
):
feature
,
target
=
uci_housing
[
i
]
feature
,
target
=
uci_housing
[
i
]
feature
=
paddle
.
to_tensor
(
feature
)
feature
=
paddle
.
to_tensor
(
feature
)
target
=
paddle
.
to_tensor
(
target
)
target
=
paddle
.
to_tensor
(
target
)
model
=
SimpleNet
()
model
=
SimpleNet
()
feature
,
target
=
model
(
feature
,
target
)
feature
,
target
=
model
(
feature
,
target
)
print
(
feature
.
numpy
().
shape
,
target
.
numpy
())
print
(
feature
.
numpy
().
shape
,
target
.
numpy
())
\ No newline at end of file
doc/paddle/api/paddle/text/datasets/wmt14/WMT14_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,50 +6,48 @@ WMT14
...
@@ -6,50 +6,48 @@ WMT14
..
py
:
class
::
paddle
.
text
.
datasets
.
WMT14
()
..
py
:
class
::
paddle
.
text
.
datasets
.
WMT14
()
Implementation
of
`
WMT14
<
http
://
www
.
statmt
.
org
/
wmt14
/>`
_
test
dataset
.
该类是对
`
WMT14
<
http
://
www
.
statmt
.
org
/
wmt14
/>`
_
测试数据集实现。
The
original
WMT14
dataset
is
too
large
and
a
small
set
of
data
for
set
is
由于原始
WMT14
数据集太大,我们在这里提供了一组小数据集。该类将从
provided
.
This
module
will
download
dataset
from
http
://
paddlepaddle
.
bj
.
bcebos
.
com
/
demo
/
wmt_shrinked_data
/
wmt14
.
tgz
http
://
paddlepaddle
.
bj
.
bcebos
.
com
/
demo
/
wmt_shrinked_data
/
wmt14
.
tgz
下载数据集。
参数
参数
:::::::::
:::::::::
data_file
(
str
):
path
to
data
tar
file
,
can
be
set
None
if
-
data_file
(
str
)
-
保存数据集压缩文件的路径
,
如果参数
:
attr
:`
download
`
设置为
True
,可设置为
None
。
:
attr
:`
download
`
is
True
.
Default
None
默认为
None
。
mode
(
str
):
'train'
,
'test'
or
'gen'
.
Default
'train'
-
mode
(
str
)
-
'train'
,
'test'
或
'gen'
。默认为
'train'
。
dict_size
(
int
):
word
dictionary
size
.
Default
-
1.
-
dict_size
(
int
)
-
词典大小。默认为
-
1
。
download
(
bool
):
whether
to
download
dataset
automatically
if
-
download
(
bool
)
-
如果
:
attr
:`
data_file
`
未设置,是否自动下载数据集。默认为
True
。
:
attr
:`
data_file
`
is
not
set
.
Default
True
Returns
:
返回值
Dataset
:
instance
of
WMT14
dataset
代码示例
:::::::::
:::::::::
``
Dataset
``
,
WMT14
数据集实例。
..
code
-
block
::
python
代码示例
:::::::::
import
paddle
..
code
-
block
::
python
from
paddle
.
text
.
datasets
import
WMT14
class
SimpleNet
(
paddle
.
nn
.
Layer
):
import
paddle
def
__init__
(
self
):
from
paddle
.
text
.
datasets
import
WMT14
super
(
SimpleNet
,
self
).
__init__
()
def
forward
(
self
,
src_ids
,
trg_ids
,
trg_ids_next
):
class
SimpleNet
(
paddle
.
nn
.
Layer
):
return
paddle
.
sum
(
src_ids
),
paddle
.
sum
(
trg_ids
),
paddle
.
sum
(
trg_ids_next
)
def
__init__
(
self
):
super
(
SimpleNet
,
self
).
__init__
()
paddle
.
disable_static
()
def
forward
(
self
,
src_ids
,
trg_ids
,
trg_ids_next
):
return
paddle
.
sum
(
src_ids
),
paddle
.
sum
(
trg_ids
),
paddle
.
sum
(
trg_ids_next
)
wmt14
=
WMT14
(
mode
=
'train'
,
dict_size
=
50
)
paddle
.
disable_static
(
)
for
i
in
range
(
10
):
wmt14
=
WMT14
(
mode
=
'train'
,
dict_size
=
50
)
src_ids
,
trg_ids
,
trg_ids_next
=
wmt14
[
i
]
src_ids
=
paddle
.
to_tensor
(
src_ids
)
trg_ids
=
paddle
.
to_tensor
(
trg_ids
)
trg_ids_next
=
paddle
.
to_tensor
(
trg_ids_next
)
model
=
SimpleNet
()
for
i
in
range
(
10
):
src_ids
,
trg_ids
,
trg_ids_next
=
model
(
src_ids
,
trg_ids
,
trg_ids_next
)
src_ids
,
trg_ids
,
trg_ids_next
=
wmt14
[
i
]
print
(
src_ids
.
numpy
(),
trg_ids
.
numpy
(),
trg_ids_next
.
numpy
())
src_ids
=
paddle
.
to_tensor
(
src_ids
)
trg_ids
=
paddle
.
to_tensor
(
trg_ids
)
trg_ids_next
=
paddle
.
to_tensor
(
trg_ids_next
)
model
=
SimpleNet
()
\ No newline at end of file
src_ids
,
trg_ids
,
trg_ids_next
=
model
(
src_ids
,
trg_ids
,
trg_ids_next
)
print
(
src_ids
.
numpy
(),
trg_ids
.
numpy
(),
trg_ids_next
.
numpy
())
doc/paddle/api/paddle/text/datasets/wmt16/WMT16_cn.rst
浏览文件 @
aae80b4d
...
@@ -6,65 +6,64 @@ WMT16
...
@@ -6,65 +6,64 @@ WMT16
..
py
:
class
::
paddle
.
text
.
datasets
.
WMT16
()
..
py
:
class
::
paddle
.
text
.
datasets
.
WMT16
()
Implementation
of
`
WMT16
<
http
://
www
.
statmt
.
org
/
wmt16
/>`
_
test
dataset
.
该类是对
`
WMT16
<
http
://
www
.
statmt
.
org
/
wmt16
/>`
_
测试数据集实现。
ACL2016
Multimodal
Machine
Translation
.
Please
see
this
website
for
more
ACL2016
多模态机器翻译。有关更多详细信息,请访问此网站:
details
:
http
://
www
.
statmt
.
org
/
wmt16
/
multimodal
-
task
.
html
#
task1
http
://
www
.
statmt
.
org
/
wmt16
/
multimodal
-
task
.
html
#
task1
If
you
use
the
dataset
created
for
your
task
,
please
cite
the
following
paper
:
如果您任务中使用了该数据集,请引用如下论文:
Multi30K
:
Multilingual
English
-
German
Image
Descriptions
.
Multi30K
:
Multilingual
English
-
German
Image
Descriptions
.
..
code
-
block
::
text
..
code
-
block
::
text
@
article
{
elliott
-
EtAl
:
2016
:
VL16
,
@
article
{
elliott
-
EtAl
:
2016
:
VL16
,
author
=
{{
Elliott
},
D
.
and
{
Frank
},
S
.
and
{
Sima
"an}, K. and {Specia}, L.},
author
=
{{
Elliott
},
D
.
and
{
Frank
},
S
.
and
{
Sima
"an}, K. and {Specia}, L.},
title = {Multi30K: Multilingual English-German Image Descriptions},
title = {Multi30K: Multilingual English-German Image Descriptions},
booktitle = {Proceedings of the 6th Workshop on Vision and Language},
booktitle = {Proceedings of the 6th Workshop on Vision and Language},
year = {2016},
year = {2016},
pages = {70--74},
pages = {70--74},
year = 2016
year = 2016
}
}
参数
参数
:::::::::
:::::::::
data_file(str): path to data tar file, can be set None if
- data_file(str)- 保存数据集压缩文件的路径,如果参数:attr:`download`设置为True,可设置为None。
:attr:`download` is True. Default None
默认值为None。
mode(str): 'train', 'test' or 'val'. Default 'train'
- mode(str)- 'train', 'test' 或 'val'。默认为'train'。
src_dict_size(int): word dictionary size for source language word. Default -1.
- src_dict_size(int)- 源语言词典大小。默认为-1。
trg_dict_size(int): word dictionary size for target language word. Default -1.
- trg_dict_size(int) - 目标语言测点大小。默认为-1。
lang(str): source language, 'en' or 'de'. Default 'en'.
- lang(str)- 源语言,'en' 或 'de'。默认为 'en'。
download(bool): whether to download dataset automatically if
- download(bool)- 如果:attr:`data_file`未设置,是否自动下载数据集。默认为True。
:attr:`data_file` is not set. Default True
返回值
Returns:
Dataset: instance of WMT16 dataset
代码示例
:::::::::
:::::::::
``Dataset``,WMT16数据集实例。
.. code-block:: python
代码示例
:::::::::
.. code-block:: python
import paddle
import paddle
from paddle.text.datasets import WMT16
from paddle.text.datasets import WMT16
class SimpleNet(paddle.nn.Layer):
class SimpleNet(paddle.nn.Layer):
def __init__(self):
def __init__(self):
super(SimpleNet, self).__init__()
super(SimpleNet, self).__init__()
def forward(self, src_ids, trg_ids, trg_ids_next):
def forward(self, src_ids, trg_ids, trg_ids_next):
return paddle.sum(src_ids), paddle.sum(trg_ids), paddle.sum(trg_ids_next)
return paddle.sum(src_ids), paddle.sum(trg_ids), paddle.sum(trg_ids_next)
paddle.disable_static()
paddle.disable_static()
wmt16 = WMT16(mode='train', src_dict_size=50, trg_dict_size=50)
wmt16 = WMT16(mode='train', src_dict_size=50, trg_dict_size=50)
for i in range(10):
for i in range(10):
src_ids, trg_ids, trg_ids_next = wmt16[i]
src_ids, trg_ids, trg_ids_next = wmt16[i]
src_ids = paddle.to_tensor(src_ids)
src_ids = paddle.to_tensor(src_ids)
trg_ids = paddle.to_tensor(trg_ids)
trg_ids = paddle.to_tensor(trg_ids)
trg_ids_next = paddle.to_tensor(trg_ids_next)
trg_ids_next = paddle.to_tensor(trg_ids_next)
model = SimpleNet()
model = SimpleNet()
src_ids, trg_ids, trg_ids_next = model(src_ids, trg_ids, trg_ids_next)
src_ids, trg_ids, trg_ids_next = model(src_ids, trg_ids, trg_ids_next)
print(src_ids.numpy(), trg_ids.numpy(), trg_ids_next.numpy())
print(src_ids.numpy(), trg_ids.numpy(), trg_ids_next.numpy())
\ No newline at end of file
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录