Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
d55cfc60
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
d55cfc60
编写于
6月 22, 2018
作者:
T
Tao Luo
提交者:
GitHub
6月 22, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #11640 from wojtuss/wojtuss/cycle-cifar-flowers
added cycling the cifar and flowers datasets
上级
dda24f18
50e750a2
变更
2
显示空白变更内容
内联
并排
Showing
2 changed file
with
52 addition
and
25 deletion
+52
-25
python/paddle/v2/dataset/cifar.py
python/paddle/v2/dataset/cifar.py
+18
-9
python/paddle/v2/dataset/flowers.py
python/paddle/v2/dataset/flowers.py
+34
-16
未找到文件。
python/paddle/v2/dataset/cifar.py
浏览文件 @
d55cfc60
...
...
@@ -43,7 +43,7 @@ CIFAR100_URL = URL_PREFIX + 'cifar-100-python.tar.gz'
CIFAR100_MD5
=
'eb9058c3a382ffc7106e4002c42a8d85'
def
reader_creator
(
filename
,
sub_name
):
def
reader_creator
(
filename
,
sub_name
,
cycle
=
False
):
def
read_batch
(
batch
):
data
=
batch
[
'data'
]
labels
=
batch
.
get
(
'labels'
,
batch
.
get
(
'fine_labels'
,
None
))
...
...
@@ -56,10 +56,13 @@ def reader_creator(filename, sub_name):
names
=
(
each_item
.
name
for
each_item
in
f
if
sub_name
in
each_item
.
name
)
while
True
:
for
name
in
names
:
batch
=
cPickle
.
load
(
f
.
extractfile
(
name
))
for
item
in
read_batch
(
batch
):
yield
item
if
not
cycle
:
break
return
reader
...
...
@@ -94,34 +97,40 @@ def test100():
'test'
)
def
train10
():
def
train10
(
cycle
=
False
):
"""
CIFAR-10 training set creator.
It returns a reader creator, each sample in the reader is image pixels in
[0, 1] and label in [0, 9].
:param cycle: whether to cycle through the dataset
:type cycle: bool
:return: Training reader creator
:rtype: callable
"""
return
reader_creator
(
paddle
.
v2
.
dataset
.
common
.
download
(
CIFAR10_URL
,
'cifar'
,
CIFAR10_MD5
),
'data_batch'
)
'data_batch'
,
cycle
=
cycle
)
def
test10
():
def
test10
(
cycle
=
False
):
"""
CIFAR-10 test set creator.
It returns a reader creator, each sample in the reader is image pixels in
[0, 1] and label in [0, 9].
:param cycle: whether to cycle through the dataset
:type cycle: bool
:return: Test reader creator.
:rtype: callable
"""
return
reader_creator
(
paddle
.
v2
.
dataset
.
common
.
download
(
CIFAR10_URL
,
'cifar'
,
CIFAR10_MD5
),
'test_batch'
)
'test_batch'
,
cycle
=
cycle
)
def
fetch
():
...
...
python/paddle/v2/dataset/flowers.py
浏览文件 @
d55cfc60
...
...
@@ -76,7 +76,8 @@ def reader_creator(data_file,
dataset_name
,
mapper
,
buffered_size
=
1024
,
use_xmap
=
True
):
use_xmap
=
True
,
cycle
=
False
):
'''
1. read images from tar file and
merge images into batch files in 102flowers.tgz_batch/
...
...
@@ -96,6 +97,8 @@ def reader_creator(data_file,
:type mapper: callable
:param buffered_size: the size of buffer used to process images
:type buffered_size: int
:param cycle: whether to cycle through the dataset
:type cycle: bool
:return: data reader
:rtype: callable
'''
...
...
@@ -108,6 +111,7 @@ def reader_creator(data_file,
file_list
=
batch_images_from_tar
(
data_file
,
dataset_name
,
img2label
)
def
reader
():
while
True
:
for
file
in
open
(
file_list
):
file
=
file
.
strip
()
batch
=
None
...
...
@@ -117,6 +121,8 @@ def reader_creator(data_file,
labels
=
batch
[
'label'
]
for
sample
,
label
in
itertools
.
izip
(
data
,
batch
[
'label'
]):
yield
sample
,
int
(
label
)
-
1
if
not
cycle
:
break
if
use_xmap
:
cpu_num
=
int
(
os
.
environ
.
get
(
'CPU_NUM'
,
cpu_count
()))
...
...
@@ -125,7 +131,7 @@ def reader_creator(data_file,
return
map_readers
(
mapper
,
reader
)
def
train
(
mapper
=
train_mapper
,
buffered_size
=
1024
,
use_xmap
=
True
):
def
train
(
mapper
=
train_mapper
,
buffered_size
=
1024
,
use_xmap
=
True
,
cycle
=
False
):
'''
Create flowers training set reader.
It returns a reader, each sample in the reader is
...
...
@@ -138,17 +144,23 @@ def train(mapper=train_mapper, buffered_size=1024, use_xmap=True):
:type mapper: callable
:param buffered_size: the size of buffer used to process images
:type buffered_size: int
:param cycle: whether to cycle through the dataset
:type cycle: bool
:return: train data reader
:rtype: callable
'''
return
reader_creator
(
download
(
DATA_URL
,
'flowers'
,
DATA_MD5
),
download
(
LABEL_URL
,
'flowers'
,
LABEL_MD5
),
download
(
SETID_URL
,
'flowers'
,
SETID_MD5
),
TRAIN_FLAG
,
mapper
,
buffered_size
,
use_xmap
)
download
(
SETID_URL
,
'flowers'
,
SETID_MD5
),
TRAIN_FLAG
,
mapper
,
buffered_size
,
use_xmap
,
cycle
=
cycle
)
def
test
(
mapper
=
test_mapper
,
buffered_size
=
1024
,
use_xmap
=
True
):
def
test
(
mapper
=
test_mapper
,
buffered_size
=
1024
,
use_xmap
=
True
,
cycle
=
False
):
'''
Create flowers test set reader.
It returns a reader, each sample in the reader is
...
...
@@ -161,14 +173,20 @@ def test(mapper=test_mapper, buffered_size=1024, use_xmap=True):
:type mapper: callable
:param buffered_size: the size of buffer used to process images
:type buffered_size: int
:param cycle: whether to cycle through the dataset
:type cycle: bool
:return: test data reader
:rtype: callable
'''
return
reader_creator
(
download
(
DATA_URL
,
'flowers'
,
DATA_MD5
),
download
(
LABEL_URL
,
'flowers'
,
LABEL_MD5
),
download
(
SETID_URL
,
'flowers'
,
SETID_MD5
),
TEST_FLAG
,
mapper
,
buffered_size
,
use_xmap
)
download
(
SETID_URL
,
'flowers'
,
SETID_MD5
),
TEST_FLAG
,
mapper
,
buffered_size
,
use_xmap
,
cycle
=
cycle
)
def
valid
(
mapper
=
test_mapper
,
buffered_size
=
1024
,
use_xmap
=
True
):
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录