Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
magicwindyyd
mindspore
提交
cb2814b4
M
mindspore
项目概览
magicwindyyd
/
mindspore
与 Fork 源项目一致
Fork自
MindSpore / mindspore
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
mindspore
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
cb2814b4
编写于
5月 11, 2020
作者:
J
jiangzhiwen
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
flat_map first commit
上级
6b4a6e55
变更
6
隐藏空白更改
内联
并排
Showing
6 changed file
with
127 addition
and
0 deletion
+127
-0
mindspore/dataset/engine/datasets.py
mindspore/dataset/engine/datasets.py
+44
-0
tests/ut/data/dataset/test_flat_map/image_index.txt
tests/ut/data/dataset/test_flat_map/image_index.txt
+2
-0
tests/ut/data/dataset/test_flat_map/images.txt
tests/ut/data/dataset/test_flat_map/images.txt
+3
-0
tests/ut/data/dataset/test_flat_map/images1.txt
tests/ut/data/dataset/test_flat_map/images1.txt
+3
-0
tests/ut/data/dataset/test_flat_map/images2.txt
tests/ut/data/dataset/test_flat_map/images2.txt
+3
-0
tests/ut/python/dataset/test_flat_map.py
tests/ut/python/dataset/test_flat_map.py
+72
-0
未找到文件。
mindspore/dataset/engine/datasets.py
浏览文件 @
cb2814b4
...
...
@@ -268,6 +268,50 @@ class Dataset:
"""
return
ShuffleDataset
(
self
,
buffer_size
)
def
flat_map
(
self
,
func
):
"""
Maps `func` to each row in dataset and flatten the result.
The specified `func` is a function that must take one 'Ndarray' as input
and return a 'Dataset'.
Args:
func (function): A function that must take one 'Ndarray' as an argument and
return a 'Dataset'.
Returns:
Dataset, applied by the function.
Examples:
>>> import mindspore.dataset as ds
>>> import mindspore.dataset.transforms.nlp.utils as nlp
>>> # declare a function which returns a Dataset object
>>> def flat_map_func(x):
>>> data_dir = nlp.as_text(x[0])
>>> d = ds.ImageFolderDatasetV2(data_dir)
>>> return d
>>> # data is a Dataset object
>>> data = ds.TextFileDataset(DATA_FILE)
>>> data = data.flat_map(flat_map_func)
Raises:
TypeError: If `func` is not a function.
TypeError: If `func` doesn't return a Dataset.
"""
dataset
=
None
if
not
hasattr
(
func
,
'__call__'
):
raise
TypeError
(
"func must be a function."
)
for
row_data
in
self
:
if
dataset
is
None
:
dataset
=
func
(
row_data
)
else
:
dataset
+=
func
(
row_data
)
if
not
isinstance
(
dataset
,
Dataset
):
raise
TypeError
(
"flat_map must return a Dataset object."
)
return
dataset
@
check_map
def
map
(
self
,
input_columns
=
None
,
operations
=
None
,
output_columns
=
None
,
columns_order
=
None
,
num_parallel_workers
=
None
,
python_multiprocessing
=
False
):
...
...
tests/ut/data/dataset/test_flat_map/image_index.txt
0 → 100644
浏览文件 @
cb2814b4
../data/dataset/test_flat_map/images1.txt
../data/dataset/test_flat_map/images2.txt
\ No newline at end of file
tests/ut/data/dataset/test_flat_map/images.txt
0 → 100644
浏览文件 @
cb2814b4
../data/dataset/testPK/data
../data/dataset/testImageNetData/train
../data/dataset/testImageNetData2/train
\ No newline at end of file
tests/ut/data/dataset/test_flat_map/images1.txt
0 → 100644
浏览文件 @
cb2814b4
../data/dataset/testPK/data
../data/dataset/testImageNetData/train
../data/dataset/testImageNetData2/train
\ No newline at end of file
tests/ut/data/dataset/test_flat_map/images2.txt
0 → 100644
浏览文件 @
cb2814b4
../data/dataset/testPK/data
../data/dataset/testImageNetData/train
../data/dataset/testImageNetData2/train
\ No newline at end of file
tests/ut/python/dataset/test_flat_map.py
0 → 100644
浏览文件 @
cb2814b4
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
numpy
as
np
import
mindspore.dataset
as
ds
DATA_FILE
=
"../data/dataset/test_flat_map/images1.txt"
INDEX_FILE
=
"../data/dataset/test_flat_map/image_index.txt"
def
test_flat_map_1
():
'''
DATA_FILE records the path of image folders, load the images from them.
'''
import
mindspore.dataset.transforms.nlp.utils
as
nlp
def
flat_map_func
(
x
):
data_dir
=
nlp
.
as_text
(
x
[
0
])
d
=
ds
.
ImageFolderDatasetV2
(
data_dir
)
return
d
data
=
ds
.
TextFileDataset
(
DATA_FILE
)
data
=
data
.
flat_map
(
flat_map_func
)
count
=
0
for
d
in
data
:
assert
isinstance
(
d
[
0
],
np
.
ndarray
)
count
+=
1
assert
count
==
52
def
test_flat_map_2
():
'''
Flatten 3D structure data
'''
import
mindspore.dataset.transforms.nlp.utils
as
nlp
def
flat_map_func_1
(
x
):
data_dir
=
nlp
.
as_text
(
x
[
0
])
d
=
ds
.
ImageFolderDatasetV2
(
data_dir
)
return
d
def
flat_map_func_2
(
x
):
text_file
=
nlp
.
as_text
(
x
[
0
])
d
=
ds
.
TextFileDataset
(
text_file
)
d
=
d
.
flat_map
(
flat_map_func_1
)
return
d
data
=
ds
.
TextFileDataset
(
INDEX_FILE
)
data
=
data
.
flat_map
(
flat_map_func_2
)
count
=
0
for
d
in
data
:
assert
isinstance
(
d
[
0
],
np
.
ndarray
)
count
+=
1
assert
count
==
104
if
__name__
==
"__main__"
:
test_flat_map_1
()
test_flat_map_2
()
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录