Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
MegEngine 天元
MegEngine
提交
9889e82e
MegEngine
项目概览
MegEngine 天元
/
MegEngine
1 年多 前同步成功
通知
403
Star
4705
Fork
582
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
MegEngine
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
提交
9889e82e
编写于
11月 14, 2022
作者:
M
Megvii Engine Team
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
docs(imperative): add example of dataset
GitOrigin-RevId: 70f2a513cfb5c6e48b30e0e56bf469f3ac99a86a
上级
7d83a9ad
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
54 addition
and
0 deletion
+54
-0
imperative/python/megengine/data/dataloader.py
imperative/python/megengine/data/dataloader.py
+4
-0
imperative/python/megengine/data/dataset/meta_dataset.py
imperative/python/megengine/data/dataset/meta_dataset.py
+50
-0
未找到文件。
imperative/python/megengine/data/dataloader.py
浏览文件 @
9889e82e
...
...
@@ -68,6 +68,10 @@ class DataLoader:
batch from workers. Default: 0
preload: whether to enable the preloading strategy of the dataloader.
When enabling, the dataloader will preload one batch to the device memory to speed up the whole training process.
parallel_stream: whether to splitting workload across all workers when dataset is streamdataset and num_workers > 0.
When enabling, each worker will collect data from different dataset in order to speed up the whole loading process.
See ref:`streamdataset-example` for more details
.. admonition:: The effect of enabling preload
:class: warning
...
...
imperative/python/megengine/data/dataset/meta_dataset.py
浏览文件 @
9889e82e
...
...
@@ -58,6 +58,33 @@ class StreamDataset(Dataset):
r
"""An abstract class for stream data.
__iter__ method is aditionally needed.
Examples:
.. code-block:: python
from megengine.data.dataset import StreamDataset
from megengine.data.dataloader import DataLoader, get_worker_info
from megengine.data.sampler import StreamSampler
class MyStream(StreamDataset):
def __init__(self):
self.data = [iter([1, 2, 3]), iter([4, 5, 6]), iter([7, 8, 9])]
def __iter__(self):
worker_info = get_worker_info()
data_iter = self.data[worker_info.idx]
while True:
yield next(data_iter)
dataloader = DataLoader(
dataset = MyStream(),
sampler = StreamSampler(batch_size=2),
num_workers=3,
parallel_stream = True,
)
for step, data in enumerate(dataloader):
print(data)
"""
@
abstractmethod
...
...
@@ -80,6 +107,29 @@ class ArrayDataset(Dataset):
One or more numpy arrays are needed to initiate the dataset.
And the dimensions represented sample number are expected to be the same.
Examples:
.. code-block:: python
from megengine.data.dataset import ArrayDataset
from megengine.data.dataloader import DataLoader
from megengine.data.sampler import SequentialSampler
rand_data = np.random.randint(0, 255, size=(sample_num, 1, 32, 32), dtype=np.uint8)
label = np.random.randint(0, 10, size=(sample_num,), dtype=int)
dataset = ArrayDataset(rand_data, label)
seque_sampler = SequentialSampler(dataset, batch_size=2)
dataloader = DataLoader(
dataset,
sampler = seque_sampler,
num_workers=3,
)
for step, data in enumerate(dataloader):
print(data)
"""
def
__init__
(
self
,
*
arrays
):
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录