Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Paddle
提交
bf09dcb3
P
Paddle
项目概览
PaddlePaddle
/
Paddle
大约 1 年 前同步成功
通知
2299
Star
20931
Fork
5422
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1423
列表
看板
标记
里程碑
合并请求
543
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1,423
Issue
1,423
列表
看板
标记
里程碑
合并请求
543
合并请求
543
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
bf09dcb3
编写于
3月 25, 2021
作者:
K
Kaipeng Deng
提交者:
GitHub
3月 25, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add GPU tensor notice & update default_collate_fn/default_convert_fn. test=develop (#31763)
上级
27f2d8df
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
42 addition
and
11 deletion
+42
-11
python/paddle/fluid/dataloader/collate.py
python/paddle/fluid/dataloader/collate.py
+36
-11
python/paddle/fluid/reader.py
python/paddle/fluid/reader.py
+6
-0
未找到文件。
python/paddle/fluid/dataloader/collate.py
浏览文件 @
bf09dcb3
...
@@ -27,24 +27,31 @@ except:
...
@@ -27,24 +27,31 @@ except:
def
default_collate_fn
(
batch
):
def
default_collate_fn
(
batch
):
"""
"""
Default batch collating function for :code:`paddle.io.DataLoader`,
Default batch collating function for :code:`paddle.io.DataLoader`,
batch should be a list of samples, and each sample should be a list
get input data as a list of sample datas, each element in list
of fields as follows:
if the data of a sample, and sample data should composed of list,
dictionary, string, number, numpy array and paddle.Tensor, this
function will parse input data recursively and stack number,
numpy array and paddle.Tensor datas as batch datas. e.g. for
following input data:
[{'image': np.array(shape=[3, 224, 224]), 'label': 1},
{'image': np.array(shape=[3, 224, 224]), 'label': 3},
{'image': np.array(shape=[3, 224, 224]), 'label': 4},
{'image': np.array(shape=[3, 224, 224]), 'label': 5},]
[[filed1, filed2, ...], [filed1, filed2, ...], ...]
This default collate function zipped each filed together and stack
This default collate function zipped each number and numpy array
each filed as the batch field as follows:
field together and stack each field as the batch field as follows:
{'image': np.array(shape=[4, 3, 224, 224]), 'label': np.array([1, 3, 4, 5])}
[batch_filed1, batch_filed2, ...]
Args:
Args:
batch(list of list of numpy array|paddle.Tensor): the batch data, each fields
batch(list of sample data): batch should be a list of sample data.
should be a numpy array, each sample should be a list of
fileds, and batch should be a list of sample.
Returns:
Returns:
a list of numpy array|Paddle.Tensor: collated batch of input batch data,
Batched data: batched each number, numpy array and paddle.Tensor
fields data type as same as fields in each sample
.
in input data
.
"""
"""
sample
=
batch
[
0
]
sample
=
batch
[
0
]
if
isinstance
(
sample
,
np
.
ndarray
):
if
isinstance
(
sample
,
np
.
ndarray
):
...
@@ -75,6 +82,24 @@ def default_collate_fn(batch):
...
@@ -75,6 +82,24 @@ def default_collate_fn(batch):
def
default_convert_fn
(
batch
):
def
default_convert_fn
(
batch
):
"""
Default batch converting function for :code:`paddle.io.DataLoader`.
get input data as a list of sample datas, each element in list
if the data of a sample, and sample data should composed of list,
dictionary, string, number, numpy array and paddle.Tensor.
.. note::
This function is default :attr:`collate_fn` in **Distable
automatic batching** mode, for **Distable automatic batching**
mode, please ses :attr:`paddle.io.DataLoader`
Args:
batch(list of sample data): batch should be a list of sample data.
Returns:
Batched data: batched each number, numpy array and paddle.Tensor
in input data.
"""
if
isinstance
(
batch
,
(
paddle
.
Tensor
,
np
.
ndarray
)):
if
isinstance
(
batch
,
(
paddle
.
Tensor
,
np
.
ndarray
)):
return
batch
return
batch
elif
isinstance
(
batch
,
(
str
,
bytes
)):
elif
isinstance
(
batch
,
(
str
,
bytes
)):
...
...
python/paddle/fluid/reader.py
浏览文件 @
bf09dcb3
...
@@ -165,6 +165,12 @@ class DataLoader(object):
...
@@ -165,6 +165,12 @@ class DataLoader(object):
For :code:`batch_sampler` please see :code:`paddle.io.BatchSampler`
For :code:`batch_sampler` please see :code:`paddle.io.BatchSampler`
.. note::
GPU tensor operation is not supported in subprocess currently,
please don't use GPU tensor operations in pipeline which will
be performed in subprocess, such as dataset transforms, collte_fn,
etc. Numpy array and CPU tensor operation is supported.
**Disable automatic batching**
**Disable automatic batching**
In certain cases such as some NLP tasks, instead of automatic batching,
In certain cases such as some NLP tasks, instead of automatic batching,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录