提交 bf32d1e0 编写于 作者: M Megvii Engine Team

docs(dataloader): update dataloader docstring

GitOrigin-RevId: 3e94a4bdf489f9eda7578e1b9a6b0b46bf427313
上级 4a32cc49
...@@ -41,18 +41,28 @@ def raise_timeout_error(): ...@@ -41,18 +41,28 @@ def raise_timeout_error():
class DataLoader: class DataLoader:
r"""Provides a convenient way to iterate on a given dataset. r"""Provides a convenient way to iterate on a given dataset.
The process is as follows:
DataLoader combines a dataset with
.. mermaid::
:align: center
flowchart LR
Dataset.__len__ -- Sampler --> Indices
batch_size -- Sampler --> Indices
Indices -- Dataset.__getitem__ --> Samples
Samples -- Transform + Collator --> mini-batch
DataLoader combines a :class:`~.Dataset` with
:class:`~.Sampler`, :class:`~.Transform` and :class:`~.Collator`, :class:`~.Sampler`, :class:`~.Transform` and :class:`~.Collator`,
make it flexible to get minibatch continually from a dataset. make it flexible to get minibatch continually from a dataset.
See :ref:`data-guide` for more details.
Args: Args:
dataset: dataset from which to load the minibatch. dataset: dataset from which to load the minibatch.
sampler: defines the strategy to sample data from the dataset. sampler: defines the strategy to sample data from the dataset.
If ``None``, it will sequentially sample from the dataset one by one.
transform: defined the transforming strategy for a sampled batch. transform: defined the transforming strategy for a sampled batch.
Default: None
collator: defined the merging strategy for a transformed batch. collator: defined the merging strategy for a transformed batch.
Default: None
num_workers: the number of sub-process to load, transform and collate num_workers: the number of sub-process to load, transform and collate
the batch. ``0`` means using single-process. Default: 0 the batch. ``0`` means using single-process. Default: 0
timeout: if positive, means the timeout value(second) for collecting a timeout: if positive, means the timeout value(second) for collecting a
...@@ -63,14 +73,17 @@ class DataLoader: ...@@ -63,14 +73,17 @@ class DataLoader:
``True`` means one batch is divided into :attr:`num_workers` pieces, and ``True`` means one batch is divided into :attr:`num_workers` pieces, and
the workers will process these pieces parallelly. ``False`` means the workers will process these pieces parallelly. ``False`` means
different sub-process will process different batch. Default: False different sub-process will process different batch. Default: False
preload: whether to enable the preloading strategy of the dataloader. When enabling, the dataloader will preload one batch to the device memory to speed up the whole training process. preload: whether to enable the preloading strategy of the dataloader.
All values in the map, list, and tuple will be converted to :class:`~.Tensor` by preloading, and you will get :class:`~.Tensor` instead of the original Numpy array or Python number. When enabling, the dataloader will preload one batch to the device memory to speed up the whole training process.
.. note:: .. admonition:: The effect of enabling preload
:class: warning
By enabling preload, tensors' host2device copy and device kernel execution will be overlapped, which will improve the training speed at the cost of higher device memory usage (due to one more batch data on device memory). * All elements in :class:`map`, :class:`list`, and :class:`tuple` will be converted to :class:`~.Tensor` by preloading,
This feature saves more time when your NN training time is short or your machine's host PCIe bandwidth for each device is low. and you will get :class:`~.Tensor` instead of the original Numpy array or Python built-in data structrure.
* Tensors' host2device copy and device kernel execution will be overlapped,
which will improve the training speed at the cost of **higher device memory usage** (due to one more batch data on device memory).
This feature saves more time when your NN training time is short or your machine's host PCIe bandwidth for each device is low.
""" """
__initialized = False __initialized = False
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册