- en: DataLoader2
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
- en: 原文：[https://pytorch.org/data/beta/dataloader2.html](https://pytorch.org/data/beta/dataloader2.html)
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
- en: A new, light-weight [`DataLoader2`](#torchdata.dataloader2.DataLoader2 "torchdata.dataloader2.DataLoader2")
    is introduced to decouple the overloaded data-manipulation functionalities from
    `torch.utils.data.DataLoader` to `DataPipe` operations. Besides, certain features
    can only be achieved with [`DataLoader2`](#torchdata.dataloader2.DataLoader2 "torchdata.dataloader2.DataLoader2")
    like snapshotting and switching backend services to perform high-performant operations.
  prefs: []
  type: TYPE_NORMAL
- en: DataLoader2[](#id1 "Permalink to this heading")
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
- en: '[PRE0]'
  prefs: []
  type: TYPE_PRE
- en: '`DataLoader2` is used to optimize and execute the given `DataPipe` graph based
    on `ReadingService` and `Adapter` functions, with support for'
  prefs: []
  type: TYPE_NORMAL
- en: Dynamic sharding for multiprocess and distributed data loading
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: Multiple backend `ReadingServices`
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '`DataPipe` graph in-place modification like shuffle control, memory pinning,
    etc.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: Snapshot the state of data-preprocessing pipeline (WIP)
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: 'Parameters:'
  prefs: []
  type: TYPE_NORMAL
- en: '**datapipe** (`IterDataPipe` or `MapDataPipe`) – `DataPipe` from which to load
    the data. A deepcopy of this datapipe will be made during initialization, allowing
    the input to be re-used in a different `DataLoader2` without sharing states. Input
    `None` can only be used if `load_state_dict` is called right after the creation
    of the DataLoader.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '**datapipe_adapter_fn** (`Iterable[Adapter]` or `Adapter`, optional) – `Adapter`
    function(s) that will be applied to the DataPipe (default: `None`).'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '**reading_service** ([*ReadingServiceInterface*](reading_service.html#torchdata.dataloader2.ReadingServiceInterface
    "torchdata.dataloader2.ReadingServiceInterface")*,* *optional*) – defines how
    `DataLoader2` should execute operations over the `DataPipe`, e.g. multiprocessing/distributed
    (default: `None`). A deepcopy of this will be created during initialization, allowing
    the ReadingService to be re-used in a different `DataLoader2` without sharing
    states.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: Note
  prefs: []
  type: TYPE_NORMAL
- en: When a `MapDataPipe` is passed into `DataLoader2`, in order to iterate through
    the data, `DataLoader2` will attempt to create an iterator via `iter(datapipe)`.
    If the object has a non-zero-indexed indices, this may fail. Consider using `.shuffle()`
    (which converts `MapDataPipe` to `IterDataPipe`) or `datapipe.to_iter_datapipe(custom_indices)`.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE1]'
  prefs: []
  type: TYPE_PRE
- en: Return a singleton iterator from the `DataPipe` graph adapted by `ReadingService`.
    `DataPipe` will be restored if the serialized state is provided to construct `DataLoader2`.
    And, `initialize_iteration` and `finalize_iterator` will be invoked at the beginning
    and end of the iteration correspondingly.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE2]'
  prefs: []
  type: TYPE_PRE
- en: Create new `DataLoader2` with `DataPipe` graph and `ReadingService` restored
    from the serialized state.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE3]'
  prefs: []
  type: TYPE_PRE
- en: For the existing `DataLoader2`, load serialized state to restore `DataPipe`
    graph and reset the internal state of `ReadingService`.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE4]'
  prefs: []
  type: TYPE_PRE
- en: Set random seed for DataLoader2 to control determinism.
  prefs: []
  type: TYPE_NORMAL
- en: 'Parameters:'
  prefs: []
  type: TYPE_NORMAL
- en: '**seed** – Random uint64 seed'
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE5]'
  prefs: []
  type: TYPE_PRE
- en: Shuts down `ReadingService` and clean up iterator.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE6]'
  prefs: []
  type: TYPE_PRE
- en: 'Return a dictionary to represent the state of data-processing pipeline with
    keys:'
  prefs: []
  type: TYPE_NORMAL
- en: '`serialized_datapipe`:Serialized `DataPipe` before `ReadingService` adaption.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '`reading_service_state`: The state of `ReadingService` and adapted `DataPipe`.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: 'Note: [`DataLoader2`](#torchdata.dataloader2.DataLoader2 "torchdata.dataloader2.DataLoader2")
    doesn’t support `torch.utils.data.Dataset` or `torch.utils.data.IterableDataset`.
    Please wrap each of them with the corresponding `DataPipe` below:'
  prefs: []
  type: TYPE_NORMAL
- en: '[`torchdata.datapipes.map.SequenceWrapper`](generated/torchdata.datapipes.map.SequenceWrapper.html#torchdata.datapipes.map.SequenceWrapper
    "torchdata.datapipes.map.SequenceWrapper"): `torch.utils.data.Dataset`'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '[`torchdata.datapipes.iter.IterableWrapper`](generated/torchdata.datapipes.iter.IterableWrapper.html#torchdata.datapipes.iter.IterableWrapper
    "torchdata.datapipes.iter.IterableWrapper"): `torch.utils.data.IterableDataset`'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: ReadingService[](#readingservice "Permalink to this heading")
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
- en: '`ReadingService` specifies the execution backend for the data-processing graph.
    There are three types of `ReadingServices` provided in TorchData:'
  prefs: []
  type: TYPE_NORMAL
- en: '| [`DistributedReadingService`](generated/torchdata.dataloader2.DistributedReadingService.html#torchdata.dataloader2.DistributedReadingService
    "torchdata.dataloader2.DistributedReadingService") | `DistributedReadingSerivce`
    handles distributed sharding on the graph of `DataPipe` and guarantee the randomness
    by sharing the same seed across the distributed processes. |'
  prefs: []
  type: TYPE_TB
- en: '| [`InProcessReadingService`](generated/torchdata.dataloader2.InProcessReadingService.html#torchdata.dataloader2.InProcessReadingService
    "torchdata.dataloader2.InProcessReadingService") | Default ReadingService to serve
    the [``](#id2)DataPipe` graph in the main process, and apply graph settings like
    determinism control to the graph. |'
  prefs: []
  type: TYPE_TB
- en: '| [`MultiProcessingReadingService`](generated/torchdata.dataloader2.MultiProcessingReadingService.html#torchdata.dataloader2.MultiProcessingReadingService
    "torchdata.dataloader2.MultiProcessingReadingService") | Spawns multiple worker
    processes to load data from the `DataPipe` graph. |'
  prefs: []
  type: TYPE_TB
- en: '| [`SequentialReadingService`](generated/torchdata.dataloader2.SequentialReadingService.html#torchdata.dataloader2.SequentialReadingService
    "torchdata.dataloader2.SequentialReadingService") |  |'
  prefs: []
  type: TYPE_TB
- en: Each `ReadingServices` would take the `DataPipe` graph and rewrite it to achieve
    a few features like dynamic sharding, sharing random seeds and snapshoting for
    multi-/distributed processes. For more detail about those features, please refer
    to [the documentation](reading_service.html).
  prefs: []
  type: TYPE_NORMAL
- en: Adapter[](#adapter "Permalink to this heading")
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
- en: '`Adapter` is used to configure, modify and extend the `DataPipe` graph in [`DataLoader2`](#torchdata.dataloader2.DataLoader2
    "torchdata.dataloader2.DataLoader2"). It allows in-place modification or replace
    the pre-assembled `DataPipe` graph provided by PyTorch domains. For example, `Shuffle(False)`
    can be provided to [`DataLoader2`](#torchdata.dataloader2.DataLoader2 "torchdata.dataloader2.DataLoader2"),
    which would disable any `shuffle` operations in the `DataPipes` graph.'
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE7]'
  prefs: []
  type: TYPE_PRE
- en: Adapter Base Class that follows python Callable protocol.
  prefs: []
  type: TYPE_NORMAL
- en: '[PRE8]'
  prefs: []
  type: TYPE_PRE
- en: Callable function that either runs in-place modification of the `DataPipe` graph,
    or returns a new `DataPipe` graph.
  prefs: []
  type: TYPE_NORMAL
- en: 'Parameters:'
  prefs: []
  type: TYPE_NORMAL
- en: '**datapipe** – `DataPipe` that needs to be adapted.'
  prefs: []
  type: TYPE_NORMAL
- en: 'Returns:'
  prefs: []
  type: TYPE_NORMAL
- en: Adapted `DataPipe` or new `DataPipe`.
  prefs: []
  type: TYPE_NORMAL
- en: 'Here are the list of [`Adapter`](#torchdata.dataloader2.adapter.Adapter "torchdata.dataloader2.adapter.Adapter")
    provided by TorchData in `torchdata.dataloader2.adapter`:'
  prefs: []
  type: TYPE_NORMAL
- en: '| [`Shuffle`](generated/torchdata.dataloader2.adapter.Shuffle.html#torchdata.dataloader2.adapter.Shuffle
    "torchdata.dataloader2.adapter.Shuffle") | Shuffle DataPipes adapter allows control
    over all existing Shuffler (`shuffle`) DataPipes in the graph. |'
  prefs: []
  type: TYPE_TB
- en: '| [`CacheTimeout`](generated/torchdata.dataloader2.adapter.CacheTimeout.html#torchdata.dataloader2.adapter.CacheTimeout
    "torchdata.dataloader2.adapter.CacheTimeout") | CacheTimeout DataPipes adapter
    allows control over timeouts of all existing EndOnDiskCacheHolder (`end_caching`)
    in the graph. |'
  prefs: []
  type: TYPE_TB
- en: 'And, we will provide more `Adapters` to cover data-processing options:'
  prefs: []
  type: TYPE_NORMAL
- en: '`PinMemory`: Attach a `DataPipe` at the end of the data-processing graph that
    coverts output data to `torch.Tensor` in pinned memory.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '`FullSync`: Attach a `DataPipe` to make sure the data-processing graph synchronized
    between distributed processes to prevent hanging.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '`ShardingPolicy`: Modify sharding policy if `sharding_filter` is presented
    in the `DataPipe` graph.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: '`PrefetchPolicy`, `InvalidateCache`, etc.'
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
- en: If you have feature requests about the `Adapters` you’d like to be provided,
    please open a GitHub issue. For specific needs, `DataLoader2` also accepts any
    custom `Adapter` as long as it inherits from the `Adapter` class.
  prefs: []
  type: TYPE_NORMAL