- en: torchaudio.datasets
  id: totrans-0
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
  zh: torchaudio.datasets
- en: 原文：[https://pytorch.org/audio/stable/datasets.html](https://pytorch.org/audio/stable/datasets.html)
  id: totrans-1
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
  zh: 原文：[https://pytorch.org/audio/stable/datasets.html](https://pytorch.org/audio/stable/datasets.html)
- en: All datasets are subclasses of [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset
    "(in PyTorch v2.1)") and have `__getitem__` and `__len__` methods implemented.
  id: totrans-2
  prefs: []
  type: TYPE_NORMAL
  zh: 所有数据集都是 [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset
    "(在 PyTorch v2.1 中)") 的子类，并实现了 `__getitem__` 和 `__len__` 方法。
- en: 'Hence, they can all be passed to a [`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader
    "(in PyTorch v2.1)") which can load multiple samples parallelly using [`torch.multiprocessing`](https://pytorch.org/docs/stable/multiprocessing.html#module-torch.multiprocessing
    "(in PyTorch v2.1)") workers. For example:'
  id: totrans-3
  prefs: []
  type: TYPE_NORMAL
  zh: 因此，它们都可以传递给 [`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader
    "(在 PyTorch v2.1 中)")，该加载器可以使用 [`torch.multiprocessing`](https://pytorch.org/docs/stable/multiprocessing.html#module-torch.multiprocessing
    "(在 PyTorch v2.1 中)") 工作器并行加载多个样本。例如：
- en: '[PRE0]'
  id: totrans-4
  prefs: []
  type: TYPE_PRE
  zh: '[PRE0]'
- en: '| [`CMUARCTIC`](generated/torchaudio.datasets.CMUARCTIC.html#torchaudio.datasets.CMUARCTIC
    "torchaudio.datasets.CMUARCTIC") | *CMU ARCTIC* [[Kominek *et al.*, 2003](references.html#id36
    "John Kominek, Alan W Black, and Ver Ver. Cmu arctic databases for speech synthesis.
    Technical Report, 2003.")] dataset. |'
  id: totrans-5
  prefs: []
  type: TYPE_TB
  zh: '| [`CMUARCTIC`](generated/torchaudio.datasets.CMUARCTIC.html#torchaudio.datasets.CMUARCTIC
    "torchaudio.datasets.CMUARCTIC") | *CMU ARCTIC* [[Kominek *et al.*, 2003](references.html#id36
    "John Kominek, Alan W Black, and Ver Ver. Cmu arctic databases for speech synthesis.
    Technical Report, 2003.")] 数据集。|'
- en: '| [`CMUDict`](generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict
    "torchaudio.datasets.CMUDict") | *CMU Pronouncing Dictionary* [[Weide, 1998](references.html#id45
    "R.L. Weide. The carnegie mellon pronuncing dictionary. 1998\. URL: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.")]
    (CMUDict) dataset. |'
  id: totrans-6
  prefs: []
  type: TYPE_TB
  zh: '| [`CMUDict`](generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict
    "torchaudio.datasets.CMUDict") | *CMU Pronouncing Dictionary* [[Weide, 1998](references.html#id45
    "R.L. Weide. The carnegie mellon pronuncing dictionary. 1998\. URL: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.")]
    (CMUDict) 数据集。|'
- en: '| [`COMMONVOICE`](generated/torchaudio.datasets.COMMONVOICE.html#torchaudio.datasets.COMMONVOICE
    "torchaudio.datasets.COMMONVOICE") | *CommonVoice* [[Ardila *et al.*, 2020](references.html#id10
    "Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler,
    Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, and Gregor Weber.
    Common voice: a massively-multilingual speech corpus. 2020\. arXiv:1912.06670.")]
    dataset. |'
  id: totrans-7
  prefs: []
  type: TYPE_TB
  zh: '| [`COMMONVOICE`](generated/torchaudio.datasets.COMMONVOICE.html#torchaudio.datasets.COMMONVOICE
    "torchaudio.datasets.COMMONVOICE") | *CommonVoice* [[Ardila *et al.*, 2020](references.html#id10
    "Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler,
    Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, and Gregor Weber.
    Common voice: a massively-multilingual speech corpus. 2020\. arXiv:1912.06670.")]
    数据集。|'
- en: '| [`DR_VCTK`](generated/torchaudio.datasets.DR_VCTK.html#torchaudio.datasets.DR_VCTK
    "torchaudio.datasets.DR_VCTK") | *Device Recorded VCTK (Small subset version)*
    [[Sarfjoo and Yamagishi, 2018](references.html#id42 "Seyyed Saeed Sarfjoo and
    Junichi Yamagishi. Device recorded vctk (small subset version). 2018.")] dataset.
    |'
  id: totrans-8
  prefs: []
  type: TYPE_TB
  zh: '| [`DR_VCTK`](generated/torchaudio.datasets.DR_VCTK.html#torchaudio.datasets.DR_VCTK
    "torchaudio.datasets.DR_VCTK") | *Device Recorded VCTK (Small subset version)*
    [[Sarfjoo and Yamagishi, 2018](references.html#id42 "Seyyed Saeed Sarfjoo and
    Junichi Yamagishi. Device recorded vctk (small subset version). 2018.")] 数据集。|'
- en: '| [`FluentSpeechCommands`](generated/torchaudio.datasets.FluentSpeechCommands.html#torchaudio.datasets.FluentSpeechCommands
    "torchaudio.datasets.FluentSpeechCommands") | *Fluent Speech Commands* [[Lugosch
    *et al.*, 2019](references.html#id48 "Loren Lugosch, Mirco Ravanelli, Patrick
    Ignoto, Vikrant Singh Tomar, and Yoshua Bengio. Speech model pre-training for
    end-to-end spoken language understanding. In Gernot Kubin and Zdravko Kacic, editors,
    Proc. of Interspeech, 814–818\. 2019.")] dataset |'
  id: totrans-9
  prefs: []
  type: TYPE_TB
  zh: '| [`FluentSpeechCommands`](generated/torchaudio.datasets.FluentSpeechCommands.html#torchaudio.datasets.FluentSpeechCommands
    "torchaudio.datasets.FluentSpeechCommands") | *Fluent Speech Commands* [[Lugosch
    *et al.*, 2019](references.html#id48 "Loren Lugosch, Mirco Ravanelli, Patrick
    Ignoto, Vikrant Singh Tomar, and Yoshua Bengio. Speech model pre-training for
    end-to-end spoken language understanding. In Gernot Kubin and Zdravko Kacic, editors,
    Proc. of Interspeech, 814–818\. 2019.")] 数据集|'
- en: '| [`GTZAN`](generated/torchaudio.datasets.GTZAN.html#torchaudio.datasets.GTZAN
    "torchaudio.datasets.GTZAN") | *GTZAN* [[Tzanetakis *et al.*, 2001](references.html#id43
    "George Tzanetakis, Georg Essl, and Perry Cook. Automatic musical genre classification
    of audio signals. 2001\. URL: http://ismir2001.ismir.net/pdf/tzanetakis.pdf.")]
    dataset. |'
  id: totrans-10
  prefs: []
  type: TYPE_TB
  zh: '| [`GTZAN`](generated/torchaudio.datasets.GTZAN.html#torchaudio.datasets.GTZAN
    "torchaudio.datasets.GTZAN") | *GTZAN* [[Tzanetakis *et al.*, 2001](references.html#id43
    "George Tzanetakis, Georg Essl, and Perry Cook. Automatic musical genre classification
    of audio signals. 2001\. URL: http://ismir2001.ismir.net/pdf/tzanetakis.pdf.")]
    数据集。|'
- en: '| [`IEMOCAP`](generated/torchaudio.datasets.IEMOCAP.html#torchaudio.datasets.IEMOCAP
    "torchaudio.datasets.IEMOCAP") | *IEMOCAP* [[Busso *et al.*, 2008](references.html#id52
    "Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower Provost,
    Samuel Kim, Jeannette Chang, Sungbok Lee, and Shrikanth Narayanan. Iemocap: interactive
    emotional dyadic motion capture database. Language Resources and Evaluation, 42:335-359,
    12 2008\. doi:10.1007/s10579-008-9076-6.")] dataset. |'
  id: totrans-11
  prefs: []
  type: TYPE_TB
  zh: '| [`IEMOCAP`](generated/torchaudio.datasets.IEMOCAP.html#torchaudio.datasets.IEMOCAP
    "torchaudio.datasets.IEMOCAP") | *IEMOCAP* [[Busso *et al.*, 2008](references.html#id52
    "Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower Provost,
    Samuel Kim, Jeannette Chang, Sungbok Lee, and Shrikanth Narayanan. Iemocap: interactive
    emotional dyadic motion capture database. Language Resources and Evaluation, 42:335-359,
    12 2008\. doi:10.1007/s10579-008-9076-6.")] 数据集。|'
- en: '| [`LibriMix`](generated/torchaudio.datasets.LibriMix.html#torchaudio.datasets.LibriMix
    "torchaudio.datasets.LibriMix") | *LibriMix* [[Cosentino *et al.*, 2020](references.html#id37
    "Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, and Emmanuel
    Vincent. Librimix: an open-source dataset for generalizable speech separation.
    2020\. arXiv:2005.11262.")] dataset. |'
  id: totrans-12
  prefs: []
  type: TYPE_TB
  zh: '| [`LibriMix`](generated/torchaudio.datasets.LibriMix.html#torchaudio.datasets.LibriMix
    "torchaudio.datasets.LibriMix") | *LibriMix* [[Cosentino *et al.*, 2020](references.html#id37
    "Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, and Emmanuel
    Vincent. Librimix: an open-source dataset for generalizable speech separation.
    2020\. arXiv:2005.11262.")] 数据集。|'
- en: '| [`LIBRISPEECH`](generated/torchaudio.datasets.LIBRISPEECH.html#torchaudio.datasets.LIBRISPEECH
    "torchaudio.datasets.LIBRISPEECH") | *LibriSpeech* [[Panayotov *et al.*, 2015](references.html#id13
    "Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech:
    an asr corpus based on public domain audio books. In 2015 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), volume, 5206-5210\. 2015\.
    doi:10.1109/ICASSP.2015.7178964.")] dataset. |'
  id: totrans-13
  prefs: []
  type: TYPE_TB
  zh: '| [`LIBRISPEECH`](generated/torchaudio.datasets.LIBRISPEECH.html#torchaudio.datasets.LIBRISPEECH
    "torchaudio.datasets.LIBRISPEECH") | *LibriSpeech* [[Panayotov *et al.*, 2015](references.html#id13
    "Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech:
    an asr corpus based on public domain audio books. In 2015 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), volume, 5206-5210\. 2015\.
    doi:10.1109/ICASSP.2015.7178964.")] 数据集。|'
- en: '| [`LibriLightLimited`](generated/torchaudio.datasets.LibriLightLimited.html#torchaudio.datasets.LibriLightLimited
    "torchaudio.datasets.LibriLightLimited") | Subset of Libri-light [[Kahn *et al.*,
    2020](references.html#id12 "J. Kahn, M. Rivière, W. Zheng, E. Kharitonov, Q. Xu,
    P. E. Mazaré, J. Karadayi, V. Liptchinsky, R. Collobert, C. Fuegen, T. Likhomanenko,
    G. Synnaeve, A. Joulin, A. Mohamed, and E. Dupoux. Libri-light: a benchmark for
    asr with limited or no supervision. In ICASSP 2020 - 2020 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), 7669-7673\. 2020\. \url https://github.com/facebookresearch/libri-light.")]
    dataset, which was used in HuBERT [[Hsu *et al.*, 2021](references.html#id16 "Wei-Ning
    Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov,
    and Abdelrahman Mohamed. Hubert: self-supervised speech representation learning
    by masked prediction of hidden units. 2021\. arXiv:2106.07447.")] for supervised
    fine-tuning. |'
  id: totrans-14
  prefs: []
  type: TYPE_TB
  zh: '| [`LibriLightLimited`](generated/torchaudio.datasets.LibriLightLimited.html#torchaudio.datasets.LibriLightLimited
    "torchaudio.datasets.LibriLightLimited") | Libri-light的子集 [[Kahn *et al.*, 2020](references.html#id12
    "J. Kahn, M. Rivière, W. Zheng, E. Kharitonov, Q. Xu, P. E. Mazaré, J. Karadayi,
    V. Liptchinsky, R. Collobert, C. Fuegen, T. Likhomanenko, G. Synnaeve, A. Joulin,
    A. Mohamed, and E. Dupoux. Libri-light: a benchmark for asr with limited or no
    supervision. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics,
    Speech and Signal Processing (ICASSP), 7669-7673\. 2020\. \url https://github.com/facebookresearch/libri-light.")]
    数据集，被用于HuBERT [[Hsu *et al.*, 2021](references.html#id16 "Wei-Ning Hsu, Benjamin
    Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman
    Mohamed. Hubert: self-supervised speech representation learning by masked prediction
    of hidden units. 2021\. arXiv:2106.07447.")] 进行监督微调。|'
- en: '| [`LIBRITTS`](generated/torchaudio.datasets.LIBRITTS.html#torchaudio.datasets.LIBRITTS
    "torchaudio.datasets.LIBRITTS") | *LibriTTS* [[Zen *et al.*, 2019](references.html#id38
    "Heiga Zen, Viet-Trung Dang, Robert A. J. Clark, Yu Zhang, Ron J. Weiss, Ye Jia,
    Z. Chen, and Yonghui Wu. Libritts: a corpus derived from librispeech for text-to-speech.
    ArXiv, 2019.")] dataset. |'
  id: totrans-15
  prefs: []
  type: TYPE_TB
  zh: '| [`LIBRITTS`](generated/torchaudio.datasets.LIBRITTS.html#torchaudio.datasets.LIBRITTS
    "torchaudio.datasets.LIBRITTS") | *LibriTTS* [[Zen *et al.*, 2019](references.html#id38
    "Heiga Zen, Viet-Trung Dang, Robert A. J. Clark, Yu Zhang, Ron J. Weiss, Ye Jia,
    Z. Chen, and Yonghui Wu. Libritts: a corpus derived from librispeech for text-to-speech.
    ArXiv, 2019.")] 数据集。|'
- en: '| [`LJSPEECH`](generated/torchaudio.datasets.LJSPEECH.html#torchaudio.datasets.LJSPEECH
    "torchaudio.datasets.LJSPEECH") | *LJSpeech-1.1* [[Ito and Johnson, 2017](references.html#id7
    "Keith Ito and Linda Johnson. The lj speech dataset. \url https://keithito.com/LJ-Speech-Dataset/,
    2017.")] dataset. |'
  id: totrans-16
  prefs: []
  type: TYPE_TB
  zh: '| [`LJSPEECH`](generated/torchaudio.datasets.LJSPEECH.html#torchaudio.datasets.LJSPEECH
    "torchaudio.datasets.LJSPEECH") | *LJSpeech-1.1* [[Ito and Johnson, 2017](references.html#id7
    "Keith Ito and Linda Johnson. The lj speech dataset. \url https://keithito.com/LJ-Speech-Dataset/,
    2017.")] 数据集。|'
- en: '| [`MUSDB_HQ`](generated/torchaudio.datasets.MUSDB_HQ.html#torchaudio.datasets.MUSDB_HQ
    "torchaudio.datasets.MUSDB_HQ") | *MUSDB_HQ* [[Rafii *et al.*, 2019](references.html#id47
    "Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis,
    and Rachel Bittner. MUSDB18-HQ - an uncompressed version of musdb18\. December
    2019\. URL: https://doi.org/10.5281/zenodo.3338373, doi:10.5281/zenodo.3338373.")]
    dataset. |'
  id: totrans-17
  prefs: []
  type: TYPE_TB
  zh: '| [`MUSDB_HQ`](generated/torchaudio.datasets.MUSDB_HQ.html#torchaudio.datasets.MUSDB_HQ
    "torchaudio.datasets.MUSDB_HQ") | *MUSDB_HQ* [[Rafii *et al.*, 2019](references.html#id47
    "Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis,
    and Rachel Bittner. MUSDB18-HQ - an uncompressed version of musdb18\. December
    2019\. URL: https://doi.org/10.5281/zenodo.3338373, doi:10.5281/zenodo.3338373.")]
    数据集。|'
- en: '| [`QUESST14`](generated/torchaudio.datasets.QUESST14.html#torchaudio.datasets.QUESST14
    "torchaudio.datasets.QUESST14") | *QUESST14* [[Miro *et al.*, 2015](references.html#id44
    "Xavier Anguera Miro, Luis Javier Rodriguez-Fuentes, Andi Buzo, Florian Metze,
    Igor Szoke, and Mikel Peñagarikano. Quesst2014: evaluating query-by-example speech
    search in a zero-resource setting with real-life queries. 2015 IEEE International
    Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5833-5837,
    2015.")] dataset. |'
  id: totrans-18
  prefs: []
  type: TYPE_TB
  zh: '| [`QUESST14`](generated/torchaudio.datasets.QUESST14.html#torchaudio.datasets.QUESST14
    "torchaudio.datasets.QUESST14") | *QUESST14* [[Miro *et al.*, 2015](references.html#id44
    "Xavier Anguera Miro, Luis Javier Rodriguez-Fuentes, Andi Buzo, Florian Metze,
    Igor Szoke, and Mikel Peñagarikano. Quesst2014: evaluating query-by-example speech
    search in a zero-resource setting with real-life queries. 2015 IEEE International
    Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5833-5837,
    2015.")] 数据集。|'
- en: '| [`Snips`](generated/torchaudio.datasets.Snips.html#torchaudio.datasets.Snips
    "torchaudio.datasets.Snips") | *Snips* [[Coucke *et al.*, 2018](references.html#id53
    "Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David
    Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut
    Lavril, and others. Snips voice platform: an embedded spoken language understanding
    system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190,
    2018.")] dataset. |'
  id: totrans-19
  prefs: []
  type: TYPE_TB
  zh: '| [`Snips`](generated/torchaudio.datasets.Snips.html#torchaudio.datasets.Snips
    "torchaudio.datasets.Snips") | *Snips* [[Coucke *et al.*, 2018](references.html#id53
    "Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David
    Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut
    Lavril, and others. Snips voice platform: an embedded spoken language understanding
    system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190,
    2018.")] 数据集。|'
- en: '| [`SPEECHCOMMANDS`](generated/torchaudio.datasets.SPEECHCOMMANDS.html#torchaudio.datasets.SPEECHCOMMANDS
    "torchaudio.datasets.SPEECHCOMMANDS") | *Speech Commands* [[Warden, 2018](references.html#id39
    "P. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition.
    ArXiv e-prints, April 2018\. URL: https://arxiv.org/abs/1804.03209, arXiv:1804.03209.")]
    dataset. |'
  id: totrans-20
  prefs: []
  type: TYPE_TB
  zh: '| [`SPEECHCOMMANDS`](generated/torchaudio.datasets.SPEECHCOMMANDS.html#torchaudio.datasets.SPEECHCOMMANDS
    "torchaudio.datasets.SPEECHCOMMANDS") | *Speech Commands* [[Warden, 2018](references.html#id39
    "P. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition.
    ArXiv e-prints, April 2018\. URL: https://arxiv.org/abs/1804.03209, arXiv:1804.03209.")]
    数据集。 |'
- en: '| [`TEDLIUM`](generated/torchaudio.datasets.TEDLIUM.html#torchaudio.datasets.TEDLIUM
    "torchaudio.datasets.TEDLIUM") | *Tedlium* [[Rousseau *et al.*, 2012](references.html#id40
    "Anthony Rousseau, Paul Deléglise, and Yannick Estève. Ted-lium: an automatic
    speech recognition dedicated corpus. In Conference on Language Resources and Evaluation
    (LREC), 125–129\. 2012.")] dataset (releases 1,2 and 3). |'
  id: totrans-21
  prefs: []
  type: TYPE_TB
  zh: '| [`TEDLIUM`](generated/torchaudio.datasets.TEDLIUM.html#torchaudio.datasets.TEDLIUM
    "torchaudio.datasets.TEDLIUM") | *Tedlium* [[Rousseau *et al.*, 2012](references.html#id40
    "Anthony Rousseau, Paul Deléglise, and Yannick Estève. Ted-lium: an automatic
    speech recognition dedicated corpus. In Conference on Language Resources and Evaluation
    (LREC), 125–129\. 2012.")] 数据集（版本1、2和3）。 |'
- en: '| [`VCTK_092`](generated/torchaudio.datasets.VCTK_092.html#torchaudio.datasets.VCTK_092
    "torchaudio.datasets.VCTK_092") | *VCTK 0.92* [[Yamagishi *et al.*, 2019](references.html#id41
    "Junichi Yamagishi, Christophe Veaux, and Kirsten MacDonald. CSTR VCTK Corpus:
    english multi-speaker corpus for CSTR voice cloning toolkit (version 0.92). 2019\.
    doi:10.7488/ds/2645.")] dataset |'
  id: totrans-22
  prefs: []
  type: TYPE_TB
  zh: '| [`VCTK_092`](generated/torchaudio.datasets.VCTK_092.html#torchaudio.datasets.VCTK_092
    "torchaudio.datasets.VCTK_092") | *VCTK 0.92* [[Yamagishi *et al.*, 2019](references.html#id41
    "Junichi Yamagishi, Christophe Veaux, and Kirsten MacDonald. CSTR VCTK Corpus:
    english multi-speaker corpus for CSTR voice cloning toolkit (version 0.92). 2019\.
    doi:10.7488/ds/2645.")] 数据集 |'
- en: '| [`VoxCeleb1Identification`](generated/torchaudio.datasets.VoxCeleb1Identification.html#torchaudio.datasets.VoxCeleb1Identification
    "torchaudio.datasets.VoxCeleb1Identification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] dataset for speaker identification task. |'
  id: totrans-23
  prefs: []
  type: TYPE_TB
  zh: '| [`VoxCeleb1Identification`](generated/torchaudio.datasets.VoxCeleb1Identification.html#torchaudio.datasets.VoxCeleb1Identification
    "torchaudio.datasets.VoxCeleb1Identification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] 用于说话人识别任务的数据集。 |'
- en: '| [`VoxCeleb1Verification`](generated/torchaudio.datasets.VoxCeleb1Verification.html#torchaudio.datasets.VoxCeleb1Verification
    "torchaudio.datasets.VoxCeleb1Verification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] dataset for speaker verification task. |'
  id: totrans-24
  prefs: []
  type: TYPE_TB
  zh: '| [`VoxCeleb1Verification`](generated/torchaudio.datasets.VoxCeleb1Verification.html#torchaudio.datasets.VoxCeleb1Verification
    "torchaudio.datasets.VoxCeleb1Verification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] 用于说话人验证任务的数据集。 |'
- en: '| [`YESNO`](generated/torchaudio.datasets.YESNO.html#torchaudio.datasets.YESNO
    "torchaudio.datasets.YESNO") | *YesNo* [[*YesNo*, n.d.](references.html#id46 "Yesno.
    URL: http://www.openslr.org/1/.")] dataset. |'
  id: totrans-25
  prefs: []
  type: TYPE_TB
  zh: '| [`YESNO`](generated/torchaudio.datasets.YESNO.html#torchaudio.datasets.YESNO
    "torchaudio.datasets.YESNO") | *YesNo* [[*YesNo*, n.d.](references.html#id46 "Yesno.
    URL: http://www.openslr.org/1/.")] 数据集。 |'