aud22_52.yaml 21.3 KB
Newer Older
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
1
- en: torchaudio.datasets
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
2
  id: totrans-0
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
3 4 5
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
6
  zh: torchaudio.datasets
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
7
- en: 原文:[https://pytorch.org/audio/stable/datasets.html](https://pytorch.org/audio/stable/datasets.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
8
  id: totrans-1
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
9 10 11
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
12
  zh: 原文:[https://pytorch.org/audio/stable/datasets.html](https://pytorch.org/audio/stable/datasets.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
13 14
- en: All datasets are subclasses of [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset
    "(in PyTorch v2.1)") and have `__getitem__` and `__len__` methods implemented.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
15
  id: totrans-2
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
16 17
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
18 19
  zh: 所有数据集都是 [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset
    "(在 PyTorch v2.1 中)") 的子类,并实现了 `__getitem__` 和 `__len__` 方法。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
20 21 22
- en: 'Hence, they can all be passed to a [`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader
    "(in PyTorch v2.1)") which can load multiple samples parallelly using [`torch.multiprocessing`](https://pytorch.org/docs/stable/multiprocessing.html#module-torch.multiprocessing
    "(in PyTorch v2.1)") workers. For example:'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
23
  id: totrans-3
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
24 25
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
26 27 28
  zh: 因此,它们都可以传递给 [`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader
    "(在 PyTorch v2.1 中)"),该加载器可以使用 [`torch.multiprocessing`](https://pytorch.org/docs/stable/multiprocessing.html#module-torch.multiprocessing
    "(在 PyTorch v2.1 中)") 工作器并行加载多个样本。例如:
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
29
- en: '[PRE0]'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
30
  id: totrans-4
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
31 32
  prefs: []
  type: TYPE_PRE
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
33
  zh: '[PRE0]'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
34 35 36 37
- en: '| [`CMUARCTIC`](generated/torchaudio.datasets.CMUARCTIC.html#torchaudio.datasets.CMUARCTIC
    "torchaudio.datasets.CMUARCTIC") | *CMU ARCTIC* [[Kominek *et al.*, 2003](references.html#id36
    "John Kominek, Alan W Black, and Ver Ver. Cmu arctic databases for speech synthesis.
    Technical Report, 2003.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
38
  id: totrans-5
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
39 40
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
41 42 43 44
  zh: '| [`CMUARCTIC`](generated/torchaudio.datasets.CMUARCTIC.html#torchaudio.datasets.CMUARCTIC
    "torchaudio.datasets.CMUARCTIC") | *CMU ARCTIC* [[Kominek *et al.*, 2003](references.html#id36
    "John Kominek, Alan W Black, and Ver Ver. Cmu arctic databases for speech synthesis.
    Technical Report, 2003.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
45 46 47 48
- en: '| [`CMUDict`](generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict
    "torchaudio.datasets.CMUDict") | *CMU Pronouncing Dictionary* [[Weide, 1998](references.html#id45
    "R.L. Weide. The carnegie mellon pronuncing dictionary. 1998\. URL: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.")]
    (CMUDict) dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
49
  id: totrans-6
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
50 51
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
52 53 54 55
  zh: '| [`CMUDict`](generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict
    "torchaudio.datasets.CMUDict") | *CMU Pronouncing Dictionary* [[Weide, 1998](references.html#id45
    "R.L. Weide. The carnegie mellon pronuncing dictionary. 1998\. URL: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.")]
    (CMUDict) 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
56 57 58 59 60 61
- en: '| [`COMMONVOICE`](generated/torchaudio.datasets.COMMONVOICE.html#torchaudio.datasets.COMMONVOICE
    "torchaudio.datasets.COMMONVOICE") | *CommonVoice* [[Ardila *et al.*, 2020](references.html#id10
    "Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler,
    Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, and Gregor Weber.
    Common voice: a massively-multilingual speech corpus. 2020\. arXiv:1912.06670.")]
    dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
62
  id: totrans-7
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
63 64
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
65 66 67 68 69 70
  zh: '| [`COMMONVOICE`](generated/torchaudio.datasets.COMMONVOICE.html#torchaudio.datasets.COMMONVOICE
    "torchaudio.datasets.COMMONVOICE") | *CommonVoice* [[Ardila *et al.*, 2020](references.html#id10
    "Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler,
    Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, and Gregor Weber.
    Common voice: a massively-multilingual speech corpus. 2020\. arXiv:1912.06670.")]
    数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
71 72 73 74 75
- en: '| [`DR_VCTK`](generated/torchaudio.datasets.DR_VCTK.html#torchaudio.datasets.DR_VCTK
    "torchaudio.datasets.DR_VCTK") | *Device Recorded VCTK (Small subset version)*
    [[Sarfjoo and Yamagishi, 2018](references.html#id42 "Seyyed Saeed Sarfjoo and
    Junichi Yamagishi. Device recorded vctk (small subset version). 2018.")] dataset.
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
76
  id: totrans-8
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
77 78
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
79 80 81 82
  zh: '| [`DR_VCTK`](generated/torchaudio.datasets.DR_VCTK.html#torchaudio.datasets.DR_VCTK
    "torchaudio.datasets.DR_VCTK") | *Device Recorded VCTK (Small subset version)*
    [[Sarfjoo and Yamagishi, 2018](references.html#id42 "Seyyed Saeed Sarfjoo and
    Junichi Yamagishi. Device recorded vctk (small subset version). 2018.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
83 84 85 86 87 88
- en: '| [`FluentSpeechCommands`](generated/torchaudio.datasets.FluentSpeechCommands.html#torchaudio.datasets.FluentSpeechCommands
    "torchaudio.datasets.FluentSpeechCommands") | *Fluent Speech Commands* [[Lugosch
    *et al.*, 2019](references.html#id48 "Loren Lugosch, Mirco Ravanelli, Patrick
    Ignoto, Vikrant Singh Tomar, and Yoshua Bengio. Speech model pre-training for
    end-to-end spoken language understanding. In Gernot Kubin and Zdravko Kacic, editors,
    Proc. of Interspeech, 814–818\. 2019.")] dataset |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
89
  id: totrans-9
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
90 91
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
92 93 94 95 96 97
  zh: '| [`FluentSpeechCommands`](generated/torchaudio.datasets.FluentSpeechCommands.html#torchaudio.datasets.FluentSpeechCommands
    "torchaudio.datasets.FluentSpeechCommands") | *Fluent Speech Commands* [[Lugosch
    *et al.*, 2019](references.html#id48 "Loren Lugosch, Mirco Ravanelli, Patrick
    Ignoto, Vikrant Singh Tomar, and Yoshua Bengio. Speech model pre-training for
    end-to-end spoken language understanding. In Gernot Kubin and Zdravko Kacic, editors,
    Proc. of Interspeech, 814–818\. 2019.")] 数据集|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
98 99 100 101 102
- en: '| [`GTZAN`](generated/torchaudio.datasets.GTZAN.html#torchaudio.datasets.GTZAN
    "torchaudio.datasets.GTZAN") | *GTZAN* [[Tzanetakis *et al.*, 2001](references.html#id43
    "George Tzanetakis, Georg Essl, and Perry Cook. Automatic musical genre classification
    of audio signals. 2001\. URL: http://ismir2001.ismir.net/pdf/tzanetakis.pdf.")]
    dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
103
  id: totrans-10
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
104 105
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
106 107 108 109 110
  zh: '| [`GTZAN`](generated/torchaudio.datasets.GTZAN.html#torchaudio.datasets.GTZAN
    "torchaudio.datasets.GTZAN") | *GTZAN* [[Tzanetakis *et al.*, 2001](references.html#id43
    "George Tzanetakis, Georg Essl, and Perry Cook. Automatic musical genre classification
    of audio signals. 2001\. URL: http://ismir2001.ismir.net/pdf/tzanetakis.pdf.")]
    数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
111 112 113 114 115 116
- en: '| [`IEMOCAP`](generated/torchaudio.datasets.IEMOCAP.html#torchaudio.datasets.IEMOCAP
    "torchaudio.datasets.IEMOCAP") | *IEMOCAP* [[Busso *et al.*, 2008](references.html#id52
    "Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower Provost,
    Samuel Kim, Jeannette Chang, Sungbok Lee, and Shrikanth Narayanan. Iemocap: interactive
    emotional dyadic motion capture database. Language Resources and Evaluation, 42:335-359,
    12 2008\. doi:10.1007/s10579-008-9076-6.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
117
  id: totrans-11
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
118 119
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
120 121 122 123 124 125
  zh: '| [`IEMOCAP`](generated/torchaudio.datasets.IEMOCAP.html#torchaudio.datasets.IEMOCAP
    "torchaudio.datasets.IEMOCAP") | *IEMOCAP* [[Busso *et al.*, 2008](references.html#id52
    "Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower Provost,
    Samuel Kim, Jeannette Chang, Sungbok Lee, and Shrikanth Narayanan. Iemocap: interactive
    emotional dyadic motion capture database. Language Resources and Evaluation, 42:335-359,
    12 2008\. doi:10.1007/s10579-008-9076-6.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
126 127 128 129 130
- en: '| [`LibriMix`](generated/torchaudio.datasets.LibriMix.html#torchaudio.datasets.LibriMix
    "torchaudio.datasets.LibriMix") | *LibriMix* [[Cosentino *et al.*, 2020](references.html#id37
    "Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, and Emmanuel
    Vincent. Librimix: an open-source dataset for generalizable speech separation.
    2020\. arXiv:2005.11262.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
131
  id: totrans-12
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
132 133
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
134 135 136 137 138
  zh: '| [`LibriMix`](generated/torchaudio.datasets.LibriMix.html#torchaudio.datasets.LibriMix
    "torchaudio.datasets.LibriMix") | *LibriMix* [[Cosentino *et al.*, 2020](references.html#id37
    "Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, and Emmanuel
    Vincent. Librimix: an open-source dataset for generalizable speech separation.
    2020\. arXiv:2005.11262.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
139 140 141 142 143 144
- en: '| [`LIBRISPEECH`](generated/torchaudio.datasets.LIBRISPEECH.html#torchaudio.datasets.LIBRISPEECH
    "torchaudio.datasets.LIBRISPEECH") | *LibriSpeech* [[Panayotov *et al.*, 2015](references.html#id13
    "Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech:
    an asr corpus based on public domain audio books. In 2015 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), volume, 5206-5210\. 2015\.
    doi:10.1109/ICASSP.2015.7178964.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
145
  id: totrans-13
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
146 147
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
148 149 150 151 152 153
  zh: '| [`LIBRISPEECH`](generated/torchaudio.datasets.LIBRISPEECH.html#torchaudio.datasets.LIBRISPEECH
    "torchaudio.datasets.LIBRISPEECH") | *LibriSpeech* [[Panayotov *et al.*, 2015](references.html#id13
    "Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech:
    an asr corpus based on public domain audio books. In 2015 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), volume, 5206-5210\. 2015\.
    doi:10.1109/ICASSP.2015.7178964.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
154 155 156 157 158 159 160 161 162 163 164 165
- en: '| [`LibriLightLimited`](generated/torchaudio.datasets.LibriLightLimited.html#torchaudio.datasets.LibriLightLimited
    "torchaudio.datasets.LibriLightLimited") | Subset of Libri-light [[Kahn *et al.*,
    2020](references.html#id12 "J. Kahn, M. Rivière, W. Zheng, E. Kharitonov, Q. Xu,
    P. E. Mazaré, J. Karadayi, V. Liptchinsky, R. Collobert, C. Fuegen, T. Likhomanenko,
    G. Synnaeve, A. Joulin, A. Mohamed, and E. Dupoux. Libri-light: a benchmark for
    asr with limited or no supervision. In ICASSP 2020 - 2020 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), 7669-7673\. 2020\. \url https://github.com/facebookresearch/libri-light.")]
    dataset, which was used in HuBERT [[Hsu *et al.*, 2021](references.html#id16 "Wei-Ning
    Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov,
    and Abdelrahman Mohamed. Hubert: self-supervised speech representation learning
    by masked prediction of hidden units. 2021\. arXiv:2106.07447.")] for supervised
    fine-tuning. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
166
  id: totrans-14
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
167 168
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
169 170 171 172 173 174 175 176 177 178 179
  zh: '| [`LibriLightLimited`](generated/torchaudio.datasets.LibriLightLimited.html#torchaudio.datasets.LibriLightLimited
    "torchaudio.datasets.LibriLightLimited") | Libri-light的子集 [[Kahn *et al.*, 2020](references.html#id12
    "J. Kahn, M. Rivière, W. Zheng, E. Kharitonov, Q. Xu, P. E. Mazaré, J. Karadayi,
    V. Liptchinsky, R. Collobert, C. Fuegen, T. Likhomanenko, G. Synnaeve, A. Joulin,
    A. Mohamed, and E. Dupoux. Libri-light: a benchmark for asr with limited or no
    supervision. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics,
    Speech and Signal Processing (ICASSP), 7669-7673\. 2020\. \url https://github.com/facebookresearch/libri-light.")]
    数据集,被用于HuBERT [[Hsu *et al.*, 2021](references.html#id16 "Wei-Ning Hsu, Benjamin
    Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman
    Mohamed. Hubert: self-supervised speech representation learning by masked prediction
    of hidden units. 2021\. arXiv:2106.07447.")] 进行监督微调。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
180 181 182 183 184
- en: '| [`LIBRITTS`](generated/torchaudio.datasets.LIBRITTS.html#torchaudio.datasets.LIBRITTS
    "torchaudio.datasets.LIBRITTS") | *LibriTTS* [[Zen *et al.*, 2019](references.html#id38
    "Heiga Zen, Viet-Trung Dang, Robert A. J. Clark, Yu Zhang, Ron J. Weiss, Ye Jia,
    Z. Chen, and Yonghui Wu. Libritts: a corpus derived from librispeech for text-to-speech.
    ArXiv, 2019.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
185
  id: totrans-15
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
186 187
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
188 189 190 191 192
  zh: '| [`LIBRITTS`](generated/torchaudio.datasets.LIBRITTS.html#torchaudio.datasets.LIBRITTS
    "torchaudio.datasets.LIBRITTS") | *LibriTTS* [[Zen *et al.*, 2019](references.html#id38
    "Heiga Zen, Viet-Trung Dang, Robert A. J. Clark, Yu Zhang, Ron J. Weiss, Ye Jia,
    Z. Chen, and Yonghui Wu. Libritts: a corpus derived from librispeech for text-to-speech.
    ArXiv, 2019.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
193 194 195 196
- en: '| [`LJSPEECH`](generated/torchaudio.datasets.LJSPEECH.html#torchaudio.datasets.LJSPEECH
    "torchaudio.datasets.LJSPEECH") | *LJSpeech-1.1* [[Ito and Johnson, 2017](references.html#id7
    "Keith Ito and Linda Johnson. The lj speech dataset. \url https://keithito.com/LJ-Speech-Dataset/,
    2017.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
197
  id: totrans-16
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
198 199
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
200 201 202 203
  zh: '| [`LJSPEECH`](generated/torchaudio.datasets.LJSPEECH.html#torchaudio.datasets.LJSPEECH
    "torchaudio.datasets.LJSPEECH") | *LJSpeech-1.1* [[Ito and Johnson, 2017](references.html#id7
    "Keith Ito and Linda Johnson. The lj speech dataset. \url https://keithito.com/LJ-Speech-Dataset/,
    2017.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
204 205 206 207 208 209
- en: '| [`MUSDB_HQ`](generated/torchaudio.datasets.MUSDB_HQ.html#torchaudio.datasets.MUSDB_HQ
    "torchaudio.datasets.MUSDB_HQ") | *MUSDB_HQ* [[Rafii *et al.*, 2019](references.html#id47
    "Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis,
    and Rachel Bittner. MUSDB18-HQ - an uncompressed version of musdb18\. December
    2019\. URL: https://doi.org/10.5281/zenodo.3338373, doi:10.5281/zenodo.3338373.")]
    dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
210
  id: totrans-17
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
211 212
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
213 214 215 216 217 218
  zh: '| [`MUSDB_HQ`](generated/torchaudio.datasets.MUSDB_HQ.html#torchaudio.datasets.MUSDB_HQ
    "torchaudio.datasets.MUSDB_HQ") | *MUSDB_HQ* [[Rafii *et al.*, 2019](references.html#id47
    "Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis,
    and Rachel Bittner. MUSDB18-HQ - an uncompressed version of musdb18\. December
    2019\. URL: https://doi.org/10.5281/zenodo.3338373, doi:10.5281/zenodo.3338373.")]
    数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
219 220 221 222 223 224 225
- en: '| [`QUESST14`](generated/torchaudio.datasets.QUESST14.html#torchaudio.datasets.QUESST14
    "torchaudio.datasets.QUESST14") | *QUESST14* [[Miro *et al.*, 2015](references.html#id44
    "Xavier Anguera Miro, Luis Javier Rodriguez-Fuentes, Andi Buzo, Florian Metze,
    Igor Szoke, and Mikel Peñagarikano. Quesst2014: evaluating query-by-example speech
    search in a zero-resource setting with real-life queries. 2015 IEEE International
    Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5833-5837,
    2015.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
226
  id: totrans-18
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
227 228
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
229 230 231 232 233 234 235
  zh: '| [`QUESST14`](generated/torchaudio.datasets.QUESST14.html#torchaudio.datasets.QUESST14
    "torchaudio.datasets.QUESST14") | *QUESST14* [[Miro *et al.*, 2015](references.html#id44
    "Xavier Anguera Miro, Luis Javier Rodriguez-Fuentes, Andi Buzo, Florian Metze,
    Igor Szoke, and Mikel Peñagarikano. Quesst2014: evaluating query-by-example speech
    search in a zero-resource setting with real-life queries. 2015 IEEE International
    Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5833-5837,
    2015.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
236 237 238 239 240 241 242
- en: '| [`Snips`](generated/torchaudio.datasets.Snips.html#torchaudio.datasets.Snips
    "torchaudio.datasets.Snips") | *Snips* [[Coucke *et al.*, 2018](references.html#id53
    "Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David
    Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut
    Lavril, and others. Snips voice platform: an embedded spoken language understanding
    system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190,
    2018.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
243
  id: totrans-19
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
244 245
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
246 247 248 249 250 251 252
  zh: '| [`Snips`](generated/torchaudio.datasets.Snips.html#torchaudio.datasets.Snips
    "torchaudio.datasets.Snips") | *Snips* [[Coucke *et al.*, 2018](references.html#id53
    "Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David
    Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut
    Lavril, and others. Snips voice platform: an embedded spoken language understanding
    system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190,
    2018.")] 数据集。|'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
253 254 255 256 257
- en: '| [`SPEECHCOMMANDS`](generated/torchaudio.datasets.SPEECHCOMMANDS.html#torchaudio.datasets.SPEECHCOMMANDS
    "torchaudio.datasets.SPEECHCOMMANDS") | *Speech Commands* [[Warden, 2018](references.html#id39
    "P. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition.
    ArXiv e-prints, April 2018\. URL: https://arxiv.org/abs/1804.03209, arXiv:1804.03209.")]
    dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
258
  id: totrans-20
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
259 260
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
261 262 263 264 265
  zh: '| [`SPEECHCOMMANDS`](generated/torchaudio.datasets.SPEECHCOMMANDS.html#torchaudio.datasets.SPEECHCOMMANDS
    "torchaudio.datasets.SPEECHCOMMANDS") | *Speech Commands* [[Warden, 2018](references.html#id39
    "P. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition.
    ArXiv e-prints, April 2018\. URL: https://arxiv.org/abs/1804.03209, arXiv:1804.03209.")]
    数据集。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
266 267 268 269 270
- en: '| [`TEDLIUM`](generated/torchaudio.datasets.TEDLIUM.html#torchaudio.datasets.TEDLIUM
    "torchaudio.datasets.TEDLIUM") | *Tedlium* [[Rousseau *et al.*, 2012](references.html#id40
    "Anthony Rousseau, Paul Deléglise, and Yannick Estève. Ted-lium: an automatic
    speech recognition dedicated corpus. In Conference on Language Resources and Evaluation
    (LREC), 125–129\. 2012.")] dataset (releases 1,2 and 3). |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
271
  id: totrans-21
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
272 273
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
274 275 276 277 278
  zh: '| [`TEDLIUM`](generated/torchaudio.datasets.TEDLIUM.html#torchaudio.datasets.TEDLIUM
    "torchaudio.datasets.TEDLIUM") | *Tedlium* [[Rousseau *et al.*, 2012](references.html#id40
    "Anthony Rousseau, Paul Deléglise, and Yannick Estève. Ted-lium: an automatic
    speech recognition dedicated corpus. In Conference on Language Resources and Evaluation
    (LREC), 125–129\. 2012.")] 数据集(版本1、2和3)。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
279 280 281 282 283
- en: '| [`VCTK_092`](generated/torchaudio.datasets.VCTK_092.html#torchaudio.datasets.VCTK_092
    "torchaudio.datasets.VCTK_092") | *VCTK 0.92* [[Yamagishi *et al.*, 2019](references.html#id41
    "Junichi Yamagishi, Christophe Veaux, and Kirsten MacDonald. CSTR VCTK Corpus:
    english multi-speaker corpus for CSTR voice cloning toolkit (version 0.92). 2019\.
    doi:10.7488/ds/2645.")] dataset |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
284
  id: totrans-22
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
285 286
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
287 288 289 290 291
  zh: '| [`VCTK_092`](generated/torchaudio.datasets.VCTK_092.html#torchaudio.datasets.VCTK_092
    "torchaudio.datasets.VCTK_092") | *VCTK 0.92* [[Yamagishi *et al.*, 2019](references.html#id41
    "Junichi Yamagishi, Christophe Veaux, and Kirsten MacDonald. CSTR VCTK Corpus:
    english multi-speaker corpus for CSTR voice cloning toolkit (version 0.92). 2019\.
    doi:10.7488/ds/2645.")] 数据集 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
292 293 294 295 296
- en: '| [`VoxCeleb1Identification`](generated/torchaudio.datasets.VoxCeleb1Identification.html#torchaudio.datasets.VoxCeleb1Identification
    "torchaudio.datasets.VoxCeleb1Identification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] dataset for speaker identification task. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
297
  id: totrans-23
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
298 299
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
300 301 302 303 304
  zh: '| [`VoxCeleb1Identification`](generated/torchaudio.datasets.VoxCeleb1Identification.html#torchaudio.datasets.VoxCeleb1Identification
    "torchaudio.datasets.VoxCeleb1Identification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] 用于说话人识别任务的数据集。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
305 306 307 308 309
- en: '| [`VoxCeleb1Verification`](generated/torchaudio.datasets.VoxCeleb1Verification.html#torchaudio.datasets.VoxCeleb1Verification
    "torchaudio.datasets.VoxCeleb1Verification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] dataset for speaker verification task. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
310
  id: totrans-24
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
311 312
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
313 314 315 316 317
  zh: '| [`VoxCeleb1Verification`](generated/torchaudio.datasets.VoxCeleb1Verification.html#torchaudio.datasets.VoxCeleb1Verification
    "torchaudio.datasets.VoxCeleb1Verification") | *VoxCeleb1* [[Nagrani *et al.*,
    2017](references.html#id49 "Arsha Nagrani, Joon Son Chung, and Andrew Zisserman.
    Voxceleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612,
    2017.")] 用于说话人验证任务的数据集。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
318 319 320
- en: '| [`YESNO`](generated/torchaudio.datasets.YESNO.html#torchaudio.datasets.YESNO
    "torchaudio.datasets.YESNO") | *YesNo* [[*YesNo*, n.d.](references.html#id46 "Yesno.
    URL: http://www.openslr.org/1/.")] dataset. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
321
  id: totrans-25
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
322 323
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
324 325 326
  zh: '| [`YESNO`](generated/torchaudio.datasets.YESNO.html#torchaudio.datasets.YESNO
    "torchaudio.datasets.YESNO") | *YesNo* [[*YesNo*, n.d.](references.html#id46 "Yesno.
    URL: http://www.openslr.org/1/.")] 数据集。 |'