aud22_53.yaml 14.8 KB
Newer Older
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
1
- en: torchaudio.models
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
2
  id: totrans-0
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
3 4 5
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
6
  zh: torchaudio.models
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
7
- en: 原文:[https://pytorch.org/audio/stable/models.html](https://pytorch.org/audio/stable/models.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
8
  id: totrans-1
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
9 10 11
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
12
  zh: 原文:[https://pytorch.org/audio/stable/models.html](https://pytorch.org/audio/stable/models.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
13 14
- en: The `torchaudio.models` subpackage contains definitions of models for addressing
    common audio tasks.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
15
  id: totrans-2
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
16 17
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
18
  zh: '`torchaudio.models`子包含有用于处理常见音频任务的模型定义。'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
19
- en: Note
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
20
  id: totrans-3
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
21 22
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
23
  zh: 注意
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
24 25
- en: For models with pre-trained parameters, please refer to [`torchaudio.pipelines`](pipelines.html#module-torchaudio.pipelines
    "torchaudio.pipelines") module.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
26
  id: totrans-4
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
27 28
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
29 30
  zh: 对于具有预训练参数的模型,请参考[`torchaudio.pipelines`](pipelines.html#module-torchaudio.pipelines
    "torchaudio.pipelines")模块。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
31 32
- en: Model defintions are responsible for constructing computation graphs and executing
    them.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
33
  id: totrans-5
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
34 35
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
36
  zh: 模型定义负责构建计算图并执行它们。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
37 38
- en: Some models have complex structure and variations. For such models, factory
    functions are provided.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
39
  id: totrans-6
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
40 41
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
42
  zh: 一些模型具有复杂的结构和变体。对于这样的模型,提供了工厂函数。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
43 44 45 46 47 48 49
- en: '| [`Conformer`](generated/torchaudio.models.Conformer.html#torchaudio.models.Conformer
    "torchaudio.models.Conformer") | Conformer architecture introduced in *Conformer:
    Convolution-augmented Transformer for Speech Recognition* [[Gulati *et al.*, 2020](references.html#id21
    "Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu,
    Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, and Ruoming Pang. Conformer:
    convolution-augmented transformer for speech recognition. 2020\. arXiv:2005.08100.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
50
  id: totrans-7
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
51 52
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
53 54 55 56 57 58 59
  zh: '| [`Conformer`](generated/torchaudio.models.Conformer.html#torchaudio.models.Conformer
    "torchaudio.models.Conformer") | Conformer架构介绍在*Conformer: Convolution-augmented
    Transformer for Speech Recognition* [[Gulati *et al.*, 2020](references.html#id21
    "Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu,
    Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, and Ruoming Pang. Conformer:
    convolution-augmented transformer for speech recognition. 2020\. arXiv:2005.08100.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
60 61 62 63 64 65 66 67
- en: '| [`ConvTasNet`](generated/torchaudio.models.ConvTasNet.html#torchaudio.models.ConvTasNet
    "torchaudio.models.ConvTasNet") | Conv-TasNet architecture introduced in *Conv-TasNet:
    Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation* [[Luo
    and Mesgarani, 2019](references.html#id22 "Yi Luo and Nima Mesgarani. Conv-tasnet:
    surpassing ideal time–frequency magnitude masking for speech separation. IEEE/ACM
    Transactions on Audio, Speech, and Language Processing, 27(8):1256–1266, Aug 2019\.
    URL: http://dx.doi.org/10.1109/TASLP.2019.2915167, doi:10.1109/taslp.2019.2915167.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
68
  id: totrans-8
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
69 70
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
71 72 73 74 75 76 77
  zh: '| [`ConvTasNet`](generated/torchaudio.models.ConvTasNet.html#torchaudio.models.ConvTasNet
    "torchaudio.models.ConvTasNet") | Conv-TasNet架构介绍在*Conv-TasNet: Surpassing Ideal
    Time–Frequency Magnitude Masking for Speech Separation* [[Luo and Mesgarani, 2019](references.html#id22
    "Yi Luo and Nima Mesgarani. Conv-tasnet: surpassing ideal time–frequency magnitude
    masking for speech separation. IEEE/ACM Transactions on Audio, Speech, and Language
    Processing, 27(8):1256–1266, Aug 2019\. URL: http://dx.doi.org/10.1109/TASLP.2019.2915167,
    doi:10.1109/taslp.2019.2915167.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
78 79 80 81 82 83 84
- en: '| [`DeepSpeech`](generated/torchaudio.models.DeepSpeech.html#torchaudio.models.DeepSpeech
    "torchaudio.models.DeepSpeech") | DeepSpeech architecture introduced in *Deep
    Speech: Scaling up end-to-end speech recognition* [[Hannun *et al.*, 2014](references.html#id17
    "Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen,
    Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng.
    Deep speech: scaling up end-to-end speech recognition. 2014\. arXiv:1412.5567.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
85
  id: totrans-9
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
86 87
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
88 89 90 91 92 93
  zh: '| [`DeepSpeech`](generated/torchaudio.models.DeepSpeech.html#torchaudio.models.DeepSpeech
    "torchaudio.models.DeepSpeech") | DeepSpeech架构介绍在*Deep Speech: Scaling up end-to-end
    speech recognition* [[Hannun *et al.*, 2014](references.html#id17 "Awni Hannun,
    Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger,
    Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng. Deep speech:
    scaling up end-to-end speech recognition. 2014\. arXiv:1412.5567.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
94 95 96 97 98 99 100 101
- en: '| [`Emformer`](generated/torchaudio.models.Emformer.html#torchaudio.models.Emformer
    "torchaudio.models.Emformer") | Emformer architecture introduced in *Emformer:
    Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech
    Recognition* [[Shi *et al.*, 2021](references.html#id30 "Yangyang Shi, Yongqiang
    Wang, Chunyang Wu, Ching-Feng Yeh, Julian Chan, Frank Zhang, Duc Le, and Mike
    Seltzer. Emformer: efficient memory transformer based acoustic model for low latency
    streaming speech recognition. In ICASSP 2021 - 2021 IEEE International Conference
    on Acoustics, Speech and Signal Processing (ICASSP), 6783-6787\. 2021.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
102
  id: totrans-10
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
103 104
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
105 106 107 108 109 110 111 112
  zh: '| [`Emformer`](generated/torchaudio.models.Emformer.html#torchaudio.models.Emformer
    "torchaudio.models.Emformer") | Emformer架构介绍在*Emformer: Efficient Memory Transformer
    Based Acoustic Model for Low Latency Streaming Speech Recognition* [[Shi *et al.*,
    2021](references.html#id30 "Yangyang Shi, Yongqiang Wang, Chunyang Wu, Ching-Feng
    Yeh, Julian Chan, Frank Zhang, Duc Le, and Mike Seltzer. Emformer: efficient memory
    transformer based acoustic model for low latency streaming speech recognition.
    In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP), 6783-6787\. 2021.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
113 114 115 116 117
- en: '| [`HDemucs`](generated/torchaudio.models.HDemucs.html#torchaudio.models.HDemucs
    "torchaudio.models.HDemucs") | Hybrid Demucs model from *Hybrid Spectrogram and
    Waveform Source Separation* [[Défossez, 2021](references.html#id50 "Alexandre
    Défossez. Hybrid spectrogram and waveform source separation. In Proceedings of
    the ISMIR 2021 Workshop on Music Source Separation. 2021.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
118
  id: totrans-11
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
119 120
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
121 122 123 124 125
  zh: '| [`HDemucs`](generated/torchaudio.models.HDemucs.html#torchaudio.models.HDemucs
    "torchaudio.models.HDemucs") | 来自*Hybrid Spectrogram and Waveform Source Separation*的混合Demucs模型[[Défossez,
    2021](references.html#id50 "Alexandre Défossez. Hybrid spectrogram and waveform
    source separation. In Proceedings of the ISMIR 2021 Workshop on Music Source Separation.
    2021.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
126 127 128 129 130 131
- en: '| [`HuBERTPretrainModel`](generated/torchaudio.models.HuBERTPretrainModel.html#torchaudio.models.HuBERTPretrainModel
    "torchaudio.models.HuBERTPretrainModel") | HuBERT model used for pretraining in
    *HuBERT* [[Hsu *et al.*, 2021](references.html#id16 "Wei-Ning Hsu, Benjamin Bolte,
    Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman Mohamed.
    Hubert: self-supervised speech representation learning by masked prediction of
    hidden units. 2021\. arXiv:2106.07447.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
132
  id: totrans-12
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
133 134
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
135 136 137 138 139 140
  zh: '| [`HuBERTPretrainModel`](generated/torchaudio.models.HuBERTPretrainModel.html#torchaudio.models.HuBERTPretrainModel
    "torchaudio.models.HuBERTPretrainModel") | HuBERT模型用于*HuBERT*中的预训练 [[Hsu *et al.*,
    2021](references.html#id16 "Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai,
    Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman Mohamed. Hubert: self-supervised
    speech representation learning by masked prediction of hidden units. 2021\. arXiv:2106.07447.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
141 142
- en: '| [`RNNT`](generated/torchaudio.models.RNNT.html#torchaudio.models.RNNT "torchaudio.models.RNNT")
    | Recurrent neural network transducer (RNN-T) model. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
143
  id: totrans-13
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
144 145
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
146 147
  zh: '| [`RNNT`](generated/torchaudio.models.RNNT.html#torchaudio.models.RNNT "torchaudio.models.RNNT")
    | 循环神经网络转录器(RNN-T)模型。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
148 149
- en: '| [`RNNTBeamSearch`](generated/torchaudio.models.RNNTBeamSearch.html#torchaudio.models.RNNTBeamSearch
    "torchaudio.models.RNNTBeamSearch") | Beam search decoder for RNN-T model. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
150
  id: totrans-14
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
151 152
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
153 154
  zh: '| [`RNNTBeamSearch`](generated/torchaudio.models.RNNTBeamSearch.html#torchaudio.models.RNNTBeamSearch
    "torchaudio.models.RNNTBeamSearch") | RNN-T模型的束搜索解码器。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
155 156 157 158
- en: '| [`SquimObjective`](generated/torchaudio.models.SquimObjective.html#torchaudio.models.SquimObjective
    "torchaudio.models.SquimObjective") | Speech Quality and Intelligibility Measures
    (SQUIM) model that predicts **objective** metric scores for speech enhancement
    (e.g., STOI, PESQ, and SI-SDR). |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
159
  id: totrans-15
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
160 161
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
162 163 164
  zh: '| [`SquimObjective`](generated/torchaudio.models.SquimObjective.html#torchaudio.models.SquimObjective
    "torchaudio.models.SquimObjective") | 预测语音增强的**客观**度量分数(例如,STOI、PESQ和SI-SDR)的语音质量和可懂度测量(SQUIM)模型。
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
165 166 167 168
- en: '| [`SquimSubjective`](generated/torchaudio.models.SquimSubjective.html#torchaudio.models.SquimSubjective
    "torchaudio.models.SquimSubjective") | Speech Quality and Intelligibility Measures
    (SQUIM) model that predicts **subjective** metric scores for speech enhancement
    (e.g., Mean Opinion Score (MOS)). |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
169
  id: totrans-16
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
170 171
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
172 173 174
  zh: '| [`SquimSubjective`](generated/torchaudio.models.SquimSubjective.html#torchaudio.models.SquimSubjective
    "torchaudio.models.SquimSubjective") | 预测语音增强的**主观**度量分数(例如,平均意见分数(MOS))的语音质量和可懂度测量(SQUIM)模型。
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
175 176 177 178 179 180 181 182 183
- en: '| [`Tacotron2`](generated/torchaudio.models.Tacotron2.html#torchaudio.models.Tacotron2
    "torchaudio.models.Tacotron2") | Tacotron2 model from *Natural TTS Synthesis by
    Conditioning WaveNet on Mel Spectrogram Predictions* [[Shen *et al.*, 2018](references.html#id27
    "Jonathan Shen, Ruoming Pang, Ron J Weiss, Mike Schuster, Navdeep Jaitly, Zongheng
    Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Rj Skerrv-Ryan, and others. Natural
    tts synthesis by conditioning wavenet on mel spectrogram predictions. In 2018
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
    4779–4783\. IEEE, 2018.")] based on the implementation from [Nvidia Deep Learning
    Examples](https://github.com/NVIDIA/DeepLearningExamples/). |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
184
  id: totrans-17
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
185 186
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
187 188 189 190 191 192
  zh: '| [`Tacotron2`](generated/torchaudio.models.Tacotron2.html#torchaudio.models.Tacotron2
    "torchaudio.models.Tacotron2") | 基于《自然TTS合成:通过在Mel频谱图预测上对WaveNet进行条件化》[[Shen等,2018](references.html#id27
    "Jonathan Shen, Ruoming Pang, Ron J Weiss, Mike Schuster, Navdeep Jaitly, Zongheng
    Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Rj Skerrv-Ryan等。通过在mel频谱图预测上对WaveNet进行条件化的自然TTS合成。在2018年IEEE国际声学、语音和信号处理会议(ICASSP)上,4779-4783。IEEE,2018。")]
    的Tacotron2模型,基于[Nvidia深度学习示例](https://github.com/NVIDIA/DeepLearningExamples/)的实现。
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
193 194 195 196 197
- en: '| [`Wav2Letter`](generated/torchaudio.models.Wav2Letter.html#torchaudio.models.Wav2Letter
    "torchaudio.models.Wav2Letter") | Wav2Letter model architecture from *Wav2Letter:
    an End-to-End ConvNet-based Speech Recognition System* [[Collobert *et al.*, 2016](references.html#id19
    "Ronan Collobert, Christian Puhrsch, and Gabriel Synnaeve. Wav2letter: an end-to-end
    convnet-based speech recognition system. 2016\. arXiv:1609.03193.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
198
  id: totrans-18
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
199 200
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
201 202 203 204
  zh: '| [`Wav2Letter`](generated/torchaudio.models.Wav2Letter.html#torchaudio.models.Wav2Letter
    "torchaudio.models.Wav2Letter") | 来自《Wav2Letter:基于端到端ConvNet的语音识别系统》[[Collobert等,2016](references.html#id19
    "Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve。Wav2letter:基于端到端ConvNet的语音识别系统。2016。arXiv:1609.03193。")]
    的Wav2Letter模型架构。 |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
205 206 207 208 209
- en: '| [`Wav2Vec2Model`](generated/torchaudio.models.Wav2Vec2Model.html#torchaudio.models.Wav2Vec2Model
    "torchaudio.models.Wav2Vec2Model") | Acoustic model used in *wav2vec 2.0* [[Baevski
    *et al.*, 2020](references.html#id15 "Alexei Baevski, Henry Zhou, Abdelrahman
    Mohamed, and Michael Auli. Wav2vec 2.0: a framework for self-supervised learning
    of speech representations. 2020\. arXiv:2006.11477.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
210
  id: totrans-19
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
211 212
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
213 214 215 216
  zh: '| [`Wav2Vec2Model`](generated/torchaudio.models.Wav2Vec2Model.html#torchaudio.models.Wav2Vec2Model
    "torchaudio.models.Wav2Vec2Model") | *wav2vec 2.0*中使用的声学模型[[Baevski等,2020](references.html#id15
    "Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli。Wav2vec 2.0:自监督学习语音表示的框架。2020。arXiv:2006.11477。")]。
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
217 218 219 220 221 222 223 224
- en: '| [`WaveRNN`](generated/torchaudio.models.WaveRNN.html#torchaudio.models.WaveRNN
    "torchaudio.models.WaveRNN") | WaveRNN model from *Efficient Neural Audio Synthesis*
    [[Kalchbrenner *et al.*, 2018](references.html#id3 "Nal Kalchbrenner, Erich Elsen,
    Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg,
    Aäron van den Oord, Sander Dieleman, and Koray Kavukcuoglu. Efficient neural audio
    synthesis. CoRR, 2018\. URL: http://arxiv.org/abs/1802.08435, arXiv:1802.08435.")]
    based on the implementation from [fatchord/WaveRNN](https://github.com/fatchord/WaveRNN).
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
225
  id: totrans-20
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
226 227
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
228 229 230 231 232 233
  zh: '| [`WaveRNN`](generated/torchaudio.models.WaveRNN.html#torchaudio.models.WaveRNN
    "torchaudio.models.WaveRNN") | 基于《高效神经音频合成》[[Kalchbrenner等,2018](references.html#id3
    "Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande,
    Edward Lockhart, Florian Stimberg, Aäron van den Oord, Sander Dieleman, Koray
    Kavukcuoglu等。高效神经音频合成。CoRR,2018。URL:http://arxiv.org/abs/1802.08435,arXiv:1802.08435。")]
    的WaveRNN模型,基于[fatchord/WaveRNN](https://github.com/fatchord/WaveRNN)的实现。 |'