aud22_67.yaml 7.3 KB
Newer Older
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
1
- en: torchaudio.prototype.models
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
2
  id: totrans-0
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
3 4 5
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
6
  zh: torchaudio.prototype.models
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
7
- en: 原文:[https://pytorch.org/audio/stable/prototype.models.html](https://pytorch.org/audio/stable/prototype.models.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
8
  id: totrans-1
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
9 10 11
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
12
  zh: 原文:[https://pytorch.org/audio/stable/prototype.models.html](https://pytorch.org/audio/stable/prototype.models.html)
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
13 14
- en: The `torchaudio.prototype.models` subpackage contains definitions of models
    for addressing common audio tasks.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
15
  id: totrans-2
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
16 17
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
18
  zh: '`torchaudio.prototype.models`子包含有用于处理常见音频任务的模型定义。'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
19
- en: Note
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
20
  id: totrans-3
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
21 22
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
23
  zh: 注意
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
24 25
- en: For models with pre-trained parameters, please refer to [`torchaudio.prototype.pipelines`](prototype.pipelines.html#module-torchaudio.prototype.pipelines
    "torchaudio.prototype.pipelines") module.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
26
  id: totrans-4
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
27 28
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
29 30
  zh: 对于具有预训练参数的模型,请参考[`torchaudio.prototype.pipelines`](prototype.pipelines.html#module-torchaudio.prototype.pipelines
    "torchaudio.prototype.pipelines")模块。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
31 32
- en: Model defintions are responsible for constructing computation graphs and executing
    them.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
33
  id: totrans-5
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
34 35
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
36
  zh: 模型定义负责构建计算图并执行它们。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
37 38
- en: Some models have complex structure and variations. For such models, factory
    functions are provided.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
39
  id: totrans-6
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
40 41
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
42
  zh: 一些模型具有复杂的结构和变体。对于这样的模型,提供了工厂函数。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
43 44 45
- en: '| [`ConformerWav2Vec2PretrainModel`](generated/torchaudio.prototype.models.ConformerWav2Vec2PretrainModel.html#torchaudio.prototype.models.ConformerWav2Vec2PretrainModel
    "torchaudio.prototype.models.ConformerWav2Vec2PretrainModel") | Conformer Wav2Vec2
    pre-train model for training from scratch. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
46
  id: totrans-7
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
47 48
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
49 50 51
  zh: '| [`ConformerWav2Vec2PretrainModel`](generated/torchaudio.prototype.models.ConformerWav2Vec2PretrainModel.html#torchaudio.prototype.models.ConformerWav2Vec2PretrainModel
    "torchaudio.prototype.models.ConformerWav2Vec2PretrainModel") | Conformer Wav2Vec2预训练模型,用于从头开始训练。
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
52 53 54 55 56 57 58 59 60 61
- en: '| [`ConvEmformer`](generated/torchaudio.prototype.models.ConvEmformer.html#torchaudio.prototype.models.ConvEmformer
    "torchaudio.prototype.models.ConvEmformer") | Implements the convolution-augmented
    streaming transformer architecture introduced in *Streaming Transformer Transducer
    based Speech Recognition Using Non-Causal Convolution* [[Shi *et al.*, 2022](references.html#id31
    "Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang,
    Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, and Mike Seltzer.
    Streaming transformer transducer based speech recognition using non-causal convolution.
    In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP), volume, 8277-8281\. 2022\. doi:10.1109/ICASSP43922.2022.9747706.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
62
  id: totrans-8
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
63 64
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
65 66 67 68 69 70 71 72 73
  zh: '| [`ConvEmformer`](generated/torchaudio.prototype.models.ConvEmformer.html#torchaudio.prototype.models.ConvEmformer
    "torchaudio.prototype.models.ConvEmformer") | 实现了*Streaming Transformer Transducer
    based Speech Recognition Using Non-Causal Convolution*中引入的卷积增强流式变压器架构[[Shi等人,2022](references.html#id31
    "Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang,
    Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, and Mike Seltzer.
    Streaming transformer transducer based speech recognition using non-causal convolution.
    In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP), volume, 8277-8281\. 2022\. doi:10.1109/ICASSP43922.2022.9747706.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
74 75 76 77 78 79 80 81
- en: '| [`HiFiGANVocoder`](generated/torchaudio.prototype.models.HiFiGANVocoder.html#torchaudio.prototype.models.HiFiGANVocoder
    "torchaudio.prototype.models.HiFiGANVocoder") | Generator part of *HiFi GAN* [[Kong
    *et al.*, 2020](references.html#id57 "Jungil Kong, Jaehyeon Kim, and Jaekyoung
    Bae. Hifi-gan: generative adversarial networks for efficient and high fidelity
    speech synthesis. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H.
    Lin, editors, Advances in Neural Information Processing Systems, volume 33, 17022–17033\.
    Curran Associates, Inc., 2020\. URL: https://proceedings.neurips.cc/paper/2020/file/c5d736809766d46260d816d8dbc9eb44-Paper.pdf.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
82
  id: totrans-9
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
83 84
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
85 86 87 88 89 90 91 92
  zh: '| [`HiFiGANVocoder`](generated/torchaudio.prototype.models.HiFiGANVocoder.html#torchaudio.prototype.models.HiFiGANVocoder
    "torchaudio.prototype.models.HiFiGANVocoder") | *HiFi GAN*的生成器部分[[Kong等人,2020](references.html#id57
    "Jungil Kong, Jaehyeon Kim, and Jaekyoung Bae. Hifi-gan: generative adversarial
    networks for efficient and high fidelity speech synthesis. In H. Larochelle, M.
    Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information
    Processing Systems, volume 33, 17022–17033\. Curran Associates, Inc., 2020\. URL:
    https://proceedings.neurips.cc/paper/2020/file/c5d736809766d46260d816d8dbc9eb44-Paper.pdf.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
93
- en: Prototype Factory Functions of Beta Models[](#prototype-factory-functions-of-beta-models
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
94
    "Permalink to this heading")
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
95
  id: totrans-10
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
96 97 98
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
99 100
  zh: Beta模型的原型工厂函数[](#prototype-factory-functions-of-beta-models "Permalink to this
    heading")
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
101 102 103
- en: Some model definitions are in beta, but there are new factory functions that
    are still in prototype. Please check “Prototype Factory Functions” section in
    each model.
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
104
  id: totrans-11
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
105 106
  prefs: []
  type: TYPE_NORMAL
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
107
  zh: 一些模型定义处于测试阶段,但仍有新的工厂函数处于原型阶段。请查看每个模型中的“Prototype Factory Functions”部分。
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
108 109 110 111 112
- en: '| [`Wav2Vec2Model`](generated/torchaudio.models.Wav2Vec2Model.html#torchaudio.models.Wav2Vec2Model
    "torchaudio.models.Wav2Vec2Model") | Acoustic model used in *wav2vec 2.0* [[Baevski
    *et al.*, 2020](references.html#id15 "Alexei Baevski, Henry Zhou, Abdelrahman
    Mohamed, and Michael Auli. Wav2vec 2.0: a framework for self-supervised learning
    of speech representations. 2020\. arXiv:2006.11477.")]. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
113
  id: totrans-12
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
114 115
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
116 117 118 119 120
  zh: '| [`Wav2Vec2Model`](generated/torchaudio.models.Wav2Vec2Model.html#torchaudio.models.Wav2Vec2Model
    "torchaudio.models.Wav2Vec2Model") | *wav2vec 2.0*中使用的声学模型[[Baevski等人,2020](references.html#id15
    "Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. Wav2vec 2.0:
    a framework for self-supervised learning of speech representations. 2020\. arXiv:2006.11477.")].
    |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
121 122
- en: '| [`RNNT`](generated/torchaudio.models.RNNT.html#torchaudio.models.RNNT "torchaudio.models.RNNT")
    | Recurrent neural network transducer (RNN-T) model. |'
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
123
  id: totrans-13
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
124 125
  prefs: []
  type: TYPE_TB
绝不原创的飞龙's avatar
绝不原创的飞龙 已提交
126 127
  zh: '| [`RNNT`](generated/torchaudio.models.RNNT.html#torchaudio.models.RNNT "torchaudio.models.RNNT")
    | 循环神经网络转录器(RNN-T)模型。 |'