未验证 提交 d983f8d8 编写于 作者: J Jackwaterveg 提交者: GitHub

Create modle_arcitecture.md

上级 9c098379
# Model Arcitecture
The implemented arcitecure of Deepspeech2 online model is based on [Deepspeech2 model](https://arxiv.org/pdf/1512.02595.pdf) with some changes.
The figure of arcitecture is shown in ![image](../image/ds2onlineModel.png).
The model is mainly composed of 2D convolution subsampling layer and single direction rnn layers. To illustrate the model implementation in detail, 5 parts is introduced.
1. Feature Extraction.
2. 2D Convolution subsampling layer.
3. RNN layer with only forward direction.
4. Softmax Layer.
5. CTC Decoder.
# Feature Extraction
Three methods of feature extraction is implemented, which are linear, fbank and mfcc.
For a single utterance $x^i$ sampled from the training set $S$,
$ S= {(x^1,y^1),(x^2,y^2),...,(x^m,y^m)}$, where $y^i$ is the label correspodding to the ${x^i}
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册