This example contains code used to train a [Fastspeech2](https://arxiv.org/abs/2006.04558) model with [Chinese Standard Mandarin Speech Copus](https://www.data-baker.com/open_source.html).
# 用 CSMSC 数据集训练 FastSpeech2 模型
本用例包含用于训练 [Fastspeech2](https://arxiv.org/abs/2006.04558) 模型的代码,使用 [Chinese Standard Mandarin Speech Copus](https://www.data-baker.com/open_source.html)数据集。
本用例包含用于训练 [Fastspeech2](https://arxiv.org/abs/2006.04558) 模型的代码,使用 [Chinese Standard Mandarin Speech Copus](https://www.data-baker.com/open_source.html)数据集。
@@ -34,7 +33,7 @@ This example contains code used to train a [Fastspeech2](https://arxiv.org/abs/2
```bash
./local/preprocess.sh ${conf_path}
```
当它完成时。将在当前目录中创建`dump`文件夹。转储文件夹的结构如下所示。
当它完成时。将在当前目录中创建`dump`文件夹。转储文件夹的结构如下所示。
```text
dump
...
...
@@ -53,17 +52,16 @@ dump
├── raw
└── speech_stats.npy
```
The dataset is split into 3 parts, namely `train`, `dev`, and` test`, each of which contains a `norm` and `raw` subfolder. The raw folder contains speech、pitch and energy features of each utterance, while the norm folder contains normalized ones. The statistics used to normalize features are computed from the training set, which is located in `dump/train/*_stats.npy`.