Prepare internal speech recognition dataset for Mandarin. (#2232) · Issue · PaddlePaddle / Paddle

Prepare internal speech recognition dataset for Mandarin.

Created by: xinghai-sun

Prepare one or more internal Mandarin speech recognition datasets for our internal benchmark.
If there is any particular data preprocessing for Mandarin, add it to the audio data provider.
Prepare a reliable baseline & evaluation details. It would be better if we could try the baseline training environment ourselves.
Need cooperating with the Department of Speech in Baidu.
Refer to the DS2 design doc and update it when necessary.