| Input Audio | Recognition Result | 
|---|---|
| 
      
             | 
      I knocked at the door on the ancient side of the building. | 
| 
      
             | 
      我认为跑步最重要的就是给我带来了身体健康。 | 
| Synthetic Audio | |
|---|---|
| Life was like a box of chocolates, you never know what you're gonna get. | 
      
             | 
    
| 早上好,今天是2020/10/29,最低温度是-3°C。 | 
      
             | 
    
| Speech-To-Text Module Type | Dataset | Model Type | Link | 
|---|---|---|---|
| Acoustic Model | Aishell | DeepSpeech2 RNN + Conv based Models | deepspeech2-aishell | 
| Transformer based Attention Models | u2.transformer.conformer-aishell | ||
| Librispeech | Transformer based Attention Models | deepspeech2-librispeech / transformer.conformer.u2-librispeech / transformer.conformer.u2-kaldi-librispeech | |
| Alignment | THCHS30 | MFA | mfa-thchs30 | 
| Language Model | Ngram Language Model | kenlm | |
| TIMIT | Unified Streaming & Non-streaming Two-pass | u2-timit | |
|  Text-To-Speech Module Type  | 
      Model Type |   | 
        | 
    
|---|---|---|---|
| Text Frontend | tn / g2p | ||
| Acoustic Model | Tacotron2 | LJSpeech | tacotron2-ljspeech | 
| TransformerTTS | transformer-ljspeech | ||
| SpeedySpeech | CSMSC | speedyspeech-csmsc | |
| FastSpeech2 | AISHELL-3 / VCTK / LJSpeech / CSMSC | fastspeech2-aishell3 / fastspeech2-vctk / fastspeech2-ljspeech / fastspeech2-csmsc | |
| Vocoder | WaveFlow | LJSpeech | waveflow-ljspeech | 
| Parallel WaveGAN | LJSpeech / VCTK / CSMSC | PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc | |
| Voice Cloning | GE2E | AISHELL-3, etc. | ge2e | 
| GE2E + Tactron2 | AISHELL-3 | ge2e-tactron2-aishell3 | |