diff --git a/README.md b/README.md index 9a049df8d47de543b9dcb78f04ff7983ceb5388a..76d29bacb1cf19fddff0737b124ac3d978839501 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,12 @@ -English | [简体中文](README_ch.md) - -# PaddleSpeech - - -
-
+
| Input Audio | +Recognition Result | +
|---|---|
|
+
+ + |
+ I knocked at the door on the ancient side of the building. | +
|
+
+ + |
+ 我认为跑步最重要的就是给我带来了身体健康。 | +
| Synthetic Audio | +|
|---|---|
| Life was like a box of chocolates, you never know what you're gonna get. | +
+
+ + |
+
| 早上好,今天是2020/10/29,最低温度是-3°C。 | +
+
+ + |
+
-The contents of this README is as follow:
-- [Alternative Installation](#alternative-installation)
-- [Quick Start](#quick-start)
-- [Models List](#models-list)
-- [Tutorials](#tutorials)
-- [FAQ and Contributing](#faq-and-contributing)
-- [License](#license)
-- [Acknowledgement](#acknowledgement)
+| ASR Module Type | +Speech-To-Text Module Type | Dataset | Model Type | Link | @@ -141,76 +179,61 @@ The current hyperlinks redirect to [Previous Parakeet](https://github.com/Paddle
|---|---|---|---|---|
| Acoustic Model | -Aishell | -2 Conv + 5 LSTM layers with only forward direction | -- Ds2 Online Aishell Model - | -|
| 2 Conv + 3 bidirectional GRU layers | +Acoustic Model | +Aishell | +DeepSpeech2 RNN + Conv based Models | - Ds2 Offline Aishell Model + deepspeech2-aishell |
| Encoder:Conformer, Decoder:Transformer, Decoding method: Attention + CTC | +Transformer based Attention Models | - Conformer Offline Aishell Model - | -||
| Encoder:Conformer, Decoder:Transformer, Decoding method: Attention | -- Conformer Librispeech Model + u2.transformer.conformer-aishell | |||
| Librispeech | -Encoder:Conformer, Decoder:Transformer, Decoding method: Attention | -Conformer Librispeech Model | -||
| Encoder:Transformer, Decoder:Transformer, Decoding method: Attention | +Librispeech | +Transformer based Attention Models | - Transformer Librispeech Model + deepspeech2-librispeech / transformer.conformer.u2-librispeech / transformer.conformer.u2-kaldi-librispeech | -|
| Language Model | -CommonCrawl(en.00) | -English Language Model | -- English Language Model | |
| Baidu Internal Corpus | -Mandarin Language Model Small | +|||
| Alignment | +THCHS30 | +MFA | ++ mfa-thchs30 + | +|
| Language Model | +Ngram Language Model | - Mandarin Language Model Small + kenlm | ||
| Mandarin Language Model Large | +TIMIT | +Unified Streaming & Non-streaming Two-pass | - Mandarin Language Model Large + u2-timit | |
| TTS Module Type | -Model Type | -Dataset | -Link | + Text-To-Speech Module Type |
+ Model Type | + |
+ |
Text Frontend | - chinese-fronted + tn / g2p |
|---|---|---|---|---|---|---|---|
| Acoustic Model | +Acoustic Model | Tacotron2 | LJSpeech | - tacotron2-vctk + tacotron2-ljspeech | |||
| FastSpeech2 | -AISHELL-3 | -- fastspeech2-aishell3 - | -|||||
| VCTK | -fastspeech2-vctk | -||||||
| LJSpeech | -fastspeech2-ljspeech | -||||||
| CSMSC | +FastSpeech2 | +AISHELL-3 / VCTK / LJSpeech / CSMSC | - fastspeech2-csmsc + fastspeech2-aishell3 / fastspeech2-vctk / fastspeech2-ljspeech / fastspeech2-csmsc | ||||
| Vocoder | +Vocoder | WaveFlow | LJSpeech | @@ -272,22 +281,10 @@ PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model | |||
| Parallel WaveGAN | -LJSpeech | -- PWGAN-ljspeech - | -|||||
| VCTK | -- PWGAN-vctk - | -||||||
| CSMSC | +Parallel WaveGAN | +LJSpeech / VCTK / CSMSC | - PWGAN-csmsc + PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc | ||||