diff --git a/docs/source/asr/deepspeech_architecture.md b/docs/source/asr/deepspeech_architecture.md index 5a6ca8867a94d7f138b7aa88b10d61bfd6403dce..be9471d93a673f64b0cc030884df31fa15c9f49b 100644 --- a/docs/source/asr/deepspeech_architecture.md +++ b/docs/source/asr/deepspeech_architecture.md @@ -14,10 +14,11 @@ In addition, the training process and the testing process are also introduced. The arcitecture of the model is shown in Fig.1.

- +
Fig.1 The Arcitecture of deepspeech2 online model

+ ### Data Preparation #### Vocabulary For English data, the vocabulary dictionary is composed of 26 English characters with " ' ", space, \ and \. The \ represents the blank label in CTC, the \ represents the unknown character and the \ represents the start and the end characters. For mandarin, the vocabulary dictionary is composed of chinese characters statisticed from the training set and three additional characters are added. The added characters are \, \ and \. For both English and mandarin data, we set the default indexs that \=0, \=1 and \= last index. @@ -130,7 +131,7 @@ By using the command above, the training process can be started. There are 5 sta Using the command below, you can test the deepspeech2 online model. ``` bash run.sh --stage 3 --stop_stage 5 --model_type online --conf_path conf/deepspeech2_online.yaml -``` + ``` The detail commands are: ``` conf_path=conf/deepspeech2_online.yaml @@ -152,7 +153,7 @@ if [ ${stage} -le 5 ] && [ ${stop_stage} -ge 5 ]; then # test export ckpt avg_n CUDA_VISIBLE_DEVICES=0 ./local/test_export.sh ${conf_path} exp/${ckpt}/checkpoints/${avg_ckpt}.jit ${model_type}|| exit -1 fi - ``` +``` After the training process, we use stage 3,4,5 for testing process. The stage 3 is for testing the model generated in the stage 2 and provided the CER index of the test set. The stage 4 is for transforming the model from dynamic graph to static graph by using "paddle.jit" library. The stage 5 is for testing the model in static graph. @@ -161,12 +162,13 @@ The deepspeech2 offline model is similarity to the deepspeech2 online model. The The arcitecture of the model is shown in Fig.2.

- +
Fig.2 The Arcitecture of deepspeech2 offline model

+ For data preparation and decoder, the deepspeech2 offline model is same with the deepspeech2 online model. The code of encoder and decoder for deepspeech2 offline model is in: @@ -182,7 +184,7 @@ For training and testing, the "model_type" and the "conf_path" must be set. # Training offline cd examples/aishell/s0 bash run.sh --stage 0 --stop_stage 2 --model_type offline --conf_path conf/deepspeech2.yaml -``` + ``` ``` # Testing offline cd examples/aishell/s0 diff --git a/docs/source/tts/README.md b/docs/source/tts/README.md index 87cac76eafface29718d6dc8d23cfe5a3ffa4c32..18283cb276b955b9fcfc52c31717e5539345d5be 100644 --- a/docs/source/tts/README.md +++ b/docs/source/tts/README.md @@ -2,10 +2,11 @@ Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models.
-
+
-## News + +## News - Oct-12-2021, Refector examples code. - Oct-12-2021, Parallel WaveGAN with LJSpeech. Check [examples/GANVocoder/parallelwave_gan/ljspeech](./examples/GANVocoder/parallelwave_gan/ljspeech). - Oct-12-2021, FastSpeech2/FastPitch with LJSpeech. Check [examples/fastspeech2/ljspeech](./examples/fastspeech2/ljspeech). diff --git a/examples/aishell/README.md b/examples/aishell/README.md index 5e5c5ca90429eba094fd8a18eadd001c2263369d..82ef91da96f47e99e5589695f8de9776279aece8 100644 --- a/examples/aishell/README.md +++ b/examples/aishell/README.md @@ -5,7 +5,8 @@ ## Data -| Data Subset | Duration in Seconds | -| data/manifest.train | 1.23 ~ 14.53125 | -| data/manifest.dev | 1.645 ~ 12.533 | -| data/manifest.test | 1.859125 ~ 14.6999375 | +| Data Subset | Duration in Seconds | +| ------------------- | --------------------- | +| data/manifest.train | 1.23 ~ 14.53125 | +| data/manifest.dev | 1.645 ~ 12.533 | +| data/manifest.test | 1.859125 ~ 14.6999375 | diff --git a/examples/librispeech/README.md b/examples/librispeech/README.md index 57f506a49778d438552ac167ba76495111bfb103..724590952ba8e82803877bdffb42957060b56522 100644 --- a/examples/librispeech/README.md +++ b/examples/librispeech/README.md @@ -1,9 +1,8 @@ # ASR -* s0 is for deepspeech2 offline -* s1 is for transformer/conformer/U2 -* s2 is for transformer/conformer/U2 w/ kaldi feat -need install Kaldi +* s0 is for deepspeech2 offline +* s1 is for transformer/conformer/U2 +* s2 is for transformer/conformer/U2 w/ kaldi feat, need install Kaldi ## Data | Data Subset | Duration in Seconds | diff --git a/examples/librispeech/s0/README.md b/examples/librispeech/s0/README.md index 11bcf5f65df1ceb169096968ec9fa9ead23b58ff..77f92a2b7625fbb618b70025ad09d94ac1cc992d 100644 --- a/examples/librispeech/s0/README.md +++ b/examples/librispeech/s0/README.md @@ -1,14 +1,6 @@ # LibriSpeech -## Data -| Data Subset | Duration in Seconds | -| --- | --- | -| data/manifest.train | 0.83s ~ 29.735s | -| data/manifest.dev | 1.065 ~ 35.155s | -| data/manifest.test-clean | 1.285s ~ 34.955s | - ## Deepspeech2 - | Model | Params | release | Config | Test set | Loss | WER | | --- | --- | --- | --- | --- | --- | --- | | DeepSpeech2 | 42.96M | 2.2.0 | conf/deepspeech2.yaml + spec_aug | test-clean | 14.49190807 | 0.067283 | diff --git a/examples/other/punctuation_restoration/README.md b/examples/other/punctuation_restoration/README.md index 42ae0db3a49ef1e16edcad5ad9ae7ed02079adec..6393d8f5bab71f3385c2384331671226e88a837b 100644 --- a/examples/other/punctuation_restoration/README.md +++ b/examples/other/punctuation_restoration/README.md @@ -1,3 +1,4 @@ # Punctation Restoration -Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask] to do this task. +Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask) to do this task. +