diff --git a/deep_speech_2/README.md b/deep_speech_2/README.md index d58d0c51fc5088770d2d3091797d9dfdc749be26..8b0da1ae2bb83fe12669654afac9d65248ae0c0a 100644 --- a/deep_speech_2/README.md +++ b/deep_speech_2/README.md @@ -434,7 +434,7 @@ python deploy/demo_client.py --help Language | Model Name | Training Data | Training Hours :-----------: | :------------: | :----------: | -------: English | [LibriSpeech Model](http://cloud.dlnel.org/filepub/?uuid=17404caf-cf19-492f-9707-1fad07c19aae) | [LibriSpeech Dataset](http://www.openslr.org/12/) | 960 h -English | [Internal English Model](to-be-added) | Baidu English Dataset | 8000 h +English | [Internal English Model](to-be-added) | Baidu English Dataset | 8628 h Mandarin | [Aishell Model](http://cloud.dlnel.org/filepub/?uuid=6c83b9d8-3255-4adf-9726-0fe0be3d0274) | [Aishell Dataset](http://www.openslr.org/33/) | 151 h Mandarin | [Internal Mandarin Model](to-be-added) | Baidu Mandarin Dataset | 2917 h @@ -442,30 +442,21 @@ Mandarin | [Internal Mandarin Model](to-be-added) | Baidu Mandarin Dataset | 291 Language Model | Training Data | Token-based | Size | Filter Configuraiton :-------------:| :------------:| :-----: | -----: | -----------------: -[English LM (Median)](http://paddlepaddle.bj.bcebos.com/model_zoo/speech/common_crawl_00.prune01111.trie.klm) | To Be Added | Word-based | 8.3 GB | To Be Added -[English LM (Big)](to-be-added) | To Be Added | Word-based | X.X GB | To Be Added -[Mandarin LM (Median)](http://cloud.dlnel.org/filepub/?uuid=d21861e4-4ed6-45bb-ad8e-ae417a43195e) | To Be Added | Character-based | 2.8 GB | To Be Added -[Mandarin LM (Big)](to-be-added) | To Be Added | Character-based | X.X GB | To Be Added +[English LM](http://paddlepaddle.bj.bcebos.com/model_zoo/speech/common_crawl_00.prune01111.trie.klm) | To Be Added | Word-based | 8.3 GB | To Be Added +[Mandarin LM](http://cloud.dlnel.org/filepub/?uuid=d21861e4-4ed6-45bb-ad8e-ae417a43195e) | To Be Added | Character-based | 2.8 GB | To Be Added ## Experiments and Benchmarks #### English Model Evaluation (Word Error Rate) Test Set | LibriSpeech Model | Internal English Model -:---------------------: | :---------------: | :-------------------: -LibriSpeech-Test-Clean | 7.9 | X.X -LibriSpeech-Test-Other | X.X | X.X +:---------------------: | ---------------: | -------------------: +LibriSpeech-Test-Clean | 7.96 | X.X +LibriSpeech-Test-Other | 23.87 | X.X VoxForge-Test | X.X | X.X Baidu-English-Test | X.X | X.X -#### English Model Evaluation (Character Error Rate) - -Test Set | LibriSpeech Model | Internal English Model -:---------------------: | :---------------: | :-------------------: -LibriSpeech-Test-Clean | X.X | X.X -LibriSpeech-Test-Other | X.X | X.X -VoxForge-Test | X.X | X.X -Baidu-English-Test | X.X | X.X +(Beam size=2000) #### Mandarin Model Evaluation (Character Error Rate) @@ -476,7 +467,7 @@ Baidu-Mandarin-Test | X.X | X.X #### Acceleration with Multi-GPUs -We compare the training time with 1, 2, 4, 8, 16 Tesla K40m GPUs (with a subset of LibriSpeech samples whose audio durations are between 6.0 and 7.0 seconds). And it shows that a **near-linear** acceleration with multiple GPUs has been achieved. In the following figure, the time (in seconds) used for training is plotted on the blue bars. +We compare the training time with 1, 2, 4, 8, 16 Tesla K40m GPUs (with a subset of LibriSpeech samples whose audio durations are between 6.0 and 7.0 seconds). And it shows that a **near-linear** acceleration with multiple GPUs has been achieved. In the following figure, the time (in seconds) cost for training is printed on the blue bars.
diff --git a/deep_speech_2/models/librispeech/download_model.sh b/deep_speech_2/models/librispeech/download_model.sh index 336502de87d77459063d1eaec8060a22a040b469..7c46c09915137cdb124a9abeb6910730c43d0f89 100644 --- a/deep_speech_2/models/librispeech/download_model.sh +++ b/deep_speech_2/models/librispeech/download_model.sh @@ -2,8 +2,8 @@ source ../../utils/utility.sh -URL='http://cloud.dlnel.org/filepub/?uuid=17404caf-cf19-492f-9707-1fad07c19aae' -MD5=ea5024a457a91179472f6dfee60e053d +URL='http://cloud.dlnel.org/filepub/?uuid=8e3cf742-2ff3-41ce-a49d-f6158cc06a23' +MD5=2ef08f8b608a7c555592161fc14d81a6 TARGET=./librispeech_model.tar.gz