From 02e7586394dd31d37ac43e409aed8657f8eaa426 Mon Sep 17 00:00:00 2001 From: Hui Zhang Date: Fri, 6 May 2022 13:02:56 +0000 Subject: [PATCH] update readme --- README.md | 35 +++++++------ README_cn.md | 63 +++++++++++------------ paddlespeech/cli/asr/pretrained_models.py | 2 +- 3 files changed, 50 insertions(+), 50 deletions(-) diff --git a/README.md b/README.md index 9791b895..d32131c0 100644 --- a/README.md +++ b/README.md @@ -151,14 +151,24 @@ For more synthesized audios, please refer to [PaddleSpeech Text-to-Speech sample ### Features Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. To be more specific, this toolkit features at: -- 📦 **Ease of Use**: low barriers to install, and [CLI](#quick-start) is available to quick-start your journey. +- 📦 **Ease of Use**: low barriers to install, [CLI](#quick-start), [Server](#quick-start-server), and [Streaming Server](#quick-start-streaming-server) is available to quick-start your journey. - 🏆 **Align to the State-of-the-Art**: we provide high-speed and ultra-lightweight models, and also cutting-edge technology. +- 🏆 **Streaming ASR and TTS System**: we provide production ready streaming asr and streaming tts system. - 💯 **Rule-based Chinese frontend**: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context. -- **Varieties of Functions that Vitalize both Industrial and Academia**: - - 🛎️ *Implementation of critical audio tasks*: this toolkit contains audio functions like Audio Classification, Speech Translation, Automatic Speech Recognition, Text-to-Speech Synthesis, etc. +- 📦 **Varieties of Functions that Vitalize both Industrial and Academia**: + - 🛎️ *Implementation of critical audio tasks*: this toolkit contains audio functions like Automatic Speech Recognition, Text-to-Speech Synthesis, Speaker Verfication, KeyWord Spotting, Audio Classification, and Speech Translation, etc. - 🔬 *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details. - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). +### Recent Update +- 👏🏻 2022.05.06: `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp`. +- 👏🏻 2022.05.06: `Server` is available for `Speaker Verification`, and `Punctuation Restoration`. +- 👏🏻 2022.04.28: `Streaming Server` is available for `Automatic Speech Recognition` and `Text-to-Speech`. +- 👏🏻 2022.03.28: `Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`. +- 👏🏻 2022.03.28: `CLI` is available for `Speaker Verification`. +- 🤗 2021.12.14: [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available! +- 👏🏻 2021.12.10: `CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`. + ### 🔥 Hot Activities -- 👏🏻 2022.04.28: PaddleSpeech Streaming Server 上线! 覆盖了语音识别和语音合成。 -- 👏🏻 2022.03.28: PaddleSpeech Server 上线! 覆盖了声音分类、语音识别、以及语音合成。 -- 👏🏻 2022.03.28: PaddleSpeech CLI 上线声纹验证。 -- 🤗 2021.12.14: Our PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available! -- 👏🏻 2021.12.10: PaddleSpeech CLI 上线!覆盖了声音分类、语音识别、语音翻译(英译中)以及语音合成。 +- 👏🏻 2022.05.06: PaddleSpeech Streaming Server 上线! 覆盖了语音识别(标点恢复、时间戳),和语音合成。 +- 👏🏻 2022.05.06: PaddleSpeech Server 上线! 覆盖了声音分类、语音识别、语音合成、声纹识别,标点恢复。 +- 👏🏻 2022.03.28: PaddleSpeech CLI 覆盖声音分类、语音识别、语音翻译(英译中)、语音合成,声纹验证。 +- 🤗 2021.12.14: PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available! +### 🔥 热门活动 -### 特性 +- 2021.12.21~12.24 -本项目采用了易用、高效、灵活以及可扩展的实现,旨在为工业应用、学术研究提供更好的支持,实现的功能包含训练、推断以及测试模块,以及部署过程,主要包括 -- 📦 **易用性**: 安装门槛低,可使用 [CLI](#quick-start) 快速开始。 -- 🏆 **对标 SoTA**: 提供了高速、轻量级模型,且借鉴了最前沿的技术。 -- 💯 **基于规则的中文前端**: 我们的前端包含文本正则化和字音转换(G2P)。此外,我们使用自定义语言规则来适应中文语境。 -- **多种工业界以及学术界主流功能支持**: - - 🛎️ 典型音频任务: 本工具包提供了音频任务如音频分类、语音翻译、自动语音识别、文本转语音、语音合成等任务的实现。 - - 🔬 主流模型及数据集: 本工具包实现了参与整条语音任务流水线的各个模块,并且采用了主流数据集如 LibriSpeech、LJSpeech、AIShell、CSMSC,详情请见 [模型列表](#model-list)。 - - 🧩 级联模型应用: 作为传统语音任务的扩展,我们结合了自然语言处理、计算机视觉等任务,实现更接近实际需求的产业级应用。 + 4 日直播课: 深度解读 PaddleSpeech 语音技术! + + **直播回放与课件资料: https://aistudio.baidu.com/aistudio/education/group/info/25130** ### 技术交流群 @@ -328,8 +327,8 @@ PaddleSpeech 的 **语音转文本** 包含语音识别声学模型、语音识 语音转文本模块类型 数据集 - 模型种类 - 链接 + 模型类型 + 脚本 @@ -402,9 +401,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 语音合成模块类型 - 模型种类 + 模型类型 数据集 - 链接 + 脚本 @@ -520,8 +519,8 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 任务 数据集 - 模型种类 - 链接 + 模型类型 + 脚本 @@ -544,10 +543,10 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 - - - - + + + + @@ -571,8 +570,8 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 - - + + diff --git a/paddlespeech/cli/asr/pretrained_models.py b/paddlespeech/cli/asr/pretrained_models.py index 7f198ad6..0f521884 100644 --- a/paddlespeech/cli/asr/pretrained_models.py +++ b/paddlespeech/cli/asr/pretrained_models.py @@ -27,7 +27,7 @@ pretrained_models = { 'ckpt_path': 'exp/conformer/checkpoints/wenetspeech', }, - "conformer_online_wenetspeech-zh-16k": { + "conformer_online_wenetspeech-zh-16k": { 'url': 'https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz', 'md5': -- GitLab
Task Dataset Model Type Link 任务 数据集 模型类型 脚本
任务 数据集 模型种类 链接 模型类型 脚本