DeepSpeech on PaddlePaddle
DeepSpeech on PaddlePaddle is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with PaddlePaddle platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient and scalable implementation, including training, inference & testing module, and demo deployment.
Models
Setup
- python>=3.7
- paddlepaddle>=2.0.0
- Run the setup script for the remaining dependencies
git clone https://github.com/PaddlePaddle/DeepSpeech.git
cd DeepSpeech
pushd tools; make; popd
source tools/venv/bin/activate
bash setup.sh
- Source venv before do experiment.
source tools/venv/bin/activate
Getting Started
Please see Getting Started and tiny egs.
More Information
- Install
- Getting Started
- Data Prepration
- Data Augmentation
- Ngram LM
- Server Demo
- Benchmark
- Relased Model
- FAQ
Questions and Help
You are welcome to submit questions and bug reports in Github Issues. You are also welcome to contribute to this project.
License
DeepSpeech is provided under the Apache-2.0 License.
项目简介
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
源项目地址