README.md 1.6 KB
Newer Older
L
liuyibing01 已提交
1 2 3 4
# Parakeet

Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on Paddle Fluid dynamic graph, with the support of many influential TTS models proposed by [Baidu Research](http://research.baidu.com) and other academic institutions.  

Y
Yibing Liu 已提交
5 6 7 8
<div align="center">
  <img src="images/logo.png" width=450 /> <br>
</div>

L
liuyibing01 已提交
9 10 11 12 13 14 15
### Setup

Make sure the library `libsndfile1` installed, e.g., on Ubuntu

```bash
sudo apt-get install libsndfile1
```
L
liuyibing01 已提交
16

L
liuyibing01 已提交
17
### Install PaddlePaddle
18 19 20 21 22

See [install](https://www.paddlepaddle.org.cn/install/quick) for more details. This repo requires paddlepaddle's version is above 1.7.

### Install Parakeet

L
liuyibing01 已提交
23
```bash
24 25 26
# git clone this repo first
cd Parakeet
pip install -e .
L
liuyibing01 已提交
27 28
```

29 30 31 32 33
### Install CMUdict for nltk

CMUdict from nltk is used to transform text into phonemes.
```python
import nltk
34
nltk.download("punkt")
35 36
nltk.download("cmudict")
```
37 38


39
## Related Research
L
liuyibing01 已提交
40

C
chenfeiyu 已提交
41
- [Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning](https://arxiv.org/abs/1710.07654)
42 43
- [Neural Speech Synthesis with Transformer Network](https://arxiv.org/abs/1809.08895)
- [FastSpeech: Fast, Robust and Controllable Text to Speech](https://arxiv.org/abs/1905.09263).
44
- [WaveFlow: A Compact Flow-based Model for Raw Audio](https://arxiv.org/abs/1912.01219)
45 46 47

## Examples

L
liuyibing01 已提交
48 49 50 51
- [Train a DeepVoice3 model with ljspeech dataset](./examples/deepvoice3)
- [Train a TransformerTTS  model with ljspeech dataset](./examples/transformer_tts)
- [Train a FastSpeech model with ljspeech dataset](./examples/fastspeech)
- [Train a WaveFlow model with ljspeech dataset](./examples/waveflow)