@@ -28,7 +28,7 @@ You can choose to install via pypi or clone the repository and install manually.
...
@@ -28,7 +28,7 @@ You can choose to install via pypi or clone the repository and install manually.
pip install-e .
pip install-e .
```
```
### cmudict
### Download cmudict for nltk
You also need to download cmudict for nltk, because convert text into phonemes with `cmudict`.
You also need to download cmudict for nltk, because convert text into phonemes with `cmudict`.
```python
```python
...
@@ -37,7 +37,7 @@ nltk.download("punkt")
...
@@ -37,7 +37,7 @@ nltk.download("punkt")
nltk.download("cmudict")
nltk.download("cmudict")
```
```
## dataset
## Dataset
We experiment with the LJSpeech dataset. Download and unzip [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
We experiment with the LJSpeech dataset. Download and unzip [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
...
@@ -48,20 +48,22 @@ tar xjvf LJSpeech-1.1.tar.bz2
...
@@ -48,20 +48,22 @@ tar xjvf LJSpeech-1.1.tar.bz2
## Model Architecture
## Model Architecture


The model consists of an encoder, a decoder and a converter (and a speaker embedding for multispeaker models). The encoder, together with the decoder forms the seq2seq part of the model, and the converter forms the postnet part.
The model consists of an encoder, a decoder and a converter (and a speaker embedding for multispeaker models). The encoder, together with the decoder forms the seq2seq part of the model, and the converter forms the postnet part.
## Project Structure
## Project Structure
```text
├── data.py data_processing
├── data.py data_processing
├── ljspeech.yaml (example) configuration file
├── ljspeech.yaml (example) configuration file
├── sentences.txt sample sentences
├── sentences.txt sample sentences
├── synthesis.py script to synthesize waveform from text
├── synthesis.py script to synthesize waveform from text
├── train.py script to train a model
├── train.py script to train a model
└── utils.py utility functions
└── utils.py utility functions
```
## train
## Train
Train the model using train.py, follow the usage displayed by `python train.py --help`.
Train the model using train.py, follow the usage displayed by `python train.py --help`.
...
@@ -100,7 +102,7 @@ optional arguments:
...
@@ -100,7 +102,7 @@ optional arguments:
5.`--device` is the device (gpu id) to use for training. `-1` means CPU.
5.`--device` is the device (gpu id) to use for training. `-1` means CPU.
## synthesis
## Synthesis
```text
```text
usage: synthesis.py [-h] [-c CONFIG] [-g DEVICE] checkpoint text output_path
usage: synthesis.py [-h] [-c CONFIG] [-g DEVICE] checkpoint text output_path