Paddle implementation of deepvoice 3 in dynamic graph, a convolutional network based text-to-speech synthesis model. The implementation is based on [Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning](https://arxiv.org/abs/1710.07654).
Paddle implementation of deepvoice 3 in dynamic graph, a convolutional network based text-to-speech synthesis model. The implementation is based on [Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning](https://arxiv.org/abs/1710.07654).
...
@@ -22,7 +22,7 @@ The model consists of an encoder, a decoder and a converter (and a speaker embed
...
@@ -22,7 +22,7 @@ The model consists of an encoder, a decoder and a converter (and a speaker embed
## Project Structure
## Project Structure
```text
```text
├── data.py data_processing
├── data.py data_processing
├── ljspeech.yaml (example) configuration file
├── ljspeech.yaml (example) configuration file
├── sentences.txt sample sentences
├── sentences.txt sample sentences
├── synthesis.py script to synthesize waveform from text
├── synthesis.py script to synthesize waveform from text
...
@@ -50,7 +50,7 @@ optional arguments:
...
@@ -50,7 +50,7 @@ optional arguments:
The directory to save result.
The directory to save result.
-g DEVICE, --device DEVICE
-g DEVICE, --device DEVICE
device to use
device to use
```
```
1.`--config` is the configuration file to use. The provided `ljspeech.yaml` can be used directly. And you can change some values in the configuration file and train the model with a different config.
1.`--config` is the configuration file to use. The provided `ljspeech.yaml` can be used directly. And you can change some values in the configuration file and train the model with a different config.
2.`--data` is the path of the LJSpeech dataset, the extracted folder from the downloaded archive (the folder which contains metadata.txt).
2.`--data` is the path of the LJSpeech dataset, the extracted folder from the downloaded archive (the folder which contains metadata.txt).
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
"""
At most cases, we have non-stream dataset, which means we can random access it with __getitem__, and we can get the length of the dataset with __len__.
At most cases, we have non-stream dataset, which means we can random access it with __getitem__, and we can get the length of the dataset with __len__.
...
@@ -6,10 +19,10 @@ This suffices for a sampler. We implemente sampler as iterable of valid indices.
...
@@ -6,10 +19,10 @@ This suffices for a sampler. We implemente sampler as iterable of valid indices.
So the sampler is only responsible for generating valid indices.
So the sampler is only responsible for generating valid indices.
"""
"""
importnumpyasnp
importnumpyasnp
importrandom
importrandom
classSampler(object):
classSampler(object):
def__init__(self,data_source):
def__init__(self,data_source):
pass
pass
...
@@ -23,7 +36,7 @@ class Sampler(object):
...
@@ -23,7 +36,7 @@ class Sampler(object):
classSequentialSampler(Sampler):
classSequentialSampler(Sampler):
def__init__(self,data_source):
def__init__(self,data_source):
self.data_source=data_source
self.data_source=data_source
def__iter__(self):
def__iter__(self):
returniter(range(len(self.data_source)))
returniter(range(len(self.data_source)))
...
@@ -42,12 +55,14 @@ class RandomSampler(Sampler):
...
@@ -42,12 +55,14 @@ class RandomSampler(Sampler):
"replacement={}".format(self.replacement))
"replacement={}".format(self.replacement))
ifself._num_samplesisnotNoneandnotreplacement:
ifself._num_samplesisnotNoneandnotreplacement:
raiseValueError("With replacement=False, num_samples should not be specified, "
raiseValueError(
"since a random permutation will be performed.")
"With replacement=False, num_samples should not be specified, "
@@ -14,9 +14,4 @@ One of the reasons we choose to load data lazily (only load metadata before hand
...
@@ -14,9 +14,4 @@ One of the reasons we choose to load data lazily (only load metadata before hand
For deep learning practice, we typically batch examples. So the dataset should comes with a method to batch examples. Assuming the record is implemented as a tuple with several items. When an item is represented as a fix-sized array, to batch them is trivial, just `np.stack` suffices. But for array with dynamic size, padding is needed. We decide to implement a batching method for each item. Then batching a record can be implemented by these methods. For a dataset, a `_batch_examples` should be implemented. But in most cases, you can choose one from `batching.py`.
For deep learning practice, we typically batch examples. So the dataset should comes with a method to batch examples. Assuming the record is implemented as a tuple with several items. When an item is represented as a fix-sized array, to batch them is trivial, just `np.stack` suffices. But for array with dynamic size, padding is needed. We decide to implement a batching method for each item. Then batching a record can be implemented by these methods. For a dataset, a `_batch_examples` should be implemented. But in most cases, you can choose one from `batching.py`.