@@ -42,7 +42,7 @@ Audio samples generated from ground-truth spectrograms with a vocoder.
<tr>
<td >Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition</td>
@@ -81,7 +81,7 @@ Audio samples generated from ground-truth spectrograms with a vocoder.
<tr>
<td>For although the Chinese took impressions from wood blocks engraved in relief for centuries before the woodcutters of the Netherlands, by a similar process</td>
@@ -119,7 +119,7 @@ Audio samples generated from ground-truth spectrograms with a vocoder.
<tr>
<td>the invention of movable metal letters in the middle of the fifteenth century may justly be considered as the invention of the art of printing.</td>
@@ -1736,3 +1736,141 @@ We use ``FastSpeech2`` + ``ParallelWaveGAN`` here.
<br>
Finetune FastSpeech2 for CSMSC
--------------------------------------
Finetuning demos of `tts_finetune/tts3 <https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/tts_finetune/tts3>`_ for CSMSC dataset.
When finetuning for CSMSC, we thought ``Freeze encoder`` > ``Non Frozen`` > ``Freeze encoder && duration_predictor`` for audio quality.
.. raw:: html
<div class="table">
CSMSC reference audio (fastspeech2_csmsc + hifigan_aishlle3 in CLI): 欢迎使用飞桨语音套件。