The convergence comparison between Fluid and Houyi
Created by: kuke
To verify the correctness, we take about 1/13 training data, and carry out the contrast training of models on one single GPU with Adam optimizer. The benchmark is the model on the internal framework Houyi developed by the Speech team.
Setting:
batch_size: 32
device: GPU
hidden_dim: 1024
learning_rate: 0.00016
minimum_batch_size: 1
parallel: False
proj_dim: 512
stacked_num: 5
The comparion shows that the two learning curvers match with each other very well. And we verify the convergence of DeepASR on part of training data.