Current status of model training
Created by: kuke
We are training the DeepASR model on the whole training dataset with duration 2000h. After 12 epoches' training, the model has well converged already.
Settings:
batch_size: 128
device: GPU
hidden_dim: 1024
learning_rate: 0.00016
minimum_batch_size: 1
proj_dim: 512
stacked_num: 5
optimizer: Adam
Env: 4 P40 GPUs, 15h per epoch.
After the decoder is ready, we will continue to fine tune this model to catch up with the performance in accuracy of benchmark.