Created by: lcy-seso
When using the parameter trained by NCE cost, the activation of the last hidden layer should be softmax.