From the evaluation results, the best step for softmax regression model is pass-00013, where classification accuracy is 90.01%, and the last pass-00099 has accuracy of 89.3%. From Fig. 7, we also see that the best accuracy may not appear in the last pass. A explanation is that during training, the model may already arrive at local optimum, and it just swings around nearby in the following passes, or it gets lower local optimum.
### 多层感知器的训练结果
<palign="center">
...
...
@@ -698,6 +714,22 @@ The classification accuracy is 94.95%
From the evaluation results, the final training accuracy is 94.95%. It has significant improvement comparing with softmax regression model. The reason is that softmax regression is simple, and it cannot fit complex data, but Multi-layer perceptron with hidden layers has stronger fitting capacity.
### 卷积神经网络的训练结果
<palign="center">
...
...
@@ -714,8 +746,26 @@ The classification accuracy is 99.20%
From the evaluation result, the best accuracy of Convolutional Neural Network is 99.20%. This means, for image problem, Convolutional Neural Network has better recognition effect than fully connected network. This should be related to the local connection and parameter sharing of convolutional layers. Also, in Fig. 9, Convolutional Neural Network achieves good effect in early steps, which indicates that it is fast to converge.
## 应用模型
## Application Model
### 预测命令与结果
脚本 `predict.py` 可以对训练好的模型进行预测,例如softmax回归中:
...
...
@@ -741,6 +791,31 @@ Actual Number: 0
从结果看出,该分类器接近100%地认为第3张图片上面的数字为0,而实际标签给出的类也确实如此。
### Prediction Commands and Results
Script `predict.py` can make prediction for trained models. For example, in softmax regression:
- -m sets model parameters, here the best trained model is used for prediction
Follow to instruction to input image ID for prediction. The classifier can output probabilities for each digit, predicted results with the highest probability, and ground truth label.
From the result, this classifier recognizes the digit on the third image as digit 0 with near to 100% probability, and the ground truth is actually consistent.