diff --git a/docs/en/advanced_tutorials/distillation/distillation_en.md b/docs/en/advanced_tutorials/distillation/distillation_en.md index cb74896b8f7959931a629da5691bf7de84fb3cc7..ff29f19e6194cb5f37569947297508d20aecf24e 100644 --- a/docs/en/advanced_tutorials/distillation/distillation_en.md +++ b/docs/en/advanced_tutorials/distillation/distillation_en.md @@ -7,7 +7,7 @@ In recent years, deep neural networks have been proven to be an extremely effect With enough training data, increasing parameters of the neural network by building a reasonabe network can significantly the model performance. But this increases the model complexity, which takes too much computation cost in real scenarios. -Parameter redundancy exists in deep neural networks. There are several methods to compress the model suck as pruning ,quantization, knowledge distillation, etc. Knowledge distillation refers to using the teacher model to guide the student model to learn specific tasks, ensuring that the small model has a relatively large effect improvement with the computation cost unchanged, and even obtains similar accuracy with the large model [1]. Combining some of the existing distillation methods [2,3], PaddleClas provides a simple semi-supervised label knowledge distillation solution (SSLD). Top-1 Accuarcy on ImageNet1k dataset has an improvement of more than 3% based on ResNet_vd and MobileNet series, which can be shown as below. +Parameter redundancy exists in deep neural networks. There are several methods to compress the model such as pruning ,quantization, knowledge distillation, etc. Knowledge distillation refers to using the teacher model to guide the student model to learn specific tasks, ensuring that the small model has a relatively large effect improvement with the computation cost unchanged, and even obtains similar accuracy with the large model [1]. Combining some of the existing distillation methods [2,3], PaddleClas provides a simple semi-supervised label knowledge distillation solution (SSLD). Top-1 Accuarcy on ImageNet1k dataset has an improvement of more than 3% based on ResNet_vd and MobileNet series, which can be shown as below. ![](../../../images/distillation/distillation_perform_s.jpg)