From 526496e1bed3882a834c73dae37f400ede97738e Mon Sep 17 00:00:00 2001 From: littletomatodonkey <2120160898@bit.edu.cn> Date: Mon, 18 Jan 2021 11:04:39 +0800 Subject: [PATCH] Update distillation_en.md --- docs/en/advanced_tutorials/distillation/distillation_en.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/advanced_tutorials/distillation/distillation_en.md b/docs/en/advanced_tutorials/distillation/distillation_en.md index cb74896b..ff29f19e 100644 --- a/docs/en/advanced_tutorials/distillation/distillation_en.md +++ b/docs/en/advanced_tutorials/distillation/distillation_en.md @@ -7,7 +7,7 @@ In recent years, deep neural networks have been proven to be an extremely effect With enough training data, increasing parameters of the neural network by building a reasonabe network can significantly the model performance. But this increases the model complexity, which takes too much computation cost in real scenarios. -Parameter redundancy exists in deep neural networks. There are several methods to compress the model suck as pruning ,quantization, knowledge distillation, etc. Knowledge distillation refers to using the teacher model to guide the student model to learn specific tasks, ensuring that the small model has a relatively large effect improvement with the computation cost unchanged, and even obtains similar accuracy with the large model [1]. Combining some of the existing distillation methods [2,3], PaddleClas provides a simple semi-supervised label knowledge distillation solution (SSLD). Top-1 Accuarcy on ImageNet1k dataset has an improvement of more than 3% based on ResNet_vd and MobileNet series, which can be shown as below. +Parameter redundancy exists in deep neural networks. There are several methods to compress the model such as pruning ,quantization, knowledge distillation, etc. Knowledge distillation refers to using the teacher model to guide the student model to learn specific tasks, ensuring that the small model has a relatively large effect improvement with the computation cost unchanged, and even obtains similar accuracy with the large model [1]. Combining some of the existing distillation methods [2,3], PaddleClas provides a simple semi-supervised label knowledge distillation solution (SSLD). Top-1 Accuarcy on ImageNet1k dataset has an improvement of more than 3% based on ResNet_vd and MobileNet series, which can be shown as below. ![](../../../images/distillation/distillation_perform_s.jpg) -- GitLab