readme.html 5.9 KB
Newer Older
V
Varuna Jayasiri 已提交
1
<!DOCTYPE html>
V
docs  
Varuna Jayasiri 已提交
2
<html lang="en">
V
Varuna Jayasiri 已提交
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
<head>
    <meta http-equiv="content-type" content="text/html;charset=utf-8"/>
    <meta name="viewport" content="width=device-width, initial-scale=1.0"/>
    <meta name="description" content=""/>

    <meta name="twitter:card" content="summary"/>
    <meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
    <meta name="twitter:title" content="Distilling the Knowledge in a Neural Network"/>
    <meta name="twitter:description" content=""/>
    <meta name="twitter:site" content="@labmlai"/>
    <meta name="twitter:creator" content="@labmlai"/>

    <meta property="og:url" content="https://nn.labml.ai/distillation/readme.html"/>
    <meta property="og:title" content="Distilling the Knowledge in a Neural Network"/>
    <meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
V
Varuna Jayasiri 已提交
18
    <meta property="og:site_name" content="Distilling the Knowledge in a Neural Network"/>
V
Varuna Jayasiri 已提交
19 20 21 22 23 24
    <meta property="og:type" content="object"/>
    <meta property="og:title" content="Distilling the Knowledge in a Neural Network"/>
    <meta property="og:description" content=""/>

    <title>Distilling the Knowledge in a Neural Network</title>
    <link rel="shortcut icon" href="/icon.png"/>
V
Varuna Jayasiri 已提交
25
    <link rel="stylesheet" href="../pylit.css?v=1">
V
Varuna Jayasiri 已提交
26
    <link rel="canonical" href="https://nn.labml.ai/distillation/readme.html"/>
V
Varuna Jayasiri 已提交
27 28
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">

V
Varuna Jayasiri 已提交
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
    <script>
        window.dataLayer = window.dataLayer || [];

        function gtag() {
            dataLayer.push(arguments);
        }

        gtag('js', new Date());

        gtag('config', 'G-4V3HC8HBLH');
    </script>
</head>
<body>
<div id='container'>
    <div id="background"></div>
    <div class='section'>
        <div class='docs'>
            <p>
                <a class="parent" href="/">home</a>
                <a class="parent" href="index.html">distillation</a>
            </p>
            <p>
V
Varuna Jayasiri 已提交
53
                <a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
V
Varuna Jayasiri 已提交
54 55 56
                    <img alt="Github"
                         src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
                         style="max-width:100%;"/></a>
V
Varuna Jayasiri 已提交
57
                <a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
V
Varuna Jayasiri 已提交
58 59 60 61
                    <img alt="Twitter"
                         src="https://img.shields.io/twitter/follow/labmlai?style=social"
                         style="max-width:100%;"/></a>
            </p>
V
Varuna Jayasiri 已提交
62 63 64 65
            <p>
                <a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/distillation/readme.md" target="_blank">
                    View code on Github</a>
            </p>
V
Varuna Jayasiri 已提交
66 67 68 69 70 71 72 73
        </div>
    </div>
    <div class='section' id='section-0'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-0'>#</a>
            </div>
            <h1><a href="https://nn.labml.ai/distillation/index.html">Distilling the Knowledge in a Neural Network</a></h1>
V
Varuna Jayasiri 已提交
74 75 76
<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation/tutorial of the paper <a href="https://papers.labml.ai/paper/1503.02531">Distilling the Knowledge in a Neural Network</a>.</p>
<p>It&#x27;s a way of training a small network using the knowledge in a trained larger network; i.e. distilling the knowledge from the large network.</p>
<p>A large model with regularization or an ensemble of models (using dropout) generalizes better than a small model when trained directly on the data and labels. However, a small model can be trained to generalize better with help of a large model. Smaller models are better in production: faster, less compute, less memory.</p>
V
Varuna Jayasiri 已提交
77
<p>The output probabilities of a trained model give more information than the labels because it assigns non-zero probabilities to incorrect classes as well. These probabilities tell us that a sample has a chance of belonging to certain classes. For instance, when classifying digits, when given an image of digit <em>7</em>, a generalized model will give a high probability to 7 and a small but non-zero probability to 2, while assigning almost zero probability to other digits. Distillation uses this information to train a small model better. </p>
V
Varuna Jayasiri 已提交
78

V
Varuna Jayasiri 已提交
79 80 81 82 83 84 85 86 87 88
        </div>
        <div class='code'>
            
        </div>
    </div>
    <div class='footer'>
        <a href="https://papers.labml.ai">Trending Research Papers</a>
        <a href="https://labml.ai">labml.ai</a>
    </div>
</div>
V
Varuna Jayasiri 已提交
89
<script src=../interactive.js?v=1"></script>
V
Varuna Jayasiri 已提交
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
<script>
    function handleImages() {
        var images = document.querySelectorAll('p>img')

        for (var i = 0; i < images.length; ++i) {
            handleImage(images[i])
        }
    }

    function handleImage(img) {
        img.parentElement.style.textAlign = 'center'

        var modal = document.createElement('div')
        modal.id = 'modal'

        var modalContent = document.createElement('div')
        modal.appendChild(modalContent)

        var modalImage = document.createElement('img')
        modalContent.appendChild(modalImage)

        var span = document.createElement('span')
        span.classList.add('close')
        span.textContent = 'x'
        modal.appendChild(span)

        img.onclick = function () {
            console.log('clicked')
            document.body.appendChild(modal)
            modalImage.src = img.src
        }

        span.onclick = function () {
            document.body.removeChild(modal)
        }
    }

    handleImages()
</script>
</body>
</html>