<spanstyle="background-color:#FFFDFA;">Opt Slim Pruner: <ahref="https://arxiv.org/pdf/1708.06519.pdf"target="_blank"><spanstyle="font-family:"font-size:14px;background-color:#FFFFFF;">Ye Y , You G , Fwu J K , et al. Channel Pruning via Optimal Thresholding[J]. 2020.</span></a><br/>
Quantization Aware Training: <ahref="https://arxiv.org/abs/1806.08342"target="_blank"><spanstyle="font-family:"font-size:14px;background-color:#FFFFFF;">Krishnamoorthi R . Quantizing deep convolutional networks for efficient inference: A whitepaper[J]. 2018.</span></a>
</li>
<li>
Post Training <span>Quantization </span><ahref="http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf"target="_blank">原理</a>
Post Training <span>Quantization </span><ahref="http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf"target="_blank">原理</a>
</li>
<li>
Embedding <span>Quantization: <ahref="https://arxiv.org/pdf/1603.01025.pdf"target="_blank"><spanstyle="font-family:"font-size:14px;background-color:#FFFFFF;">Miyashita D , Lee E H , Murmann B . Convolutional Neural Networks using Logarithmic Data Representation[J]. 2016.</span></a></span>