diff --git a/docs/zh_cn/papers/2022.md b/docs/zh_cn/papers/2022.md new file mode 100644 index 0000000000000000000000000000000000000000..53f2d360067bc5c0ea3325419b5b4a8e401eb278 --- /dev/null +++ b/docs/zh_cn/papers/2022.md @@ -0,0 +1,503 @@ +# 2022年压缩相关论文整理 + +目录: +- [CVPR](#CVPR) +- [NeurIPS](#NeurIPS) +- [ECCV](#ECCV) +- [IJCAI](#IJCAI) +- [ICML](#ICML) +- [AAAI](#AAAI) +- [ICLR](#ICLR) +- [ACM MM](#ACM-MM) + + + + +## CVPR + +### Knowledge Distillation +1. Decoupled Knowledge Distillation +2. Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation +3. Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability +4. Focal and Global Knowledge Distillation for Detectors +### Pruning +1. CHEX: CHannel EXploration for CNN Model Compression +2. Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs +### Quantization +1. It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher +2. Implicit Feature Decoupling with Depthwise Quantization +3. IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization +### NAS +1. Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? +2. Training-free Transformer Architecture Search +3. Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning +4. β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search + +## NeurIPS + +### Knowledge Distillation + +1. Hilbert Distillation for Cross-Dimensionality Networks +2. Decomposed Knowledge Distillation for Class-incremental Semantic Segmentation +3. Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation +4. Preservation of the Global Knowledge by Not-True Distillation in Federated Learning +5. TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation +6. Weakly Supervised Knowledge Distillation for Whole Slide Image Classification +7. PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient +8. SeqPATE: Differentially Private Text Generation via Knowledge Distillation +9. Dataset Distillation using Neural Feature Regression +10. Toward Understanding Privileged Features Distillation in Learning-to-Rank +11. What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective +12. Fairness without Demographics through Knowledge Distillation +13. Improving Policy Learning via Language Dynamics Distillation +14. Offline Multi-Agent Reinforcement Learning with Knowledge Distillation +15. Structural Knowledge Distillation for Object Detection +16. Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation +17. Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer +18. Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks +19. Knowledge Distillation: Bad Models Can Be Good Role Models +20. Teach Less, Learn More: On the Undistillable Classes in Knowledge Distillation +21. Weighted Distillation with Unlabeled Examples +22. Geometric Distillation for Graph Networks +23. Efficient Dataset Distillation using Random Feature Approximation +24. Knowledge Distillation from A Stronger Teacher +25. Respecting Transfer Gap in Knowledge Distillation +26. Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding +27. Functional Ensemble Distillation +28. Decomposing NeRF for Editing via Feature Field Distillation +29. Towards Efficient 3D Object Detection with Knowledge Distillation +30. PerfectDou: Dominating DouDizhu with Perfect Information Distillation +31. Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation +32. Improved Feature Distillation via Projector Ensemble +33. Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation +34. Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts + +### Quantization + +1. Redistribution of Weights and Activations for AdderNet Quantization +2. Towards Efficient Post-training Quantization of Pre-trained Language Models +3. ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers +4. Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning +5. FP8 Quantization: The Power of the Exponent +6. ClimbQ: Class Imbalanced Quantization Enabling Robustness on Efficient Inferences +7. Prompt Certified Machine Unlearning with Randomized Gradient Smoothing and Quantization +8. Leveraging Inter-Layer Dependency for Post -Training Quantization +9. Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques +10. Entropy-Driven Mixed-Precision Quantization for Deep Network Design +11. GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale + +### Pruning + +1. Spatial Pruned Sparse Convolution for 3D Object Detection +2. Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks +3. Data-Efficient Structured Pruning via Submodular Optimization +4. Robust Binary Models by Pruning Randomly-initialized Networks +5. On Neural Network Pruning's Effect on Generalization +6. Advancing Model Pruning via Bi-level Optimization +7. Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning +8. Sparse Probabilistic Circuits via Pruning and Growing +9. Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions +10. Structural Pruning via Latency-Saliency Knapsack +11. SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization +12. A Fast Post-Training Pruning Framework for Transformers +13. Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm +14. Pruning has a disparate impact on model accuracy + + +## ECCV + +### Knowledge Distillation + +1. Monitored Distillation for Positive Congruent Depth Completion +2. Resolution-Free Point Cloud Sampling Network with Data Distillation +3. CMD: Self-Supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation +4. Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation +5. Prediction-Guided Distillation for Dense Object Detection +6. HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors +7. Multi-faceted Distillation of Base-Novel Commonality for Few-Shot Object Detection +8. Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection +9. Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations +10. GLAMD: Global and Local Attention Mask Distillation for Object Detectors +11. Knowledge Condensation Distillation +12. Masked Generative Distillation +13. Efficient One Pass Self-Distillation with Zipf's Label Smoothing +14. IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors +15. Switchable Online Knowledge Distillation +16. Deep Ensemble Learning by Diverse Knowledge Distillation for Fine-Grained Object Classification +17. Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition +18. CoupleFace: Relation Matters for Face Recognition Distillation +19. Deep Hash Distillation for Image Retrieval +20. TinyViT: Fast Pretraining Distillation for Small Vision Transformers +21. Black-Box Few-Shot Knowledge Distillation +22. CXR Segmentation by AdaIN-Based Domain Adaptation and Knowledge Distillation +23. Dynamic Metric Learning with Cross-Level Concept Distillation +24. MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition +25. Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition +26. A Fast Knowledge Distillation Framework for Visual Recognition +27. Cross-Domain Ensemble Distillation for Domain Generalization +28. Self-Regulated Feature Learning via Teacher-Free Feature Distillation +29. Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting +30. Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving +31. Multi-Granularity Distillation Scheme towards Lightweight Semi-Supervised Semantic Segmentation +32. FedX: Unsupervised Federated Learning with Cross Knowledge Distillation +33. Improving Self-Supervised Lightweight Model Learning via Hard-Aware Metric Distillation +34. KD-MVS: Knowledge Distillation Based Self-Supervised Learning for Multi-View Stereo +35. DistPro: Searching a Fast Knowledge Distillation Process via Meta Optimization +36. Personalized Education: Blind Knowledge Distillation +37. Delta Distillation for Efficient Video Processing +38. Drive\&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation +39. LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection +40. Circumventing Outliers of AutoAugment with Knowledge Distillation +41. Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification +42. Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer +43. HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing +44. Intra-class Feature Variation Distillation for Semantic Segmentation +45. Knowledge Distillation Meets Self-Supervision +46. Discriminability Distillation in Group Representation Learning +47. Robust Re-Identification by Multiple Views Knowledge Distillation +48. Local Correlation Consistency for Knowledge Distillation +49. AMLN: Adversarial-based Mutual Learning Network for Online Knowledge Distillation +50. Defocus Blur Detection via Depth Distillation +51. MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation +52. Knowledge Transfer via Dense Cross-Layer Mutual-Distillation +53. Matching Guided Distillation +54. Differentiable Feature Aggregation Search for Knowledge Distillation +55. Online Ensemble Model Compression using Knowledge Distillation +56. Prime-Aware Adaptive Distillation +57. PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning +58. StyleGAN2 Distillation for Feed-forward Image Manipulation +59. Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition +60. Feature Normalized Knowledge Distillation for Image Classification +61. DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild +62. Weight Decay Scheduling and Knowledge Distillation for Active Learning +63. Semantic Relation Preserving Knowledge Distillation for Image-to-Image Translation +64. Domain Adaptation Through Task Distillation +65. Interpretable Foreground Object Search As Knowledge Distillation +66. Improving Knowledge Distillation via Category Structure +67. Improving Face Recognition from Hard Samples via Distribution Distillation Loss + +### Quantization + +1. PoseGPT: Quantization-Based 3D Human Motion Generation and Forecasting +2. CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution +3. Fine-Grained Data Distribution Alignment for Post-Training Quantization +4. Patch Similarity Aware Data-Free Quantization for Vision Transformers +5. Symmetry Regularization and Saturating Nonlinearity for Robust Quantization +6. Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance +7. Non-uniform Step Size Quantization for Accurate Post-Training Quantization +8. Towards Accurate Network Quantization with Equivalent Smooth Regularizer +9. Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization +10. BASQ: Branch-Wise Activation-Clipping Search Quantization for Sub-4-Bit Neural Networks +11. RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization +12. PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization +13. Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach +14. Privacy-Preserving Action Recognition via Motion Difference Quantization +15. Auto-Regressive Image Synthesis with Integrated Quantization +16. Synergistic Self-Supervised and Quantization Learning +17. Post-Training Piecewise Linear Quantization for Deep Neural Networks +18. Quantization Guided JPEG Artifact Correction +19. Deep Transferring Quantization +20. Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization +21. Generative Low-bitwidth Data Free Quantization +22. Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes +23. Task-Aware Quantization Network for JPEG Image Compression +24. HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs +25. Differentiable Joint Pruning and Quantization for Hardware Efficiency + + +### Pruning + +1. Disentangled Differentiable Network Pruning +2. Multi-Granularity Pruning for Model Acceleration on Mobile Devices +3. Ensemble Knowledge Guided Sub-network Search and Fine-Tuning for Filter Pruning +4. SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning +5. Soft Masking for Cost-Constrained Channel Pruning +6. SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning +7. Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning +8. FairGRAPE: Fairness-Aware GRAdient Pruning mEthod for Face Attribute Classification +9. CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution +10. Filter Pruning via Feature Discrimination in Deep Neural Networks +11. Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps +12. Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning +13. EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning +14. DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation +15. DHP: Differentiable Meta Pruning via HyperNetworks +16. Meta-Learning with Network Pruning +17. Accelerating CNN Training by Pruning Activation Gradients +18. DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search +19. Differentiable Joint Pruning and Quantization for Hardware Efficiency +20. PPT: Token-Pruned Pose Transformer for Monocular and Multi-View Human Pose Estimation +21. Prune Your Model before Distill It +22. CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution + +## IJCAI + +### Knowledge Distillation + +1. Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation +2. Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation +3. Mutual Distillation Learning Network for Trajectory-User Linking +4. Continual Federated Learning Based on Knowledge Distillation +5. Data-Free Adversarial Knowledge Distillation for Graph Neural Networks +6. Pseudo-spherical Knowledge Distillation +7. Learning from Students: Online Contrastive Distillation Network for General Continual Learning +8. Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt + + +### Quantization + +1. FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer +2. RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization +3. MultiQuant: Training Once for Multi-bit Quantization of Neural Networks + +### Pruning + +1. Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph +2. FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning Using Shared Data on the Server +3. On the Channel Pruning using Graph Convolution Network for Convolutional Neural Network Acceleration +4. Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization +5. Model-Based Offline Planning with Trajectory Pruning +6. Neural Subgraph Explorer: Reducing Noisy Information via Target-oriented Syntax Graph Pruning +7. Neural Network Pruning by Cooperative Coevolution +8. Recent Advances on Neural Network Pruning at Initialization + +## ICML + +### Knowledge Distillation +1. Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing? +2. Spatial-Channel Token Distillation for Vision MLPs +3. Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation + +### Quantization + +1. Continuous Control with Action Quantization from Demonstrations +2. SDQ: Stochastic Differentiable Quantization with Mixed Precision +3. Overcoming Oscillations in Quantization-Aware Training +4. Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training +5. Correlated Quantization for Distributed Mean Estimation and Optimization +6. SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization +7. Accurate Quantization of Measures via Interacting Particle-based Optimization + +### Pruning + +1. Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness +2. SPDY: Accurate Pruning with Speedup Guarantees +3. Sparse Double Descent: Where Network Pruning Aggravates Overfitting +4. PAC-Net: A Model Pruning Approach to Inductive Transfer Learning +5. Neural Network Pruning Denoises the Features and Makes Local Connectivity Emerge in Visual Tasks +6. Winning the Lottery Ahead of Time: Efficient Early Network Pruning +7. Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning +8. The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks +9. PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance + +### Others +1. DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks + +## AAAI + +### Knowledge Distillation + +1. Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-Guided Feature Imitation +2. Knowledge Distillation via Constrained Variational Inference +3. Cosine Model Watermarking Against Ensemble Distillation +4. Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay +5. TrustAL: Trustworthy Active Learning Using Knowledge Distillation +6. Class-Wise Adaptive Self Distillation for Federated Learning on Non-IID Data (Student Abstract) +7. Multi-Scale Distillation from Multiple Graph Neural Networks +8. Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes +9. Improving Neural Cross-Lingual Abstractive Summarization via Employing Optimal Transport Distance for Knowledge Distillation +10. EasySED: Trusted Sound Event Detection with Self-Distillation +11. LGD: Label-Guided Self-Distillation for Object Detection +12. On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals +13. Content-Variant Reference Image Quality Assessment via Knowledge Distillation +14. Pose-Invariant Face Recognition via Adaptive Angular Distillation +15. UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation +16. Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation +17. Up to 100$\times$ Faster Data-Free Knowledge Distillation +18. Safe Distillation Box +19. Cross-Task Knowledge Distillation in Multi-Task Recommendation +20. Boosting Contrastive Learning with Relation Knowledge Distillation +21. Feature Distillation Interaction Weighting Network for Lightweight Image Super-Resolution +22. Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation +23. ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images +24. Adversarial Data Augmentation for Task-Specific Knowledge Distillation of Pre-Trained Transformers +25. An Efficient Combinatorial Optimization Model Using Learning-to-Rank Distillation + +### Quantization + +1. Anisotropic Additive Quantization for Fast Inner Product Search +2. Contrastive Quantization with Code Memory for Unsupervised Image Retrieval + +### Pruning + +1. Span-Based Semantic Role Labeling with Argument Pruning and Second-Order Inference +2. Width & Depth Pruning for Vision Transformers +3. Gradient and Mangitude Based Pruning for Sparse Deep Neural Networks +4. Prior Gradient Mask Guided Pruning-Aware Fine-Tuning +5. From Dense to Sparse: Contrastive Pruning for Better Pre-Trained Language Model Compression +6. Prune and Tune Ensembles: Low-Cost Ensemble Learning with Sparse Independent Subnetworks + +### Others + +1. BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition +2. Convolutional Neural Network Compression Through Generalized Kronecker Product Decomposition + +## ICLR + +### Knowledge Distillation + +1. Churn Reduction via Distillation Heinrich Jiang, Harikrishna Narasimhan, Dara Bahri, Andrew Cotter, Afshin Rostamizadeh29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Spotlight Readers: EveryoneShow details +2. Progressive Distillation for Fast Sampling of Diffusion Models Tim Salimans, Jonathan Ho29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Spotlight Readers: EveryoneShow details +3. Online Hyperparameter Meta-Learning with Hypergradient Distillation +4. Improving Non-Autoregressive Translation Models Without Distillation Xiao Shi Huang, Felipe Perez, Maksims Volkovs29 Sept 2021 (modified: 12 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +5. Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations Fangyu Liu, Yunlong Jiao, Jordan Massiah, Emine Yilmaz, Serhii Havrylov29 Sept 2021 (modified: 14 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +6. Feature Kernel Distillation Bobby He, Mete Ozay29 Sept 2021 (modified: 06 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +7. Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation Shichang Zhang, Yozen Liu, Yizhou Sun, Neil Shah29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +8. Reliable Adversarial Distillation with Unreliable Teachers Jianing Zhu, Jiangchao Yao, Bo Han, Jingfeng Zhang, Tongliang Liu, Gang Niu, Jingren Zhou, Jianliang Xu, Hongxia Yang29 Sept 2021 (modified: 10 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +9. Towards Model Agnostic Federated Learning Using Knowledge Distillation Andrei Afonin, Sai Praneeth Karimireddy29 Sept 2021 (modified: 11 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +10. Unified Visual Transformer Compression Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +11. Bag of Instances Aggregation Boosts Self-supervised Distillation Haohang Xu, Jiemin Fang, XIAOPENG ZHANG, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian29 Sept 2021 (modified: 14 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +12. Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation Bichen Wu, Ruizhe Cheng, Peizhao Zhang, Tianren Gao, Joseph E. Gonzalez, Peter Vajda29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +13. OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION Qu Tang, Xiangyu Zhu, Zhen Lei, Zhaoxiang Zhang29 Sept 2021 (modified: 09 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +14. Open-vocabulary Object Detection via Vision and Language Knowledge Distillation Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +15. Prototypical Contrastive Predictive Coding Kyungmin Lee29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +16. Better Supervisory Signals by Observing Learning Paths Yi Ren, Shangmin Guo, Danica J. Sutherland29 Sept 2021 (modified: 05 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +17. Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data Yaxing Wang, Joost van de weijer, Lu Yu, SHANGLING JUI29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +18. Image BERT Pre-training with Online Tokenizer Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +19. Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery Yulun Wu, Nicholas Choma, Andrew Deru Chen, Mikaela Cashman, Erica Teixeira Prates, Veronica G Melesse Vergara, Manesh B Shah, Austin Clyde, Thomas Brettin, Wibe Albert de Jong, Neeraj Kumar, Martha S Head, Rick L. Stevens, Peter Nugent, Daniel A Jacobson, James B Brown29 Sept 2021 (modified: 10 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +20. Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods Wenqing Zheng, Edward W Huang, Nikhil Rao, Sumeet Katariya, Zhangyang Wang, Karthik Subbian29 Sept 2021 (modified: 12 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +21. Reinforcement Learning in Presence of Discrete Markovian Context Evolution Hang Ren, Aivar Sootla, Taher Jafferjee, Junxiao Shen, Jun Wang, Haitham Bou Ammar29 Sept 2021 (modified: 12 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +22. Exploring extreme parameter compression for pre-trained language models Benyou Wang, Yuxin Ren, Lifeng Shang, Xin Jiang, Qun Liu29 Sept 2021 (modified: 17 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +23. Out-of-distribution Generalization in the Presence of Nuisance-Induced Spurious Correlations Aahlad Manas Puli, Lily H Zhang, Eric Karl Oermann, Rajesh Ranganath29 Sept 2021 (modified: 11 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +24. Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning Yulun Zhang, Huan Wang, Can Qin, Yun Fu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +25. BiBERT: Accurate Fully Binarized BERT + + +### Quantization + +1. F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization +2. 8-bit Optimizers via Block-wise Quantization +3. QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, Fengwei Yu29 Sept 2021 (modified: 10 Feb 2022) ICLR 2022 Poster Readers: EveryoneShow details +4. Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks Tong Bu, Wei Fang, Jianhao Ding, PENGLIN DAI, Zhaofei Yu, Tiejun Huang29 Sept 2021 (modified: 14 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +5. SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo29 Sept 2021 (modified: 12 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +6. Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization Sunwoo Lee, Jeongwoo Park, Dongsuk Jeon29 Sept 2021 (modified: 08 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +7. Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes Yunjiang Jiang, Han Zhang, Yiming Qiu, Yun Xiao, Bo Long, Wen-Yun Yang29 Sept 2021 (modified: 10 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +8. Information Bottleneck: Exact Analysis of (Quantized) Neural Networks Stephan Sloth Lorenzen, Christian Igel, Mads Nielsen29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +9. EXACT: Scalable Graph Neural Networks Training via Extreme Activation Compression Zirui Liu, Kaixiong Zhou, Fan Yang, Li Li, Rui Chen, Xia Hu29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +10. VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning Adrien Bardes, Jean Ponce, Yann LeCun29 Sept 2021 (modified: 09 Feb 2022) ICLR 2022 Poster Readers: EveryoneShow details +11. Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng29 Sept 2021 (modified: 05 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +12. Transformer-based Transform Coding Yinhao Zhu, Yang Yang, Taco Cohen29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +13. Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation Shichang Zhang, Yozen Liu, Yizhou Sun, Neil Shah + + +### Pruning + +1. On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning Marc Vischer, Robert Tjarko Lange, Henning Sprekeler29 Sept 2021 (modified: 11 May 2022) ICLR 2022 Spotlight Readers: EveryoneShow details +2. SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, David Reeb29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Spotlight Readers: EveryoneShow details +3. Possibility Before Utility: Learning And Using Hierarchical Affordances Robby Costales, Shariq Iqbal, Fei Sha29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Spotlight Readers: EveryoneShow details +4. Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wan +5. Effective Model Sparsification by Scheduled Grow-and-Prune Methods Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie29 Sept 2021 (modified: 10 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +6. Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions Shaochen Zhong, Guanqun Zhang, Ningjia Huang, Shuai Xu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +7. The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +8. Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning Yulun Zhang, Huan Wang, Can Qin, Yun Fu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +9. Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression Bae Seong Park, Se Jung Kwon, Daehwan Oh, Byeongwook Kim, Dongsoo Lee29 Sept 2021 (modified: 31 Jan 2022) ICLR 2022 Poster Readers: EveryoneShow details +10. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +11. Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, Yarin Gal +12. Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, Yarin Gal29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +13. An Operator Theoretic View On Pruning Deep Neural Networks William T Redman, MARIA FONOBEROVA, Ryan Mohr, Yannis Kevrekidis, Igor Mezic29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +14. Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions Shaochen Zhong, Guanqun Zhang, Ningjia Huang, Shuai Xu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +15. Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning Yulun Zhang, Huan Wang, Can Qin, Yun Fu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +16. The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +17. Unified Visual Transformer Compression Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +18. Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression Bae Seong Park, Se Jung Kwon, Daehwan Oh, Byeongwook Kim, Dongsoo Lee29 Sept 2021 (modified: 31 Jan 2022) ICLR 2022 Poster Readers: EveryoneShow details +19. Signing the Supermask: Keep, Hide, Invert Nils Koster, Oliver Grothe, Achim Rettinger29 Sept 2021 (modified: 14 Feb 2022) ICLR 2022 Poster Readers: EveryoneShow details +20. Plant 'n' Seek: Can You Find the Winning Ticket? Jonas Fischer, Rebekka Burkholz29 Sept 2021 (modified: 06 Apr 2022) ICLR 2022 Poster Readers: EveryoneShow details +21. Effective Model Sparsification by Scheduled Grow-and-Prune Methods Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie29 Sept 2021 (modified: 10 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +22. Proving the Lottery Ticket Hypothesis for Convolutional Neural Networks Arthur da Cunha, Emanuele Natale, Laurent Viennot29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +23. On the Existence of Universal Lottery Tickets Rebekka Burkholz, Nilanjana Laha, Rajarshi Mukherjee, Alkis Gotovos29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +24. Training Structured Neural Networks Through Manifold Identification and Variance Reduction Zih-Syuan Huang, Ching-pei Lee29 Sept 2021 (modified: 03 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +25. PF-GNN: Differentiable particle filtering based approximation of universal graph representations Mohammed Haroon Dupty, Yanfei Dong, Wee Sun Lee29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +26. Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation Shichang Zhang, Yozen Liu, Yizhou Sun, Neil Shah29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +27. How many degrees of freedom do we need to train deep networks: a loss landscape perspective Brett W Larsen, Stanislav Fort, Nic Becker, Surya Ganguli29 Sept 2021 (modified: 11 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +28. Dual Lottery Ticket Hypothesis Yue Bai, Huan Wang, ZHIQIANG TAO, Kunpeng Li, Yun Fu29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +29. Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently Xiaohan Chen, Jason Zhang, Zhangyang Wang + +### Others + +1. DKM: Differentiable k-Means Clustering Layer for Neural Network Compression Minsik Cho, Keivan Alizadeh-Vahid, Saurabh Adya, Mohammad Rastegari29 Sept 2021 (modified: 21 Feb 2022) ICLR 2022 Poster Readers: EveryoneShow details +2. Exploring extreme parameter compression for pre-trained language models Benyou Wang, Yuxin Ren, Lifeng Shang, Xin Jiang, Qun Liu29 Sept 2021 (modified: 17 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +3. Unified Visual Transformer Compression Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +4. Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression Bae Seong Park, Se Jung Kwon, Daehwan Oh, Byeongwook Kim, Dongsoo Lee29 Sept 2021 (modified: 31 Jan 2022) ICLR 2022 Poster Readers: EveryoneShow details +5. LOSSY COMPRESSION WITH DISTRIBUTION SHIFT AS ENTROPY CONSTRAINED OPTIMAL TRANSPORT Huan Liu, George Zhang, Jun Chen, Ashish J Khisti29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +6. Memory Replay with Data Compression for Continual Learning Liyuan Wang, Xingxing Zhang, Kuo Yang, Longhui Yu, Chongxuan Li, Lanqing HONG, Shifeng Zhang, Zhenguo Li, Yi Zhong, Jun Zhu29 Sept 2021 (modified: 09 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +7. Entroformer: A Transformer-based Entropy Model for Learned Image Compression Yichen Qian, Xiuyu Sun, Ming Lin, Zhiyu Tan, Rong Jin29 Sept 2021 (modified: 14 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +8. Language model compression with weighted low-rank factorization Yen-Chang Hsu, Ting Hua, Sungen Chang, Qian Lou, Yilin Shen, Hongxia Jin29 Sept 2021 (modified: 11 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +9. On Distributed Adaptive Optimization with Gradient Compression Xiaoyun Li, Belhal Karimi, Ping Li29 Sept 2021 (modified: 09 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +10. Distribution Compression in Near-Linear Time Abhishek Shetty, Raaz Dwivedi, Lester Mackey29 Sept 2021 (modified: 25 Jun 2022) ICLR 2022 Poster Readers: EveryoneShow details +11. Towards Empirical Sandwich Bounds on the Rate-Distortion Function Yibo Yang, Stephan Mandt29 Sept 2021 (modified: 11 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +12. Information Bottleneck: Exact Analysis of (Quantized) Neural Networks Stephan Sloth Lorenzen, Christian Igel, Mads Nielsen29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +13. Prototypical Contrastive Predictive Coding Kyungmin Lee29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +14. Transformer-based Transform Coding Yinhao Zhu, Yang Yang, Taco Cohen29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +15. Neural Network Approximation based on Hausdorff distance of Tropical Zonotopes Panagiotis Misiakos, Georgios Smyrnis, George Retsinas, Petros Maragos29 Sept 2021 (modified: 24 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +16. Autoregressive Diffusion Models Emiel Hoogeboom, Alexey A. Gritsenko, Jasmijn Bastings, Ben Poole, Rianne van den Berg, Tim Salimans29 Sept 2021 (modified: 10 Feb 2022) ICLR 2022 Poster Readers: EveryoneShow details +17. Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning Yulun Zhang, Huan Wang, Can Qin, Yun Fu29 Sept 2021 (modified: 15 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +18. Fast Generic Interaction Detection for Model Interpretability and Compression Tianjian Zhang, Feng Yin, Zhi-Quan Luo29 Sept 2021 (modified: 30 Jan 2022) ICLR 2022 Poster Readers: EveryoneShow details +19. EXACT: Scalable Graph Neural Networks Training via Extreme Activation Compression Zirui Liu, Kaixiong Zhou, Fan Yang, Li Li, Rui Chen, Xia Hu29 Sept 2021 (modified: 16 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +20. Generalized Kernel Thinning Raaz Dwivedi, Lester Mackey29 Sept 2021 (modified: 05 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +21. VC dimension of partially quantized neural networks in the overparametrized regime Yutong Wang, Clayton Scott29 Sept 2021 (modified: 03 May 2022) ICLR 2022 Poster Readers: EveryoneShow details +22. BiBERT: Accurate Fully Binarized BERT Haotong Qin, Yifu Ding, Mingyuan Zhang, Qinghua YAN, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu29 Sept 2021 (modified: 11 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +23. Permutation Compressors for Provably Faster Distributed Nonconvex Optimization Rafał Szlendak, Alexander Tyurin, Peter Richtárik29 Sept 2021 (modified: 13 Mar 2022) ICLR 2022 Poster Readers: EveryoneShow details +24. Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions Shaochen Zhong, Guanqun Zhang, Ningjia Huang, Shuai Xu + +## ACL + +### Knowledge Distillation + +1. Attention Temperature Matters in Abstractive Summarization Distillation. 127-141 +2. Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm. 190-200 +3. Multi-Granularity Structural Knowledge Distillation for Language Model Compression. 1001-1011 +4. Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation. 1658-1669 +5. Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations. 4865-4877 +6. BERT Learns to Teach: Knowledge Distillation with Meta Learning. 7037-7049 +7. DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization. 203-211 + +### Quantization + +1. Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking. 695-707 +2. Compression of Generative Pre-trained Language Models via Quantization. 4821-4836 + +### Pruning + +1. Structured Pruning Learns Compact and Accurate Models. 1513-1528 +2. Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency. 1852-1865 + + +### Others + +1. Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning. 1267-1280 +2. Compression of Generative Pre-trained Language Models via Quantization. 4821-4836 + +## ACM MM + +### Knowledge Distillation + +1. Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval. 601-609 +2. Patch-based Knowledge Distillation for Lifelong Person Re-Identification. 696-707 +3. D2Animator: Dual Distillation of StyleGAN For High-Resolution Face Animation. 1769-1778 +4. Learning an Inference-accelerated Network from a Pre-trained Model with Frequency-enhanced Feature Distillation. 1847-1856 +5. Mix-DANN and Dynamic-Modal-Distillation for Video Domain Adaptation. 3224-3233 +6. Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation. 3829-3837 +7. Pay Attention to Your Positive Pairs: Positive Pair Aware Contrastive Knowledge Distillation. 5862-5870 +8. Progressive Cross-modal Knowledge Distillation for Human Action Recognition. 5903-5912 +9. Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection. 6278-6287 + +### Quantization + +1. Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach. 2899-2908 +2. Towards Accurate Post-Training Quantization for Vision Transformer. 5380-5388 + +### Pruning + +1. Bayesian based Re-parameterization for DNN Model Pruning. 1367-1375 +2. SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension. 3608-3616 +3. Rethinking the Mechanism of the Pattern Pruning and the Circle Importance Hypothesis. 4899-4908