1. 15 4月, 2022 2 次提交
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
    • H
      fix batch norm memory issue (#41717) · 42abcc08
      hong 提交于
      * try to fix batch norm memory issue
      
      * fix batch norm memroy alloc bug
      
      * polish some code
      42abcc08
  2. 14 4月, 2022 9 次提交
  3. 13 4月, 2022 10 次提交
  4. 12 4月, 2022 13 次提交
  5. 11 4月, 2022 4 次提交
  6. 10 4月, 2022 1 次提交
  7. 09 4月, 2022 1 次提交