• L
    Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
    limingshu 提交于
    * change cudnn helper for auto-tune
    
    * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
    
    * Fix the bug in calculating and printing current step cache hit rate.
    
    * Improve the autotune cache and fix unittest.
    
    * Change the key from AlgorithmType to int64_t.
    
    * Fix unittest for cpu-only env.
    
    * change ChooseAlgoByWorkspace for heuristic mode
    Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
    35acfeda
CMakeLists.txt 1.3 KB