1. 15 4月, 2022 2 次提交
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
    • H
      fix batch norm memory issue (#41717) · 42abcc08
      hong 提交于
      * try to fix batch norm memory issue
      
      * fix batch norm memroy alloc bug
      
      * polish some code
      42abcc08
  2. 14 4月, 2022 3 次提交
  3. 13 4月, 2022 2 次提交
  4. 12 4月, 2022 8 次提交
  5. 11 4月, 2022 3 次提交
  6. 10 4月, 2022 1 次提交
  7. 09 4月, 2022 2 次提交
  8. 08 4月, 2022 1 次提交
  9. 07 4月, 2022 8 次提交
  10. 06 4月, 2022 4 次提交
  11. 05 4月, 2022 4 次提交
    • Z
      Fix bug of data transform in inference executor (#41349) · 91212104
      zyfncg 提交于
      * fix bug of data transform in inference executor
      
      * fix bug
      91212104
    • Z
      [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul (#41387) · d8a10977
      Zhanlue Yang 提交于
      * [Refactor] refactored eager_gen.py PR #2
      
      * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes
      
      * Fixed minor issue
      
      * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition
      
      * Fixed issues
      
      * Supported higher-order grad node generation
      
      * [DoubleGrad PR #4] Supported higher-order GradNode generation
      
      * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation
      
      * Fixed yaml typo
      
      * Fixed yaml typo
      
      * fixed minor issues
      
      * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()
      
      * Fixed minor issue
      
      * Fixed CI-Inference issue
      
      * Fixed CI-inference issues
      
      * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run
      
      * Fixed minor issues
      
      * Fixed issue with backward graph construction logic
      
      * Fixed implementation issues with backward graph reconstruction
      
      * Fixed unittest issue
      
      * Fixed issues
      
      * [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul
      
      * Fixed issues with phi kernel
      
      * Added triple grad test case
      
      * Fixed minor issue
      d8a10977
    • G
      add new format of quantization (#41041) · b72a7ebb
      Guanghua Yu 提交于
      b72a7ebb
    • Z
      Implement AutoTuneStatus class for Kernel Auto Tune (#41218) · b0f8000e
      Zhang Ting 提交于
      * switch autotune
      
      * implement AutoTuneCache
      
      * implement AutoTuneCache class
      
      * add pybind api
      
      * add dygraph test
      
      * support static mode and eager mode and improve unittests
      
      * rename the SwitchAutoTune Class and improve tests
      
      * improve AutoTuneStatus and reduce the cost of tests
      b0f8000e
  12. 04 4月, 2022 2 次提交