1. 05 7月, 2021 4 次提交
  2. 04 7月, 2021 1 次提交
  3. 02 7月, 2021 3 次提交
  4. 01 7月, 2021 3 次提交
  5. 30 6月, 2021 3 次提交
  6. 29 6月, 2021 2 次提交
  7. 28 6月, 2021 2 次提交
  8. 25 6月, 2021 2 次提交
  9. 24 6月, 2021 4 次提交
  10. 23 6月, 2021 9 次提交
  11. 22 6月, 2021 4 次提交
  12. 21 6月, 2021 3 次提交
    • L
      Add AXPY oneDNN handler (#33632) · 773aabc7
      lidanqing 提交于
      * Add oneDNN AXPY handler.
      
      * Add fallback for small tensors.
      
      * Fix ifdefs
      
      * Remove unnecessary namespace prefixes and add missing headers.
      
      * Guard handler_axpy with proper ifdefs.
      
      * Compilation of this function is possible only when Paddle is not build
      with CUDA nor HIP.
      
      * Move AXPY handler code to separate files.
      
      * Use oneDNN AXPY handler in SGD op.
      
      * Use axpy handler only when Paddle is built with oneDNN.
      
      * Add test for SUM BF16 with big rows.
      
      * Fix SFINAE rules for elementwise_add_to.
      
      * Add test case for SGD with big rows.
      
      * update
      
      * update
      Co-authored-by: NAdam Osewski <adam.osewski@intel.com>
      773aabc7
    • P
      [NPU] optimize mul op, use BatchMatMul to realize (#33616) · f91dfe15
      pangyoki 提交于
      * use BatchMatMul
      
      * replace TensorCopy with ShareDataWith
      
      * remove check fp16 grad
      
      * fix format
      
      * add grad_check
      
      * fix grad check
      f91dfe15
    • C
      Combine amp and qat (#33484) · f88af205
      cc 提交于
      * Combine amp and qat
      * add unit test
      f88af205