1. 02 2月, 2023 1 次提交
  2. 01 2月, 2023 1 次提交
  3. 31 1月, 2023 2 次提交
  4. 30 1月, 2023 1 次提交
  5. 19 1月, 2023 1 次提交
    • J
      [KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9
      jameszhang 提交于
      * [KUNLUN] add op: maxpool_with_index
      
      * use DeviceContext::Alloc() instead of DenseTensor::mutable_data()
      
      * fix file format
      
      * solve clip unittest failure
      
      * minor fix
      
      * Revert "solve clip unittest failure" since the issue is fixed
      in #49535
      
      This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.
      
      * align with xdnn on the definition of mask in max_pool_with_index
      
      * minor
      f71f77e9
  6. 18 1月, 2023 4 次提交
    • S
      Handle repetitive code in oneDNN activation fuse passes (#49824) · a1b2e1e2
      Sławomir Siwek 提交于
      * extract fuse pass logic to header file
      
      * adjust namespaces
      
      * Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h
      
      update date
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      * add inline remove static
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      a1b2e1e2
    • R
      [PHI] remove bitwise and, or, xor (#49916) · 9056cc8b
      RuohengMa 提交于
      * add reduce_sum_int64 and reduce_sum_int8 xpu kernels
      
      * [PHI] add clip grad kernel with support type float32 and int32
      
      * [PHI unittest] add clip_grad unit test
      
      * adapt code to clang-format
      
      * update xpu api output with clip_grad api
      
      * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests
      
      * adapt license date, add code for XPUDataType convertion
      
      * add int8 support of reduce_sum
      
      * add reduce_sum unit tests for dtype int64, int8, and add more test cases
      
      * update license date
      
      * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel
      
      * change license date
      9056cc8b
    • H
      [XPU] add logical_not op. (#49911) · 60d1199a
      houj04 提交于
      60d1199a
    • J
      use default XPU stream for computing (#49806) · f6b23d6d
      jameszhang 提交于
      * revert to use default XPU stream for computing
      
      XPUContext now has a null stream by default. If you want to use a separate stream
       (e.g. in async collective communication), you should create a dedicated XPUContext
      and invoke its XPUContext::CreateStream()
      
      * minor
      f6b23d6d
  7. 16 1月, 2023 1 次提交
  8. 13 1月, 2023 5 次提交
  9. 12 1月, 2023 3 次提交
  10. 10 1月, 2023 2 次提交
  11. 09 1月, 2023 2 次提交
  12. 06 1月, 2023 3 次提交
  13. 03 1月, 2023 1 次提交
  14. 27 12月, 2022 1 次提交
  15. 26 12月, 2022 1 次提交
    • Y
      fix dlrm qpsproblem (#49171) · c8f76337
      ykkk2333 提交于
      * migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun
      
      * fix dlrm throughput problem, test=kunlun
      c8f76337
  16. 23 12月, 2022 2 次提交
  17. 22 12月, 2022 1 次提交
  18. 20 12月, 2022 1 次提交
  19. 19 12月, 2022 2 次提交
  20. 17 12月, 2022 1 次提交
  21. 16 12月, 2022 1 次提交
  22. 15 12月, 2022 1 次提交
  23. 14 12月, 2022 1 次提交
  24. 12 12月, 2022 1 次提交
    • Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3
      傅剑寒 提交于
      * fix codestyle
      
      * add double complex<float> complex<double> dtype support for syevj_batched
      
      * fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case
      
      * optimize eigh in different case
      
      * fix missing ; bug
      
      * fix use_syevj bug
      
      * fix use_cusolver_syevj_batched flag
      16e364d3