1. 03 2月, 2023 1 次提交
    • S
      Replace matmul(v2) with fused_matmul during oneDNN fuse passes (#49515) · 5cfe1645
      Sławomir Siwek 提交于
      * replace matmul with matmul_v2 in fuse passes
      
      * Remove fusion logic from matmul
      
      * removing fusion methods
      
      * add proper name
      
      * adjust namespaces
      
      * clean attrs in python tests
      
      * delete checkpoint and restore matmul version
      
      * remove unused code
      
      * matmul and reshape/transpose fuses migrated
      
      * split MatmulOneDNN headers
      
      * fuse activation and eltwise_add
      
      * add fuse_activation
      
      * matmul_transpose_reshape/reshape_transpose_matmul
      
      * matmul + elementwise_add (fused)
      
      * activation temporary modifciation
      
      * merge newest develop
      
      * remove depedency from other PR
      
      * revert pbtxt
      
      * remove placeholders from matmul_v2
      
      * add description in OPMaker
      
      * remove matmul_v2_op.h and all depedencies
      
      * remove dims changing in base op
      
      * add possibility to fuse already fused_matmul
      
      * restart broken CI
      
      * Empty-Commit
      
      * revert matmul_utils.h
      
      * codestyle
      
      * adjust imports
      
      * add pbtxt file
      
      * 100% matmul unit tests coverage
      
      * trigger CI with minimal changes to develop
      
      * adjust changes to develop
      
      * add fused_matmul op
      
      * inherit base ops
      
      * add "v2"
      
      * move OPMaker
      
      * Gradually add fused_matmul files
      
      * second batch of fused_matmul changes
      
      * split infershapes of matmul_v2 and fused_matmul
      
      * inherit fused_matmul from matmul_v2
      
      * Update paddle/phi/backends/onednn/onednn_reuse.h
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      * Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      ---------
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      5cfe1645
  2. 02 2月, 2023 1 次提交
  3. 01 2月, 2023 1 次提交
  4. 31 1月, 2023 2 次提交
  5. 30 1月, 2023 1 次提交
  6. 19 1月, 2023 1 次提交
    • J
      [KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9
      jameszhang 提交于
      * [KUNLUN] add op: maxpool_with_index
      
      * use DeviceContext::Alloc() instead of DenseTensor::mutable_data()
      
      * fix file format
      
      * solve clip unittest failure
      
      * minor fix
      
      * Revert "solve clip unittest failure" since the issue is fixed
      in #49535
      
      This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.
      
      * align with xdnn on the definition of mask in max_pool_with_index
      
      * minor
      f71f77e9
  7. 18 1月, 2023 4 次提交
    • S
      Handle repetitive code in oneDNN activation fuse passes (#49824) · a1b2e1e2
      Sławomir Siwek 提交于
      * extract fuse pass logic to header file
      
      * adjust namespaces
      
      * Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h
      
      update date
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      
      * add inline remove static
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      a1b2e1e2
    • R
      [PHI] remove bitwise and, or, xor (#49916) · 9056cc8b
      RuohengMa 提交于
      * add reduce_sum_int64 and reduce_sum_int8 xpu kernels
      
      * [PHI] add clip grad kernel with support type float32 and int32
      
      * [PHI unittest] add clip_grad unit test
      
      * adapt code to clang-format
      
      * update xpu api output with clip_grad api
      
      * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests
      
      * adapt license date, add code for XPUDataType convertion
      
      * add int8 support of reduce_sum
      
      * add reduce_sum unit tests for dtype int64, int8, and add more test cases
      
      * update license date
      
      * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel
      
      * change license date
      9056cc8b
    • H
      [XPU] add logical_not op. (#49911) · 60d1199a
      houj04 提交于
      60d1199a
    • J
      use default XPU stream for computing (#49806) · f6b23d6d
      jameszhang 提交于
      * revert to use default XPU stream for computing
      
      XPUContext now has a null stream by default. If you want to use a separate stream
       (e.g. in async collective communication), you should create a dedicated XPUContext
      and invoke its XPUContext::CreateStream()
      
      * minor
      f6b23d6d
  8. 16 1月, 2023 1 次提交
  9. 13 1月, 2023 5 次提交
  10. 12 1月, 2023 3 次提交
  11. 10 1月, 2023 2 次提交
  12. 09 1月, 2023 2 次提交
  13. 06 1月, 2023 3 次提交
  14. 03 1月, 2023 1 次提交
  15. 27 12月, 2022 1 次提交
  16. 26 12月, 2022 1 次提交
    • Y
      fix dlrm qpsproblem (#49171) · c8f76337
      ykkk2333 提交于
      * migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun
      
      * fix dlrm throughput problem, test=kunlun
      c8f76337
  17. 23 12月, 2022 2 次提交
  18. 22 12月, 2022 1 次提交
  19. 20 12月, 2022 1 次提交
  20. 19 12月, 2022 2 次提交
  21. 17 12月, 2022 1 次提交
  22. 16 12月, 2022 1 次提交
  23. 15 12月, 2022 1 次提交
  24. 14 12月, 2022 1 次提交