1. 18 1月, 2023 3 次提交
    • R
      [PHI] remove bitwise and, or, xor (#49916) · 9056cc8b
      RuohengMa 提交于
      * add reduce_sum_int64 and reduce_sum_int8 xpu kernels
      
      * [PHI] add clip grad kernel with support type float32 and int32
      
      * [PHI unittest] add clip_grad unit test
      
      * adapt code to clang-format
      
      * update xpu api output with clip_grad api
      
      * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests
      
      * adapt license date, add code for XPUDataType convertion
      
      * add int8 support of reduce_sum
      
      * add reduce_sum unit tests for dtype int64, int8, and add more test cases
      
      * update license date
      
      * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel
      
      * change license date
      9056cc8b
    • H
      [XPU] add logical_not op. (#49911) · 60d1199a
      houj04 提交于
      60d1199a
    • J
      use default XPU stream for computing (#49806) · f6b23d6d
      jameszhang 提交于
      * revert to use default XPU stream for computing
      
      XPUContext now has a null stream by default. If you want to use a separate stream
       (e.g. in async collective communication), you should create a dedicated XPUContext
      and invoke its XPUContext::CreateStream()
      
      * minor
      f6b23d6d
  2. 16 1月, 2023 1 次提交
  3. 13 1月, 2023 5 次提交
  4. 12 1月, 2023 3 次提交
  5. 10 1月, 2023 2 次提交
  6. 09 1月, 2023 2 次提交
  7. 06 1月, 2023 3 次提交
  8. 03 1月, 2023 1 次提交
  9. 27 12月, 2022 1 次提交
  10. 26 12月, 2022 1 次提交
    • Y
      fix dlrm qpsproblem (#49171) · c8f76337
      ykkk2333 提交于
      * migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun
      
      * fix dlrm throughput problem, test=kunlun
      c8f76337
  11. 23 12月, 2022 2 次提交
  12. 22 12月, 2022 1 次提交
  13. 20 12月, 2022 1 次提交
  14. 19 12月, 2022 2 次提交
  15. 17 12月, 2022 1 次提交
  16. 16 12月, 2022 1 次提交
  17. 15 12月, 2022 1 次提交
  18. 14 12月, 2022 1 次提交
  19. 12 12月, 2022 1 次提交
    • Optimization of Eigh op with ssyevj_batched runtime api (#48560) · 16e364d3
      傅剑寒 提交于
      * fix codestyle
      
      * add double complex<float> complex<double> dtype support for syevj_batched
      
      * fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case
      
      * optimize eigh in different case
      
      * fix missing ; bug
      
      * fix use_syevj bug
      
      * fix use_cusolver_syevj_batched flag
      16e364d3
  20. 09 12月, 2022 3 次提交
  21. 08 12月, 2022 3 次提交
  22. 07 12月, 2022 1 次提交