1. 11 4月, 2023 1 次提交
    • Y
      Cherry pick for fix of operator precision. (#52705) · d1e8b1e2
      Yiqun Liu 提交于
      * Fix scale kernel for low precision, cherry pick #50998.
      
      * Fix the FP16 precision problem of add_n. (#50129)
      
      * Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315.
      
      * Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993.
      
      * Cherry-pick the multi-precision support of AdamW for bf16, #48041.
      
      * Fix compiling error.
      
      * Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993.
      
      * Fix unittest.
      
      ---------
      Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>
      d1e8b1e2
  2. 05 8月, 2022 1 次提交
    • F
      move fft kernels to phi (#44714) · 153f1138
      Feiyu Chan 提交于
      * move fft kernels to phi, done with cufft, pocketfft, mkl_cdft, hipfft
      * make stft_op use fft from phi/kernels/funcs, clean code
      153f1138
  3. 21 6月, 2022 1 次提交
  4. 05 6月, 2022 1 次提交
  5. 12 3月, 2022 1 次提交
  6. 01 3月, 2022 1 次提交
    • Z
      [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
      zhangbo9674 提交于
      * add scale gather sum
      
      * refine CUDA_ATOMIC_WRAPPER ADD for bf16
      
      * add gather unittest
      
      * solve conflict
      
      * add scale uinttest
      
      * add sum unittest
      
      * solve conflict
      
      * refine gather unittest
      
      * refine unittest
      6d26b332
  7. 22 2月, 2022 1 次提交
  8. 20 2月, 2022 2 次提交
  9. 11 2月, 2022 1 次提交
  10. 08 2月, 2022 1 次提交
  11. 28 1月, 2022 1 次提交
  12. 27 1月, 2022 1 次提交
  13. 24 1月, 2022 1 次提交
    • [Refactoring Tensor PR #5] replace storage with pten allocation (#39085) · a56e16a7
      石晓伟 提交于
      * updates callers, test=develop
      
      * updates tensor, test=develop
      
      * fixes errors, test=develop
      
      * remove some dtypes, test=develop
      
      * fix errors in the base storage modification, test=develop
      
      * fixes a bug, test=develop
      
      * fixes the bugs in push the whole, test=develop
      
      * updates, test=develop
      
      * update
      
      * update, test=develop
      
      * fixes the mac-py3 CI, test=develop
      
      * remove the storage impl, test=develop
      
      * updates some codes, test=develop
      
      * update, test=develop
      
      * updates pten allocation, test=develop
      a56e16a7
  14. 20 1月, 2022 1 次提交
  15. 18 1月, 2022 2 次提交
  16. 15 1月, 2022 1 次提交
  17. 13 1月, 2022 2 次提交
    • C
      [PTen] Rename kernel register marco (#38861) · 158bf13f
      Chen Weihang 提交于
      * rename register marco
      
      * fix error changing
      
      * fix format error
      158bf13f
    • C
      [pten]Remove pten/include dir files (#38878) · 7e0292ea
      chentianyu03 提交于
      * move dot_dev api into dot_kernel.h
      
      * add infermate header
      
      * modify to dotkerel in dot_op.h
      
      * mvoe conj dev api into complex_kernel.h
      
      * move sign dev api into  sign_kernel.h
      
      * move scale dev api into kernel.h and remove infermete.h
      
      * rm paddle/pten/include/math.h
      
      * rm paddle/pten/include/math.h
      
      * rm include dir
      
      * rm paddle/pten/include/math.h
      
      * fix conflict with develop branch
      
      * rm devContext in conj_op.h
      
      * add the missing complex_kernel header
      7e0292ea
  18. 12 1月, 2022 1 次提交
  19. 04 1月, 2022 1 次提交
  20. 21 12月, 2021 2 次提交
  21. 20 12月, 2021 1 次提交