1. 20 6月, 2023 1 次提交
  2. 06 4月, 2023 1 次提交
    • S
      Move fused_attention op to phi [迁移前向 GPU OpKernel] (#51743) · a7ec8958
      Sonder 提交于
      * add kernel functions
      
      * update kernel functions
      
      * update func parameters' name
      
      * create codes for gpu device
      
      * 调整文件位置
      
      * fix include error
      
      * remove dependent files to phi/
      
      * restore fused_attention_op.cu
      
      * fix dependence errors
      
      * fix dependence errors
      
      * fix include error
      
      * fix all depandence errors[build success]
      
      * remove useless include
      
      * recover useless include
      
      * use phi::ToNCCLDataType
      
      * fix namespace
      
      * update new register code
      
      * fix error in fused_gemm_epilogue_utils
      
      * fix error in FusedAttentionKernel parm
      
      * finish fused_attention registe code[build success]
      
      * add paddle::optional
      
      * add sig file
      
      * fix build error
      
      * fix a include error
      
      * update CMkaeList
      
      * fix parameter sequence
      
      * add include file
      
      * update #if before include
      
      * fix grammly error
      
      * update codes for DropoutParam
      
      * remove const cast
      
      * trans some fluid api to phi api
      
      * add #if
      
      * update test code
      
      * update test codes
      
      * recover test codes
      
      * trans fused_attention to fluid
      
      * move #endif to end
      
      * move #endif
      
      * delete useless files
      
      * use fused attention utils and recover random seed
      
      * remove fluid include in phi
      a7ec8958
  3. 06 3月, 2023 1 次提交
    • H
      [phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b
      Huang Jiyi 提交于
      * move DeviceContextPool to phi
      
      * add EmplaceExternalContextFunc
      
      * update namespace
      
      * update cmake
      
      * fix bugs and create context_pool_impl.h
      
      * replace platform::is_xxx_place
      
      * fix bugs
      
      * update generator
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix enforce usage
      
      * Revert "fix enforce usage"
      
      This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.
      
      * fix bugs
      
      * rm XPUDeviceContext and CustomDeviceContext
      
      * fix bugs
      
      * fix fix context init bug
      
      * fix bugs after merge
      
      * fix bugs
      
      * fix name
      
      * fix mutable_data
      
      * update and fix bugs
      
      * fix bugs
      
      * update
      
      * fix bugs
      
      * fix name
      
      * fix bugs
      
      * merge
      
      * fix bugs
      
      * create context_pool in phi/backends
      
      * create context_pool in phi/backends
      
      * fix bugs
      
      * fix xpu bugs
      
      * fix rocm bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix xpu bugs
      
      * update
      
      * update
      
      * fix bugs
      
      * fix bugs
      a1006b2b
  4. 02 3月, 2023 1 次提交
  5. 03 2月, 2023 1 次提交
  6. 02 12月, 2022 1 次提交
    • B
      Split common funcs from reduction and structure modification (#46970) · ef575d6a
      Bo Zhang 提交于
      * profile reduce kernel for fp16 and reduceHigherdim
      
      * use reinterpret_cast
      
      * fix for CI on ROCm
      
      * add Macro for ROCm
      
      * ROCm CI config
      
      * ROCm CI config
      
      * unit test repair
      
      * pull
      
      * add common_funcs.h
      
      * reduceType
      
      * Update reduce_function.h
      
      * not higher
      
      * rename
      ef575d6a
  7. 21 11月, 2022 1 次提交
  8. 18 11月, 2022 1 次提交
  9. 10 11月, 2022 1 次提交
  10. 31 10月, 2022 1 次提交
  11. 20 9月, 2022 1 次提交
  12. 23 8月, 2022 1 次提交
  13. 07 6月, 2022 1 次提交
  14. 05 6月, 2022 1 次提交
  15. 09 5月, 2022 1 次提交
  16. 18 4月, 2022 1 次提交
  17. 14 4月, 2022 1 次提交
  18. 12 4月, 2022 1 次提交
    • L
      [KP] Add Logical/compare/bitwise registry & UT (#40802) · 3749198e
      Lijunhui 提交于
      * init commit no push
      
      * collect comile errors
      
      * bitwise UT
      
      * fix compile problem
      
      * cancel comments
      
      * restore miss deletion
      
      * fix compilation
      
      * fix UT
      
      * NO stash in multiple branch at the same times
      
      * fix error
      
      * combine .cu from gpu and kps
      
      * replace gpu by kps
      
      * fix by Chen-weihang
      
      * Revert "Fix kps compile error in Junhui logic compare bitwise"
      
      * fix backend test
      
      * rm comments
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      3749198e
  19. 03 4月, 2022 1 次提交
    • F
      add maximum limit for grid of index_select (#41127) · af8d2482
      FlyingQianMM 提交于
      * limit grid dim for index select
      
      * mv LimitGridDim into gpu_launch_config.h
      
      * fix conflicts
      
      * fix conflicts
      
      * fix code style
      
      * set block to 256
      
      * fix grid setting
      
      * set dtype of block_dim to unsigned int
      af8d2482
  20. 02 4月, 2022 1 次提交
  21. 25 3月, 2022 1 次提交
  22. 24 3月, 2022 1 次提交
  23. 17 3月, 2022 1 次提交
  24. 08 3月, 2022 1 次提交
  25. 07 3月, 2022 1 次提交
    • C
      [Phi] Remove storage deps of empty (#40136) · b46e49de
      Chen Weihang 提交于
      * remove storage deps of empty
      
      * remove invalid empty method
      
      * remove error empty using
      
      * fix test_sparse_utils_dev_api
      
      * revert some sparse change
      
      * add memset for conv grad
      
      * resolve conflict
      
      * resolve conflict
      
      * resolve conflict
      b46e49de
  26. 04 3月, 2022 1 次提交
    • C
      [phi]move reduce gpu impl funcs into pten/kernels/funcs (#39990) · e2e2d531
      chentianyu03 提交于
      * move reduce gpu impl funcs into pten/kernels/funcs
      
      * change reduce header name and namespace
      
      * fix spell word error
      
      * change mutable_data to dev_ctx.Alloc
      
      * modify place to devcontex
      
      * format code style
      
      * fix build error
      
      * fix build error
      
      * fix conflict
      e2e2d531
  27. 03 3月, 2022 1 次提交
  28. 20 2月, 2022 2 次提交
  29. 19 2月, 2022 1 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
  30. 17 2月, 2022 1 次提交
  31. 11 2月, 2022 1 次提交
  32. 09 2月, 2022 1 次提交
  33. 08 2月, 2022 1 次提交
  34. 06 2月, 2022 1 次提交
  35. 29 1月, 2022 1 次提交
    • C
      [PTen] Tidy pten core headers (#39188) · dd990981
      Chen Weihang 提交于
      * open header for custom kernel
      
      * add core utils
      
      * tidy core code
      
      * tify header
      
      * tidy include
      
      * tidy namespace
      
      * resolve conflit
      
      * fix unittest and coverage
      
      * remove platform using
      
      * resolve conflict
      
      * resolve conflict
      
      * fix digamma namespace error
      
      * fix xpu full kernel error
      
      * fix xpu full kernel error
      
      * polish details
      
      * add place for lib storage
      dd990981
  36. 26 1月, 2022 1 次提交
  37. 25 1月, 2022 3 次提交