1. 14 3月, 2023 1 次提交
  2. 13 3月, 2023 1 次提交
  3. 10 3月, 2023 2 次提交
  4. 09 3月, 2023 3 次提交
  5. 07 3月, 2023 1 次提交
  6. 06 3月, 2023 1 次提交
    • H
      [phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b
      Huang Jiyi 提交于
      * move DeviceContextPool to phi
      
      * add EmplaceExternalContextFunc
      
      * update namespace
      
      * update cmake
      
      * fix bugs and create context_pool_impl.h
      
      * replace platform::is_xxx_place
      
      * fix bugs
      
      * update generator
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix enforce usage
      
      * Revert "fix enforce usage"
      
      This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.
      
      * fix bugs
      
      * rm XPUDeviceContext and CustomDeviceContext
      
      * fix bugs
      
      * fix fix context init bug
      
      * fix bugs after merge
      
      * fix bugs
      
      * fix name
      
      * fix mutable_data
      
      * update and fix bugs
      
      * fix bugs
      
      * update
      
      * fix bugs
      
      * fix name
      
      * fix bugs
      
      * merge
      
      * fix bugs
      
      * create context_pool in phi/backends
      
      * create context_pool in phi/backends
      
      * fix bugs
      
      * fix xpu bugs
      
      * fix rocm bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix xpu bugs
      
      * update
      
      * update
      
      * fix bugs
      
      * fix bugs
      a1006b2b
  7. 03 3月, 2023 2 次提交
  8. 02 3月, 2023 2 次提交
  9. 01 3月, 2023 1 次提交
    • D
      [XPU] Add kernels for VITDET (#50992) · 798b527c
      duanyanhui 提交于
      * add support of int64 add for xpu
      
      * add transpose support for int64
      
      * add randperm kernel
      
      * fix randperm
      
      * add distribute_fpn_proposal kernel
      
      * fix comment
      
      * add reduce_sum_int32
      798b527c
  10. 28 2月, 2023 1 次提交
  11. 27 2月, 2023 2 次提交
  12. 26 2月, 2023 1 次提交
  13. 22 2月, 2023 1 次提交
  14. 21 2月, 2023 2 次提交
  15. 20 2月, 2023 1 次提交
  16. 17 2月, 2023 2 次提交
  17. 16 2月, 2023 2 次提交
    • H
      [Phi decouple] move layer_norm_kernel.cu.h to phi (#50506) · 8910bb4a
      Huang Jiyi 提交于
      * move layer_norm_kernel.cu.h to phi
      
      * fix bugs
      
      * fix namespace
      
      * fix bugs
      
      * fix CI-Windwos
      
      * replace mutable_data
      
      * fix bugs
      
      * fix bugs
      8910bb4a
    • H
      [phi decoupling] remove variable.h in phi (#50407) · 905cefd4
      Huang Jiyi 提交于
      * move variable_utils from phi_api_utils to fluid
      
      * fix coment
      
      * update include
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update
      
      * update
      
      * fix CI-Windows-OpenBLAS
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * update include
      
      * move variable_utils to phi_utils
      
      * fix namespace
      905cefd4
  18. 15 2月, 2023 1 次提交
  19. 14 2月, 2023 2 次提交
  20. 10 2月, 2023 1 次提交
  21. 09 2月, 2023 2 次提交
    • H
      [PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a
      Huang Jiyi 提交于
      * decouple strided_memcpy
      
      * move strided_memcpy
      
      * move strided_memcpy to phi
      
      * fix namespace
      
      * update
      
      * fix gpu compile bugs
      17318c1a
    • Y
      Add MultiTenosrAdam OP (#49220) · 10654c77
      yuehuayingxueluo 提交于
      * add multi_tenosr_adam
      
      * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py
      
      * fix adam.py optimizer.py
      
      * fix adamw.py
      
      * fix test_multi_tensor_adam.py
      
      * fix CI bug
      
      * fix CI coverage
      
      * fix ci bug
      
      * fix betapow
      
      * fix some bugs
      
      * fix test_adamw_op.py
      
      * fix CI coverage
      
      * fix multi_tensor_adam_kernel.cc
      
      * fix CI bug
      
      * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py
      
      * fix code style
      
      * update C++ parts
      
      * remove python parts modification temporarily
      
      * add C++ ut
      
      * update betapow copy code logic
      
      * fix ci ut
      
      * fix windows ci
      
      * fix coverage ci
      
      * improve coverage rate
      
      ---------
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      10654c77
  22. 08 2月, 2023 1 次提交
  23. 07 2月, 2023 1 次提交
  24. 03 2月, 2023 1 次提交
  25. 02 2月, 2023 2 次提交
  26. 01 2月, 2023 3 次提交
    • R
      Fix div 0 error of case11: paddle.nn.functional.max_pool1d/max_pool2d/max_pool3d (#50010) · 3ab6faa8
      RedContritio 提交于
      * add stride check for MaxPool
      
      * add unittests
      3ab6faa8
    • L
      Combination of multiple paddle::memory::allocate operation into one for ops (#49126) · bdae5481
      limingshu 提交于
      * A leap of try for cudaLaunchCooperativeKernel
      
      * fix bugs
      
      * Totally replace the lar cuda kernel
      
      * Fix bugs
      
      * fix code according to comments
      
      * fix codes according to  review comments
      
      * adding some function overload
      
      * relocate the power operation.
      
      * add bf16 support for index select relevant ops
      
      * revert bf16 type change.
      
      * add changes for more op
      
      * fix code writting bugs
      bdae5481
    • L
      H2D data transfer optimization for split kernel (#49086) · 057ba778
      limingshu 提交于
      * profile reduce kernel for fp16 and reduceHigherdim
      
      * use reinterpret_cast
      
      * fix for CI on ROCm
      
      * add Macro for ROCm
      
      * ROCm CI config
      
      * ROCm CI config
      
      * unit test repair
      
      * pull
      
      * add common_funcs.h
      
      * reduceType
      
      * Update reduce_function.h
      
      * not higher
      
      * rename
      
      * implement of matmul using cublasLt instead of cublas
      
      * cublasLt bugfix
      
      * Update matmul_kernel_impl.h
      
      * Update matmul_kernel_impl_via_blasLt.h
      
      * for-loop-algo
      
      * PR comments changes
      
      * add macro
      
      * ci unused variable isCublasLt
      
      * ci unused variable isCublasLt macro
      
      * split matmul to autotune
      
      * rewrite the split kernel with segmented_array
      
      * rewrite the split kernel with segmented_array
      
      * rewrite the split kernel with segmented_array
      
      * add some method for cuda_graph
      
      * fix bugs for rocm
      
      * change for ci-error
      
      * i dont know why ci-model-benchmark gives a shit error, so i recover codes with original one to see if original codes work.
      
      * add some changes for passing mode_benchmark and coverage ci
      
      * fix ci error
      
      * fix ci-rocm error
      
      * add some changes for header
      
      ---------
      Co-authored-by: Nzhangbopd <1299246947@qq.com>
      Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
      057ba778