1. 10 5月, 2023 1 次提交
    • B
      [cherry-pick 2.5] Broadcast && Dropout_nd Performance Optimization into Release/2.5 (#53623) · f9ea2301
      Bo Zhang 提交于
      * Support different dtypes of inputs for broadcast for dropout optimization  (#52093)
      
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * PR comment
      
      * dropout_nd_optimization (#51479)
      
      * with printf
      
      * add DropOutNdForwardKernel
      
      * PR comment
      
      * Dropout optimize & clean broadcast inT and ElementwiseType (#52969)
      
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * clean ElementwiseT and InT for BroadcastKernel
      
      * default axis and clean inT
      
      * remove redundant fast divmod computation
      
      * optimize drop_nd & drop_nd_grad
      
      * optimize BroadcastDataLoader bf16 fp16
      
      * rm InT etc. after merge develop
      
      * delete constexpr for windows ci
      
      * fix conflict
      
      * fix conflic with develop
      
      * fix conflic
      
      * new clean
      
      * clean
      
      * Fix xpu2 kp compile error (#53548)
      
      * fix conflict
      
      * conflict
      f9ea2301
  2. 08 3月, 2023 1 次提交
  3. 03 3月, 2023 1 次提交
  4. 21 2月, 2023 1 次提交
  5. 17 2月, 2023 1 次提交
  6. 01 9月, 2022 1 次提交
  7. 30 8月, 2022 1 次提交
  8. 21 7月, 2022 1 次提交
    • X
      [ Phi ] svd transfer (#44392) · ba89a3d3
      xiongkun 提交于
      * svd cpu forward
      
      * svd gpu forward
      
      * transfer the backward of svd
      
      * remove cusolver in svd_grad
      
      * svd kernel bug fix
      
      * fix bugs
      
      * fix bugs.
      
      * fix bug
      ba89a3d3
  9. 21 6月, 2022 1 次提交
  10. 05 6月, 2022 1 次提交
  11. 19 4月, 2022 1 次提交
  12. 15 4月, 2022 1 次提交
  13. 17 3月, 2022 1 次提交
  14. 16 3月, 2022 1 次提交
  15. 15 3月, 2022 1 次提交
  16. 14 3月, 2022 1 次提交
    • C
      【phi】migrate matrix_rank to phi (#40074) · b9d4285b
      crystal 提交于
      * migrate matrix_rank to phi
      
      * migrate eigh and matrix_rank to phi
      
      * fix matrix_rank
      
      * optimize code
      
      * move matrix_rank to phi
      
      * add max functor
      
      * migrate matrix_rank to phi
      
      * optimize code
      b9d4285b