1. 30 9月, 2022 1 次提交
    • S
      support pure bfloat16 for more ops (#46364) · b7b231a6
      sneaxiy 提交于
      * support pure bfloat16
      
      * support bf16 linear
      
      * update PR to pass CI
      
      * tiny fix where_grad_kernel.cu
      
      * add bfloat16 to selu_grad to pass CI
      
      * fix selu grad compilation error
      b7b231a6
  2. 12 8月, 2022 1 次提交
    • S
      [geometric]Add paddle.geometric.send_ue_recv API (#43174) · 615b15a3
      Siming Dai 提交于
      * add init file
      
      * add op definition and infermeta
      
      * add kernel definition funcs
      
      * add broadcast infer shape
      
      * add gpu forward kernel
      
      * delete SUB and DIV
      
      * add x_grad
      
      * add template
      
      * add e_grad for min and max
      
      * fix small bug
      
      * temp commit
      
      * temp commit
      
      * add e_grad for sum and mean
      
      * fix some compile bug
      
      * fix compile bugs
      
      * fix compile problem
      
      * add sum forward unittest
      
      * fix broadcast error, add kernel sig, register e_grad, change unit test
      
      * fix grad
      
      * add temp grad fix
      
      * temp commit
      
      * add min max unittest
      
      * add max, min unittest, fix mul bug
      
      * add cpu forward sum and mean
      
      * add forward min max, fix mean unittest
      
      * add cpu backward min max
      
      * fix code-style
      
      * add backward sum mean
      
      * fix rocm ci
      
      * set uniitest timeout
      
      * fix bug of x broadcast to e, gpu grad
      
      * fix bug of x broadcast to e, cpu grad
      
      * rename BOOST_GET_CONST macro
      
      * fix rocm ci
      
      * mv graph_send_e_recv to graph_send_ue_recv
      
      * move out_size to IntArray
      
      * add eager op test
      
      * fix max pool type bug, add unittest for api
      
      * revise api doc
      
      * add fp16 for atomic min and max, add unittest
      
      * add unittest
      
      * add fp16 support for graph_send_recv
      
      * fix unittest fp16 bug
      
      * change OutSizeTensor to Out_size
      
      * move E to Y
      
      * add copyright, fix comment
      
      * review code
      
      * fix thread block size
      
      * fix thread block size
      
      * change api attribute name: pool_type to reduce_op, compute_type to message_op
      
      * change api attribute name, move pool_type to reduce_op, move compute_type to message_op
      615b15a3
  3. 26 6月, 2022 1 次提交
  4. 05 6月, 2022 1 次提交
  5. 01 3月, 2022 1 次提交
    • Z
      [bf16] add bf16 kernel: scale gather sum (#39683) · 6d26b332
      zhangbo9674 提交于
      * add scale gather sum
      
      * refine CUDA_ATOMIC_WRAPPER ADD for bf16
      
      * add gather unittest
      
      * solve conflict
      
      * add scale uinttest
      
      * add sum unittest
      
      * solve conflict
      
      * refine gather unittest
      
      * refine unittest
      6d26b332
  6. 25 2月, 2022 1 次提交
  7. 24 2月, 2022 1 次提交
  8. 09 12月, 2021 1 次提交
  9. 03 12月, 2021 1 次提交
  10. 19 11月, 2021 1 次提交
    • S
      Add paddle.incubate.graph_send_recv API (#37205) · 39012536
      Siming Dai 提交于
      * add cpu version, using set: sum, min, max
      
      * add cpu version: mean
      
      * improve cpu code and fix dynamic memory allcation problem
      
      * fix arg error, add index judge, delete fp16
      
      * fix bug in CudaAtomicMax and CudaAtomicMin
      
      * add CUDA version
      
      * fix grad_op bug for index
      
      * add op test, add correct cpu grad op
      
      * Add correct CUDA Mean grad
      
      * [Add] Successful MEAN and SUM
      
      * [Add] Successful MIN and MAX in CPU
      
      * [Add] Successful MIN and MAX in CUDA
      
      * fix windows dtype ci
      
      * fix ROCM ci by adding HIP flag
      
      * rename fused_gather_scatter to send_recv
      
      * unify name as send and recv
      
      * change zero index return time
      
      * add send_recv incubate api
      
      * fix index data type, add unittest case for API
      
      * delete redundant input tensor
      
      * fix en example and docs, add default value in pool_type
      
      * add shape judge and max grid judge
      
      * fix comment
      
      * fix index type bug
      
      * add const &
      
      * fix en docs
      
      * delete numpy in examples
      
      * add unittest for int input
      
      * fix send_recv comment
      
      * change send_recv to graph_send_recv
      39012536
  11. 01 6月, 2021 1 次提交
  12. 07 4月, 2021 1 次提交
  13. 08 2月, 2021 1 次提交
  14. 25 12月, 2020 1 次提交
  15. 26 9月, 2020 1 次提交
  16. 25 9月, 2020 1 次提交
  17. 24 9月, 2020 1 次提交
  18. 31 7月, 2018 1 次提交
  19. 30 7月, 2018 1 次提交
  20. 03 5月, 2018 1 次提交
  21. 02 5月, 2018 2 次提交
  22. 30 4月, 2018 1 次提交
  23. 10 4月, 2018 2 次提交
  24. 28 2月, 2018 1 次提交
  25. 26 2月, 2018 1 次提交
  26. 24 2月, 2018 1 次提交
  27. 12 2月, 2018 1 次提交
  28. 10 2月, 2018 1 次提交
  29. 23 11月, 2017 1 次提交
  30. 18 9月, 2017 1 次提交
  31. 23 8月, 2017 1 次提交
  32. 22 8月, 2017 2 次提交