1. 11 7月, 2023 1 次提交
  2. 03 7月, 2023 1 次提交
  3. 29 6月, 2023 1 次提交
    • N
      Add fused_rope forward op (#54351) · a215c46a
      niuliling123 提交于
      * style
      
      * more
      
      * update ctest
      
      * Update legacy_backward.yaml
      
      * Update legacy_ops.yaml
      
      * Update legacy_ops.yaml
      
      * update
      
      * update
      
      * update for move
      a215c46a
  4. 28 6月, 2023 1 次提交
  5. 26 6月, 2023 1 次提交
    • S
      remove ops from OpsWithFluidKernelNeedMoveToPhi set (#54007) · 733eca85
      Sonder 提交于
      * remove ops from OpsWithFluidKernelNeedMoveToPhi set
      
      * open static build flag
      
      * OpsWithFluidKernelNeedMoveToPhi
      
      * open new_executor_static_build
      
      * add infermate for cudnn_lstm
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * update
      
      * update
      
      * fix pow2 decay
      
      * fix pow2 decay
      
      * recover analysis_predictor.cc
      
      * fix pow2 decay
      
      * fix cudnn lstm
      
      * add output register info for svd
      
      * fix pow2_decay_with_linear_warmup_kernel
      
      * recover test lstm cudnn
      
      * recover svg register codes
      
      * fix register info
      
      * fix reduce sum register info
      
      * add output info for adadelta
      
      * add output info for adadelta
      
      * add output info for adamax
      
      * fix complex abs register info
      
      * add register info for cudnn_lstm_grad
      
      * recover
      
      * fix lstm cudnn
      
      * fix
      
      * fix xpu output registe info
      
      * remove std::cout
      
      * add backend
      
      * remove output info in pow2_decay_with_linear_warmup_kernel
      
      * add judgment in TensorShouldBeFakeInitialized
      
      * recover power_
      
      * close new_executor_static_build
      
      * fix set_value_xpu
      733eca85
  6. 16 6月, 2023 1 次提交
  7. 01 6月, 2023 1 次提交
  8. 23 5月, 2023 1 次提交
  9. 10 5月, 2023 1 次提交
    • add index_put api (#52886) · f3393f49
      傅剑寒 提交于
      * add index_put api
      
      * fix value broadcast in backward and add test case in static
      
      * add timeout=120s for index_put
      
      * add op_compat for index_put
      
      * add inplace index_put test
      
      * add test case when index tensor in indices is int32 when indices.size less than x.dims
      
      * add index_put api backward in cpu place
      
      * add backward test case
      
      * refactor code to delete some duplicated code
      
      * replace reshape with resize for decrease extra memcpy
      
      * add datatype flag in backward yaml
      
      * fix bug in documentation
      
      * Update python/paddle/tensor/manipulation.py
      
      ---------
      Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
      f3393f49
  10. 24 4月, 2023 1 次提交
  11. 19 4月, 2023 1 次提交
  12. 11 4月, 2023 2 次提交
  13. 04 4月, 2023 1 次提交
  14. 27 3月, 2023 1 次提交
  15. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  16. 22 3月, 2023 1 次提交
    • S
      Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b
      sneaxiy 提交于
      * add fused_linear_param_grad_add_kernel
      
      * fix compile error
      
      * remove flag
      
      * fix ci compile error
      
      * fix ci compile error
      
      * revert pylayer revision
      
      * fix ci ut
      
      * improve performance
      f59c5d8b
  17. 08 3月, 2023 1 次提交
  18. 06 3月, 2023 1 次提交
  19. 03 3月, 2023 1 次提交
  20. 01 3月, 2023 1 次提交
  21. 17 2月, 2023 1 次提交
    • Y
      Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2
      yuehuayingxueluo 提交于
      * rename multi_tensor_adam to fused_adam
      
      * fix some bugs
      
      * fix CI coverage
      
      * rename test_fused_adam.py
      
      * fix some bug
      
      * add test_fused_adam_op.py
      
      * fix some bugs
      
      * fix fused_adam_op.cc
      
      * fix CI bugs
      
      * fix CI bug
      
      * fix CI bug
      e6af9bd2
  22. 16 2月, 2023 1 次提交
  23. 09 2月, 2023 1 次提交
    • Y
      Add MultiTenosrAdam OP (#49220) · 10654c77
      yuehuayingxueluo 提交于
      * add multi_tenosr_adam
      
      * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py
      
      * fix adam.py optimizer.py
      
      * fix adamw.py
      
      * fix test_multi_tensor_adam.py
      
      * fix CI bug
      
      * fix CI coverage
      
      * fix ci bug
      
      * fix betapow
      
      * fix some bugs
      
      * fix test_adamw_op.py
      
      * fix CI coverage
      
      * fix multi_tensor_adam_kernel.cc
      
      * fix CI bug
      
      * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py
      
      * fix code style
      
      * update C++ parts
      
      * remove python parts modification temporarily
      
      * add C++ ut
      
      * update betapow copy code logic
      
      * fix ci ut
      
      * fix windows ci
      
      * fix coverage ci
      
      * improve coverage rate
      
      ---------
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      10654c77
  24. 23 12月, 2022 1 次提交
  25. 22 12月, 2022 1 次提交
  26. 09 12月, 2022 1 次提交
  27. 17 11月, 2022 1 次提交
  28. 02 11月, 2022 1 次提交
  29. 01 11月, 2022 1 次提交
  30. 31 10月, 2022 1 次提交
  31. 12 10月, 2022 1 次提交
  32. 19 9月, 2022 1 次提交
    • Y
      [PHI]Move sum op to PHI (#45860) · 4b3f2af1
      YuanRisheng 提交于
      * move sum
      
      * fix ci bugs
      
      * fix ci bugs
      
      * fix set_lod bugs
      
      * fix infershape bugs
      
      * fix ci bugs
      
      * fix ci unittest bug
      
      * fix ci bugs
      
      * perfect code
      
      * update code according comment
      
      * add unittest
      
      * fix ci bugs
      4b3f2af1
  33. 09 9月, 2022 1 次提交
  34. 07 9月, 2022 1 次提交
  35. 30 8月, 2022 1 次提交
    • H
      [phi] Transfer coalesce_tensor to phi (#45478) · cf9d651b
      HongyuJia 提交于
      * add coalesce_tensor kernel
      
      * polist coalesce_tensor kernel
      
      * add sig and InferMeta
      
      * add testcase
      
      * add legacy_api.yaml
      
      * fix infermeta
      
      * fix yaml
      
      * fix kernel implementation
      
      * add compile dependency of phi/kernels
      
      * fix MetaConfig
      
      * add python api
      
      * add and fix testcase
      
      * rnn.py add import
      
      * change _C_ops.coalesce_tensor
      
      * remove useless comments
      
      * add SetBackend
      
      * restore XPU kernel temporarily
      
      * fix code according to PR comments
      cf9d651b
  36. 16 8月, 2022 2 次提交
    • C
      [Phi] Move amp ops into phi (#45079) · b4f67757
      Chen Weihang 提交于
      * move check finite and unscale kernel into phi
      
      * move infershape into phi
      
      * move update_loss_scaling kernel into phi
      
      * remove original kernels
      
      * move update loss scaling infershape into phi
      
      * add header for xpu and npu
      
      * solve coverage failed
      
      * fix npu test failed
      
      * remove mutable data in cu file
      
      * fix new executor failed
      
      * add valid check for meta tensor output
      b4f67757
    • S
      [geometric]Add paddle.geometric.send_uv API (#44848) · 88724a53
      Siming Dai 提交于
      * initial commit
      
      * fix op maker bug
      
      * fix mul grad bug
      
      * add unittest
      
      * fix add grad bug, add cpu kernel
      
      * add paddle.geometric.message_passing
      
      * add paddle.geometric.send_uv api, add unittest
      
      * add fp16 judgement
      
      * fix file typo, move compute_type to message_op
      
      * add impl file
      
      * fix unittest timeout time
      
      * add review revise
      88724a53
  37. 12 8月, 2022 1 次提交
    • S
      [geometric]Add paddle.geometric.send_ue_recv API (#43174) · 615b15a3
      Siming Dai 提交于
      * add init file
      
      * add op definition and infermeta
      
      * add kernel definition funcs
      
      * add broadcast infer shape
      
      * add gpu forward kernel
      
      * delete SUB and DIV
      
      * add x_grad
      
      * add template
      
      * add e_grad for min and max
      
      * fix small bug
      
      * temp commit
      
      * temp commit
      
      * add e_grad for sum and mean
      
      * fix some compile bug
      
      * fix compile bugs
      
      * fix compile problem
      
      * add sum forward unittest
      
      * fix broadcast error, add kernel sig, register e_grad, change unit test
      
      * fix grad
      
      * add temp grad fix
      
      * temp commit
      
      * add min max unittest
      
      * add max, min unittest, fix mul bug
      
      * add cpu forward sum and mean
      
      * add forward min max, fix mean unittest
      
      * add cpu backward min max
      
      * fix code-style
      
      * add backward sum mean
      
      * fix rocm ci
      
      * set uniitest timeout
      
      * fix bug of x broadcast to e, gpu grad
      
      * fix bug of x broadcast to e, cpu grad
      
      * rename BOOST_GET_CONST macro
      
      * fix rocm ci
      
      * mv graph_send_e_recv to graph_send_ue_recv
      
      * move out_size to IntArray
      
      * add eager op test
      
      * fix max pool type bug, add unittest for api
      
      * revise api doc
      
      * add fp16 for atomic min and max, add unittest
      
      * add unittest
      
      * add fp16 support for graph_send_recv
      
      * fix unittest fp16 bug
      
      * change OutSizeTensor to Out_size
      
      * move E to Y
      
      * add copyright, fix comment
      
      * review code
      
      * fix thread block size
      
      * fix thread block size
      
      * change api attribute name: pool_type to reduce_op, compute_type to message_op
      
      * change api attribute name, move pool_type to reduce_op, move compute_type to message_op
      615b15a3
  38. 08 8月, 2022 1 次提交