1. 14 8月, 2023 1 次提交
    • MarDino's avatar
      Add rmsnorm residual bias add and quant (#55965) · 2ac6a7e4
      MarDino 提交于
      * add rmsnorm residual bias add and quant
      
      * refine python interface
      
      * add rmsnorm unittest
      
      * Add layernorm
      
      * fix layernorm unittest
      
      * refine unittest
      
      * fix example code
      
      * fix review comment
      2ac6a7e4
  2. 10 8月, 2023 1 次提交
    • L
      Add variable_length_memory_efficient_attention (#55400) · 4036c937
      lzy 提交于
      * add variable_length_memory_efficient_attention
      * update variable_length_memory_efficient_attention unittest
      * update variable_length_mem_eff_attn's docs and unittest
      * update variable_length_mem_eff_attn's docs
      * Update test_variable_length_memory_efficient_attention.py
      * Update variable_length_memory_efficient_attention.cu
      * fix codestyle
      * fix variable_length_fmha's docs and unittest
      * fix variable_length_fmha's docs
      4036c937
  3. 08 8月, 2023 2 次提交
  4. 03 8月, 2023 1 次提交
  5. 26 7月, 2023 1 次提交
  6. 13 7月, 2023 1 次提交
  7. 11 7月, 2023 1 次提交
  8. 03 7月, 2023 1 次提交
  9. 29 6月, 2023 1 次提交
    • N
      Add fused_rope forward op (#54351) · a215c46a
      niuliling123 提交于
      * style
      
      * more
      
      * update ctest
      
      * Update legacy_backward.yaml
      
      * Update legacy_ops.yaml
      
      * Update legacy_ops.yaml
      
      * update
      
      * update
      
      * update for move
      a215c46a
  10. 28 6月, 2023 1 次提交
  11. 26 6月, 2023 1 次提交
    • S
      remove ops from OpsWithFluidKernelNeedMoveToPhi set (#54007) · 733eca85
      Sonder 提交于
      * remove ops from OpsWithFluidKernelNeedMoveToPhi set
      
      * open static build flag
      
      * OpsWithFluidKernelNeedMoveToPhi
      
      * open new_executor_static_build
      
      * add infermate for cudnn_lstm
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * update
      
      * update
      
      * fix pow2 decay
      
      * fix pow2 decay
      
      * recover analysis_predictor.cc
      
      * fix pow2 decay
      
      * fix cudnn lstm
      
      * add output register info for svd
      
      * fix pow2_decay_with_linear_warmup_kernel
      
      * recover test lstm cudnn
      
      * recover svg register codes
      
      * fix register info
      
      * fix reduce sum register info
      
      * add output info for adadelta
      
      * add output info for adadelta
      
      * add output info for adamax
      
      * fix complex abs register info
      
      * add register info for cudnn_lstm_grad
      
      * recover
      
      * fix lstm cudnn
      
      * fix
      
      * fix xpu output registe info
      
      * remove std::cout
      
      * add backend
      
      * remove output info in pow2_decay_with_linear_warmup_kernel
      
      * add judgment in TensorShouldBeFakeInitialized
      
      * recover power_
      
      * close new_executor_static_build
      
      * fix set_value_xpu
      733eca85
  12. 16 6月, 2023 1 次提交
  13. 01 6月, 2023 1 次提交
  14. 23 5月, 2023 1 次提交
  15. 10 5月, 2023 1 次提交
    • add index_put api (#52886) · f3393f49
      傅剑寒 提交于
      * add index_put api
      
      * fix value broadcast in backward and add test case in static
      
      * add timeout=120s for index_put
      
      * add op_compat for index_put
      
      * add inplace index_put test
      
      * add test case when index tensor in indices is int32 when indices.size less than x.dims
      
      * add index_put api backward in cpu place
      
      * add backward test case
      
      * refactor code to delete some duplicated code
      
      * replace reshape with resize for decrease extra memcpy
      
      * add datatype flag in backward yaml
      
      * fix bug in documentation
      
      * Update python/paddle/tensor/manipulation.py
      
      ---------
      Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
      f3393f49
  16. 24 4月, 2023 1 次提交
  17. 19 4月, 2023 1 次提交
  18. 11 4月, 2023 2 次提交
  19. 04 4月, 2023 1 次提交
  20. 27 3月, 2023 1 次提交
  21. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  22. 22 3月, 2023 1 次提交
    • S
      Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b
      sneaxiy 提交于
      * add fused_linear_param_grad_add_kernel
      
      * fix compile error
      
      * remove flag
      
      * fix ci compile error
      
      * fix ci compile error
      
      * revert pylayer revision
      
      * fix ci ut
      
      * improve performance
      f59c5d8b
  23. 08 3月, 2023 1 次提交
  24. 06 3月, 2023 1 次提交
  25. 03 3月, 2023 1 次提交
  26. 01 3月, 2023 1 次提交
  27. 17 2月, 2023 1 次提交
    • Y
      Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2
      yuehuayingxueluo 提交于
      * rename multi_tensor_adam to fused_adam
      
      * fix some bugs
      
      * fix CI coverage
      
      * rename test_fused_adam.py
      
      * fix some bug
      
      * add test_fused_adam_op.py
      
      * fix some bugs
      
      * fix fused_adam_op.cc
      
      * fix CI bugs
      
      * fix CI bug
      
      * fix CI bug
      e6af9bd2
  28. 16 2月, 2023 1 次提交
  29. 09 2月, 2023 1 次提交
    • Y
      Add MultiTenosrAdam OP (#49220) · 10654c77
      yuehuayingxueluo 提交于
      * add multi_tenosr_adam
      
      * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py
      
      * fix adam.py optimizer.py
      
      * fix adamw.py
      
      * fix test_multi_tensor_adam.py
      
      * fix CI bug
      
      * fix CI coverage
      
      * fix ci bug
      
      * fix betapow
      
      * fix some bugs
      
      * fix test_adamw_op.py
      
      * fix CI coverage
      
      * fix multi_tensor_adam_kernel.cc
      
      * fix CI bug
      
      * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py
      
      * fix code style
      
      * update C++ parts
      
      * remove python parts modification temporarily
      
      * add C++ ut
      
      * update betapow copy code logic
      
      * fix ci ut
      
      * fix windows ci
      
      * fix coverage ci
      
      * improve coverage rate
      
      ---------
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      10654c77
  30. 23 12月, 2022 1 次提交
  31. 22 12月, 2022 1 次提交
  32. 09 12月, 2022 1 次提交
  33. 17 11月, 2022 1 次提交
  34. 02 11月, 2022 1 次提交
  35. 01 11月, 2022 1 次提交
  36. 31 10月, 2022 1 次提交
  37. 12 10月, 2022 1 次提交
  38. 19 9月, 2022 1 次提交
    • Y
      [PHI]Move sum op to PHI (#45860) · 4b3f2af1
      YuanRisheng 提交于
      * move sum
      
      * fix ci bugs
      
      * fix ci bugs
      
      * fix set_lod bugs
      
      * fix infershape bugs
      
      * fix ci bugs
      
      * fix ci unittest bug
      
      * fix ci bugs
      
      * perfect code
      
      * update code according comment
      
      * add unittest
      
      * fix ci bugs
      4b3f2af1