1. 29 6月, 2023 1 次提交
    • N
      Add fused_rope forward op (#54351) · a215c46a
      niuliling123 提交于
      * style
      
      * more
      
      * update ctest
      
      * Update legacy_backward.yaml
      
      * Update legacy_ops.yaml
      
      * Update legacy_ops.yaml
      
      * update
      
      * update
      
      * update for move
      a215c46a
  2. 12 6月, 2023 1 次提交
  3. 09 6月, 2023 1 次提交
  4. 23 5月, 2023 1 次提交
  5. 22 5月, 2023 1 次提交
    • M
      [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() (#53856) · 3794d171
      Meteor Liu 提交于
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * fixed cyclic reference that caused patial import
      
      * fixed bad change
      
      * fix bad import
      
      * fix bad import
      
      * fix bad import
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fixed usage of in_dynamic_mode() or in_dygraph_mode()
      
      * revert python3 to python in .pre-commit-config.yaml
      
      * fix merge conflicts
      3794d171
  6. 19 5月, 2023 1 次提交
    • L
      Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
      limingshu 提交于
      * Reorganize the forward codes of flash-attention.
      
      * Fix forward.
      
      * Remove some noused codes.
      
      * Simplify codes and fix backward.
      
      * Change all LOG(INFO) to VLOG and fix the backward.
      
      * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes
      
      * decrease the effect of debug print on performance
      
      * Unify the initialize of flashattn arguments.
      
      * Rewirte the reshape of temp_mask and temp_bias.
      
      * API support use_flash_attn.
      
      * Fix compiling error on CI.
      
      * Try to crop the flash-attention lib.
      
      * Correct the condition of whether can use flash-attn.
      
      * Remove the softmax_out argument.
      
      * Remove is_causal.
      
      * Polish codes.
      
      * Fix qkv_transpose_out's shape and scaling of Q * K.
      
      * Update commit of flash-attention.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      d29c1f8e
  7. 06 5月, 2023 1 次提交
  8. 17 4月, 2023 1 次提交
  9. 31 3月, 2023 1 次提交
  10. 29 3月, 2023 1 次提交
  11. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  12. 23 3月, 2023 1 次提交
  13. 22 3月, 2023 1 次提交
  14. 17 3月, 2023 1 次提交
  15. 10 3月, 2023 1 次提交
  16. 22 2月, 2023 1 次提交
  17. 15 2月, 2023 1 次提交
  18. 01 2月, 2023 1 次提交
  19. 05 1月, 2023 2 次提交
  20. 23 12月, 2022 1 次提交
  21. 22 12月, 2022 1 次提交
  22. 07 12月, 2022 1 次提交
  23. 29 11月, 2022 1 次提交
  24. 28 11月, 2022 1 次提交
    • Y
      clear fluid api: warpctc, nce, identity_loss (#48142) · d983fc34
      yuehuayingxueluo 提交于
      * clear fluid api: warpctc, nce, identity_loss
      
      * fix test_layers.py __init__.py
      
      * fix loss.py
      
      * change __init__.py and api calling method
      
      * fix nce
      
      * fix nce
      
      * fix fluid.data
      
      * delete warpctc api document
      
      * fix loss.py
      
      * fix ctc_loss
      
      * fix test_warpctc_op.py
      
      * fix test_layers.py
      
      * fix some bug
      
      * fix conflict
      
      * fix ci bug
      
      * Empty Commit test=allcase
      
      * fix ci bug
      d983fc34
  25. 22 11月, 2022 1 次提交
    • U
      Fixdocs (#47986) · 91f4d1ce
      ustiniankw 提交于
      * list112-122, test=document_fix
      
      * precommitfix, test=document_fix
      
      * list112-127, test=document_fix
      
      * fix_ResNetBasicBlock, test=document_fix
      
      * pre-commit_resnet, test=document_fix
      
      * refix, test=document
      
      * refix, test=document_fix
      91f4d1ce
  26. 03 11月, 2022 1 次提交
  27. 02 11月, 2022 1 次提交
  28. 23 10月, 2022 1 次提交
  29. 20 10月, 2022 1 次提交
  30. 12 10月, 2022 1 次提交
  31. 10 10月, 2022 1 次提交
  32. 23 9月, 2022 1 次提交
  33. 14 9月, 2022 1 次提交
  34. 26 8月, 2022 1 次提交
  35. 30 6月, 2022 1 次提交
  36. 28 6月, 2022 1 次提交
  37. 21 6月, 2022 1 次提交
  38. 17 6月, 2022 1 次提交
  39. 14 6月, 2022 1 次提交