1. 26 5月, 2023 1 次提交
    • Y
      [PHI Decoupling]Create PHI shared lib (#53735) · da50a009
      YuanRisheng 提交于
      * create phi so
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * add file
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * perfect so
      
      * fix py3 bugs
      
      * delete all static target in phi
      
      * fix windows bugs
      
      * fix py3 bugs
      
      * fix ci bugs
      
      * fix windows bugs
      
      * fix bugs: gflags can't be linked by dynamic and static lib
      
      * fix bugs that can not load 3rd party
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix conflict
      
      * fix xpu bugs
      
      * fix mac compile bugs
      
      * fix psgpu bugs
      
      * fix inference failed
      
      * deal with conflict
      
      * fix LIBRARY_PATH bug
      
      * fix windows bugs
      
      * fix onednn error
      
      * fix windows compile bugs
      
      * fix windows compile bugs
      
      * fix test_cuda_graph_static_mode_error aborted
      
      * fix windows bugs
      
      * fix mac-python3 error
      
      * fix hip compile bugs
      
      * change mode to static
      
      * change to static mode
      
      * fix ci bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix bugs
      
      * add static flag
      
      * add PADDLE_API
      
      * change position of PADDLE_API
      
      * fix windows bugs
      
      * change mode to dynamic lib
      
      * fix windows static bugs
      
      * deal with conflict
      
      * fix windows unit bug
      
      * fix coverage
      
      * deal with conflict
      
      * fix windows-inference
      
      * fix py3 bugs
      
      * fix bugs when compile type_info
      
      * fix compile bugs
      
      * fix py3 bugs
      
      * fix windows bugs
      
      * fix windows openblas
      
      * fix xpu bugs
      
      * fix enforce_test in windows
      
      * update code according comment
      
      * fix windows cmake bug
      
      * fix windows bugs
      
      * fix windows bugs
      
      * delete cinn unittest
      
      * fix cinn bugs
      
      ---------
      Co-authored-by: HappyHeavyRain's avatarlzydev <1528794076@qq.com>
      da50a009
  2. 24 5月, 2023 2 次提交
  3. 23 5月, 2023 3 次提交
  4. 19 5月, 2023 2 次提交
    • L
      Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
      limingshu 提交于
      * Reorganize the forward codes of flash-attention.
      
      * Fix forward.
      
      * Remove some noused codes.
      
      * Simplify codes and fix backward.
      
      * Change all LOG(INFO) to VLOG and fix the backward.
      
      * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes
      
      * decrease the effect of debug print on performance
      
      * Unify the initialize of flashattn arguments.
      
      * Rewirte the reshape of temp_mask and temp_bias.
      
      * API support use_flash_attn.
      
      * Fix compiling error on CI.
      
      * Try to crop the flash-attention lib.
      
      * Correct the condition of whether can use flash-attn.
      
      * Remove the softmax_out argument.
      
      * Remove is_causal.
      
      * Polish codes.
      
      * Fix qkv_transpose_out's shape and scaling of Q * K.
      
      * Update commit of flash-attention.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      d29c1f8e
    • G
      test,test=develop (#53818) · 63ffd733
      Galaxy1458 提交于
      63ffd733
  5. 18 5月, 2023 1 次提交
  6. 16 5月, 2023 1 次提交
  7. 15 5月, 2023 3 次提交
    • X
      Silu double grad (#53605) · 94c38803
      xiaoguoguo626807 提交于
      * add rules
      
      * modify no kernel yaml parse
      
      * success op generate
      
      * success test_silu_double
      
      * modify bug
      
      * modify static error
      
      * modify silu_grad input
      
      * modify kernel signature
      
      * modify kernel signature
      
      * code style
      
      * code style
      
      * review
      
      * delete opinfo modify
      94c38803
    • G
      remove some [-Wunsed-parameter] warning (#53689) · 3e1fffea
      Galaxy1458 提交于
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      
      * test,test=develop
      3e1fffea
    • N
      Tranpose layout (#53351) · 3dce9f0a
      niuliling123 提交于
      * update
      
      * Update backward.h
      
      * Update composite_backward_api.h
      
      * Update tensor_utils.cc
      
      * Update backward.cc
      
      * update
      
      * stype
      
      * update
      
      * add ctest
      
      * code stype
      3dce9f0a
  8. 13 5月, 2023 1 次提交
    • X
      Revert elementwise add (#53745) · b75d8c7e
      xiaoguoguo626807 提交于
      * modify concat_grad add sum comp rule
      
      * delete default mul_double_grad
      
      * delete high grad test
      
      * recover yaml
      
      * modify yaml
      
      * recover add_double_grad prim
      b75d8c7e
  9. 11 5月, 2023 2 次提交
  10. 08 5月, 2023 1 次提交
  11. 05 5月, 2023 1 次提交
  12. 28 4月, 2023 2 次提交
  13. 27 4月, 2023 3 次提交
  14. 25 4月, 2023 1 次提交
    • Y
      [PHI]Add flags macro for PHI (#52991) · 22e96bde
      YuanRisheng 提交于
      * add flags for phi
      
      * fix compile bugs
      
      * fix ci bugs
      
      * fix inference bugs
      
      * fix cinn' bugs
      
      * fix cinn bugs
      
      * perfect code according comment
      
      * fix ci bugs
      
      * fix ci bugs
      22e96bde
  15. 24 4月, 2023 4 次提交
  16. 23 4月, 2023 3 次提交
    • R
      [CustomDevice] add pipeline parallel support (#53220) · 040f8aa5
      ronnywang 提交于
      * [CustomDevice] add pipeline parallel support
      
      * update
      
      * update
      040f8aa5
    • R
      apply gcc12 to gpups (#52960) · cbfd43e4
      risemeup1 提交于
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpips
      
      * apply gcc12 to gpups
      
      * apply gcc12 to gpups
      
      * test
      
      * test
      
      * apply gcc12 to gpups
      
      * apply_gcc12_to_gpups
      
      * fix compiler bug
      
      * fix compiler bug
      
      * test
      
      * fix dangling-pointer compiler
      
      * fix dangling-pointer compiler
      
      * fix dangling-pointer compiler
      
      * apply_gcc12_to_gpups
      
      * apply gcc12 to gpups
      
      * Update cuda_streams_py.cc
      cbfd43e4
    • N
      Delete temp param in eager_gen (#53047) · 328195d7
      niuliling123 提交于
      * Delete temp param in eager_gen
      328195d7
  17. 19 4月, 2023 1 次提交
  18. 18 4月, 2023 2 次提交
  19. 17 4月, 2023 1 次提交
  20. 15 4月, 2023 1 次提交
  21. 13 4月, 2023 1 次提交
  22. 12 4月, 2023 2 次提交
  23. 11 4月, 2023 1 次提交