1. 10 4月, 2023 3 次提交
  2. 07 4月, 2023 1 次提交
  3. 06 4月, 2023 3 次提交
    • J
      support more custom vjp (#52533) · 29c28e2f
      Jiabin Yang 提交于
      29c28e2f
    • K
      feat: add composite rule of roll grad (#52532) · 348a36b5
      Kang Zhao 提交于
      * feat: add relu composite rule
      
      * feat: add relu composite rule, maximum op
      
      * feat: add relu composite rule, maximum op
      
      * feat: add relu composite rule, polish comments
      
      * feat: add relu composite rule, polish comments
      
      * feat: add relu composite rule, add python api of relu
      
      * feat: add relu composite rule, commit hook
      
      * fix: maximum type error & ban cinn test
      
      * fix: maximum input sequence bugs
      
      * resolve conflicts
      
      * fix: code style bugs
      
      * add: relu fp16 test
      
      * feat: add rsqrt composite rule
      
      * feat: add rsqrt composite rule
      
      * resolve conflicts of composite rule
      
      * fix: delete check eager
      
      * feat: add roll grad composite rule
      
      * fix minus shift
      
      * fix test roll op
      348a36b5
    • S
      Fix flash attention bug (#52551) · 8ac5a6b6
      sneaxiy 提交于
      * fix flash attn
      
      * fix another API
      8ac5a6b6
  4. 04 4月, 2023 1 次提交
  5. 31 3月, 2023 1 次提交
  6. 30 3月, 2023 4 次提交
  7. 29 3月, 2023 1 次提交
  8. 28 3月, 2023 5 次提交
  9. 27 3月, 2023 1 次提交
    • HappyHeavyRain's avatar
      Add fuse_ops.yaml and fused_backward.yaml (#52010) · 10145cb6
      HappyHeavyRain 提交于
      * add fused_yaml fused_backward
      
      * fix eager_funciton bug
      
      * add some comment of fused yaml file
      
      * add 'support_dygraph_mode' configuration in fused yaml
      
      * delete some 'fused_api.h' in include file
      
      * add fused flag in api_gen
      10145cb6
  10. 24 3月, 2023 1 次提交
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  11. 23 3月, 2023 1 次提交
  12. 22 3月, 2023 3 次提交
  13. 21 3月, 2023 1 次提交
  14. 20 3月, 2023 3 次提交
  15. 16 3月, 2023 1 次提交
  16. 10 3月, 2023 2 次提交
    • HappyHeavyRain's avatar
      [New features]Add function node in phi_kernel for MKLDNN (#51073) · a0a6dc6a
      HappyHeavyRain 提交于
      * Add function node in phi_kernel for MKLDNN
      
      * fix the bug in 'BuildInferVarKernelContext'
      
      * add infer_varkernel_utils.cc
      
      * fix the bug:the first two parametes of 'BuildInferVarKernelContext' can't be template variable
      
      * change the code according to first review
      
      * change the code according to first review
      
      * change the mode of paddle_build.sh
      
      * change 'infer_var_kernel_fn_' to 'get_kerneltype_forvar_fn_'
      
      * add the error information
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      
      * fix NotFound infomation warning
      a0a6dc6a
    • C
      add flashattn raw kernel (#51383) · f951832d
      Chitsing KUI 提交于
      f951832d
  17. 09 3月, 2023 2 次提交
    • G
      add prim erf grad (#50436) · b7e4d974
      GGBond8488 提交于
      * add prim erf grad
      
      * add yaml config for prim erf grad
      
      * add math.h
      
      * add cmath
      
      * add math  defines
      
      * use define math
      
      * use define math
      
      * define M_2_SQRTPI
      
      * M_2_SQRTPI math
      
      * try math.h
      
      * fix typro
      
      * remove pow in erf grad
      
      * use new optest
      
      * add fp16 fp32 test
      
      * remove fp16 test
      b7e4d974
    • W
      Add softplus double grad (#50261) · 542844b4
      will-jl944 提交于
      * add softplus double grad
      
      * use constant method
      542844b4
  18. 08 3月, 2023 1 次提交
  19. 06 3月, 2023 1 次提交
  20. 03 3月, 2023 1 次提交
  21. 01 3月, 2023 2 次提交
    • C
      Integration flash attention (#49869) · 61611786
      Chitsing KUI 提交于
      * flash attn
      
      * seed
      
      * almost
      
      * softmax
      
      * fix workspace
      
      * add unitest; linux only
      
      * fix setup
      
      * fix datatype include
      
      * fix setup typo
      
      * fix def scope
      
      * new error api
      
      * use paddle fork
      
      * fix attr bug; complete ut
      
      * update flash hash
      
      * fix rng reset
      
      * fix offset
      
      * fix comments
      61611786
    • Z
      add topk prim backward (#50679) · 296b3ff0
      zqw_1997 提交于
      * tmp gather vjp
      
      * support gather
      
      * remove useless code
      
      * fix compiling error
      
      * fix ut
      
      * add eager test
      
      * add eager test
      
      * add seed
      
      * small change
      
      * fix cpu error
      
      * fix transpose op compat
      
      * remove tensor index case
      
      * fix prim_cinn
      
      * small commit
      
      * add cumsum prim backward
      
      * small commit
      
      * skip aixs=None test case
      
      * fix op generante eror
      
      * fix static test error
      
      * remove unused code
      
      * fix static test error
      
      * small commit
      
      * skip cpu float16 test case
      
      * skip eager cpu cumsum float16 test case
      
      * add eager and static UT
      
      * fix ut
      
      * add composite backward rule
      
      * fix error
      
      * fix type error and format error
      
      * add try cpu+float16 test
      
      * fix test bugs
      
      * remove test for cpu+float16 and make y[0] be the grad arg
      
      * add cinn test
      
      * fix UT
      
      * fix the wrong dim of v in test cases
      
      * change y[0] to y[1] for grad in UT
      
      * reshape flatten out
      
      * Disable cinn single test
      
      * use scatter_nd_add
      
      * modify the reshape part of topk_grad
      
      * delete useless build file
      
      * to make the syntax right
      
      * modify bug
      
      * try use of put_along_axis
      
      * remove cinn test
      
      * reformat todo
      
      * add silu composite rule
      
      * fix code style.
      
      * add cinn test
      
      * fix composite grad maker code gen
      
      * add prim in cumsum op test
      
      * remove old test
      
      * fix typro
      
      * pass the static test
      
      * fix typro
      
      * modify optest and delete old test files
      
      * remove normal test_top_k_op test
      
      * fix typro
      
      * pass axis=None test case
      
      * buffer comment
      
      * for debug
      
      * add silu fp16 unit test.
      
      * add static guard
      
      * remove forward prim test
      
      * remove same name axis
      
      * modify the test_top_v2_op.py to pass all local tests
      
      * delete the useless testcase
      
      * fix mistake
      
      * add more testcases to test dtype16 and dtype32
      
      ---------
      Co-authored-by: NJiabinYang <360788950@qq.com>
      Co-authored-by: NGGBond8488 <857631483@qq.com>
      Co-authored-by: Nzxcd <228587199@qq.com>
      Co-authored-by: NCharles-hit <wanghao107@baidu.com>
      296b3ff0
  22. 23 2月, 2023 1 次提交
    • HappyHeavyRain's avatar
      Support 'complex promote' in yaml (#50611) · 91a3d159
      HappyHeavyRain 提交于
      * support 'complex promote' in yaml
      
      * change the compplex_promote
      
      * change 'kron' in math.py
      
      * change 'kron' comment in python
      
      * change kron comment in python
      
      * change kron comment in python
      91a3d159