• Z
    Memory Efficient Attention (#51867) · e5ad3859
    ZhangDY-6483 提交于
    * first version, notest
    
    * return final rst, notest
    
    * use infinity() instead of max
    
    * ut structure
    
    * start up of ut
    
    * generate lse
    
    * update
    
    * add depense
    
    * reconstruct cmake
    
    * move file
    
    * add memory efficient attention and fix blasimpl
    
    * update
    
    * update cmake
    
    * add namespace
    
    * update cmake
    
    * use .cu
    
    * update for pad3d
    
    * bug fix
    
    * bug fix
    
    * update
    
    * bug fix
    
    * update enforce
    
    * add test case
    
    * merge the lse pad
    
    * fix kernel_fn of backward
    
    * fix PADDLE_ENFORCE_EQ and phi_api
    
    * fix PADDLE_ENFORCE
    
    * fix PADDLE_ENFORCE
    
    * rerun coverage
    
    * fix memory efficient attention test
    
    * rerun ci
    
    * add cuda version condition
    
    * add cuda version condition
    
    * delete WIP test
    
    * replace PADDLE_ENFORCE
    
    * edit the namespace of datatype in multiple.cc
    
    * rerun
    
    * rerun
    
    ---------
    Co-authored-by: Nliuyuang <liuyuang@baidu.com>
    e5ad3859
memory_efficient_attention.py 4.4 KB