[Paddle Inference] Add masked multihead attention kernel and export API. (#55344)
* support_mmha
* add_python_api
* add_api_doc
* fix_doc_error
* fix_infermeta
* add_infermeta
* add_bf16_cuda_check
* add_bf16_check
* fix_ci_windows
* fix_ci_windows_kernel_register
* fix_test_mmha
* add_cumoffsets
* remove_bias
* delete_mmha_reshape_input_output
* rename_delete_hfile
* remove_fluid
---------
Co-authored-by: Nyangjianfengo1 <yangjianfeng01@baidu.com>
Showing
此差异已折叠。
此差异已折叠。
想要评论请 注册 或 登录