提交 · 0473369f2b9f9d7fcebb84936fe53a67cff91812 · PaddlePaddle / Paddle

07 8月, 2023 1 次提交

[WIP] Integration flash attention 2 (#55758) · 0473369f

由 umiswing 提交于 8月 07, 2023

* Work for fa-2 padded fwd. Code to be cleaned.

* Work for fa2 unpadded fwd.

* Work for padded-bwd, dk get small diff on np.random.seed(0)

* Anyway I pass paddle's utest, except return softmax without dropout.

* Clean code.

* Modify interface.

* Clean code and add some check.

* Easy compile for dev.

* Fix ci.

* Fix ci-build.

* Add std c++17 option again.

* Limit max job when compiling fa2.

* Remove const_cast

* Add fwd params, to be cleaned.

* Clean code.

* Add bwd params.

* Clean code.

* Add enforce.

* Use v2.0.4

* Pass RNG state to fa2 capi

* Fix review.

* Add assert

* Skip compile for sm less than 80.

0473369f

05 6月, 2023 1 次提交
- W
  third-party lib offline compilation support for mkldnn flashattn gtest (#54319) · 20a9d2fd
  由 Wang Xin 提交于 6月 05, 2023
```
* third-party lib offline compilation support for mkldnn flashattn and gtest

* fix bug

* ignore dirty
```
  20a9d2fd
19 5月, 2023 1 次提交

Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e

由 limingshu 提交于 5月 19, 2023

* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d29c1f8e

20 4月, 2023 1 次提交
- C
  [FlashAttn] add flash randomness control (#52902) · 00ac8014
  由 Chitsing KUI 提交于 4月 20, 2023
```
* add flash randomness control

* fix VLOG undefied
```
  00ac8014
29 3月, 2023 1 次提交
- C
  Fix flashattn build error on jetson (#51665) · fb5910f4
  由 chalsliu 提交于 3月 29, 2023
```
* Fix flashattn build error on jetson

* Fix nvcc not found on jetson
```
  fb5910f4
01 3月, 2023 1 次提交

Integration flash attention (#49869) · 61611786

由 Chitsing KUI 提交于 3月 01, 2023

* flash attn

* seed

* almost

* softmax

* fix workspace

* add unitest; linux only

* fix setup

* fix datatype include

* fix setup typo

* fix def scope

* new error api

* use paddle fork

* fix attr bug; complete ut

* update flash hash

* fix rng reset

* fix offset

* fix comments

61611786

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功