提交 · 428fb8043113fdd52e966512f88a91c29da3b7ea · BaiXuePrincess / Paddle

13 12月, 2022 1 次提交

Save fused_attention op memory when dropout_rate = 0.0 (#48902) · 428fb804

由 sneaxiy 提交于 12月 13, 2022

* save fused_attention memory when dropout_rate = 0.0

* add ut

* fix ut bug

* fix fused_layernorm_residual_dropout_bias_test.cu

428fb804

07 12月, 2022 1 次提交
- 张
  
  [phi::DenseTensor] Replace Tensor with phi::DenseTensor (#48682) · 65420271
  由张春乔提交于 12月 07, 2022
  
  65420271
05 12月, 2022 1 次提交

Transpose optimization for AlphaFold2 (#45230) · a0f43889

由 limingshu 提交于 12月 05, 2022

* first commit

* fix bugs according to ci

* add some changes

* change file name into function.cu.h

* remove const_cast

a0f43889

30 11月, 2022 1 次提交

[PHI decoupling] migrate transpose_op.cu.h and gpu_utils.h to phi (#48286) · 8a9bef70

由 Netpunk 提交于 11月 30, 2022

* migrate transpose_op.cu.h and gpu_utils.h

* format code style

* fix some problems

* format code

* reset tranpose_op.cc

* test commit

* recover transpose_op.h

* delete transpose_op.h

* adjust header files order in transpose_op.cc

8a9bef70

18 11月, 2022 1 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

28 9月, 2022 1 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

01 7月, 2022 1 次提交

Addition of switch_auto_tune option for transpose op (#43310) · 53d5abe3

由 limingshu 提交于 7月 01, 2022

* 2nd part of transpose update

* add switch_auto_tune option.

* add some changes according to Ci

* refine the structure of auto_tune_base.

* merge develop changes

* reset the switch_set_range and change unittest of transpose auto-tune

* change the kernel auto-tune logits

53d5abe3

26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
09 6月, 2022 1 次提交
- C
  Implement dropout_nd operator to optimize dropout with axis not None. (#42463) · caa57498
  由 crystal 提交于 6月 09, 2022
```
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  caa57498
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
30 5月, 2022 1 次提交
- C
  
  Implement fused_gate_attention operator for AlphaFold. (#42018) · fdcdbec5
  由 crystal 提交于 5月 30, 2022
  
  fdcdbec5
24 5月, 2022 1 次提交
- Y
  [Phi]Move grad_add op kernel into phi and delete elementwise_add_op file (#42903) · 4d7a9eef
  由 YuanRisheng 提交于 5月 24, 2022
```
* move grad_add

* fix unittest bugs

* fix compile bugs
```
  4d7a9eef
16 5月, 2022 1 次提交
- W
  
  fused_multi_transformer add fused softmax mask (#42636) · f9d5ae4e
  由 WangXi 提交于 5月 16, 2022
  
  f9d5ae4e
19 4月, 2022 1 次提交
- W
  
  fix inf in fused_attention (#41933) · 6bd39b5e
  由 WangXi 提交于 4月 19, 2022
  
  6bd39b5e
11 3月, 2022 1 次提交
- Y
  
  [hybrid] Support tensor parallel and cache structure for fused attention op. (#40101) · 1882c496
  由 Yuang Liu 提交于 3月 11, 2022
  
  1882c496
10 3月, 2022 1 次提交

Move dropout to phi (#40148) · 99fc1b08

由 hong 提交于 3月 10, 2022

* move dropout to phi; test=develop

* fix xpu, npu compile error; test=develop

99fc1b08

25 2月, 2022 1 次提交

[Phi] Support cudnn kernel moving & move softmax kernels (#39547) · 8895379a

由 Chen Weihang 提交于 2月 25, 2022

* support cudnn kernel moving

* polish cmake rules

* add unittest for coverage

* remove orig kernel

* remove softmax cudnn kernel

* fix softmax test failed

* fix npu func error

* resolve conflict

* rename gpu dnn kernels

* fix name rule error

* fix compile error

* update fp16 namespace

8895379a

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

18 2月, 2022 1 次提交
- F
  [Pten] blas and lapck migration (#39587) · 8c7ee8c2
  由 Feiyu Chan 提交于 2月 18, 2022
```
* move blas related files
* move lapack related files
```
  8c7ee8c2
18 1月, 2022 1 次提交

[Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3

由 Zhanlue Yang 提交于 1月 18, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues

2052f1e3

08 11月, 2021 1 次提交
- L
  【fix-bug】Support attn_mask=None input cases for fused_attention_op. (#36951) · 472dcca4
  由 Li Min 提交于 11月 08, 2021
```
目前的fused_attention_op不支持attn_mask=None的输入，本PR对此进行了补充，并补充了相应的单测逻辑。
```
  472dcca4
23 9月, 2021 1 次提交
- L
  
  Add fused_attention_op: add impl wrappers. (#35903) · 88ea8e6f
  由 Li Min 提交于 9月 23, 2021
  
  88ea8e6f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致