提交 · 078e8c7893c48d22b3db5d687602ee8824c1c4cc · BaiXuePrincess / Paddle

09 10月, 2022 1 次提交
- H
  
  [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) · 078e8c78
  由 Haohongxiang 提交于 10月 09, 2022
  
  078e8c78
28 9月, 2022 1 次提交

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

07 9月, 2022 1 次提交
- W
  
  Fix fused cuda op's mutable data [2] (#45562) · 4bbbed9a
  由 Wilber 提交于 9月 07, 2022
  
  4bbbed9a
01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

12 7月, 2022 1 次提交
- Y
  
  fix fused attention, ffn, fm under new process group (#44259) · f6ff2221
  由 Yuang Liu 提交于 7月 12, 2022
  
  f6ff2221
26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
17 6月, 2022 1 次提交

Support optional residual add in fused_attention and fused_feedforward. (#43474) · 19e866f9

由 Yiqun Liu 提交于 6月 17, 2022

* Support optional residual add in fused_attention and fused_feedforward.

* Add checkpoint and add the check of add_residual when pre_layer_norm is false.

* Add TODO and change the python api to add add_residual argument.

19e866f9

05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
24 5月, 2022 1 次提交
- Y
  [Phi]Move grad_add op kernel into phi and delete elementwise_add_op file (#42903) · 4d7a9eef
  由 YuanRisheng 提交于 5月 24, 2022
```
* move grad_add

* fix unittest bugs

* fix compile bugs
```
  4d7a9eef
09 3月, 2022 1 次提交
- W
  
  [hybrid] fused_feedforward op support tensor model parallel (#40160) · e0866dc6
  由 WangXi 提交于 3月 09, 2022
  
  e0866dc6
20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

18 2月, 2022 1 次提交
- F
  [Pten] blas and lapck migration (#39587) · 8c7ee8c2
  由 Feiyu Chan 提交于 2月 18, 2022
```
* move blas related files
* move lapack related files
```
  8c7ee8c2
18 1月, 2022 1 次提交

[Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3

由 Zhanlue Yang 提交于 1月 18, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues

2052f1e3

12 11月, 2021 1 次提交
- Z
  [fix]fix the bug of fused_attention and fused_feedforward (#36972) · 6486e242
  由 zhangkaihuo 提交于 11月 12, 2021
```
* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter
```
  6486e242
28 10月, 2021 1 次提交
- L
  Fix fused_attention_op and fused_feedforward_op bug when pre_layer_norm is false. (#36793) · ff3018d7
  由 Li Min 提交于 10月 28, 2021
```
* Fix bug when pre_layer_norm is false.
```
  ff3018d7
25 10月, 2021 2 次提交

add op: fused_feedforward(backward) (#35611) · 2dd0a46a

由 zhangkaihuo 提交于 10月 25, 2021

这个PR是fused_feedforward反向的代码

相关kernel实现：fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias

fused_feedforward是一个融合算子，该算子对transformer模型的feed forward层的算子进行融合和封装，使得前端只呈现一个接口，通过融合减少部分访存和kernel launch的时间，以此提升性能。

2dd0a46a

add op: fused_feedforward(forward) (#35843) · b18cbfb2

由 zhangkaihuo 提交于 10月 25, 2021

这个PR只包含fused_feedforward前向的代码。

相关kernel实现：fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias

b18cbfb2

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致