提交 · 1be70bc516c2faa5282779aeaf82028f7524f7f6 · BaiXuePrincess / Paddle

05 1月, 2023 1 次提交
- Y
  
  Add transpose_qkv_wb flags to the fused_attention_op. (#49494) · ec857b85
  由 Yuang Liu 提交于 1月 05, 2023
  
  ec857b85
23 12月, 2022 1 次提交
- L
  
  make FusedMultiTransformer supports RoPE (#48842) · 644dfc60
  由 lzy 提交于 12月 23, 2022
  
  644dfc60
22 12月, 2022 1 次提交
- X
  
  [Paddle Inference] Add moe phi kernel (#48703) · def2a87f
  由 xiaoxiaohehe001 提交于 12月 22, 2022
  
  def2a87f
07 12月, 2022 1 次提交
- K
  
  Remove reduntant numpy output in Example code (1/3), test=document_fix (#48678) · e75c651d
  由 Kevin吴嘉文提交于 12月 07, 2022
  
  e75c651d
29 11月, 2022 1 次提交
- N
  [CodeStyle][isort] introduce isort (part4) (#48402) · f85def97
  由 Nyakku Shigure 提交于 11月 29, 2022
```
* isort all files

* revert conflicting files

* revert conflicting files

* revert conflicting files
```
  f85def97
28 11月, 2022 1 次提交

clear fluid api: warpctc, nce, identity_loss (#48142) · d983fc34

由 yuehuayingxueluo 提交于 11月 28, 2022

* clear fluid api: warpctc, nce, identity_loss

* fix test_layers.py __init__.py

* fix loss.py

* change __init__.py and api calling method

* fix nce

* fix nce

* fix fluid.data

* delete warpctc api document

* fix loss.py

* fix ctc_loss

* fix test_warpctc_op.py

* fix test_layers.py

* fix some bug

* fix conflict

* fix ci bug

* Empty Commit test=allcase

* fix ci bug

d983fc34

22 11月, 2022 1 次提交

Fixdocs (#47986) · 91f4d1ce

由 ustiniankw 提交于 11月 22, 2022

* list112-122, test=document_fix

* precommitfix, test=document_fix

* list112-127, test=document_fix

* fix_ResNetBasicBlock, test=document_fix

* pre-commit_resnet, test=document_fix

* refix, test=document

* refix, test=document_fix

91f4d1ce

03 11月, 2022 1 次提交

[CodeStyle][py2][U008] remove unnecessary args in `super()` (#47549) · 3de3e45e

由 Nyakku Shigure 提交于 11月 03, 2022

* [CodeStyle][py2][U008] remove unnecessary args in `super()`

* remove remained args

* revert changes in test_pylayer_op

* Revert "revert changes in test_pylayer_op"

This reverts commit ff185a9ae738afac3b0264f61bde6c6b7f72e7c4.

* revert some changes in example code

3de3e45e

02 11月, 2022 1 次提交
- K
  
  Remove redundant numpy import (#47483) · 20db5221
  由 Kevin吴嘉文提交于 11月 02, 2022
  
  20db5221
23 10月, 2022 1 次提交
- N
  [CodeStyle][black] use black instead of yapf (#46014) · 7097630f
  由 Nyakku Shigure 提交于 10月 23, 2022
```
* update config

* re-blacken python code

* temporarily disable date and diff_py_file

* skip a format
```
  7097630f
20 10月, 2022 1 次提交

[CodeStyle][W605] Add escape symbols to some strings (#46752) · e1c0461d

由 Tony Cao 提交于 10月 20, 2022

* Fix W605 in tools folder by adding escape symbols

* Fix W605 in incubate and some other folders

* Fix W605 in /fluid/test folders

* Update tools/analysisPyXml.py
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>

* Add some changes to manual and auto escape symbols

* revert changes in transformer.py

* Fix new code with W605 error: add escape symbols

* revert changes in transformer.py

* revert changes in transformer.py
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>

e1c0461d

12 10月, 2022 1 次提交

[CodeStyle][F401] remove unused imports in... · 6977df8c

由 Shuangchi He 提交于 10月 12, 2022

[CodeStyle][F401] remove unused imports in python_paddle/inference_device_profiler_text_metric_incubate_quantization_libs_audio_amp_jit. (#46762)

6977df8c

10 10月, 2022 1 次提交

make fused_multi_transformer support dynamically set the cache_kvs' shape and... · 9ea279a4

由 carryyu 提交于 10月 10, 2022

make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches. (#46777)

* make fused_multi_transformer support dynamically set the cache_kvs' shape and support input prefix_caches.

9ea279a4

23 9月, 2022 1 次提交
- N
  
  [CodeStyle][W191][E101] remove tabs in python files (#46288) · ed2bb051
  由 Nyakku Shigure 提交于 9月 23, 2022
  
  ed2bb051
14 9月, 2022 1 次提交
- N
  [CodeStyle][W291] trim trailing whitespace in python file (#45937) · de8c0ba5
  由 Nyakku Shigure 提交于 9月 14, 2022
```
* trim trailing whitespace

* fix `.cmake-format.py`

* revert npu ut changes, avoid npu ci error
```
  de8c0ba5
26 8月, 2022 1 次提交
- W
  
  [Eager] delete final state pre-name (#45306) · 126940b3
  由 wanghuancoder 提交于 8月 26, 2022
  
  126940b3
30 6月, 2022 1 次提交
- Z
  Add new attr of fused_multi_transformer (#43730) · c2a5bb91
  由 Zhang Zheng 提交于 6月 30, 2022
```
* Add new attr of fused_multi_transformer

* fix format

* add note

* add in layer

* fixfixfixfix
```
  c2a5bb91
28 6月, 2022 1 次提交
- Y
  
  [fused_transformer] update transformer fustion for dygraph, test=allcases (#43858) · 99b3727d
  由 Yuang Liu 提交于 6月 28, 2022
  
  99b3727d
21 6月, 2022 1 次提交
- Y
  
  Fix code example of fused_attention and fused_feedforward. (#43635) · 223fb7b3
  由 Yiqun Liu 提交于 6月 21, 2022
  
  223fb7b3
17 6月, 2022 1 次提交

Support optional residual add in fused_attention and fused_feedforward. (#43474) · 19e866f9

由 Yiqun Liu 提交于 6月 17, 2022

* Support optional residual add in fused_attention and fused_feedforward.

* Add checkpoint and add the check of add_residual when pre_layer_norm is false.

* Add TODO and change the python api to add add_residual argument.

19e866f9

14 6月, 2022 1 次提交
- L
  
  fix is_test bug in fused_feedforward. (#43508) · 193ab32c
  由 Li Min 提交于 6月 14, 2022
  
  193ab32c
13 6月, 2022 1 次提交
- W
  
  fused_attention fused_feedforward api support Model Tensor Parallel (#42985) · 31ddaae2
  由 WangXi 提交于 6月 13, 2022
  
  31ddaae2
05 6月, 2022 1 次提交

【code format check upgrade】 step2：yapf (#42944) · a072fca8

由 Sing_chan 提交于 6月 05, 2022

* use yapf to format all python file

* yapf exclude two unittests file for they rely on writing and reading file, and format will break them

* disable diff_py_file because too many diff files cause command following failed

a072fca8

01 6月, 2022 1 次提交

Make fuse_gemm_epilogue support transpose_x and transpose_y (#40558) · 048b0013

由 sneaxiy 提交于 6月 01, 2022

* support weight transpose

* add ut

* add template

* fix transpose error

* fix transpose_comment

* add api tests

* add skipif

* add doc

048b0013

31 5月, 2022 1 次提交
- L
  Rename dropout is test (#43098) · 67497119
  由 Li Min 提交于 5月 31, 2022
```
* replace dropout_is_test with is_test.
* improve atol on a100.
```
  67497119
30 5月, 2022 1 次提交
- L
  Add fused_bias_dropout_residual_ln op and layer. (#43062) · dceccd9d
  由 Li Min 提交于 5月 30, 2022
```
* add fused_bias_dropout_residual_ln op and layer.
```
  dceccd9d
12 5月, 2022 1 次提交
- S
  
  Fix some typos in paddle/. (#42408) · 2012672c
  由 Shuangchi He 提交于 5月 12, 2022
  
  2012672c
26 4月, 2022 1 次提交
- W
  
  Add fused_multi_transformer op to optimize transformer generation performance (#41814) · 9dadf7df
  由 WangXi 提交于 4月 26, 2022
  
  9dadf7df
25 3月, 2022 1 次提交

Refactor Dygraph Flags (#40786) · 3085d5e4

由 Jiabin Yang 提交于 3月 25, 2022

* refactor eager flags

* fix flags error when we switch from eager to dygraph

* fix ci problem

* fix ci

* fix ci

* merge develop and fix code style

* merge develop and fix code style

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* merge develop

3085d5e4

11 3月, 2022 1 次提交
- Y
  
  [hybrid] Support tensor parallel and cache structure for fused attention op. (#40101) · 1882c496
  由 Yuang Liu 提交于 3月 11, 2022
  
  1882c496
24 2月, 2022 1 次提交
- L
  fix 'invalid escape sequence' (#39842) · 4e26fa57
  由 Leo Chen 提交于 2月 24, 2022
```
* fix 'invalid escape sequence'

* fix assert error
```
  4e26fa57
28 1月, 2022 1 次提交
- Z
  
  recovery code (#39287) · 45f9c9eb
  由 zhangkaihuo 提交于 1月 28, 2022
  
  45f9c9eb
27 1月, 2022 1 次提交

Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3

由 zhangkaihuo 提交于 1月 27, 2022

* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter

* for pure fp16

* Add a SparseCsrTensor

* remove unused functional

* remove const

* remove SetMemoberTensor

* remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows

* SparseCooTensor

* add SetMember

* merge upstream; add SetMember

* merge upstream

* merge upstream; add newline at end of file

* add newline at end of file

* remove newline at end of file

* remove newline at end of file

* stash

* user pten::framework::make_ddim

* user pten::framework::make_ddim

* merge upstream; use the latest mutable_data

* merge upstream; use the latest mutable_data

* return mutable dense tensor

a7edb3f3

26 11月, 2021 1 次提交
- L
  Fix bugs when bias add none in static graph for fused_attention op. (#37566) · 097e098d
  由 Li Min 提交于 11月 26, 2021
```
* Fix bugs when bias is none for static graph for fused_attention op.
```
  097e098d
23 11月, 2021 1 次提交
- L
  Add support bias is none for fused_attention op. (#37411) · 1a8786cf
  由 Li Min 提交于 11月 23, 2021
```
Add support for bias is none for fused_attention op.
```
  1a8786cf
16 11月, 2021 1 次提交

Fix attn_bias_add bug. (#37147) · a9e7a854

由 Li Min 提交于 11月 16, 2021

fused_attention_op的实现中，使用了bias_add，且其实现是通过使用kernel primitive来实现的，之后kernel primitive的WriteData api接口及函数内部实现发生了更改，将判断越界的逻辑移到了template的参数中，使得调用的分支有错误，产生了越界赋值操作，污染了别的显存空间的内容。具体表现为：test_fused_attention_op_api.py 单次执行基本上不会报错，多次循环执行不同shape的输入，结果计算不对，具有偶发性，bug不易察觉。

a9e7a854

12 11月, 2021 1 次提交
- Z
  [fix]fix the bug of fused_attention and fused_feedforward (#36972) · 6486e242
  由 zhangkaihuo 提交于 11月 12, 2021
```
* fix bug:
1. atten: set the default value of attn_dropout_rate to None
2. ffn: add activation parameter
```
  6486e242
28 10月, 2021 1 次提交
- L
  [fix-doc-bug] Fix fused_attention_op english doc test=document_fix (#36803) · 11c2874e
  由 Li Min 提交于 10月 28, 2021
```
* Fix fused_attention english doc test=document_fix
```
  11c2874e
27 10月, 2021 1 次提交

Fused transformer encoder layer and fused feedforward layer (#36604) · 9f3613f3

由 zhangkaihuo 提交于 10月 27, 2021

本PR是fused_transformer的layer层代码，包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。

9f3613f3

26 10月, 2021 1 次提交

Add fused attention op backward and python layer. (#36498) · 5119428e

由 Li Min 提交于 10月 26, 2021

功能：本PR的目标是提高attention模块的计算性能。
为了减少框架层对op的调度开销，本PR通过在C++层手动实现attention模块，对外提供attention 大op；
为了减少防存开销，本PR采取了两种优化方法：
（1）在q,k,v计算时通过共享输入X，将该处的gemm，transpose和bias add从三次调用减少为一次；
（2）使用kernel融合优化技术，在不同cuda kernel之间通过寄存器传输数据；

5119428e

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致