提交 · 645e81f0704515915545c23a3c6d452a84e34218 · PaddlePaddle / Paddle

19 5月, 2023 19 次提交
- W
  
  [XPU] fix fallback (#53801) · 4b85e5db
  由 wz1qqx 提交于 5月 19, 2023
  
  4b85e5db
- add minimum grad composite rules (#52561) · 97690816
  由 warrentdrew 提交于 5月 19, 2023
```
* add minimum grad composite rules

* add public python api

* fix format

* fix format

* update testcase

* fix testcase

* fix format

* fix cmakelist.txt

* fix format

* fix param problem

* fix op and composite rule

* fix bf16 cpu support problem

* fix bf16 cpu issue

* fix axis error log

* add axis for maximum

* revert commit

* remove .orig

* fix generic problem

* revert max op

* fix axis error

* fix maximum axis

* fix test_check_output

* fix cinn

* fix minimum maximum axis check
```
  97690816
- 王
  
  [IR] fine-tune the implementation of ir component. (#53894) · 9d9f0ce5
  由王明冬提交于 5月 19, 2023
  
  9d9f0ce5
- L
  Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
  由 limingshu 提交于 5月 19, 2023
```
* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  d29c1f8e
- L
  
  fix_windows_static_assert_error (#53750) · 4dc28b54
  由 limingshu 提交于 5月 19, 2023
  
  4dc28b54
- R
  
  fix compile error on custom_devices (#53940) · b086404b
  由 RedContritio 提交于 5月 19, 2023
  
  b086404b
- G
  
  test,test=develop (#53811) · 10758725
  由 Galaxy1458 提交于 5月 19, 2023
  
  10758725
- G
  
  test,test=develop (#53843) · c1f4005a
  由 Galaxy1458 提交于 5月 19, 2023
  
  c1f4005a
- G
  
  test,test=develop (#53839) · c174aa22
  由 Galaxy1458 提交于 5月 19, 2023
  
  c174aa22
- X
  【prim】merge branch for GradOpMaker codeGen to clear code (#53874) · 6cb53e91
  由 xiaoguoguo626807 提交于 5月 19, 2023
```
* review

* modify opcompat bug

* modify pybind
```
  6cb53e91
- D
  delete bf16 of cross entropy (#53922) · 69d3f4e3
  由 Danyang Zhang 提交于 5月 19, 2023
```
* delete bf16 of cross entropy

* delete bf16 of cross entropy
```
  69d3f4e3
- Z
  [Paddle-TRT] Decrease peak memory (#53930) · eb193d8d
  由 zhoutianzi666 提交于 5月 19, 2023
```
* decrease_peak_memory
```
  eb193d8d
- G
  
  test,test=develop (#53931) · 6179a3ec
  由 Galaxy1458 提交于 5月 19, 2023
  
  6179a3ec
- G
  
  test,test=develop (#53818) · 63ffd733
  由 Galaxy1458 提交于 5月 19, 2023
  
  63ffd733
- G
  
  test,test=develop (#53847) · 39f365c4
  由 Galaxy1458 提交于 5月 19, 2023
  
  39f365c4
- G
  
  test,test=develop (#53851) · 16fcbb9b
  由 Galaxy1458 提交于 5月 19, 2023
  
  16fcbb9b
- G
  
  test,test=develop (#53945) · 2f42cb7f
  由 Galaxy1458 提交于 5月 19, 2023
  
  2f42cb7f
- R
  
  [CustomDevice] fix buffered reader exception (#53925) · b922e711
  由 ronnywang 提交于 5月 19, 2023
  
  b922e711
- Z
  
  Move raw kernel to legacy (#53830) · 051add42
  由 zhangyuqin1998 提交于 5月 19, 2023
  
  051add42
18 5月, 2023 20 次提交
- H
  
  [XPU] fix bug on XPUPlace and AllGather (#53926) · 4a4ffe9a
  由 houj04 提交于 5月 18, 2023
  
  4a4ffe9a
- G
  
  test,test=develop (#53938) · 3ad67b9a
  由 Galaxy1458 提交于 5月 18, 2023
  
  3ad67b9a
- C
  [AMP OP&Test]support prod、meshgrid、expand_as bf16 dtype (#53865) · 706503d0
  由 Charles-hit 提交于 5月 18, 2023
```
* add meshgrid,expand_as, prod and grad bf16 kernel

* fix bf16 for optest

* modify code style

* fix amp test
```
  706503d0
- Z
  [inference][trt]Remove trt sparse weight api (#53905) · 1007690b
  由 Zhang Jun 提交于 5月 18, 2023
```
* Revert "[inference][trt]add trt sparse weights switch (#53562)"

This reverts commit 4a69a536.

* remove kSPARSE_WEIGHTS

* remove kFASTER_DYNAMIC_SHAPES_0805 and add 'TrtMajorVersion' function
```
  1007690b
- Y
  
  adjust inference lib dir (#53091) · 9b0f621c
  由 Yuanle Liu 提交于 5月 18, 2023
  
  9b0f621c
- remove CopyWithContext limitation (#53771) · d53d8fdc
  由 engineer1109 提交于 5月 18, 2023
  
  d53d8fdc
- H
  Fused elementwises kernels and ops (#51427) · fb4a6ecf
  由 Hulek 提交于 5月 18, 2023
```
* Fused elementwises kernels and ops

* change fuse pass name

* adjust .pbtxt files

* adjust quantization attributes

* add missing arguments and fix others, review fixed

* simplify fused kernel registration

* fix elementwise unit tests

* reuse one fused elementwise op

* adjust proto

* Add supported datatypes

* Change 'Scale' to 'scale' in tests, change some tests to onednn

* Revert breaking changes

* Fix unit tests

* Delete obsolete test cases

* Delete commented out code

* Fix codestyle

* delete temporary condition

* fix conflicts and delete duplicate fusing

* Fix code after merge

* Move tests to new directory

* fix tests volatility

* Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py

* Update CMakeLists.txt add mkldnn op test

---------
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  fb4a6ecf
- H
  
  move fusion_group kernel to phi (#53781) · 26da689d
  由 huangjiyi 提交于 5月 18, 2023
  
  26da689d
- C
  
  Add segment_pool tests (#53785) · 0bed2203
  由 co63oc 提交于 5月 18, 2023
  
  0bed2203
- C
  
  Fix typos (#53912) · 117e951b
  由 co63oc 提交于 5月 18, 2023
  
  117e951b
- C
  
  Fix typos, test=document_fix (#53927) · e916e80c
  由 co63oc 提交于 5月 18, 2023
  
  e916e80c
- L
  
  add fp16 and bf16 for trunc (#53876) · d8407c51
  由 LoneRanger 提交于 5月 18, 2023
  
  d8407c51
- W
  move sequence_mask op InferShape func (#53782) · a862debf
  由 Wang Xin 提交于 5月 18, 2023
```
* move sequence_mask op InferShape func

* add dtype infer
```
  a862debf
- C
  
  Fix typos in elementwise dir (#53907) · 2782b291
  由 co63oc 提交于 5月 18, 2023
  
  2782b291
- X
  [Dy2static-Fallback] add set_eval_frame function in pybind. (#52006) · 7b1695af
  由 xiongkun 提交于 5月 18, 2023
```
* [Dy2static-Fallback] add set_eval_frame function in pybind.
1. add set_eval_frame function in pybind.

* add unittest for eval frame hooker.

* [support py38]

* fix-GeneratorExit error in eval frame hooker

* support python == 3.9

* support 3.10

* fix some comments
```
  7b1695af
- C
  
  Fix typos in executor_statistics.cc (#53917) · 1ac28b6b
  由 co63oc 提交于 5月 18, 2023
  
  1ac28b6b
- R
  support auto generate for op layer_norm (#53178) · 4f07b653
  由 RedContritio 提交于 5月 18, 2023
```
* simplify layer_norm_op.cc

* support auto generate for op layer_norm

* update unittest for composite_layer_norm

* remove layer_norm_op.cc from scripts

* replace layer_norm_op with generated_op

* add get_expected_kernel for layer_norm

* update cmake kernel register function for layer_norm_mkldnn_op
```
  4f07b653
- H
  
  [Fix Typo] Fix gpu_info.h, Wheter->Whether (#53564) · 236e742d
  由 HongyuJia 提交于 5月 18, 2023
  
  236e742d
- fix -Werror=format-security (#53886) · 6d7076cc
  由 engineer1109 提交于 5月 18, 2023
  
  6d7076cc
- C
  
  Fix typos in send_v2_op.cu.cc (#53904) · 65ce6886
  由 co63oc 提交于 5月 18, 2023
  
  65ce6886
17 5月, 2023 1 次提交
- D
  [CustomDevice] suport device_guard for custom device (#53808) · 9e045eeb
  由 duanyanhui 提交于 5月 17, 2023
```
* suport device_guard for npu

* fix comment

* fix typo
```
  9e045eeb

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功