提交 · 786c6e996c9658e7b38333b8d7442cc1507267e6 · PaddlePaddle / Paddle

15 8月, 2023 1 次提交
- L
  
  Merge reduce type of auto_parallel and phi kernel (#56202) · 786c6e99
  由 LiYuRio 提交于 8月 15, 2023
  
  786c6e99
14 8月, 2023 7 次提交
- G
  
  [clang-tidy] Open cppcoreguidelines-avoid-c-arrays Check (#56208) · 0f5148fb
  由 gouzil 提交于 8月 14, 2023
  
  0f5148fb
- C
  
  [clang-tidy] No.31 enable modernize-use-bool-literals (#56216) · 2c307457
  由 cyberslack_lee 提交于 8月 14, 2023
  
  2c307457
- L
  
  support r to s unbalanced split (#56149) · 8fc9a817
  由 LiYuRio 提交于 8月 14, 2023
  
  8fc9a817
- write the common functions p_norm_kernel.cu and p_norm_grad_kernel.cu to p_norm_utils.h (#56191) · 7f0bdf07
  由周波涛提交于 8月 14, 2023
  
  7f0bdf07
- J
  
  [XPU] Add take_along_axis xpu kernel and plugin (#56125) · b2e06fc9
  由 jiangfan06 提交于 8月 14, 2023
  
  b2e06fc9
- S
  
  [Fluid] Move fused_softmax_mask_upper_triangle to phi (#55769) · 6e40fc1d
  由 Sonder 提交于 8月 14, 2023
  
  6e40fc1d
- Add rmsnorm residual bias add and quant (#55965) · 2ac6a7e4
  由 MarDino 提交于 8月 14, 2023
```
* add rmsnorm residual bias add and quant

* refine python interface

* add rmsnorm unittest

* Add layernorm

* fix layernorm unittest

* refine unittest

* fix example code

* fix review comment
```
  2ac6a7e4
11 8月, 2023 4 次提交
- U
  [Sparse] Fix bugs in parameter freezing (#56154) · 8bdb336d
  由 umiswing 提交于 8月 11, 2023
```
* Add enforce for sparse_bn.

* Add enforce for sp conv.
```
  8bdb336d
- Y
  Fix the shape of input sin and cos for fused_rope. (#56132) · f60c698f
  由 Yiqun Liu 提交于 8月 11, 2023
```
* Fix the shape of input sin and cos for fused_rope.

* Update shape in unittest.
```
  f60c698f
- W
  
  [XPU]Add flip kernel (#55932) · ee003457
  由 wz1qqx 提交于 8月 10, 2023
  
  ee003457
- H
  
  [XPU] Add fast_gather_nd plugin (#56103) · 460e4fc6
  由 hong19860320 提交于 8月 11, 2023
  
  460e4fc6
10 8月, 2023 5 次提交

L

Implement reshard from s to r with same process_mesh (#56039) · 4569ae13
由 LiYuRio 提交于 8月 10, 2023

4569ae13
J

[XPU] Add gather_nd fp16 and add check_dtype_op_blacklist (#55860) · 307128d1
由 jiangfan06 提交于 8月 10, 2023

307128d1

Add variable_length_memory_efficient_attention (#55400) · 4036c937

由 lzy 提交于 8月 10, 2023

* add variable_length_memory_efficient_attention
* update variable_length_memory_efficient_attention unittest
* update variable_length_mem_eff_attn's docs and unittest
* update variable_length_mem_eff_attn's docs
* Update test_variable_length_memory_efficient_attention.py
* Update variable_length_memory_efficient_attention.cu
* fix codestyle
* fix variable_length_fmha's docs and unittest
* fix variable_length_fmha's docs

4036c937

add tanh_triple_grad composite logic (#56072) · 7c4a3556

由 lxd-cumt 提交于 8月 10, 2023

* decompose tanh_triple_grad and add it into prim_white_list test=develop

* fix TanhTripleGradKernel bugs test=develop

* decompose tanh_triple_grad test=develop

7c4a3556

[XPU kernel] fix warpctc issue (#55950) · 689bcad5

由 RuohengMa 提交于 8月 10, 2023

* [XPU kernel] fix warpctc issue

* fix issue

* temporal hack to circumvent depthwise_conv2d precision issue

* reset test case

689bcad5

09 8月, 2023 6 次提交
- X
  [oneDNN]rename macro to PADDLE_WITH_DNNL (#52208) · 6ff4c130
  由 Xinyu Chen 提交于 8月 09, 2023
```
* onednn: rename macro to PADDLE_WITH_DNNL

* onednn: rename macro to CINN_WITH_DNNL
```
  6ff4c130
- C
  
  Add FP16 & BF16 for nanmedian (#56056) · 4ae9945b
  由 cyberslack_lee 提交于 8月 09, 2023
  
  4ae9945b
- N
  
  change index's dtype for int to int64 (#55949) · 8d181e37
  由 niuliling123 提交于 8月 09, 2023
  
  8d181e37
- H
  
  [XPU] add fused_softmax_mask and fused_softmax_mask_grad. (#55914) · b982af4a
  由 houj04 提交于 8月 09, 2023
  
  b982af4a
- H
  [XPU] add pos_weight for sigmoid_cross_entropy_with_logits. (#55001) · 4315bc4c
  由 houj04 提交于 8月 09, 2023
```
* [XPU] add pos_weight for sigmoid_cross_entropy_with_logits.

* update xdnn version.
```
  4315bc4c
- R
  
  fix atan2 grad (#56067) · 1bf2ab48
  由 ronnywang 提交于 8月 09, 2023
  
  1bf2ab48
08 8月, 2023 6 次提交
- W
  move `decayed_adagrad_op` to phi (#55995) · 0d920178
  由 Wang Xin 提交于 8月 08, 2023
```
* move decayed_adagrad_op to phi

* fix bug
```
  0d920178
- H
  
  move dgc kernel to phi (#56003) · 3c03ade8
  由 huangjiyi 提交于 8月 08, 2023
  
  3c03ade8
- L
  
  [XPU] register multiclass_nms3 and norm xpu kernel to optimize model (#56064) · ba992136
  由 leolishaohao 提交于 8月 08, 2023
  
  ba992136
- F
  
  optimize op structure (#55988) · 6bd7f860
  由 freeliuzc 提交于 8月 08, 2023
  
  6bd7f860
- N
  
  Modefied reduce op for store temp_data with MpType (#55709) · 03ca04fe
  由 niuliling123 提交于 8月 08, 2023
  
  03ca04fe
- H
  
  add data op data type (#56033) · 7472057c
  由 hong 提交于 8月 08, 2023
  
  7472057c
07 8月, 2023 4 次提交

Add attn_mask supported for FlashAttnKernel. (#55969) · 42e0c6b8

由 yin wei 提交于 8月 07, 2023

* add mask

* add backword

* add enforce info

* update scale

* integrate code

* update enforce

* add enforce eq

* add error type

* update enforce

* add test_flash_attention

* Polish codes and fix compiling errors.

* Set num_splits to 0 for flash-attn with tensor mask.

* Fix the compiling error for non flash-attn case.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

42e0c6b8

G

[clang-tidy] NO.6 enable `modernize-avoid-c-arrays` step: 2 (#55954) · 5ada98b8
由 gouzil 提交于 8月 07, 2023

5ada98b8
R

[clang-tidy] enable modernize-use-equals-default (#55983) · 30a02d27
由 Ruibin Cheung 提交于 8月 07, 2023

30a02d27

[WIP] Integration flash attention 2 (#55758) · 0473369f

由 umiswing 提交于 8月 07, 2023

* Work for fa-2 padded fwd. Code to be cleaned.

* Work for fa2 unpadded fwd.

* Work for padded-bwd, dk get small diff on np.random.seed(0)

* Anyway I pass paddle's utest, except return softmax without dropout.

* Clean code.

* Modify interface.

* Clean code and add some check.

* Easy compile for dev.

* Fix ci.

* Fix ci-build.

* Add std c++17 option again.

* Limit max job when compiling fa2.

* Remove const_cast

* Add fwd params, to be cleaned.

* Clean code.

* Add bwd params.

* Clean code.

* Add enforce.

* Use v2.0.4

* Pass RNG state to fa2 capi

* Fix review.

* Add assert

* Skip compile for sm less than 80.

0473369f

04 8月, 2023 4 次提交
- K
  [NewIR] Rename feed with place to data (#55778) · 274e5e54
  由 kangguangli 提交于 8月 04, 2023
```
* fix bug: feed_with_place should consider variable existence

* fix

* fix build scope

* change method to set feed var name

* remove feed_with_place to placeholder

* fix

* rename to data

* fix

* fix
```
  274e5e54
- H
  [NewIR]New ir aot placement refactor (#55810) · dd1379e4
  由 hong 提交于 8月 04, 2023
```
* refacot aot

* update

* fix bugs

* remove some test

* fix bug

* fix bug

* fix bug

* fix bug

* update
```
  dd1379e4
- Z
  
  [clang-tidy] NO.12 enable modernize-use-nullptr check(#55800) · 1e4f627d
  由 Zhenghai Zhang 提交于 8月 04, 2023
  
  1e4f627d
- J
  
  [XPU] Add int support for elementwise_sub/elementwise_div (#55920) · 97ab6aa6
  由 jiangfan06 提交于 8月 04, 2023
  
  97ab6aa6
03 8月, 2023 3 次提交
- Y
  
  Optim fused linear grad add (#55927) · 91873469
  由 Yuang Liu 提交于 8月 03, 2023
  
  91873469
- Y
  
  FLUID: move limit_by_capacity to PHI (#55948) · 230c6ce1
  由 yangguohao 提交于 8月 03, 2023
  
  230c6ce1
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功