提交 · 4ea1d04107ff42072bb2e416e00c9d68859ff98c · PaddlePaddle / Paddle

25 5月, 2023 3 次提交
- T
  
  【Hackathon 4th No.26】为 Paddle 新增 paddle.sparse.nn.Softmax 稀疏 API 的 coo 格式计算逻辑 (#53613) · 4ea1d041
  由 thunder95 提交于 5月 25, 2023
  
  4ea1d041
- [Zero-Dim] support ReshapeTransform/nll_loss/matmul support 0D (#53828) · a64a722a
  由 zhouweiwei2014 提交于 5月 25, 2023
  
  a64a722a
- H
  
  [XPU] change cuda_rng_state to rng_state in fleet random (#54077) · 23baa8c6
  由 houj04 提交于 5月 25, 2023
  
  23baa8c6
24 5月, 2023 3 次提交
- L
  
  suppport optional input for unbind_grad (#54085) · f2ed4011
  由 Leo Chen 提交于 5月 24, 2023
  
  f2ed4011
- Z
  
  [Sparse]sparse BatchNorm support 2D input (#53893) · 8032d57e
  由 zhangkaihuo 提交于 5月 24, 2023
  
  8032d57e
- H
  
  [Dygraph] Fix NCCL_VERSION Check Tools (#53990) · 4fa8a676
  由 Haohongxiang 提交于 5月 24, 2023
  
  4fa8a676
23 5月, 2023 10 次提交
- Z
  [AMP OP&Test] Support float16 in selu (#54030) · 6133ca4e
  由 Zhang Zheng 提交于 5月 23, 2023
```
* [AMP OP&Test] Support float16 in selu

* fix
```
  6133ca4e
- F
  [CINN] Enable check_cinn on some tests (#53710) · 97fe79a9
  由 Fisher 提交于 5月 23, 2023
```
* Enable check_cinn on some tests

Tests: bitwise, compare, shape, assign_value, sum, expand_v2,
lookup_table, lookup_table_v2

* Enable more CINN tests

Tests with CINN: expand_v2, matmul, matmul_v2, mul, norm, one_hot_v2
Add target select in cinn_launch_op

* Revert test_mul_op

* Improve op unit tests
```
  97fe79a9
- C
  
  Fix typos (#54015) · adca3654
  由 co63oc 提交于 5月 23, 2023
  
  adca3654
- C
  
  Fix typos (#54011) · 11aa5edd
  由 co63oc 提交于 5月 23, 2023
  
  11aa5edd
- C
  
  Fix typos (#53960) · d89e0367
  由 co63oc 提交于 5月 23, 2023
  
  d89e0367
- C
  
  fix typos(#53967) · c36a000d
  由 cyberslack_lee 提交于 5月 23, 2023
  
  c36a000d
- L
  add host memory stats (#54036) · 01345a51
  由 Leo Chen 提交于 5月 23, 2023
```
* add host memory stats

* add ut
```
  01345a51
- R
  [CustomDevice] fix auto_paralell (#53842) · 3aa5d64e
  由 ronnywang 提交于 5月 23, 2023
```
* [CustomDevice] fix auto_paralell

* update

* update

* update
```
  3aa5d64e
- Z
  fix processing logic of the arange function when dtype is empty. (#53800) · 3ac1ccf9
  由 zxcd 提交于 5月 23, 2023
```
* fix processing logic of the arange function when dtype is empty.

* update commit version

* fix ValueError when end is None.

* add unitest for new case.

* fix tensor type.

* remove paddle.to_tensor(), add more test unit.

* remove useless line.

* fix enable_static

* add new test unit.

* fix by comment.
```
  3ac1ccf9
- H
  [0D-Tensor] Support elementwise_add (#53955) · 26c824db
  由 HongyuJia 提交于 5月 23, 2023
```
* [0D-Tensor] Support elementwise_add

* support elementwise_add ZeroDim2&3
```
  26c824db
22 5月, 2023 8 次提交

精简 virtual pipeline 调度逻辑 (#54003) · 6fde2056

由 zhenhailiu 提交于 5月 22, 2023

* unify code

* remove useless code

* polish

* python/paddle/distributed/fleet/meta_parallel/pipeline_parallel.py

* polish

* polish

6fde2056

[dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() (#53856) · 3794d171

由 Meteor Liu 提交于 5月 22, 2023

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()

* fixed cyclic reference that caused patial import

* fixed bad change

* fix bad import

* fix bad import

* fix bad import

* fix ut failed caused by change in_dynamic_mode

* fix ut failed caused by change in_dynamic_mode

* fixed usage of in_dynamic_mode() or in_dygraph_mode()

* revert python3 to python in .pre-commit-config.yaml

* fix merge conflicts

3794d171

Z

remove depthwise_conv from extra black list (#53901) · 98f4446a
由 Zhang Ting 提交于 5月 22, 2023

98f4446a
N

Fix ctest error in test_amp_api (#53885) · 56947361
由 niuliling123 提交于 5月 22, 2023

56947361
N

Delete the chinese decription in ctest (#54018) · f7083f47
由 niuliling123 提交于 5月 22, 2023

f7083f47
J

fix device changed in setitem-numpy case (#53987) · ae35f502
由 JYChen 提交于 5月 22, 2023

ae35f502

Add multiclass_nms3 GPU kernel (#52401) · f71c805e

由 Tian Zheng 提交于 5月 22, 2023

* Add GPU kernel for multiclass_nms3 op

* Make multiclass_nms3 gpu kernel output consistent with cpu kernel

* Fix API incompatibility

* Fix unittests on builds without CUDA

* Fix ROCM build

* Remove fluid headers; Use default atol for unittest

* Change function and variable naming

* Add comments; Reduce redundant code

* Use paddle test framework

f71c805e

N
Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT and backward... · d2fa26f6
由 niuliling123 提交于 5月 22, 2023
```
Print python trace back when debugmode = CHECK_NAN_INF_AND_ABORT  and backward has nan/inf  (#52808)
```
d2fa26f6

20 5月, 2023 1 次提交
- S
  
  add info for topology (#54000) · 83a12b11
  由 ShenLiang 提交于 5月 20, 2023
  
  83a12b11
19 5月, 2023 5 次提交

add minimum grad composite rules (#52561) · 97690816

由 warrentdrew 提交于 5月 19, 2023

* add minimum grad composite rules

* add public python api

* fix format

* fix format

* update testcase

* fix testcase

* fix format

* fix cmakelist.txt

* fix format

* fix param problem

* fix op and composite rule

* fix bf16 cpu support problem

* fix bf16 cpu issue

* fix axis error log

* add axis for maximum

* revert commit

* remove .orig

* fix generic problem

* revert max op

* fix axis error

* fix maximum axis

* fix test_check_output

* fix cinn

* fix minimum maximum axis check

97690816

Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e

由 limingshu 提交于 5月 19, 2023

* Reorganize the forward codes of flash-attention.

* Fix forward.

* Remove some noused codes.

* Simplify codes and fix backward.

* Change all LOG(INFO) to VLOG and fix the backward.

* add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes

* decrease the effect of debug print on performance

* Unify the initialize of flashattn arguments.

* Rewirte the reshape of temp_mask and temp_bias.

* API support use_flash_attn.

* Fix compiling error on CI.

* Try to crop the flash-attention lib.

* Correct the condition of whether can use flash-attn.

* Remove the softmax_out argument.

* Remove is_causal.

* Polish codes.

* Fix qkv_transpose_out's shape and scaling of Q * K.

* Update commit of flash-attention.

---------
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

d29c1f8e

Z
Add large dim test of log_softmax (#53954) · 1b6972fd
由 Zhang Zheng 提交于 5月 19, 2023
```
* Add large dim test of log_softmax

* fix
```
1b6972fd
C

fix meshgird and expand_as test (#53951) · 14e0ce71
由 Charles-hit 提交于 5月 19, 2023

14e0ce71
D
delete bf16 of cross entropy (#53922) · 69d3f4e3
由 Danyang Zhang 提交于 5月 19, 2023
```
* delete bf16 of cross entropy

* delete bf16 of cross entropy
```
69d3f4e3

18 5月, 2023 10 次提交

H

[XPU] fix bug on XPUPlace and AllGather (#53926) · 4a4ffe9a
由 houj04 提交于 5月 18, 2023

4a4ffe9a
C
[AMP OP&Test]support prod、meshgrid、expand_as bf16 dtype (#53865) · 706503d0
由 Charles-hit 提交于 5月 18, 2023
```
* add meshgrid,expand_as, prod and grad bf16 kernel

* fix bf16 for optest

* modify code style

* fix amp test
```
706503d0
P
[Bug Fix] fix parameter not passed in InstanceNorm (#53900) · dc1202b2
由 PuQing 提交于 5月 18, 2023
```
* fix parameter not passed

* fix repr
```
dc1202b2
H
[CINN] Fix TestGelu unittest of CINN (#53859) · 2b7cbd1b
由 HongyuJia 提交于 5月 18, 2023
```
* [CINN] Fix TestGelu unittest of CINN

* pass if_enable_cinn
```
2b7cbd1b
C

Add einsum tests (#53722) · c3c85794
由 co63oc 提交于 5月 18, 2023

c3c85794

Fused elementwises kernels and ops (#51427) · fb4a6ecf

由 Hulek 提交于 5月 18, 2023

* Fused elementwises kernels and ops

* change fuse pass name

* adjust .pbtxt files

* adjust quantization attributes

* add missing arguments and fix others, review fixed

* simplify fused kernel registration

* fix elementwise unit tests

* reuse one fused elementwise op

* adjust proto

* Add supported datatypes

* Change 'Scale' to 'scale' in tests, change some tests to onednn

* Revert breaking changes

* Fix unit tests

* Delete obsolete test cases

* Delete commented out code

* Fix codestyle

* delete temporary condition

* fix conflicts and delete duplicate fusing

* Fix code after merge

* Move tests to new directory

* fix tests volatility

* Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py

* Update CMakeLists.txt add mkldnn op test

---------
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>

fb4a6ecf

C

Add segment_pool tests (#53785) · 0bed2203
由 co63oc 提交于 5月 18, 2023

0bed2203
C

Fix typos (#53912) · 117e951b
由 co63oc 提交于 5月 18, 2023

117e951b
L

add fp16 and bf16 for trunc (#53876) · d8407c51
由 LoneRanger 提交于 5月 18, 2023

d8407c51

support auto generate for op layer_norm (#53178) · 4f07b653

由 RedContritio 提交于 5月 18, 2023

* simplify layer_norm_op.cc

* support auto generate for op layer_norm

* update unittest for composite_layer_norm

* remove layer_norm_op.cc from scripts

* replace layer_norm_op with generated_op

* add get_expected_kernel for layer_norm

* update cmake kernel register function for layer_norm_mkldnn_op

4f07b653

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功