提交 · 71e28b12815018a8420962fc6a20a3526085938d · PaddlePaddle / Paddle

31 8月, 2023 3 次提交
- T
  Add fused_scale_bias_relu_conv_bnstats OP (#55026) · 71e28b12
  由 Tian Zheng 提交于 8月 31, 2023
```
* Add fused_scale_bias_relu_conv_bnstats op

* Review changes

* Fix no CUDNN Frontend build

* Fix PADDLE_ENFORCE format

* Fix PADDLE_ENFORCE CI error

* Rename kernel filename

* Refactor unittest to use paddle eager_op_test

* Fix padding bugs

* Review changes

* test=cuda117

* test=cuda117
```
  71e28b12
- Z
  
  [Fluid] Move distributed_fused_lamb_init to phi (#55993) · 0bc369ef
  由 Zero Rains 提交于 8月 31, 2023
  
  0bc369ef
- R
  
  [ROCM] Remove the constraint with a maximum number of threads per block of 256, P1 (#56699) · d7679426
  由 ronnywang 提交于 8月 31, 2023
  
  d7679426
30 8月, 2023 3 次提交

Add paddle custom flags support (#56256) · 2ef4ec71

由 huangjiyi 提交于 8月 30, 2023

* update

* repalce gflags header

* replace DEFINE_<type> with PD_DEFINE_<type>

* fix bug

* fix bug

* fix bug

* update cmake

* add :: before some paddle namespace

* fix link error

* fix CI-Py3

* allow commandline parse

* fix SetFlagsFromEnv

* fix bug

* fix bug

* fix CI-CINN

* fix CI-Coverage-build

* fix CI-Windows-build

* fix CI-Inference

* fix bug

* fix bug

* fix CI-CINN

* fix inference api test

* fix infer_ut test

* revert infer_ut gflags usage

* update

* fix inference

* remove flags export macro

* revert inference demo_ci gflags usage

* update

* update

* update

* update

* update

* update

* update

* update

* fix bug when turn on WITH_GFLAGS

* turn on WITH_GFLAGS

* fix bug when turn on WITH_GFLAGS

* fix bug when turn on WITH_GFLAGS

* update

* update and add unittest

* add unittest

* fix conflict

* rerun ci

* update

* resolve conflict

2ef4ec71

R

[ROCM] Remove the constraint with a maximum number of threads per block of 256, P4 (#56702) · 8c154880
由 ronnywang 提交于 8月 30, 2023

8c154880

【complex op】No.6 add complex support for logical_and/or/xor/not (#56323) · 5cbf5bd4

由 iSerendipity 提交于 8月 30, 2023

* 【complex op】No.6 add complex support for logical_and/or/xor/not

* fix dtype check

* modify the docs

* add special condition for not raise when x.dtype is complex

* add random generate for complex dtype

* fix generate for complex

* fix

* fix

* add corner case for complex type

* fix ut

* fix ut

5cbf5bd4

29 8月, 2023 13 次提交
- S
  Remove need_move_to_phi (#56371) · daac3829
  由 Sonder 提交于 8月 29, 2023
```
* remove flag

* open static build flag

* add searchsorted to list

* add register info for fused layernorm

* fix fused_layernorm_kernel output registe info

* fix stft registe info

* add include

* fix registe info

* add skip fake init for fused_layernorm:residual_out

* fix error

* add distributed_fused_lamb_init to StaticBuildBlackList

* set static_build flag to false
```
  daac3829
- D
  [DCU] support cum & multinomial for dcu (#56612) · 0c3e4cf6
  由 duanyanhui 提交于 8月 29, 2023
```
* support cum & multinomial for dcu

* rm commt
```
  0c3e4cf6
- R
  
  [ROCM] Remove the constraint with a maximum number of threads per block of 256, P2 (#56700) · 76b328bc
  由 ronnywang 提交于 8月 29, 2023
  
  76b328bc
- R
  
  [ROCM] Remove the constraint with a maximum number of threads per block of 256, P3 (#56701) · 593a4428
  由 ronnywang 提交于 8月 29, 2023
  
  593a4428
- G
  [Fluid] move lars_momentum to phi (#55798) · b0c2ee26
  由 gouzil 提交于 8月 29, 2023
```
* [Fluid] move lars_momentum to phi

* add sig

* fix optional Output

* off check_dygraph

* fix input

* fix operator[]

* fix

* try fix AllocateTmpTensor

* fix

* fix type

* Update paddle/phi/kernels/gpu/lars_momentum_kernel.cu

* fix type

* rollback

* Add Registration

* try fix win

* try fix win

* try use double

* try use operator *(float,const Derived &)

* try auto

* fix

* fix

* fix

* fix dtype

* fix type

* fix index
```
  b0c2ee26
- L
  
  make variable_length_memory_efficient_attention supports mask_broadcast_heads (#56673) · 6839a7b9
  由 lzy 提交于 8月 29, 2023
  
  6839a7b9
- Z
  Revert "[NewIR]Fix new ir output dtype bug (#56620)" (#56739) · f5d9981e
  由 zhangbo9674 提交于 8月 29, 2023
```
This reverts commit 1409e4ec.
```
  f5d9981e
- C
  [clang-tidy] No.26,27 enable misc-unused-using-decls,misc-unused-alias-decls (#56485) · 138bdf40
  由 cyberslack_lee 提交于 8月 29, 2023
```
* fix

* fix
```
  138bdf40
- X
  [clang-tidy] No. 53,54 enable cppcoreguidelines-c-copy-assignment-signature... · cc9e8699
  由 xiaoye 提交于 8月 29, 2023
```
[clang-tidy] No. 53,54 enable cppcoreguidelines-c-copy-assignment-signature and bugprone-use-after-move (#56601)
```
  cc9e8699
- G
  
  [clang-tidy] enable bugprone-misplaced-widening-cast check (#56635) · 11421705
  由 gouzil 提交于 8月 29, 2023
  
  11421705
- G
  
  [clang-tidy] NO.8 enable `cppcoreguidelines-narrowing-conversions`. step:1 (#56218) · b702d2ae
  由 gouzil 提交于 8月 29, 2023
  
  b702d2ae
- G
  
  [clang-tidy] enable clang-analyzer-core.UndefinedBinaryOperatorResult (#56636) · a694e679
  由 gouzil 提交于 8月 29, 2023
  
  a694e679
- G
  
  [clang-tidy] enable clang-analyzer-core.uninitialized.Assign check (#56637) · f2c7d162
  由 gouzil 提交于 8月 29, 2023
  
  f2c7d162
28 8月, 2023 6 次提交

[NewIR]Fix new ir output dtype bug (#56620) · 1409e4ec

由 hong 提交于 8月 28, 2023

* update

* fix batch norm grad args def

* fix bug

* fix combine slice bug

* fix slice bug

* update builtin split

1409e4ec

Z
[clang-tidy] NO.65 enable `clang-analyzer-cplusplus.InnerPointer` check (#56693) · c0f5dac6
由 Zhenghai Zhang 提交于 8月 28, 2023
```
* enable clang-analyzer-cplusplus.InnerPointer check

* fix bug
```
c0f5dac6
N

Change the print in debugging to RuntimeError (#56622) · d7e0f875
由 niuliling123 提交于 8月 28, 2023

d7e0f875

【inplace api】Batch add inplace api gt_, ge_, lt_, le_, eq_, not_equal_,... · c5fc413a

由 GGBond8488 提交于 8月 28, 2023

【inplace api】Batch add inplace api gt_, ge_, lt_, le_, eq_, not_equal_, logical_and_, logical_or_, logical_xor_, logical_not_, divide_, floor_divide_, bitwise_and_ , bitwise_or_, bitwise_xor_, bitwise_not_ (#55509)

* tmp commit

* add atan2

* add inplace api

* fix error

* add inpalce divide

* add inplace api

* add more inplace

* add more inpalce

* fix logical_not error

* support sinh and cosh in cpu

* support asin, acos, atan, asinh, acosh, atanh in cpu

* fix typro

* fix typro

* mv out atan2 ldexp

* mv out atan2 ldexp

* support sinh and cosh in gpu

* support asin, acos, atan, asinh, acosh, atanh in gpu

* fix ge error

* fix dygraph commpare error

* fix dygraph commpare error

* check complex in python

* fix cast inpalce error

* open inplace test

* fix ops.yaml error

* mv cast inpalce to python

* fix coverage ci

* add last inplace

* fix inplace error

* fix cast error

* fix error

* add nan_to_num_

* fix typro

* fix sparse cast error

* remove gpu 4

* fix static cast error

* tmp commit

* add atan2

* add inplace api

* fix error

* add inpalce divide

* add inplace api

* add more inplace

* add more inpalce

* fix logical_not error

* fix typro

* fix typro

* mv out atan2 ldexp

* mv out atan2 ldexp

* fix ge error

* fix dygraph commpare error

* fix dygraph commpare error

* fix cast inpalce error

* open inplace test

* fix ops.yaml error

* mv cast inpalce to python

* fix coverage ci

* add last inplace

* fix inplace error

* fix cast error

* fix error

* add nan_to_num_

* fix typro

* fix sparse cast error

* remove gpu 4

* fix static cast error

* fix cast error

* fix

* Revert "check complex in python"

This reverts commit c822064261d774dd58ad46a4f90ba8b467700a05.

* add renorm , fix error

* add coverage

* fix cumsum inpalce version error

* add cast inpalce impl

* rm test.log

* fix multiply_dyfunction and add multiply_backward test

* add and use is_same_tensor

* fix typro

* fix sone error

* fix typro

---------
Co-authored-by: NScotty <jmhgchn@gmail.com>
Co-authored-by: NScotty <527407973@qq.com>

c5fc413a

L

optimize unique and index_put (#56582) · d674ea95
由 lijin23 提交于 8月 28, 2023

d674ea95

[Phi] move shuffle_batch to phi (#56547) · 30708028

由 Sonder 提交于 8月 28, 2023

* move shuffle_batch to phi

* remove useless codes

* add test_shuffle_batch_op to STATIC_BUILD_TESTS

* move shuffle_batch_kernel.cc to cpu folder

* move shuffle_batch_grad to phi

* rm shuffle_batch_op.h

* change year at file head

30708028

25 8月, 2023 4 次提交
- L
  [Reshard] Support create shard tensor and non-zero dim reshard (#56553) · 99795a13
  由 LiYuRio 提交于 8月 25, 2023
```
* support create shard dist tesnor

* support non-zero shard to replicated

* change reshard signature
```
  99795a13
- H
  New ir support fuse bn add act (#56247) · d3f4596a
  由 hong 提交于 8月 25, 2023
```
* support new ir load combine

* update

* polish code

* remove print

* update

* update

* update

* polish code

* fix bug

* polish code

* fix compile bug

* fix bug

* revert code

* remove useless code

* polish code
```
  d3f4596a
- R
  
  [CustomDevice] add comm context support (#56301) · 62397cd2
  由 ronnywang 提交于 8月 25, 2023
  
  62397cd2
- X
  [Paddle Inference] Add bias input of mmha and simplify mmha. (#56411) · 636dc2ff
  由 xiaoxiaohehe001 提交于 8月 25, 2023
```
* add_bias_and_simplify_mmha
```
  636dc2ff
24 8月, 2023 4 次提交
- N
  
  Add enable/disable_model_check_nan_inf op (#54081) · 1c0db09a
  由 niuliling123 提交于 8月 24, 2023
  
  1c0db09a
- W
  
  refine fill with tensor (#56568) · 9ad06e06
  由 wanghuancoder 提交于 8月 24, 2023
  
  9ad06e06
- Y
  
  [New IR]Support build New IR model in python (#56315) · 4f652ac2
  由 YuanRisheng 提交于 8月 24, 2023
  
  4f652ac2
- C
  
  [XPU] Add embedding plugin (#56488) · 2a5adc5a
  由 csy0225 提交于 8月 24, 2023
  
  2a5adc5a
23 8月, 2023 2 次提交
- W
  
  move c_identity to phi (#56215) · 9ed58bff
  由 Wang Xin 提交于 8月 23, 2023
  
  9ed58bff
- W
  [IR] Ir fill constant (#56520) · e914f7fc
  由 wanghuancoder 提交于 8月 23, 2023
```
* support ir fill constant
```
  e914f7fc
22 8月, 2023 5 次提交
- J
  
  [XPU] modify add_layernorm_xpu kernel (#56429) · eb0e4d4b
  由 jiangfan06 提交于 8月 22, 2023
  
  eb0e4d4b
- R
  
  [Fluid] NO.4 Migrate c_split to PHI (#56327) · 5dc7ff04
  由 Ruibin Cheung 提交于 8月 22, 2023
  
  5dc7ff04
- L
  [XPU][PHI Kernels] add index_put kernel for xpu (#56169) · 332a73b1
  由 lijin23 提交于 8月 22, 2023
```
* add inverse kernel for xpu

* add more kernels

* add index_put kernel for xpu

* add index_put kernel for xpu

* remove unused headers

* refine test

* wait to avoid memory bugs for xpu

* refine inverse
```
  332a73b1
- Z
  
  fix delete_repeated_ops_pass, fix multiclass_nms3 (#56434) · 3e55f255
  由 zhupengyang 提交于 8月 22, 2023
  
  3e55f255
- [Paddle Inference] refactor linear_compress (#55490) · ffff3da0
  由 FormlessUnit 提交于 8月 22, 2023
```
* Modify kernels to support quantized_matmul

---------
Co-authored-by: Nsuperxf <1208713646@qq.com>
```
  ffff3da0

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功