提交 · 613beeb6702995b618fd0205b7279815b7beb47e · PaddlePaddle / Paddle

03 8月, 2023 3 次提交
- H
  
  [XPU] Fix compilation errors of XPU plugin on multiple versions of GCC (#55924) · 613beeb6
  由 hong19860320 提交于 8月 03, 2023
  
  613beeb6
- W
  fix security bug (#55870) · 08f28b40
  由 wanghuancoder 提交于 8月 03, 2023
```
* fix security bug
```
  08f28b40
- W
  fix security bug (#55865) · dcf30692
  由 wanghuancoder 提交于 8月 03, 2023
```
* fix security bug
```
  dcf30692
02 8月, 2023 6 次提交

[clang-tidy] NO.6 enable `modernize-avoid-c-arrays` check (#55774) · c000091e

由 gouzil 提交于 8月 02, 2023

* [clang-tidy] modernize-avoid-c-arrays

* rollback

* [clang-tidy] fix

* close modernize-avoid-c-arrays

* fix PHI_DEFINE_string; add PHI_DEFINE_bool NOLINT

* fix PHI_DEFINE_string

* fix next_h_state and parity err

* fix win32

* fix cuda_graph

* fix accuracy_kernel

* fix math_function

* fix fused_softmax_mask_kernel.cu load_data and warp_reduce; rollback concat_and_split_functor ins_addr

* fix fused_dropout_add_grad_kernel

* fix

* rollback cu

* rollback concat_and_split_functor.cu

* rollback

c000091e

W

[XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
由 wz1qqx 提交于 8月 02, 2023

22c7a6eb

[Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a

由 yangjianfengo1 提交于 8月 02, 2023

[Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)

* finish

* cpergroup odd

* fix bf16

* single channel

* code style

* jingdu duiqi

* add head_file

* add bf16 head file

* bf16 2

* bf16

* bf16 head

* bf16 compile

* py test

* bf16 compile

* bf16 compile

* unset py test

* nhwc

* test

* mean var

* bf16 success

* su

* ctest success

* use is_same_as

* is_same

* use is_same

* rtol

* gpu_stream

* del sigmod

* fix bfloat16 type

* use cuda_bf16_hpp

* use_cuda_arch

* bfloat162float2

* del inplace_tol

* del max_releative_tol

* temp store

* jingdu duiqi

* temp store

* plugin

* jingdu duiqi

* duiqi

* include cuda.h

* del half

* half single

* ci

* add const

* ci

* cudamemset

* del printf

* fp16 test

* add half compute

* del br16 ci

* del ci

* ci approve

* del fluid include

e61d892a

C

Add FP16 & BF16 for erfinv (#55287) · 6d7efd09
由 cyberslack_lee 提交于 8月 02, 2023

6d7efd09
W
fix security bug (#55782) · 19da5c0c
由 wanghuancoder 提交于 8月 02, 2023
```
* fix security bug
```
19da5c0c
J

[XPU] Add gather_squeeze_pass (#55605) · d13a49d6
由 jiangfan06 提交于 8月 02, 2023

d13a49d6

01 8月, 2023 5 次提交
- S
  move prune_gate_by_capacity to phi (#55780) · 6b93ba0a
  由 Sonder 提交于 8月 01, 2023
```
* move prune_gate_by_capacity to phi

* fix

* fix registe info

* remove useless codes
```
  6b93ba0a
- G
  
  [phi] move nop to phi (#55816) · 719b1ed3
  由 gouzil 提交于 8月 01, 2023
  
  719b1ed3
- H
  [NewIR]New ir support print op (#55648) · 75c29ac1
  由 hong 提交于 8月 01, 2023
```
* new ir support print op

* fix gpu bug

* fix bug

* update

* remove layout to string

* remove usless header

* polish code

* fix bug

* posolis code
```
  75c29ac1
- R
  
  [ROCM] fix concat and split (#55821) · d7aef892
  由 ronnywang 提交于 8月 01, 2023
  
  d7aef892
- H
  
  [XPU] Add fast_where fusion op and XPU micro kernel (#55628) · 07e788f1
  由 hong19860320 提交于 8月 01, 2023
  
  07e788f1
31 7月, 2023 6 次提交

H
[NewIR]fix new ir shadow typo (#55706) · 2265d63c
由 hong 提交于 7月 31, 2023
```
* fix new ir shadow typo

* update
```
2265d63c
S

[Fluid] Move random routing to phi (#55773) · ada393b3
由 Sonder 提交于 7月 31, 2023

ada393b3
W
Support stride2 (#55156) · 859fc01b
由 wanghuancoder 提交于 7月 31, 2023
```
support stride
```
859fc01b

rename BatchNormGradFunctor (#55717) · eee4b8fb

由 zhangyuqin1998 提交于 7月 31, 2023

* rename BatchNormGradFunctor

* Update batch_norm_grad_kernel.cc

* Update batch_norm_grad_kernel.cu

* Update batch_norm_grad_kernel.cc

* fix

* Update batch_norm_grad_kernel.cc

eee4b8fb

C

Add float16 and bfloat16 support and test for argsort (#55105) · 60e37d17
由 cyberslack_lee 提交于 7月 31, 2023

60e37d17

【PaddlePaddle Hackathon 4】No.56 : add fp16 test and bf16 for poisson (#51662) · 608a3f28

由 LoneRanger 提交于 7月 31, 2023

* add fp16 and bf16 support for poisson

* add fp16 and bf16 support for searchsorted

* fix bug

* Update test_searchsorted_op.py

fix function name

* Update test_poisson_op.py

fix function name

* fix bug

* remove the searchorted

* Update test_poisson_op.py

* fix bug of TestPoissonBF16Op

* Update test_poisson_op.py

* Update test_poisson_op.py

* Update test_poisson_op.py

* fix bug of import

* fix bug

608a3f28

28 7月, 2023 2 次提交
- S
  【Complex op】add complex support for sin, cos, tan, tanh (#55380) · 3bedec8a
  由 Scotty 提交于 7月 28, 2023
```
* add complex dtype for tanh

* add test case

* support complex for sin, cos and tan

* support gpu

* fix error in cpu

* fix gpu error

* set check_prim to False only for complex type
```
  3bedec8a
- Y
  
  [bug fix] fix scatter 0d index grad error (#55738) · 3e2c6a56
  由 Yuang Liu 提交于 7月 28, 2023
  
  3e2c6a56
27 7月, 2023 3 次提交

Z
add int32/int64 for outer/matmul Kernel. (#55584) · ff2142f2
由 zxcd 提交于 7月 27, 2023
```
* add int32/int64 for outer/matmul Kernel.

* fix by comment.

* fix by comment
```
ff2142f2

[NewIR]Fix new ir dygraph 2 static concat grad bug (#55634) · 51ebcf68

由 hong 提交于 7月 27, 2023

* add kernel dialect

* change DenseTensorTypeStorage to DenseTensorType

* add test case`

* add first pd_op to kernel dialect

* lower pd op to kernel dialect

* update

* update

* remove useless code

* add attrite print test

* fix bug

* update

* update

* update

* update

* polish code

* fix bug

* polish  code  and add python test

* add test

* fix test error

* relax constraint when inserting get_parameter

* add env flag

* fix bug

* dygraph2static support new ir

* fix bug

* revert test env

* change cc_test_old to cc_test

* update

* fix build_static bug

* update test

* fix type test error

* udpate cmake

* disable test in windows

* fix inference compile

* fix program translator error

* only run on cpu, not support gpu yet

* fix conflict

* polish code

* fix bug

* add feed with place op

* update

* remove useless unitest

* udpate mkldnn

* update

* update

* align mkldnn version

* new ir support builtin slice op

* fix bug

* fix phi kernel adaptor bug

* add enable static

* add enable_static

* remove useless test case

* change feed list to single variable

* update

* add feed with place and shaddow output op

* fix bug

* remove usless code

* support gpu

* fix bug

* fix bug

* remove template

* add more data type

* fix cimpile bug

* udpate

* remove useless code

* revert dygraph2st test

* remove usless code

* revert op

* fix bug

* remove instance norm

* fix concat grad bug

* revert code

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>

51ebcf68

【inplace api】batch add inplace api paddle.log_, paddle.i0_,... · 58a03d41

由 GGBond8488 提交于 7月 27, 2023

【inplace api】batch add inplace api paddle.log_, paddle.i0_, paddle.nn.functional.leaky_relu_... (#55576)

* batch add inplace api

* add inplace test

* add activation inplace

* fix test

* remove atan2 ge, gt, le, lt, nq

* remove atan2 ge, gt, le, lt, nq

* fix windows ci error

* rerun ci

* fix typro

* fix bugs

---------
Co-authored-by: Nzhangrui34 <v_zhangrui34@baidu.com>

58a03d41

26 7月, 2023 4 次提交
- T
  
  add sin and cos optional parameters to fused_rope op (#55415) · 581d05bb
  由 tianhaodongbd 提交于 7月 26, 2023
  
  581d05bb
- D
  
  Add FP16 & BF16 for lamb (#55641) · 84a56b4a
  由 Difer 提交于 7月 26, 2023
  
  84a56b4a
- L
  [Reshard] Implement replicated to split with same placement (#55552) · 9f3b5f15
  由 LiYuRio 提交于 7月 26, 2023
```
* Implement replicated to split reshard function

* fix link error in clang

* refine split functor

* simplify reshard code
```
  9f3b5f15
- G
  
  add modernize-redundant-void-arg check (#55652) · 12fb18dd
  由 gouzil 提交于 7月 26, 2023
  
  12fb18dd
25 7月, 2023 7 次提交

L

fix a bug caused by hipcc lambda value capture (#55612) · 8db3ff1f
由 lishicheng1996 提交于 7月 25, 2023

8db3ff1f

Bugfix, fast layer norm, OOB (#55639) · 017a6164

由 Jeng Bai-Cheng 提交于 7月 25, 2023

* Fix LayerNormForward perf issue

* Bugfix, fast_layer_norm OOB

* apply pre-commit

---------
Co-authored-by: NShijie Wang <jaywan@nvidia.com>

017a6164

傅

add all false bool indices support for index_put (#55655) · c737f0ae
由傅剑寒提交于 7月 25, 2023

c737f0ae
L

fix bugs in rnn op (#55656) · 0cd422b6
由 Lucas 提交于 7月 25, 2023

0cd422b6
W

fix div 0 bug (#55644) · 690ffe81
由 wanghuancoder 提交于 7月 25, 2023

690ffe81

[NewIR]new ir dygraph to static supoort gpu (#55620) · fb9bec5d

由 hong 提交于 7月 25, 2023

* add kernel dialect

* change DenseTensorTypeStorage to DenseTensorType

* add test case`

* add first pd_op to kernel dialect

* lower pd op to kernel dialect

* update

* update

* remove useless code

* add attrite print test

* fix bug

* update

* update

* update

* update

* polish code

* fix bug

* polish  code  and add python test

* add test

* fix test error

* relax constraint when inserting get_parameter

* add env flag

* fix bug

* dygraph2static support new ir

* fix bug

* revert test env

* change cc_test_old to cc_test

* update

* fix build_static bug

* update test

* fix type test error

* udpate cmake

* disable test in windows

* fix inference compile

* fix program translator error

* only run on cpu, not support gpu yet

* fix conflict

* polish code

* fix bug

* add feed with place op

* update

* remove useless unitest

* udpate mkldnn

* update

* update

* align mkldnn version

* new ir support builtin slice op

* fix bug

* fix phi kernel adaptor bug

* add enable static

* add enable_static

* remove useless test case

* change feed list to single variable

* update

* add feed with place and shaddow output op

* fix bug

* remove usless code

* support gpu

* fix bug

* fix bug

* remove template

* add more data type

* fix cimpile bug

* udpate

* remove useless code

* revert dygraph2st test

* remove usless code

* revert op

* fix bug

* new ir dygraph2static support gpu

* remove usless code

* code polish

* add const

* revert code and remove useless code

* revert code

* revert legacy op yaml

* remove useless code

* delete std::move

---------
Co-authored-by: Nkangguangli <kangguangli@hotmail.com>

fb9bec5d

J

[XPU] Add FP16 support for arg_min_max (#55642) · 14094aad
由 jiangfan06 提交于 7月 25, 2023

14094aad

24 7月, 2023 1 次提交
- H
  
  [PHI] add fused_softmax_mask and fused_softmax_mask_grad for CPU. (#55616) · b10b899c
  由 houj04 提交于 7月 24, 2023
  
  b10b899c
20 7月, 2023 3 次提交
- H
  [NewIR]Change feed list to variable list && support GPU (#55401) · 75517841
  由 hong 提交于 7月 20, 2023
```
* add feed with place op

* remove useless unitest

* udpate mkldnn

* update

* new ir support builtin slice op

* fix phi kernel adaptor bug

* add enable_static

* remove useless test case

* change feed list to single variable

* support gpu

* fix bug

* remove template

* add more data type

* fix cimpile bug
```
  75517841
- Z
  
  [XPU] fuse cast to conv2d/fc in mixed precision model (#54493) · 4df00939
  由 zhupengyang 提交于 7月 20, 2023
  
  4df00939
- Z
  
  rename hard_sigmoid to hardsigmoid for kernel name (#55559) · c3080386
  由 zyfncg 提交于 7月 20, 2023
  
  c3080386

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功