提交 · a2d3c3359d05a92cdbb231025f6802ab44323f76 · PaddlePaddle / Paddle

28 3月, 2023 11 次提交
- 张
  support auto generate for cumprod (#52047) · a2d3c335
  由张春乔提交于 3月 28, 2023
```
* mv cumprod

* add attrs

* Update backward.yaml

* Update backward.yaml
```
  a2d3c335
- L
  add support to set chunk size of auto_growth_allocator (#52204) · b3efc923
  由 Leo Chen 提交于 3月 28, 2023
```
* add flag to set chunk size

* use the flag

* add vlog

* add ut

* rename ut
```
  b3efc923
- S
  Add overflow check in memory efficient attention implementation (#52191) · ecff3864
  由 sneaxiy 提交于 3月 28, 2023
```
* add overflow check in memory efficient attention

* fix ci compile error

* fix ci compile error
```
  ecff3864
- H
  fix int8 support for full kernel (#52194) · c145fd1e
  由 houj04 提交于 3月 28, 2023
```
* fix int8 support for full kernel

* fix ut.
```
  c145fd1e
- C
  support auto generate for huber_loss (#51951) · 2ba4515e
  由 cyberslack_lee 提交于 3月 28, 2023
```
* fix huber_loss

* fix

* fix ops.yaml add intermediate

* fix

* fix test
```
  2ba4515e
- R
  support auto generate static for one_hot_v2 (#52134) · b6af72eb
  由 RedContritio 提交于 3月 28, 2023
```
* support auto generate static for one_hot_v2

* format
```
  b6af72eb
- W
  
  add autogen code support for margin_cross_entropy (#52130) · 8c8c6d9d
  由 Wang Xin 提交于 3月 28, 2023
  
  8c8c6d9d
- R
  [静态图算子自动生成] support auto generate for log_softmax (#52036) · ad9b88ad
  由 RedContritio 提交于 3月 28, 2023
```
* support auto generate for log_softmax

* add data_type
```
  ad9b88ad
- H
  
  [API/OP] Support FP16/BF16 in paddle.nonzero API/OP (#51640) · 2e92357b
  由 Haohongxiang 提交于 3月 28, 2023
  
  2e92357b
- W
  [AMP OP&Test] add fp16/bf16 unittest for conv ops (#51787) · ad5536eb
  由 wangxinxin08 提交于 3月 28, 2023
```
* add unittest for conv2d/depthwise_conv2d/conv2d_transpose

* add bf16 for DWConv and ConvTranspose

* fix unitest of conv2d_transpose

* modify DWConv2d op and unittest

* fix unittest of conv2d_transpose_bf16

* modify unittest name according to review

* modify atol of DWConv2D unittest
```
  ad5536eb
- C
  
  Modify the registration information of the interpolate kernel (#52163) · 3b055199
  由 csy0225 提交于 3月 28, 2023
  
  3b055199
27 3月, 2023 13 次提交

Z

edit formate of mea (#52147) · 13baef48
由 ZhangDY-6483 提交于 3月 27, 2023

13baef48
[Zero-Dim] add FLAGS_set_to_1d, control whether to hack process to 1D, add ut for xpu (#51899) · 134c9c0c
由 zhouweiwei2014 提交于 3月 27, 2023

134c9c0c

Add fuse_ops.yaml and fused_backward.yaml (#52010) · 10145cb6

由 HappyHeavyRain 提交于 3月 27, 2023

* add fused_yaml fused_backward

* fix eager_funciton bug

* add some comment of fused yaml file

* add 'support_dygraph_mode' configuration in fused yaml

* delete some 'fused_api.h' in include file

* add fused flag in api_gen

10145cb6

X

elementwise: onednn: support zero dimension inputs (#51656) · 2c1d494e
由 Xinyu Chen 提交于 3月 27, 2023

2c1d494e

[CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output (#52114) · 04025237

由 HongyuJia 提交于 3月 27, 2023

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete dtype,shape func of multi_inplace op

* [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output

04025237

Automatically generate 'assign' operator (#51940) · 888a30c9

由 HappyHeavyRain 提交于 3月 27, 2023

* support assign op

* support assign infer_var_type

* change code according to review

* change code according to review

* only save 'get_infer_var_type_func'

* rest file mode

888a30c9

L
unbind support bool dtype (#52080) · 553630aa
由 Leo Chen 提交于 3月 27, 2023
```
* unbind support bool dtype

* replace np.array_equal
```
553630aa
L
Add data type of int, int64 for add kernel. Modify the code style of (#50443) · 62bff0e0
由 Leo Guo 提交于 3月 27, 2023
```
instance_norm_grad kernel. Fix bugs that the data type of input is different from output in reduce_sum kernel. test=kunlun
```
62bff0e0
R
fix_gcc12_error (#52083) · f7267412
由 risemeup1 提交于 3月 27, 2023
```
* fix_gcc12_error

* fix gcc12 error

* fix gcc12 error
```
f7267412

fix_gcc12_error (#52007) · b2bd74f7

由 risemeup1 提交于 3月 27, 2023

* fix_gcc12_error

* patch on eigen3 for fixing gcc12 error

* Update multiary.cc

b2bd74f7

Fused elementwise_(mul/div) (#50428) · 968f7f24

由 Sławomir Siwek 提交于 3月 27, 2023

* extract Op and OPMaker to .h

* extend pattern for fused_op

* set "with_residual" default to false

* adjust fuse passes

* remove fc+eltwise flag

* fused_output_scale

* activation attrs

* remove extra attrs

* fix int8/bf16 unit tests

* simplify RecomputeOutputDims

* remove unused method

* Add description for attributes

* add extra check

* adjust op compats

* update quantize test

* fix protobuf parsing error

* fix int8 performance

* fused elementwises

* merge develop

* remove activation

* restore activation for existing add/sub ops

968f7f24

H

[XPU] layer_norm support fp16 input of scale and bias. (#52091) · 14abafa1
由 houj04 提交于 3月 27, 2023

14abafa1

Fix memory efficient attention bug (#52117) · 019e1cf5

由 sneaxiy 提交于 3月 27, 2023

* fix mea compile error

* support 2-D bias

* add inline to avoid compile error

* polish codes

019e1cf5

25 3月, 2023 1 次提交
- R
  [Fix Bug] fix get_new_shape and get_new_data_from_tensor not support fallback... · db5204ec
  由 Ruibin Cheung 提交于 3月 25, 2023
```
[Fix Bug] fix get_new_shape and get_new_data_from_tensor not support fallback to CPU on custom device (#52002)
```
  db5204ec
24 3月, 2023 7 次提交

add phi operator allreduce/reduce (#51857) · 47f87ad3

由 TaoTao Li 提交于 3月 24, 2023

* add all_reduce, reduce kernel and api

* fix all_reduce reduce ut

fix reduce op maker conflict

fix merge conflicts

* fix conflicts, rename ReduceOp->ReduceBaseOp in reduce_ops

rename allreduce op, to remove

* fix code format

fix comments

* modify test_collective_reduce_api ut timeout

* fix PR-CI-Build

fix comments: format phi operator

47f87ad3

[PHI Decoupling]Remove memory header (Part3) (#51288) · 3d78e759

由 YuanRisheng 提交于 3月 24, 2023

* decouple memory copy

* fix ci bugs

* fix ci compile bugs

* fix rocm compile

* fix ci bugs

* decouple memory

* deal with conflict

* fix xpu compile bugs

* fix xpu bugs

* deal with xpu bugs

* fix cmake bugs

* fix windows bugs

* fix ci bugs

* fix ci bugs

* delete redundance code

* add code for pybind

* fix py3 bugs

* fix ci bugs

3d78e759

P
[PHI]fix momentum dtype infer (#51353) · 648ec795
由 PuQing 提交于 3月 24, 2023
```
* fix momentum dtype infer

* fix momentum datatype

* fix on cpu

* add momentum
```
648ec795
T
【PaddlePaddle Hackathon 4 No.40】为 Paddle 优化 kthvalue op 在 GPU 上的计算性能 (#51835) · e18f5339
由 thunder95 提交于 3月 24, 2023
```
* untracked files

* kthvalue perf

* remove unused files

* fix isnan

* fix isnan2

* fix bug

* try to fix rocm error
```
e18f5339

Memory Efficient Attention (#51867) · e5ad3859

由 ZhangDY-6483 提交于 3月 24, 2023

* first version, notest

* return final rst, notest

* use infinity() instead of max

* ut structure

* start up of ut

* generate lse

* update

* add depense

* reconstruct cmake

* move file

* add memory efficient attention and fix blasimpl

* update

* update cmake

* add namespace

* update cmake

* use .cu

* update for pad3d

* bug fix

* bug fix

* update

* bug fix

* update enforce

* add test case

* merge the lse pad

* fix kernel_fn of backward

* fix PADDLE_ENFORCE_EQ and phi_api

* fix PADDLE_ENFORCE

* fix PADDLE_ENFORCE

* rerun coverage

* fix memory efficient attention test

* rerun ci

* add cuda version condition

* add cuda version condition

* delete WIP test

* replace PADDLE_ENFORCE

* edit the namespace of datatype in multiple.cc

* rerun

* rerun

---------
Co-authored-by: Nliuyuang <liuyuang@baidu.com>

e5ad3859

Z

remove copy of index for gather_nd_grad and scatter_nd_add op in xpu (#51871) · b110085f
由 zhangyikun02 提交于 3月 24, 2023

b110085f
Y

Fix roll kernel gpu bug. (#52012) · b6d0dac9
由 Yuang Liu 提交于 3月 24, 2023

b6d0dac9

23 3月, 2023 8 次提交
- H
  
  [Fix Bug] Fix customOP + customDevice scenario selects wrong place (#51996) · 2bf0d1c8
  由 HongyuJia 提交于 3月 23, 2023
  
  2bf0d1c8
- H
  
  [CustomOP Optional] CustomOP supports optional vector<Tensor> input (#51973) · 6a10e604
  由 HongyuJia 提交于 3月 23, 2023
  
  6a10e604
- H
  [Polish Log] Polish Tensor operants' log: 'OperantsManager reusing XXX mode... · 5754aae5
  由 HongyuJia 提交于 3月 23, 2023
```
[Polish Log] Polish Tensor operants' log: 'OperantsManager reusing XXX mode API {func_name}' (#51991)

* [Polish Log] Polish Tensor operants' log: 'OperantsManager reusing XXX mode API {func_name}'

* Make API name more precise
```
  5754aae5
- Z
  
  pool2d and pool2d_grad support case of kernel_size > kh/kw for xpu (#51870) · 5f388221
  由 zhangyikun02 提交于 3月 23, 2023
  
  5f388221
- X
  【prim】delete high order prim flag && add special prune rules for node.cc (#51676) · 978d544b
  由 xiaoguoguo626807 提交于 3月 23, 2023
```
* delete prim flag for matmul_2_grad

* delete prim flag for matmul_2_grad

* add new setgradoutmeta for matmul_double_grad_node

* modify test and delete log

* deal with review
```
  978d544b
- C
  [Prim] add meshgrid composite rule (#51061) · 53bb883d
  由 chenjian 提交于 3月 23, 2023
```
* add meshgrid composite rule

* add meshgrid composite rule

* update

* add into CMakeLists

* fix

* update

* update

* optimize code

* fix meshgrid op

* update test
```
  53bb883d
- Z
  
  [XPU] support lod_reset (#51967) · c491b361
  由 ZhouMengLei1999 提交于 3月 23, 2023
  
  c491b361
- S
  Remove fluid deps in fused_linear_param_grad_add_kernel.cu (#51975) · 5da1a27b
  由 sneaxiy 提交于 3月 23, 2023
```
* remove fluid deps in fused_linear_param_grad_add_kernel

* fix compile error

* fix ut error

* follow comments
```
  5da1a27b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功