提交 · f2c96bc264854a3176890c51187f94ddad3ee44b · PaddlePaddle / Paddle

29 3月, 2023 1 次提交
- S
  Fix generate_kernels.py in CUDA 12.0 (#52232) · f2c96bc2
  由 sneaxiy 提交于 3月 29, 2023
```
* fix generate_kernels.py in CUDA 12.0

* fix attrs bug
```
  f2c96bc2
28 3月, 2023 20 次提交

Add basic functionalities to support Scalar & Scalars in op attr (#51984) · 2e9fd5e4

由 Feiyu Chan 提交于 3月 28, 2023

Add basic functionalities to support Scalar & Scalars in operator attribute.

1. extend allowed types in operator's attribute type, add `paddle::experimental::Scalar`, add corresponding protobuf Message types;
2. Scalar enhancement, add formatting, equality;
3. add code to handle Scalar & Scalars in opmaker, conversion from paddle operator to phi kernel, opdesc construction and manipulation, tensorrt converter, tracer, operator construction, etc;
4. bind `paddle::experimental::Scalar` to python, as `libpaddle.Scalar`;
5. add functionality to canonicalize attribute map according to OpProto(if the op the attribute map used for has an OpProto);
6. add code to manipulate Scalar proto message via protobuffer python API;

Add unittests.

1. add test cases for formatting, equality for Scalars, and WrapAsScalars;
2. add test cases for 'casting' between different morphs of attributes;
3. add test cases for extracting scalar & scalars from attribute;
4. add test cases for CanonicalizeScalarAttrs(and fix a bug in type index offset);
5. fix gmock's library filename on windows platform.
6. clean code: use canonicalize_attrs instead of inlining the function;
7. add test cases for libpaddle.Scalar in python code.
8. add test cases for `make_scalar_proto`, which manipulate proto message `Scalar` via protobuffer python API.

2e9fd5e4

Z
[inference] Remove log about fluid and fix uninitialization warning (#51558) · e91a7896
由 Zhang Jun 提交于 3月 28, 2023
```
* Remove log about fluid
* Remove useless forward declarations
* Fix uninitialization warning (trt onehot)
```
e91a7896
C

support auto generate for kldiv_loss (#51886) · cdba7e36
由 cyberslack_lee 提交于 3月 28, 2023

cdba7e36
张
support auto generate for cumprod (#52047) · a2d3c335
由张春乔提交于 3月 28, 2023
```
* mv cumprod

* add attrs

* Update backward.yaml

* Update backward.yaml
```
a2d3c335
W
Del old dygraph optest7 (#51999) · 6d0fa6f2
由 wanghuancoder 提交于 3月 28, 2023
```
* delete old dygraph op test
```
6d0fa6f2

【prim】change layernorm_grad rules (#51879) · 789aac8a

由 xiaoguoguo626807 提交于 3月 28, 2023

* support layer_norm prim and cinn test

* enable cinn test

* fix merge conflict

* polish input for check_output_with_place

* fix merge conflict

* add more test case

* fix merge conflict

* polish test case

* polish op_test

* change ln_g rules

* modify scale is none case

* modify scale is none case

* add public_python_api for check prim

* modify setoutputgrad and fp64bug

* add todo & delete log

* delete Single***varname

* delete get varname

* modify FP64 bug

* delete op test

* recover

* fix conflict

---------
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

789aac8a

L
add support to set chunk size of auto_growth_allocator (#52204) · b3efc923
由 Leo Chen 提交于 3月 28, 2023
```
* add flag to set chunk size

* use the flag

* add vlog

* add ut

* rename ut
```
b3efc923
S
Add overflow check in memory efficient attention implementation (#52191) · ecff3864
由 sneaxiy 提交于 3月 28, 2023
```
* add overflow check in memory efficient attention

* fix ci compile error

* fix ci compile error
```
ecff3864
H
fix int8 support for full kernel (#52194) · c145fd1e
由 houj04 提交于 3月 28, 2023
```
* fix int8 support for full kernel

* fix ut.
```
c145fd1e
C
support auto generate for huber_loss (#51951) · 2ba4515e
由 cyberslack_lee 提交于 3月 28, 2023
```
* fix huber_loss

* fix

* fix ops.yaml add intermediate

* fix

* fix test
```
2ba4515e
R
support auto generate static for one_hot_v2 (#52134) · b6af72eb
由 RedContritio 提交于 3月 28, 2023
```
* support auto generate static for one_hot_v2

* format
```
b6af72eb
W

add autogen code support for margin_cross_entropy (#52130) · 8c8c6d9d
由 Wang Xin 提交于 3月 28, 2023

8c8c6d9d
R
[静态图算子自动生成] support auto generate for log_softmax (#52036) · ad9b88ad
由 RedContritio 提交于 3月 28, 2023
```
* support auto generate for log_softmax

* add data_type
```
ad9b88ad
H

[API/OP] Support FP16/BF16 in paddle.nonzero API/OP (#51640) · 2e92357b
由 Haohongxiang 提交于 3月 28, 2023

2e92357b

[AMP OP&Test] add fp16/bf16 unittest for conv ops (#51787) · ad5536eb

由 wangxinxin08 提交于 3月 28, 2023

* add unittest for conv2d/depthwise_conv2d/conv2d_transpose

* add bf16 for DWConv and ConvTranspose

* fix unitest of conv2d_transpose

* modify DWConv2d op and unittest

* fix unittest of conv2d_transpose_bf16

* modify unittest name according to review

* modify atol of DWConv2D unittest

ad5536eb

Z

[XPU] fix bug of AnalyseOpFuncType about xpu op : memcpy_d2d of xpu is actually async (#52042) · 93d20c44
由 ZhouMengLei1999 提交于 3月 28, 2023

93d20c44
R

[CustomDevice] fix reducer (#52115) · e7c249cb
由 ronnywang 提交于 3月 28, 2023

e7c249cb
I

[CodeStyle][C405] Unnecessary <list/tuple> literal - rewrite as a set literal (#51972) · 9fa98349
由 Infinity_lee 提交于 3月 28, 2023

9fa98349

[Hackathon NO.77] 为 Paddle-TRT 添加 bitwise 算子 (#51971) · 864b50c3

由 Young-Flash 提交于 3月 28, 2023

* add bitwise_not trt converter

* run pre-commit

* modify neg_one_tensor_dims init way

* fix BOOL type support requires TensorRT 8.4

* fix int8 & uint8 type

* improve data type readability

* modify filter logic

* fix coverage CI

864b50c3

C

Modify the registration information of the interpolate kernel (#52163) · 3b055199
由 csy0225 提交于 3月 28, 2023

3b055199

27 3月, 2023 18 次提交
- Y
  [PHI]Support register functor kernel into PHI (#51914) · bcea3b89
  由 YuanRisheng 提交于 3月 27, 2023
```
* perfect structure kernel registry

* fix ci bugs
```
  bcea3b89
- A
  
  [NewExe]Adjust ExecutorCache Capacity from 4 into 10 (#52104) · 897fb6ab
  由 Aurelius84 提交于 3月 27, 2023
  
  897fb6ab
- Z
  
  edit formate of mea (#52147) · 13baef48
  由 ZhangDY-6483 提交于 3月 27, 2023
  
  13baef48
- [Zero-Dim] add FLAGS_set_to_1d, control whether to hack process to 1D, add ut for xpu (#51899) · 134c9c0c
  由 zhouweiwei2014 提交于 3月 27, 2023
  
  134c9c0c
- Add fuse_ops.yaml and fused_backward.yaml (#52010) · 10145cb6
  由 HappyHeavyRain 提交于 3月 27, 2023
```
* add fused_yaml fused_backward

* fix eager_funciton bug

* add some comment of fused yaml file

* add 'support_dygraph_mode' configuration in fused yaml

* delete some 'fused_api.h' in include file

* add fused flag in api_gen
```
  10145cb6
- X
  
  elementwise: onednn: support zero dimension inputs (#51656) · 2c1d494e
  由 Xinyu Chen 提交于 3月 27, 2023
  
  2c1d494e
- H
  [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output (#52114) · 04025237
  由 HongyuJia 提交于 3月 27, 2023
```
* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete dtype,shape func of multi_inplace op

* [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output
```
  04025237
- Automatically generate 'assign' operator (#51940) · 888a30c9
  由 HappyHeavyRain 提交于 3月 27, 2023
```
* support assign op

* support assign infer_var_type

* change code according to review

* change code according to review

* only save 'get_infer_var_type_func'

* rest file mode
```
  888a30c9
- L
  
  fix scope reuse problem (#52119) · 97fc2a0f
  由 Leo Chen 提交于 3月 27, 2023
  
  97fc2a0f
- W
  Revert "fix softmaxce null point in shape test (#51850)" (#52086) · d92c6477
  由 wanghuancoder 提交于 3月 27, 2023
```
This reverts commit 9c238d2b.
```
  d92c6477
- L
  unbind support bool dtype (#52080) · 553630aa
  由 Leo Chen 提交于 3月 27, 2023
```
* unbind support bool dtype

* replace np.array_equal
```
  553630aa
- E
  add custom device mixed precision inference api (#50884) · a6449634
  由 engineer1109 提交于 3月 27, 2023
```
fix bug

remove useless

fix bug

add pybind

remove log

fix style

fix style

change api
```
  a6449634
- L
  Add data type of int, int64 for add kernel. Modify the code style of (#50443) · 62bff0e0
  由 Leo Guo 提交于 3月 27, 2023
```
instance_norm_grad kernel. Fix bugs that the data type of input is different from output in reduce_sum kernel. test=kunlun
```
  62bff0e0
- R
  fix_gcc12_error (#52083) · f7267412
  由 risemeup1 提交于 3月 27, 2023
```
* fix_gcc12_error

* fix gcc12 error

* fix gcc12 error
```
  f7267412
- R
  fix_gcc12_error (#52007) · b2bd74f7
  由 risemeup1 提交于 3月 27, 2023
```
* fix_gcc12_error

* patch on eigen3 for fixing gcc12 error

* Update multiary.cc
```
  b2bd74f7
- S
  Fused elementwise_(mul/div) (#50428) · 968f7f24
  由 Sławomir Siwek 提交于 3月 27, 2023
```
* extract Op and OPMaker to .h

* extend pattern for fused_op

* set "with_residual" default to false

* adjust fuse passes

* remove fc+eltwise flag

* fused_output_scale

* activation attrs

* remove extra attrs

* fix int8/bf16 unit tests

* simplify RecomputeOutputDims

* remove unused method

* Add description for attributes

* add extra check

* adjust op compats

* update quantize test

* fix protobuf parsing error

* fix int8 performance

* fused elementwises

* merge develop

* remove activation

* restore activation for existing add/sub ops
```
  968f7f24
- H
  
  [XPU] layer_norm support fp16 input of scale and bias. (#52091) · 14abafa1
  由 houj04 提交于 3月 27, 2023
  
  14abafa1
- S
  Fix memory efficient attention bug (#52117) · 019e1cf5
  由 sneaxiy 提交于 3月 27, 2023
```
* fix mea compile error

* support 2-D bias

* add inline to avoid compile error

* polish codes
```
  019e1cf5
25 3月, 2023 1 次提交
- 张
  
  [CodeStyle][PLR0402] import a.b to from a import b (#52125) · 8c17fc0b
  由张春乔提交于 3月 25, 2023
  
  8c17fc0b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功