提交 · fd0d4fa437c5d8b80ba81585bb716bee4d674f36 · PaddlePaddle / Paddle

11 2月, 2023 1 次提交

[TRT] elementwise_add+transpose fusion (#50081) · fd0d4fa4

由 Wang Bojun 提交于 2月 11, 2023

* eleadd_trans first version

log fix

* refine code for linear format, add pass check

* linear format refine and ut fix

* fix ut

* windows ut

* windows ut 2

* move tensorMeta and alloc to configure

fd0d4fa4

10 2月, 2023 21 次提交
- U
  
  remove if constexpr(), which is not supported on gcc54 (#50395) · 22bcb75a
  由 umiswing 提交于 2月 10, 2023
  
  22bcb75a
- R
  
  Remove redundant UTs for interptercore (#50392) · 17d10a5d
  由 Ruibiao Chen 提交于 2月 10, 2023
  
  17d10a5d
- L
  Fix bugs and add unit tests in instance_norm_grad_kernel when d_scale and (#50394) · 4c373e6b
  由 Leo Guo 提交于 2月 10, 2023
```
d_bias are nullptr. Modify the code style of full_kernel.cc. Add new data
type for concat, elementwise_add, gather, scale, scatter ops. test=kunlun
```
  4c373e6b
- R
  
  fix ninja error while using python>=3.9 (#48867) · 243cae59
  由 risemeup1 提交于 2月 10, 2023
  
  243cae59
- A
  Fix inferMefer in transpose2_grad (#50388) · 42a75145
  由 Aurelius84 提交于 2月 10, 2023
```
* Fix inferMefer in transpose2_grad

* fix infershape

* fix unittest
```
  42a75145
- Y
  
  add xpu batch norm ncdhw layout, test=kunlun (#50384) · ca520280
  由 ykkk2333 提交于 2月 10, 2023
  
  ca520280
- Y
  
  fix bugs about ParallelEnv (#50405) · c1f2c52c
  由 yuehuayingxueluo 提交于 2月 10, 2023
  
  c1f2c52c
- W
  [fluid clean]clean fluid.distribute_lookup_table (#50350) · 291c55a2
  由 wangxiaoning 提交于 2月 10, 2023
```
* fluid clean

* fix optimizer

* fix distributed_transpiler

* fix fluid.__init__

* remove from fluid.init
```
  291c55a2
- I
  
  fix stackoverflow case13 gather (#50243) · bf80664c
  由 Infinity_lee 提交于 2月 10, 2023
  
  bf80664c
- R
  Fix UFA非法地址访问(UFA illegal address access) of case2: paddle.scatter (#50025) · fb228c4a
  由 RedContritio 提交于 2月 10, 2023
```
* add dim check in scatter

* add check in scatter.cu

* add unittest

* remove unnecessary log and comment

---------

Co-authored-by: RedContritio <>
```
  fb228c4a
- H
  [Bug Fix] Fix NLP-Bert model performance loss (#50333) · e1a792fe
  由 HongyuJia 提交于 2月 10, 2023
```
* fix NLP-Bert model performance loss

* fix windows compile error
```
  e1a792fe
- R
  Fix test_fleet_exe_dist_model_run bug (#48492) · ffbda80c
  由 risemeup1 提交于 2月 10, 2023
```
* fix test_fleet_exe_dist_model_run

* test
```
  ffbda80c
- W
  
  fix_conv2d_transpose_double_grad (#50386) · 428c01d6
  由 Weilong Wu 提交于 2月 10, 2023
  
  428c01d6
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
- L
  Fix Python IndexError of Case14: paddle.nn.functional.glu (#50016) · 62fe3cf5
  由 LoneRanger 提交于 2月 10, 2023
```
* 为split增加取值范围维度的判断

* 为glu的axis进行取值判断并添加单测

* 完善glu的单测

* fix glu
```
  62fe3cf5
- A
  
  [Dy2St]Fix func.__self__ problem in FunctionSpec (#50404) · 3374600e
  由 Aurelius84 提交于 2月 10, 2023
  
  3374600e
- M
  [Zero-Dim] support input 0D Tensor for std/var (#49735) · 86cc694f
  由 mhy-666 提交于 2月 10, 2023
```
* add test_std

* add test_var

* fix std/var assertequal

* fix std/var assertequal

* fix std/var assertequal

* -madd api name to reduce_api

* fix

* fix var

* fix

* fix

* fix stat

* fix unitest

* fix stat/var

* fix stat/var, unittest

* fix stat/std, unittest

* add unittest of var,std, fix stat/var,std

* fix stat/var, unittest

* fix

* fix unittest

* fix

* fix

* fix

* fix unittest
```
  86cc694f
- H
  
  fix default_attr=nullptr bug (#50383) · efef3035
  由 HongyuJia 提交于 2月 10, 2023
  
  efef3035
- H
  [phi decoupling] remove AllocatorFacade in phi (#50380) · d1bfb4b7
  由 Huang Jiyi 提交于 2月 10, 2023
```
* remove AllocatorFacade in phi

* fix include

* fix bugs
```
  d1bfb4b7
- H
  [phi decoupling] rm gradient_accumulator in phi (#50385) · 13f57ec0
  由 Huang Jiyi 提交于 2月 10, 2023
```
* rm gradient_accumulator in phi

* update
```
  13f57ec0
- W
  
  [XPU] bind op: atan & deformable_conv_v1 (#50373) · e15ef948
  由 wangshengxiang 提交于 2月 10, 2023
  
  e15ef948
09 2月, 2023 18 次提交

Z
[trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
14a92c8c
L

Modify full kernel for xpu. test=kunlun (#50209) · 18e0e01d
由 Leo Guo 提交于 2月 09, 2023

18e0e01d
R
[kunlun] support async send/recv via group (#50329) · 350cd82a
由 Roc 提交于 2月 09, 2023
```
Co-authored-by: Nzhangxiaoci <zhangxiaoci@baidu.com>
```
350cd82a
X

consider grad_op exist in forward program. (#50321) · 3862f347
由 xiongkun 提交于 2月 09, 2023

3862f347
J
Adjust mkldnn_placement_pass to check library type and data type (#49899) · ebdf3ef9
由 joanna.wozna.intel 提交于 2月 09, 2023
```
* Adjust mkldnn_placement_pass to check library type and data type

* Check if var has inputs

* Remove unrelated test

* Refactor
```
ebdf3ef9

remove paddle.fluid.dygraph.parallel.ParallelEnv (#50157) · 9dd1f4bf

由 zqw_1997 提交于 2月 09, 2023

* remove dygraph.parallel.ParallelEnv

* logger.py error: AttributeError: module 'paddle' has no attribute 'distributed'

* move the implenmentation to the root folder

* logger.py import ParallelEnv from paddle.parallel to avoid circular import

* add the comment of why import ParallelEnv from paddle.parallel in logger.py and remove the api interface in the paddle/parallel.py

* outdated Env and note removed

* decouple the logger.py and ParallelEnv

* remove another ref of parallel in init.py

9dd1f4bf

[PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a

由 Huang Jiyi 提交于 2月 09, 2023

* decouple strided_memcpy

* move strided_memcpy

* move strided_memcpy to phi

* fix namespace

* update

* fix gpu compile bugs

17318c1a

H

remove layout_utils in phi (#50355) · 90650534
由 Huang Jiyi 提交于 2月 09, 2023

90650534

Add MultiTenosrAdam OP (#49220) · 10654c77

由 yuehuayingxueluo 提交于 2月 09, 2023

* add multi_tenosr_adam

* update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py

* fix adam.py optimizer.py

* fix adamw.py

* fix test_multi_tensor_adam.py

* fix CI bug

* fix CI coverage

* fix ci bug

* fix betapow

* fix some bugs

* fix test_adamw_op.py

* fix CI coverage

* fix multi_tensor_adam_kernel.cc

* fix CI bug

* fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py

* fix code style

* update C++ parts

* remove python parts modification temporarily

* add C++ ut

* update betapow copy code logic

* fix ci ut

* fix windows ci

* fix coverage ci

* improve coverage rate

---------
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

10654c77

Y
[audio] fix doc typo (#50343) · d676d552
由 YangZhou 提交于 2月 09, 2023
```
* fix typo

* add sox_io in audio test

* fix

* fix
```
d676d552
L

fix gc bug and start interceptor (#50344) · 5d5cb256
由 LiYuRio 提交于 2月 09, 2023

5d5cb256

Fix bugs in pass_base.py (#50136) · 5cae5fdd

由 yuehuayingxueluo 提交于 2月 09, 2023

* fix the processing order of passes in pass_base.py

* fix processing order

* add _PASS_PROCESS_ORDER_LIST

* delete some pass in _PASS_PROCESS_ORDER_LIST

* add assert in pass_base.py

* remove fuse_optimizer

* add _fusion_opt_list_rule

* add test_pass_base_list.py

* fix some bug

* add fused_attention

* add some passes to list

* fix ci bug

* fix ci bug

5cae5fdd

W

[rm fluid] for the non distribution (#50313) · 7edfac9e
由 wangzhen38 提交于 2月 09, 2023

7edfac9e

[Paddle-TRT] GroupNorm int8 nchw32 fake kernel (#50146) · d93c63a0

由 zhoutianzi666 提交于 2月 09, 2023

* add fmha_flashattention oss plugin

* add fmhca

* add oss fmhca

* code reconstruct and add ut

* code style refine

* fix ut and enforce check

* refine trt version check

refine compile

fix compile

* fix cross ut

* code refine

* use runtime trt version check

* bug fix and code refine

* compile fix

* merge develop

* add GN QDQ kernel

* support GN int8 fake kernel

* add with_int8

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8  UT

* add verison > 8000  in GN int8  UT

* add some check in .cu

* add stdlib.h in UT

* little change  in .cu

* remove rand_r use rand

* remove use rand

* setAxis(1)

* when int8 is on allow fall back to fp16

---------
Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>

d93c63a0

W

clean communicator (#50339) · d9b70950
由 wangxiaoning 提交于 2月 09, 2023

d9b70950
K
[BugFix][ConditionalBlock] fix judgement about scope validation (#50086) · 61f9f136
由 kangguangli 提交于 2月 09, 2023
```
* fix judgement about scope validation

* fix ci bug: same address is not enough for data consistency

* remove useless check
```
61f9f136
P

Fix pscore test (#50349) · fe811625
由 pangengzheng 提交于 2月 09, 2023

fe811625
J

fix bn composite error shape (#50338) · e389f2fc
由 Jiabin Yang 提交于 2月 09, 2023

e389f2fc

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功