提交 · 82cf1fad8fd2e62e30d01434d4038a1ecfea74e8 · PaddlePaddle / Paddle

12 2月, 2023 1 次提交
- X
  
  [prim] generate static prim api (#50315) · 82cf1fad
  由 Xiaoxu Chen 提交于 2月 12, 2023
  
  82cf1fad
11 2月, 2023 2 次提交

[Tensor Operator] Overload Tensor Operator (#50098) · 14e45f6b

由 HongyuJia 提交于 2月 11, 2023

* init commit

* fix tensor operator*

* fix compile bug

* bug reproduce

* update commit

* polish codes

* fix compile bug

* test begin

* test begin

* compile finish

* restore origin composite_backward_api

* pass local CI

* fix merge error

* fix merge error

* change py_test from GPU->CPU, test custom op

* polish codes, modify prim unittest

* modify prim unittest

* determine phi_tensor_operants location

* polish codes

* add header file

* solve windows unresolved symbol

* fix some CI error

* add overload defination

* fix CI inference and Windows

* polish codes according to reviewers' opinion

* polish codes according to reviewers' opinion

14e45f6b

[TRT] elementwise_add+transpose fusion (#50081) · fd0d4fa4

由 Wang Bojun 提交于 2月 11, 2023

* eleadd_trans first version

log fix

* refine code for linear format, add pass check

* linear format refine and ut fix

* fix ut

* windows ut

* windows ut 2

* move tensorMeta and alloc to configure

fd0d4fa4

10 2月, 2023 14 次提交
- U
  
  remove if constexpr(), which is not supported on gcc54 (#50395) · 22bcb75a
  由 umiswing 提交于 2月 10, 2023
  
  22bcb75a
- L
  Fix bugs and add unit tests in instance_norm_grad_kernel when d_scale and (#50394) · 4c373e6b
  由 Leo Guo 提交于 2月 10, 2023
```
d_bias are nullptr. Modify the code style of full_kernel.cc. Add new data
type for concat, elementwise_add, gather, scale, scatter ops. test=kunlun
```
  4c373e6b
- A
  Fix inferMefer in transpose2_grad (#50388) · 42a75145
  由 Aurelius84 提交于 2月 10, 2023
```
* Fix inferMefer in transpose2_grad

* fix infershape

* fix unittest
```
  42a75145
- Y
  
  add xpu batch norm ncdhw layout, test=kunlun (#50384) · ca520280
  由 ykkk2333 提交于 2月 10, 2023
  
  ca520280
- I
  
  fix stackoverflow case13 gather (#50243) · bf80664c
  由 Infinity_lee 提交于 2月 10, 2023
  
  bf80664c
- R
  Fix UFA非法地址访问(UFA illegal address access) of case2: paddle.scatter (#50025) · fb228c4a
  由 RedContritio 提交于 2月 10, 2023
```
* add dim check in scatter

* add check in scatter.cu

* add unittest

* remove unnecessary log and comment

---------

Co-authored-by: RedContritio <>
```
  fb228c4a
- H
  [Bug Fix] Fix NLP-Bert model performance loss (#50333) · e1a792fe
  由 HongyuJia 提交于 2月 10, 2023
```
* fix NLP-Bert model performance loss

* fix windows compile error
```
  e1a792fe
- R
  Fix test_fleet_exe_dist_model_run bug (#48492) · ffbda80c
  由 risemeup1 提交于 2月 10, 2023
```
* fix test_fleet_exe_dist_model_run

* test
```
  ffbda80c
- W
  
  fix_conv2d_transpose_double_grad (#50386) · 428c01d6
  由 Weilong Wu 提交于 2月 10, 2023
  
  428c01d6
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
- H
  
  fix default_attr=nullptr bug (#50383) · efef3035
  由 HongyuJia 提交于 2月 10, 2023
  
  efef3035
- H
  [phi decoupling] remove AllocatorFacade in phi (#50380) · d1bfb4b7
  由 Huang Jiyi 提交于 2月 10, 2023
```
* remove AllocatorFacade in phi

* fix include

* fix bugs
```
  d1bfb4b7
- H
  [phi decoupling] rm gradient_accumulator in phi (#50385) · 13f57ec0
  由 Huang Jiyi 提交于 2月 10, 2023
```
* rm gradient_accumulator in phi

* update
```
  13f57ec0
- W
  
  [XPU] bind op: atan & deformable_conv_v1 (#50373) · e15ef948
  由 wangshengxiang 提交于 2月 10, 2023
  
  e15ef948
09 2月, 2023 15 次提交

Z
[trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
14a92c8c
L

Modify full kernel for xpu. test=kunlun (#50209) · 18e0e01d
由 Leo Guo 提交于 2月 09, 2023

18e0e01d
R
[kunlun] support async send/recv via group (#50329) · 350cd82a
由 Roc 提交于 2月 09, 2023
```
Co-authored-by: Nzhangxiaoci <zhangxiaoci@baidu.com>
```
350cd82a
J
Adjust mkldnn_placement_pass to check library type and data type (#49899) · ebdf3ef9
由 joanna.wozna.intel 提交于 2月 09, 2023
```
* Adjust mkldnn_placement_pass to check library type and data type

* Check if var has inputs

* Remove unrelated test

* Refactor
```
ebdf3ef9

[PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a

由 Huang Jiyi 提交于 2月 09, 2023

* decouple strided_memcpy

* move strided_memcpy

* move strided_memcpy to phi

* fix namespace

* update

* fix gpu compile bugs

17318c1a

H

remove layout_utils in phi (#50355) · 90650534
由 Huang Jiyi 提交于 2月 09, 2023

90650534

Add MultiTenosrAdam OP (#49220) · 10654c77

由 yuehuayingxueluo 提交于 2月 09, 2023

* add multi_tenosr_adam

* update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py

* fix adam.py optimizer.py

* fix adamw.py

* fix test_multi_tensor_adam.py

* fix CI bug

* fix CI coverage

* fix ci bug

* fix betapow

* fix some bugs

* fix test_adamw_op.py

* fix CI coverage

* fix multi_tensor_adam_kernel.cc

* fix CI bug

* fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py

* fix code style

* update C++ parts

* remove python parts modification temporarily

* add C++ ut

* update betapow copy code logic

* fix ci ut

* fix windows ci

* fix coverage ci

* improve coverage rate

---------
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

10654c77

L

fix gc bug and start interceptor (#50344) · 5d5cb256
由 LiYuRio 提交于 2月 09, 2023

5d5cb256

[Paddle-TRT] GroupNorm int8 nchw32 fake kernel (#50146) · d93c63a0

由 zhoutianzi666 提交于 2月 09, 2023

* add fmha_flashattention oss plugin

* add fmhca

* add oss fmhca

* code reconstruct and add ut

* code style refine

* fix ut and enforce check

* refine trt version check

refine compile

fix compile

* fix cross ut

* code refine

* use runtime trt version check

* bug fix and code refine

* compile fix

* merge develop

* add GN QDQ kernel

* support GN int8 fake kernel

* add with_int8

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8 fake kernel

* add GN int8  UT

* add verison > 8000  in GN int8  UT

* add some check in .cu

* add stdlib.h in UT

* little change  in .cu

* remove rand_r use rand

* remove use rand

* setAxis(1)

* when int8 is on allow fall back to fp16

---------
Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>

d93c63a0

K
[BugFix][ConditionalBlock] fix judgement about scope validation (#50086) · 61f9f136
由 kangguangli 提交于 2月 09, 2023
```
* fix judgement about scope validation

* fix ci bug: same address is not enough for data consistency

* remove useless check
```
61f9f136
P

Fix pscore test (#50349) · fe811625
由 pangengzheng 提交于 2月 09, 2023

fe811625

[IR] Type system stage1: add class TypeId, class AbstractType, class TypeStorage (#50242) · f11c913e

由 zhangbo9674 提交于 2月 09, 2023

* add TypeID

* Specification comment code

* refine code

* add AbstractType

* add TypeStorage

* fix unittest bug

* change dir

* change dir

* refine code

* fix bug

* Refine code by comment

* delete unused code

* normative naming rules

* refine code by comment

* refine doc

* refine codestyle

f11c913e

Z

add logical_and, logical_or and logical_xor for xpu (#50228) · 0036316e
由 zhangyikun02 提交于 2月 09, 2023

0036316e
W
[TRT] Transpose layernorm fusion with different input format (#50082) · b2bb7ec9
由 Wang Bojun 提交于 2月 09, 2023
```
* trans_layernorm
```
b2bb7ec9
傅

fix set_value_65965 (#50340) · b3f60f39
由傅剑寒提交于 2月 09, 2023

b3f60f39

08 2月, 2023 8 次提交
- P
  fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe
  由 Paulina Gacek 提交于 2月 08, 2023
```
* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added
```
  197a4ffe
- S
  Add bf16 support for fused matmul (#50254) · b47923b4
  由 Sławomir Siwek 提交于 2月 08, 2023
```
* add support for bf16 fused_ops

* fused_matmul only
```
  b47923b4
- W
  [code style]fix cpplint codestyle (#50314) · 209d534d
  由 wangxiaoning 提交于 2月 08, 2023
```
* fix codestyle

* fix std
```
  209d534d
- Z
  [inference][trt] Disable ShapeTensor for nearest_interp_v2 when trt version < 8.2 (#50258) · fa284076
  由 Zhang Jun 提交于 2月 08, 2023
```
* update

* update

* format code

* update

* Update test_trt_convert_nearest_interp_v2.py
```
  fa284076
- Y
  
  Fused attention pass mp support (#50320) · e44ff495
  由 Yuang Liu 提交于 2月 08, 2023
  
  e44ff495
- Z
  [pglbox]hidden unzip (#50292) · a7539508
  由 zmxdream 提交于 2月 08, 2023
```
* hidden unzip

* fix

* fix
```
  a7539508
- W
  
  Export custom operator-related function symbols (#50238) · f9c801ff
  由 weishengying 提交于 2月 08, 2023
  
  f9c801ff
- H
  
  Use inference, save construct time (#50163) · 7a82b6de
  由 HongyuJia 提交于 2月 08, 2023
  
  7a82b6de

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功