提交 · a5827f0ee3104fa19938f4e70eb5229063949900 · PaddlePaddle / Paddle

27 2月, 2023 1 次提交

[Bfloat16]register bfloat16 datatype for squared l2 norm (#50908) · 3c121040

由 shaojie_wang 提交于 2月 26, 2023

* register bfloat16 datatype for squared l2 norm

* register bfloat16 datatype for softmax with upper triangular mask

* register bfloat16 for tril triu cuda kernel

3c121040

26 2月, 2023 1 次提交

Enable matmul + bias fusion in fused_gat_attention. (#50755) · 57f6a469

由 Yiqun Liu 提交于 2月 26, 2023

* Enable matmul + bias fusion in fused_gat_attention.

* Add a variable to control whether using fused matmul + bias.

57f6a469

24 2月, 2023 4 次提交

【Prim】Fix prim amp (#50518) · 6664a232

由 Jiabin Yang 提交于 2月 24, 2023

* change amp with to_prim

* fix prim amp

* fix rules

* fix liear

* add amp test

* add test

* disable this test on cpu

* disable this test on cpu

---------
Co-authored-by: Ncyber-pioneer <chenzhuo@tju.edu.cn>

6664a232

C

fix composite grad maker code gen (#50854) · 07c416c8
由 Charles-hit 提交于 2月 24, 2023

07c416c8

support 'backend' in static ops (#50671) · 363825df

由 HappyHeavyRain 提交于 2月 24, 2023

* support 'backend' in static ops

* change bitwise_xx comment in python

* change bitwise_xxx comment in python

* change 'backend' and 'data_type' in GetExpectedKernelType

363825df

【prim】Slice grad (#50771) · f6dea800

由 xiaoguoguo626807 提交于 2月 24, 2023

* support prim test in OpTest

* fix cmake

* fix op test

* fix test_input_spec

* disable cinn in reduce_sum unit test

* add bfloat16 dtype for sum

* add approve rules

* polish code

* add clear jit program function

* convert grad out from tensor to numpy

* remove unnecessary code

* add only_prim flag

* fix flag

* fix op test

* add attr

* fix optest comp inplace error

* fix op test

* fix op test with guard

* add initialization of check_comp flag

* fix comp inplace error in op test

* rename check_comp with check_prim and add bfloat16 dtype convert

* rename comp_op_type to prim_op_type

* rename comp to prim

* remove useless code

* skip ci check for only prim

* add no_grad_vars and grad_outputs in prim test

* fix var_dict

* fix op test for only_prim

* fix dy2static bugs

* polish some code

* temp

* modify op test

* except cinn test

* modify bfp16

* modify pad grad

* add pad_grad dtype

* start cinn part

---------
Co-authored-by: NCharles-hit <wanghao107@baidu.com>

f6dea800

23 2月, 2023 3 次提交

[phi decoupling] move generator implementation from fluid to phi (#50746) · 4e417409

由 Huang Jiyi 提交于 2月 23, 2023

* move fluid generator to phi

* move fluid generator to phi

* update .gitignore

* fix bugs

* fix cannot find "glog/logging.h" in "generator.h"

* fix bugs

4e417409

Support 'complex promote' in yaml (#50611) · 91a3d159

由 HappyHeavyRain 提交于 2月 23, 2023

* support 'complex promote' in yaml

* change the compplex_promote

* change 'kron' in math.py

* change 'kron' comment in python

* change kron comment in python

* change kron comment in python

91a3d159

kunlun support c_softmax_with_cross_entropy (#49934) · f43b5fe5

由 jameszhang 提交于 2月 23, 2023

* kunlun support c_softmax_with_cross_entropy

* fix grad calc error

* replace mutable_data() and ShareDataWith()

* update xdnn

* update xpu toolchain to 20230215

* remove fluid from test file

f43b5fe5

22 2月, 2023 3 次提交

* remove broadcast (#50701) · 2fa91d71
由 TaoTao Li 提交于 2月 22, 2023

2fa91d71

Fix some typos. (#50429) · 93b2bf4b

由 Shuangchi He 提交于 2月 22, 2023

* Fix some typos.
Signed-off-by: Yulv-git <yulvchi@qq.com>

* pre-commit
Signed-off-by: Yulv-git <yulvchi@qq.com>

---------
Signed-off-by: Yulv-git <yulvchi@qq.com>

93b2bf4b

【Prim】Add gather vjp (#50305) · 4db8e5c7

由 Jiabin Yang 提交于 2月 22, 2023

* tmp gather vjp

* support gather

* remove useless code

* fix compiling error

* fix ut

* add eager test

* add eager test

* add seed

* fix cpu error

* fix transpose op compat

* remove tensor index case

* fix prim_cinn

* fix ut

4db8e5c7

21 2月, 2023 3 次提交

Support bw invoke fw (#50260) · d8845735

由 HappyHeavyRain 提交于 2月 21, 2023

* support bw invoke fw

* fix scale in static_backward.yaml

* fix the bug in tensorrt/convert

* move 'scale','sign' into ops.yaml

* add scale_grad of scale in op_compat.yaml

* change generated_static_op in CMakeLists.txt

d8845735

Q

add c_reduce_sum/unstack/all_reduce_datatype for kunlun (#50606) · 397c9403
由 QingshuChen 提交于 2月 21, 2023

397c9403

[phi decoupling] move sequence_padding from fluid to phi (#50639) · 5f443601

由 Huang Jiyi 提交于 2月 21, 2023

* move sequence_padding to phi

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix buga

* fix bugs

* revert and update phi::XPUContext

5f443601

20 2月, 2023 1 次提交

[phi decoupling] move serialization from phi to fluid (#50608) · 6b3c48c1

由 Huang Jiyi 提交于 2月 20, 2023

* move save_op to fluid

* fix namespace

* move_load_kernel

* fix kernel_register

* move serialization to fluid

* fix test

* fix bugs

6b3c48c1

17 2月, 2023 3 次提交

Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2

由 yuehuayingxueluo 提交于 2月 17, 2023

* rename multi_tensor_adam to fused_adam

* fix some bugs

* fix CI coverage

* rename test_fused_adam.py

* fix some bug

* add test_fused_adam_op.py

* fix some bugs

* fix fused_adam_op.cc

* fix CI bugs

* fix CI bug

* fix CI bug

e6af9bd2

upgrade oneDNN to 2.7.3 (#46301) · f803b239

由 Sławomir Siwek 提交于 2月 17, 2023

* change SHA

* update to oneDNN 2.7

* update to 2.7.1

* update to 2.7.2

* add supported hardsigmoid

* update to 2.7.3

* limit cpu threads for int8 test

* group activations

f803b239

[phi decoupling] move platform/transform to phi (#50498) · fe332794

由 Huang Jiyi 提交于 2月 17, 2023

* move platform::transform to phi

* fix bugs

* move transform_test to phi

* fix cmake

* update namespace

* fix cmake

fe332794

16 2月, 2023 4 次提交

S
[XPU][Fleet] Support multi-card infer for xpu (#50490) · 517d8074
由 shentanyue 提交于 2月 16, 2023
```
* support xpu multi-card infer

* add ut

* clean code

* clean code

* fix

* fix

* fix

* fix
```
517d8074

[Phi decouple] move layer_norm_kernel.cu.h to phi (#50506) · 8910bb4a

由 Huang Jiyi 提交于 2月 16, 2023

* move layer_norm_kernel.cu.h to phi

* fix bugs

* fix namespace

* fix bugs

* fix CI-Windwos

* replace mutable_data

* fix bugs

* fix bugs

8910bb4a

Use StandaloneExecutor in FleetExecutor (#50239) · df207283

由 Ruibiao Chen 提交于 2月 16, 2023

* Use StandaloneExecutor in FleetExecutor

* Update FLAGS

* Fix CI errors

* Update code

* Add force_root_scope_vars config

* Update code

* Fix CI errors

* Fix test_layer_new errors

df207283

[phi decoupling] remove variable.h in phi (#50407) · 905cefd4

由 Huang Jiyi 提交于 2月 16, 2023

* move variable_utils from phi_api_utils to fluid

* fix coment

* update include

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* update

* update

* fix CI-Windows-OpenBLAS

* fix bugs

* fix bugs

* fix bugs

* update include

* move variable_utils to phi_utils

* fix namespace

905cefd4

15 2月, 2023 5 次提交
- D
  
  fix npu save_combine (#50496) · 3c14b38e
  由 duanyanhui 提交于 2月 15, 2023
  
  3c14b38e
- Z
  
  delete onednn kernel of feed (#50503) · 8decfb78
  由 zyfncg 提交于 2月 15, 2023
  
  8decfb78
- Y
  [PHI Decoupling]Remove Profiler header (Part2) (#50183) · 8fabca11
  由 YuanRisheng 提交于 2月 15, 2023
```
* move profiler

* add file

* fix mac compile bugs

* fix ci bugs

* fix mac bugs

* fix ci bugs

* fix compile bugs

* perfect code according comment
```
  8fabca11
- L
  make FusedMultiTransformer supports variable-lengths. (#49560) · 53df50c7
  由 lzy 提交于 2月 15, 2023
```
* make FusedMultiTransformer supports variable-lengths.

* modify ffn2 when cuda_version >= 11.6 because of #49392.

* code style

* delete remove_padding
```
  53df50c7
- R
  fix some protobuf update problems (#49875) · d84b918b
  由 risemeup1 提交于 2月 15, 2023
```
* Improved prootbuf upgrades

* Improved prootbuf upgrades

* Improved prootbuf upgrades

* limit protobuf version>=3.20.0
```
  d84b918b
14 2月, 2023 1 次提交

Decrease usage of GetVecSize for optimizing host computation efficiency (#50353) · 976606fe

由 limingshu 提交于 2月 14, 2023

* first commit.

* a little changes

* add some changes for get vec_size efficiently

* fix bugs

---------
Co-authored-by: Nzhangbopd <1299246947@qq.com>

976606fe

13 2月, 2023 2 次提交

add xpu pool3d kernels (#50233) · 1281b612

由 ykkk2333 提交于 2月 13, 2023

* add xpu adagrad and where_grad kernels, test=kunlun

* add xpu pool3d kernels, test=kunlun

1281b612

Upgrade protobuf to 4.21.x (#49168) · 15d93394

由 risemeup1 提交于 2月 13, 2023

* upgrade protobuf to 3.19.0 in cmake

* recover protobuf python version

* fix distribute compile

* fix

* fix framework.data_feed_pb2

* fix macos ifdef

* fix lite

* test

* update protoc from 3.19.0 t0 3.20.0

* test

* debug

* test

* test

* debug

* debug

* debug

* debug

* test

* debug

* update protocol from 3.20.0 to 4.21.12

* modify graph_brpc_client.h

* modify graph_brpc_client.h

* test

* test

* test

* fix third_party cache problem on build ci

* updata proto

* test

* test

* test

* test

* test

* test

* fix coverage failed test

* try to fix test_exe_fleet_model_run

* fix cinn bug

* fix windows compile problem

* fix python/requirements

---------
Co-authored-by: Npangyoki <pangyoki@126.com>

15d93394

12 2月, 2023 1 次提交
- X
  
  [prim] generate static prim api (#50315) · 82cf1fad
  由 Xiaoxu Chen 提交于 2月 12, 2023
  
  82cf1fad
10 2月, 2023 2 次提交
- A
  Fix inferMefer in transpose2_grad (#50388) · 42a75145
  由 Aurelius84 提交于 2月 10, 2023
```
* Fix inferMefer in transpose2_grad

* fix infershape

* fix unittest
```
  42a75145
- Z
  
  [XPU] add fc_xpu op&pass to optimize ernie model (#50277) · 945f918c
  由 zhupengyang 提交于 2月 10, 2023
  
  945f918c
09 2月, 2023 4 次提交

Z
[trt][inference]support int64 shapetensor as engine input (#50170) · 14a92c8c
由 Zhang Jun 提交于 2月 09, 2023
```
* update

* support int64 shape tensor as engine input

* add inference_predictor ut
```
14a92c8c

[PHI decoupling] move strided_memcpy.h to phi (#50346) · 17318c1a

由 Huang Jiyi 提交于 2月 09, 2023

* decouple strided_memcpy

* move strided_memcpy

* move strided_memcpy to phi

* fix namespace

* update

* fix gpu compile bugs

17318c1a

Add MultiTenosrAdam OP (#49220) · 10654c77

由 yuehuayingxueluo 提交于 2月 09, 2023

* add multi_tenosr_adam

* update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py

* fix adam.py optimizer.py

* fix adamw.py

* fix test_multi_tensor_adam.py

* fix CI bug

* fix CI coverage

* fix ci bug

* fix betapow

* fix some bugs

* fix test_adamw_op.py

* fix CI coverage

* fix multi_tensor_adam_kernel.cc

* fix CI bug

* fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py

* fix code style

* update C++ parts

* remove python parts modification temporarily

* add C++ ut

* update betapow copy code logic

* fix ci ut

* fix windows ci

* fix coverage ci

* improve coverage rate

---------
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

10654c77

K
[BugFix][ConditionalBlock] fix judgement about scope validation (#50086) · 61f9f136
由 kangguangli 提交于 2月 09, 2023
```
* fix judgement about scope validation

* fix ci bug: same address is not enough for data consistency

* remove useless check
```
61f9f136

08 2月, 2023 2 次提交

fuse quantize+transpose and transpose+dequantize (#49509) · 197a4ffe

由 Paulina Gacek 提交于 2月 08, 2023

* QuantTranpose pattern is being found by pass

* quant + transpose fuse

* code style changes

* UT written, reorder fixed

* Dequantize + transpose2 fuse  added

* pass name changed

* UT added & shift corrected

* got rid of redundancy

* review changes

* AsIntermediate corrected

* compat added

197a4ffe

Y

Fused attention pass mp support (#50320) · e44ff495
由 Yuang Liu 提交于 2月 08, 2023

e44ff495

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功