提交 · 4d97b25d1838ec89af4f4e156f9eb004fb314841 · PaddlePaddle / Paddle

06 4月, 2023 4 次提交

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

张

mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
由张春乔提交于 4月 06, 2023

80dd1672
J

[CINN] fix CINN graph symbolization topo sort fixed (#52556) · 2acc2b14
由 jiangcheng 提交于 4月 06, 2023

2acc2b14
X

[oneDNN]disable interpolate operators by default (#52462) · 690767ed
由 Xinyu Chen 提交于 4月 06, 2023

690767ed

04 4月, 2023 5 次提交

G
delete [-Wno-error=terminate], test=develop (#52490) · 15aa73df
由 Galaxy1458 提交于 4月 04, 2023
```
* delete [-Wno-error=terminate], test=develop

* remove GPUps[-Wterminate],test=develop
```
15aa73df
L
Autogen embedding static graph code (#52460) · 5b7c8f9e
由 lzydev 提交于 4月 04, 2023
```
* autogen embedding

* deal

* fix bug in CompatMetaTensor::share_lod
```
5b7c8f9e

Improve new executor static build (#51149) · 5bac67d4

由 Ruibiao Chen 提交于 4月 04, 2023

* Improve new executor static build

* Skip GC for static build

* Skip infershape for static build

* Handle read_op

* Add fused_attention to OpsWithFluidKernelNeedMoveToPhi

* Fix argsort typos

* Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi

* Fix skip share lod errors

* Fix errors for adam

* Fix errors for eigvals, memcpy and fake_quantize

* Add static_build.cc

* Add black list

* Fix CI errors

* Fix CI errors

* Fix CI errors

* Fix TensorArray

* Fix TensorArray

* Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel

* Fix copy

* Fix errors

* Fix momentum

* Skip mkldnn

* Fix CI errors

* Fix c_sync_calc_stream_op

* Fix CINN

* Fix while op

* All CI pass, disable FLAGS to merge code, enable it after more tests in future

* Add UTs

* Fix typos

* Fix typos

* Add mkldnn UT

* Remove mkldnn test

* Fix typos

* Fix dist test

* Fix typos

* Fix CI errors

* Fix CI errors

* Add UTs

* Fix typos

* Fix typos

* Add sparse tests

* ToComplexType -> ToComplex

* Add test_matmul_op_static_build to disable_win_inference_test

5bac67d4

H
change skip-layernorm to adapt a new method (#52456) · 8a66d999
由 handiz 提交于 4月 04, 2023
```
* change skip-layernorm to adapt a new method

* fix review problem and add vlog

* fix review problem
```
8a66d999
C

Fix inplace op dims not changed (#52416) · 8e7aa296
由 csy0225 提交于 4月 04, 2023

8e7aa296

03 4月, 2023 3 次提交
- H
  [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input (#52421) · 59c9d75e
  由 HongyuJia 提交于 4月 03, 2023
```
* [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input

* uncomment unittest codes
```
  59c9d75e
- remove WITH_ASCEND_CL PADDLE_WITH_ASCEND_CL WITH_ASCEND_CXX11 (#52448) · 0b60f28c
  由 engineer1109 提交于 4月 03, 2023
  
  0b60f28c
- W
  
  [XPU]add conv_fuse pass && kernel (#52247) · eddf1ad6
  由 wz1qqx 提交于 4月 03, 2023
  
  eddf1ad6
01 4月, 2023 1 次提交

Delete the /paddle/fluid/platform/device/npu directory (#52384) · 69436bf5

由 jjyaoao 提交于 4月 01, 2023

* Delete the /paddle/fluid/platform/device/npu directory

* clear Cmakelists

* Try removing npu in the header file

69436bf5

31 3月, 2023 4 次提交
- L
  
  fix bug in op_desc (#52396) · 07c7926f
  由 Leo Chen 提交于 3月 31, 2023
  
  07c7926f
- H
  [CustomOP Optional Inplace] Custom op supports inplace optional tensor (#52216) · fcd77346
  由 HongyuJia 提交于 3月 31, 2023
```
* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete custom_inplace_setup.py

* [CustomOP Optional Inplace] Custom operator supports inplace optional Tensor input

* fix bug for vector<Tensor> inplace test
```
  fcd77346
- Y
  [PHI Decoupling]Remove distribute header (#52202) · e923642e
  由 YuanRisheng 提交于 3月 31, 2023
```
* remove distribute

* fix py3 bugs

* fix gpu-ps bugs

* fix compile bugs

* fix unittest bugs
```
  e923642e
- W
  [Paddle-TRT] fix skiplayernorm, add trt_version check (#52342) · 4e23af72
  由 Wangzheee 提交于 3月 31, 2023
```
* fix skiplayernorm, add trt_version check
```
  4e23af72
30 3月, 2023 7 次提交
- Z
  
  [XPU] add delete_cast_op_pass (#52305) · 8b622d58
  由 zhupengyang 提交于 3月 30, 2023
  
  8b622d58
- P
  Speedup worker (#51760) · 8ca86d72
  由 pangengzheng 提交于 3月 30, 2023
```
* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory
```
  8ca86d72
- H
  register fluid kerenls to phi [part 1] (#52014) · 93d01787
  由 huangjiyi 提交于 3月 30, 2023
```
* update assign_pos

* update attention_lstm

* update barrier

* update batch_fc

* update beam_search

* update beam_search_decode

* update bilateral_slice

* fix bug

* Handle Structure kernel for InterpreterCore::RunOperator

* fix bug

* fix rocm compile

* fix rocm compile

* Revert "fix rocm compile"

* test

* revert test and update cmake

---------
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
```
  93d01787
- Z
  
  [XPU] add delete_concat_op_pass (#52304) · 70ebef81
  由 zhupengyang 提交于 3月 30, 2023
  
  70ebef81
- F
  
  rename Scalar related utility functions(use CamelCase) (#52280) · e5a0dc31
  由 Feiyu Chan 提交于 3月 30, 2023
  
  e5a0dc31
- R
  
  Skip device transfer when arg-defs is set to Allbackend (#52294) · 54497c47
  由 Ruibiao Chen 提交于 3月 30, 2023
  
  54497c47
- S
  [BugFix]Fix segment fault in order setting (#52293) · d2cdc7e3
  由 ShenLiang 提交于 3月 29, 2023
```
* fix bug in proto

* add utest
```
  d2cdc7e3
29 3月, 2023 3 次提交

Add output defines for graph_sample_neighbors and group_norm (#51503) · 37bd7e78

由 hjyp 提交于 3月 29, 2023

* regist output type for GraphSampleNeighbors and GroupNorm

* Update return type

* fix return type

* update

* fix detail

37bd7e78

Z

[XPU] optimize pass (#52099) · 599388e3
由 zhupengyang 提交于 3月 29, 2023

599388e3

Add Fuse Adamw Pass (#50484) · 66098bff

由 yuehuayingxueluo 提交于 3月 29, 2023

* add fuse adamw pass

* fix some bugs

* fix CIbug

* change chunk_size

* fix CI bug

* rm test_fused_adam_op.py

* fix CI bugs

* fix fuse_adamw_op_pass.cc

* change code style

* fix CI bug

* fix ut bug and use_adamw_op_pass.cc

* fix test_fuse_adamw_pass.py

* fix CI bug

* remove fluid

* fix ci bug

* fix CI bug

66098bff

28 3月, 2023 2 次提交

Add basic functionalities to support Scalar & Scalars in op attr (#51984) · 2e9fd5e4

由 Feiyu Chan 提交于 3月 28, 2023

Add basic functionalities to support Scalar & Scalars in operator attribute.

1. extend allowed types in operator's attribute type, add `paddle::experimental::Scalar`, add corresponding protobuf Message types;
2. Scalar enhancement, add formatting, equality;
3. add code to handle Scalar & Scalars in opmaker, conversion from paddle operator to phi kernel, opdesc construction and manipulation, tensorrt converter, tracer, operator construction, etc;
4. bind `paddle::experimental::Scalar` to python, as `libpaddle.Scalar`;
5. add functionality to canonicalize attribute map according to OpProto(if the op the attribute map used for has an OpProto);
6. add code to manipulate Scalar proto message via protobuffer python API;

Add unittests.

1. add test cases for formatting, equality for Scalars, and WrapAsScalars;
2. add test cases for 'casting' between different morphs of attributes;
3. add test cases for extracting scalar & scalars from attribute;
4. add test cases for CanonicalizeScalarAttrs(and fix a bug in type index offset);
5. fix gmock's library filename on windows platform.
6. clean code: use canonicalize_attrs instead of inlining the function;
7. add test cases for libpaddle.Scalar in python code.
8. add test cases for `make_scalar_proto`, which manipulate proto message `Scalar` via protobuffer python API.

2e9fd5e4

Z

[XPU] fix bug of AnalyseOpFuncType about xpu op : memcpy_d2d of xpu is actually async (#52042) · 93d20c44
由 ZhouMengLei1999 提交于 3月 28, 2023

93d20c44

27 3月, 2023 5 次提交

Y
[PHI]Support register functor kernel into PHI (#51914) · bcea3b89
由 YuanRisheng 提交于 3月 27, 2023
```
* perfect structure kernel registry

* fix ci bugs
```
bcea3b89
A

[NewExe]Adjust ExecutorCache Capacity from 4 into 10 (#52104) · 897fb6ab
由 Aurelius84 提交于 3月 27, 2023

897fb6ab

[CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output (#52114) · 04025237

由 HongyuJia 提交于 3月 27, 2023

* [CustomOP Inplace] Automap inplace dtype and shape, prepare for vector<Tensor> output

* delete dtype,shape func of multi_inplace op

* [CustomOP Inplace] Automap inplace dtype and shape, support vector<Tensor> output

04025237

R
fix_gcc12_error (#52083) · f7267412
由 risemeup1 提交于 3月 27, 2023
```
* fix_gcc12_error

* fix gcc12 error

* fix gcc12 error
```
f7267412

Fused elementwise_(mul/div) (#50428) · 968f7f24

由 Sławomir Siwek 提交于 3月 27, 2023

* extract Op and OPMaker to .h

* extend pattern for fused_op

* set "with_residual" default to false

* adjust fuse passes

* remove fc+eltwise flag

* fused_output_scale

* activation attrs

* remove extra attrs

* fix int8/bf16 unit tests

* simplify RecomputeOutputDims

* remove unused method

* Add description for attributes

* add extra check

* adjust op compats

* update quantize test

* fix protobuf parsing error

* fix int8 performance

* fused elementwises

* merge develop

* remove activation

* restore activation for existing add/sub ops

968f7f24

23 3月, 2023 5 次提交
- H
  
  [CustomOP Optional] CustomOP supports optional vector<Tensor> input (#51973) · 6a10e604
  由 HongyuJia 提交于 3月 23, 2023
  
  6a10e604
- add output defs for clip_by_norm kernel (#51993) · 33897a95
  由 iSerendipity 提交于 3月 23, 2023
  
  33897a95
- S
  Remove fluid deps in fused_linear_param_grad_add_kernel.cu (#51975) · 5da1a27b
  由 sneaxiy 提交于 3月 23, 2023
```
* remove fluid deps in fused_linear_param_grad_add_kernel

* fix compile error

* fix ut error

* follow comments
```
  5da1a27b
- H
  register fluid activation kernel to phi (#51927) · aaa14780
  由 Huang Jiyi 提交于 3月 23, 2023
```
* update

* update

* update

* update

* update

* fix test
```
  aaa14780
- P
  [PHI] Add nanmedian output defs (#51358) · a82911a5
  由 PuQing 提交于 3月 23, 2023
```
* add nanmedian output defs

* remove the multiclass_nms3 momentum
```
  a82911a5
22 3月, 2023 1 次提交

[Zero-Dim] Support 0-D tensor for some oneDNN unary kernels (#51687) · 2a3d75bc

由 YangQun 提交于 3月 22, 2023

* support 0-d tensor for element wise unary ops

* fix python code style check

* fix approval check

* support 0-d tensor for onednn softmax and logsoftmax kernels

* fix commnets

* fix some unittests

2a3d75bc

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功