提交 · 6d396ace08f03a11911ac19b0d167b80b7a3053d · PaddlePaddle / Paddle

08 5月, 2023 2 次提交
- 张
  
  rm npu (#53566) · 6d396ace
  由张春乔提交于 5月 08, 2023
  
  6d396ace
- W
  
  [XPU] Optimize fp16 xpu models (#53523) · 0a59825e
  由 wz1qqx 提交于 5月 08, 2023
  
  0a59825e
05 5月, 2023 2 次提交
- S
  
  [XPU] Fusion of gather and assign operators to fused_mt op for reducing memory usage (#53262) · 2039115c
  由 shentanyue 提交于 5月 05, 2023
  
  2039115c
- S
  
  [XPU] Fix the out_max of the branch in xpu_conv2d op(#53343) · d27f15ed
  由 sprouteer 提交于 5月 05, 2023
  
  d27f15ed
04 5月, 2023 1 次提交
- W
  
  Fix a bug in constant folding pass (#53456) · ace61b8b
  由 weishengying 提交于 5月 04, 2023
  
  ace61b8b
27 4月, 2023 1 次提交
- Z
  
  xpu quant weight only (#53306) · 1c97aa69
  由 zhupengyang 提交于 4月 27, 2023
  
  1c97aa69
25 4月, 2023 2 次提交
- S
  
  [XPU][BUG] Fix link_xpu_op_max_pass bug (#53258) · be1b3fc3
  由 sprouteer 提交于 4月 25, 2023
  
  be1b3fc3
- Y
  [PHI]Add flags macro for PHI (#52991) · 22e96bde
  由 YuanRisheng 提交于 4月 25, 2023
```
* add flags for phi

* fix compile bugs

* fix ci bugs

* fix inference bugs

* fix cinn' bugs

* fix cinn bugs

* perfect code according comment

* fix ci bugs

* fix ci bugs
```
  22e96bde
24 4月, 2023 2 次提交
- Z
  
  transform cachekv datalayout of fused_multi_transformer_xpu (#53144) · bfa5d6b8
  由 zhupengyang 提交于 4月 24, 2023
  
  bfa5d6b8
- G
  remove some [-Wunused-parameter] (#53185) · 834eb2ba
  由 Galaxy1458 提交于 4月 24, 2023
```
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test ,test=develop
```
  834eb2ba
23 4月, 2023 1 次提交

remove some [-Wunused-parameter] (#53162) · b02687cc

由 Galaxy1458 提交于 4月 23, 2023

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

b02687cc

21 4月, 2023 1 次提交
- Z
  
  optimize write_read_array, gather if beam_size=1 (#53130) · e8e9d6c5
  由 zhupengyang 提交于 4月 21, 2023
  
  e8e9d6c5
19 4月, 2023 1 次提交
- C
  
  fix_delete_repeated_ops_pass bug (#53042) · b64b8163
  由 csy0225 提交于 4月 19, 2023
  
  b64b8163
17 4月, 2023 1 次提交

[Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a

由 zhoutianzi666 提交于 4月 17, 2023

* initial commit for cutlass_teller

* second commit for cutlass_teller

* add conv2d_depthwise python template

* add conv2d_depthwise cutlass template

* /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h

* refine code in Conv2dFusionCanSupport

* add macro in cutlass_teller.h

* add 3x3 5x5 teller

* add groups not 1 or conv2d_depthwise teller

* 只生成ic是8的倍数的conv2d_depthwise 的kernel

* add EXPLICIT in cutlass_teller.h

* final commit

* add split_k_slices in conv2d_depthwise

* make stages == 2

* 重构部分代码

* add CutlassFusionType

* solve illegal memory

* make stride_h=stride_w && make dilation==1

* must check HasAttr(use_cutlass) before GetAttrIfExists

* add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String

* modify decl.h and util.cu

bd3b096a

14 4月, 2023 1 次提交
- Z
  
  delete cast if lookup_table_v2 support fp16; delete repeated ops (#52888) · 7aafeb45
  由 zhupengyang 提交于 4月 14, 2023
  
  7aafeb45
13 4月, 2023 5 次提交

W
[Paddle-Trt] Replace fc mul matmul matmul_v2 with matrix_multiply (#52222) · ef734e84
由 Wangzheee 提交于 4月 13, 2023
```
* Paddle-Trt: Replace fc mul matmul matmul_v2 with matrix_multiply
```
ef734e84
C

Fix delete_isolated_node_pass problem (#52856) · 0f2dc4ca
由 csy0225 提交于 4月 13, 2023

0f2dc4ca

[enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26

由 HongyuJia 提交于 4月 13, 2023

* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h

* Add logging.h for profiler.cc

* Add logging.h for gloo_utils.h

* Add logging.h for addmm_kernel_impl.h

* Add logging.h for addmm_grad_kernel_impl.h

* Add logging.h for p_send_kernel.cu

* Add logging.h for determinant_grad_kernel_impl.h

* Add logging.h for p_recv_kernel.cu

* Add logging.h for elementwise_grad_base.h

* Add logging.h for transfer_layout_kernel.cc

* Add logging.h for eigvals_kernel.cc and index_select_impl.h

* Add logging.h for all files in kernel directory

* Add logging.h for xpu_info.cc

* Add logging.h for xpu

5664ea26

Z

delete useless cast, elementwise_mul (#52831) · 0695fb88
由 zhupengyang 提交于 4月 13, 2023

0695fb88
C

[XPU] Fix instance_norm、conv2d_xpu、inplace optimizer bugs. (#52627) · fa8abeec
由 csy0225 提交于 4月 13, 2023

fa8abeec

12 4月, 2023 1 次提交
- Y
  
  move delete_cast_op_pass (#52788) · d12b1ffa
  由 Yuanle Liu 提交于 4月 12, 2023
  
  d12b1ffa
11 4月, 2023 1 次提交
- W
  
  [XPU] fix error pattern and rename max name (#52726) · 259b0aad
  由 wz1qqx 提交于 4月 11, 2023
  
  259b0aad
10 4月, 2023 1 次提交
- X
  [Paddle Inference] Support two inputs of multihead attention named qk_multihead. (#52455) · 6934ac79
  由 xiaoxiaohehe001 提交于 4月 10, 2023
```
* Support two inputs of multihead attention named qk_multihead
```
  6934ac79
06 4月, 2023 3 次提交

由 huangjiyi 提交于 4月 06, 2023

* update

* fix compile bug

* fix bug

* fix bug

* revert crop_op

* fix xpu compile

* fix cinn compile

* fix bug

* fix bug

* fix bug

* fix bug

* update

* update

* update

058ca61d

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

X

[oneDNN]disable interpolate operators by default (#52462) · 690767ed
由 Xinyu Chen 提交于 4月 06, 2023

690767ed

04 4月, 2023 1 次提交
- H
  change skip-layernorm to adapt a new method (#52456) · 8a66d999
  由 handiz 提交于 4月 04, 2023
```
* change skip-layernorm to adapt a new method

* fix review problem and add vlog

* fix review problem
```
  8a66d999
03 4月, 2023 1 次提交
- W
  
  [XPU]add conv_fuse pass && kernel (#52247) · eddf1ad6
  由 wz1qqx 提交于 4月 03, 2023
  
  eddf1ad6
31 3月, 2023 2 次提交
- Y
  [PHI Decoupling]Remove distribute header (#52202) · e923642e
  由 YuanRisheng 提交于 3月 31, 2023
```
* remove distribute

* fix py3 bugs

* fix gpu-ps bugs

* fix compile bugs

* fix unittest bugs
```
  e923642e
- W
  [Paddle-TRT] fix skiplayernorm, add trt_version check (#52342) · 4e23af72
  由 Wangzheee 提交于 3月 31, 2023
```
* fix skiplayernorm, add trt_version check
```
  4e23af72
30 3月, 2023 2 次提交
- Z
  
  [XPU] add delete_cast_op_pass (#52305) · 8b622d58
  由 zhupengyang 提交于 3月 30, 2023
  
  8b622d58
- Z
  
  [XPU] add delete_concat_op_pass (#52304) · 70ebef81
  由 zhupengyang 提交于 3月 30, 2023
  
  70ebef81
29 3月, 2023 2 次提交

Z

[XPU] optimize pass (#52099) · 599388e3
由 zhupengyang 提交于 3月 29, 2023

599388e3

Add Fuse Adamw Pass (#50484) · 66098bff

由 yuehuayingxueluo 提交于 3月 29, 2023

* add fuse adamw pass

* fix some bugs

* fix CIbug

* change chunk_size

* fix CI bug

* rm test_fused_adam_op.py

* fix CI bugs

* fix fuse_adamw_op_pass.cc

* change code style

* fix CI bug

* fix ut bug and use_adamw_op_pass.cc

* fix test_fuse_adamw_pass.py

* fix CI bug

* remove fluid

* fix ci bug

* fix CI bug

66098bff

27 3月, 2023 1 次提交

Fused elementwise_(mul/div) (#50428) · 968f7f24

由 Sławomir Siwek 提交于 3月 27, 2023

* extract Op and OPMaker to .h

* extend pattern for fused_op

* set "with_residual" default to false

* adjust fuse passes

* remove fc+eltwise flag

* fused_output_scale

* activation attrs

* remove extra attrs

* fix int8/bf16 unit tests

* simplify RecomputeOutputDims

* remove unused method

* Add description for attributes

* add extra check

* adjust op compats

* update quantize test

* fix protobuf parsing error

* fix int8 performance

* fused elementwises

* merge develop

* remove activation

* restore activation for existing add/sub ops

968f7f24

22 3月, 2023 5 次提交

J

Correct lstm qat test (#51499) · 31f81685
由 joanna.wozna.intel 提交于 3月 22, 2023

31f81685

Add fused_feed_forward pass (#50423) · 5dda0ef6

由 Ghost Screaming 提交于 3月 22, 2023

* Add fused_feed_forward pass for semi-automatic static graph training.

* Add fused_feedforward property in parallel_executor.cc

* Polish code.

* Polish fused feed_forward pass code. Support use_dropout1 and
use_dropout2 option.

* Support model parallel in fused_feedforward pass.

5dda0ef6

Extract fused_transpose op dedicated for oneDNN fuse passes (#50021) · 02296977

由 Sławomir Siwek 提交于 3月 22, 2023

* extract common methods to reuse

* add header for transpose ops

* fused_transpose

* Split big function

* transpose2 tests

* fused_transpose

* Apply extra attributes

* add pbtxt file

* update pbtxt

* Merge develop

* add more strict op compats

* code  style

* remove mkldnn_data_type

* unify SetOutMemDescWithReshape2FuseSupport

* adjust quantize-dequantize for transpose

* remove appendact

* transpose2 quantization

* fix int8 tests

* adjust transpose_op to current develop

* delete fusion code from transpose_kernel

* add fused transpose to NHWC unittest

* change order

02296977

Z

[XPU] optimize graph if beam_size=1 (#51732) · 720b14e3
由 zhupengyang 提交于 3月 22, 2023

720b14e3
S

remove duplicate mkldnn_data_type (#51598) · 80472116
由 Sylwester Fraczek 提交于 3月 21, 2023

80472116

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功