提交 · 1eb30775cc42235d07f9cc73508a6e50d78d5d29 · PaddlePaddle / Paddle

18 4月, 2023 1 次提交
- 张
  
  remove mlu(#53007) · 4d5a3ad6
  由张春乔提交于 4月 18, 2023
  
  4d5a3ad6
17 4月, 2023 4 次提交

[Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a

由 zhoutianzi666 提交于 4月 17, 2023

* initial commit for cutlass_teller

* second commit for cutlass_teller

* add conv2d_depthwise python template

* add conv2d_depthwise cutlass template

* /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h

* refine code in Conv2dFusionCanSupport

* add macro in cutlass_teller.h

* add 3x3 5x5 teller

* add groups not 1 or conv2d_depthwise teller

* 只生成ic是8的倍数的conv2d_depthwise 的kernel

* add EXPLICIT in cutlass_teller.h

* final commit

* add split_k_slices in conv2d_depthwise

* make stages == 2

* 重构部分代码

* add CutlassFusionType

* solve illegal memory

* make stride_h=stride_w && make dilation==1

* must check HasAttr(use_cutlass) before GetAttrIfExists

* add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String

* modify decl.h and util.cu

bd3b096a

G

remove some [-Wunused-paramter] warning (#52924) · 337cc2ca
由 Galaxy1458 提交于 4月 17, 2023

337cc2ca

Add output defs for some kernelsPhi register (#52941) · 23f87442

由 Sonder 提交于 4月 17, 2023

* add register info for eigh and eig_gard

* add sync_batch_norm_op.cu register info

* add lamb output register info

* add unique register info

* change type name

* change type name

* add output register info for check_finite_and_unscale

* update cmake and config file

* add register info for adagrad

* fix build error

* add sync to run_unittests.sh

* add register info for unique_consecutive

* fix build error

* add eigh to STATIC_BUILD_TESTS

* update eig_kernel.cc

* update eig_kernel.cc

* fix infer mate error

* fix unique register error

* fix lamb register info error

* fix lamb register info

* update lamb register info

* fix lamb

* remove one Output Register

* update static build file

* add eigh op to disable_wingpu_test

* update run_unittests

23f87442

H

[Dygraph] Support delaying div loss by accumulate_steps in PipelineLayer (#52848) · 0abdcff6
由 Haohongxiang 提交于 4月 17, 2023

0abdcff6

14 4月, 2023 4 次提交

J
delete SupportNPU(), SupportMLU() (#52911) · 8601859e
由 jjyaoao 提交于 4月 14, 2023
```
* delete SupportNPU(), SupportMLU()

* delete npu branch
```
8601859e

1. modify set_value op, use Scalars to represent attr `values`, instead of a... · dd2a749a

由 Feiyu Chan 提交于 4月 14, 2023

1. modify set_value op, use Scalars to represent attr `values`, instead of a bunch of attributs of various types; (#52408)

2. add program converter and set_value op as an example, which provides the functionality to convert `paddle::framework::ProgramDesc` between old and new formats(the differences are mainly some operators with incompatible updates in the definition);
3. program version and operator version map now are always saved when serializing `paddle::framework::ProgramDesc` to identify the version;
3. provide an option `legacy_format=false` in serialization of `paddle::framework::ProgramDesc`, it decided whether to convert ProgramDesc back to a legacy format, which is compatible for paddle 2.4.2 or earlier versions to load and execute;
4. deserialization of `paddle::framework::ProgramDesc` is now automatically detecting whether the bytes it receives is in legacy format(contains any of the operators that has been incompatibly updated and have any attribute of type `Scalar`) and convert it to new format. But if you want a faithful deserialization without the automatic conversion, you can use protobuf's deserialization instead. Though it is not recommended, it can be used for the purpose of testing.

dd2a749a

Z

delete cast if lookup_table_v2 support fp16; delete repeated ops (#52888) · 7aafeb45
由 zhupengyang 提交于 4月 14, 2023

7aafeb45
K

rem cncl (#52434) · 25bd5ed8
由 Kim Yann 提交于 4月 14, 2023

25bd5ed8

13 4月, 2023 7 次提交
- W
  [Paddle-Trt] Replace fc mul matmul matmul_v2 with matrix_multiply (#52222) · ef734e84
  由 Wangzheee 提交于 4月 13, 2023
```
* Paddle-Trt: Replace fc mul matmul matmul_v2 with matrix_multiply
```
  ef734e84
- C
  
  Fix delete_isolated_node_pass problem (#52856) · 0f2dc4ca
  由 csy0225 提交于 4月 13, 2023
  
  0f2dc4ca
- H
  [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26
  由 HongyuJia 提交于 4月 13, 2023
```
* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h

* Add logging.h for profiler.cc

* Add logging.h for gloo_utils.h

* Add logging.h for addmm_kernel_impl.h

* Add logging.h for addmm_grad_kernel_impl.h

* Add logging.h for p_send_kernel.cu

* Add logging.h for determinant_grad_kernel_impl.h

* Add logging.h for p_recv_kernel.cu

* Add logging.h for elementwise_grad_base.h

* Add logging.h for transfer_layout_kernel.cc

* Add logging.h for eigvals_kernel.cc and index_select_impl.h

* Add logging.h for all files in kernel directory

* Add logging.h for xpu_info.cc

* Add logging.h for xpu
```
  5664ea26
- Z
  
  delete useless cast, elementwise_mul (#52831) · 0695fb88
  由 zhupengyang 提交于 4月 13, 2023
  
  0695fb88
- fix distributed comm context (#52787) · 1acb845a
  由 TaoTao Li 提交于 4月 13, 2023
  
  1acb845a
- N
  
  Support print stack when place=CUDAPlace (#52841) · e7652a37
  由 niuliling123 提交于 4月 13, 2023
  
  e7652a37
- C
  
  [XPU] Fix instance_norm、conv2d_xpu、inplace optimizer bugs. (#52627) · fa8abeec
  由 csy0225 提交于 4月 13, 2023
  
  fa8abeec
12 4月, 2023 2 次提交
- J
  
  [CINN] add cinn sub-graph save into graphviz flag (#52766) · 8d7c15a7
  由 jiangcheng 提交于 4月 12, 2023
  
  8d7c15a7
- Y
  
  move delete_cast_op_pass (#52788) · d12b1ffa
  由 Yuanle Liu 提交于 4月 12, 2023
  
  d12b1ffa
11 4月, 2023 4 次提交
- Y
  
  [Paddle Inference] Predictor support paddle::Tensor (#50445) · 10fd4a95
  由 Yuanle Liu 提交于 4月 11, 2023
  
  10fd4a95
- W
  
  [XPU] fix error pattern and rename max name (#52726) · 259b0aad
  由 wz1qqx 提交于 4月 11, 2023
  
  259b0aad
- W
  [AMP OP&Test]Add fp16/bf16 support isnan/isfinite/isinf op (#52259) · aaf873b2
  由 WJJ1995 提交于 4月 11, 2023
```
* add bfp16 test for isfinite

* fixed for ci

* deal with comments

* fixed test

* skip test in cpu

* deal with comments

* fixed for ci

* fixed testcase

* fixed for ci

* fixed for testcase
```
  aaf873b2
- W
  
  mp sync params & grads & opt states. (#51428) · 6b74cf76
  由 wuhuachaocoding 提交于 4月 11, 2023
  
  6b74cf76
10 4月, 2023 3 次提交

X
[Paddle Inference] Support two inputs of multihead attention named qk_multihead. (#52455) · 6934ac79
由 xiaoxiaohehe001 提交于 4月 10, 2023
```
* Support two inputs of multihead attention named qk_multihead
```
6934ac79

[Opt Performance] Optimize custom operator performance (#52597) · 01247e33

由 HongyuJia 提交于 4月 10, 2023

* [Opt Performance] Optimize custom operator performance, reconstruct python API auto-gen, add cache and use const inference

* opt AutoGradMeta implementation

* remove profiler codes

* fix unit test

* change year, 2021->2023

* fix int64_t parse bug

01247e33

[StandaloneExe] Remove flag about Executor (#52671) · d6ee0a13

由 kangguangli 提交于 4月 10, 2023

* add strategy force_sequential_run

* remove flag

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

d6ee0a13

08 4月, 2023 1 次提交
- K
  [StandaloneExe] add strategy force_sequential_run (#52652) · e1692dc7
  由 kangguangli 提交于 4月 08, 2023
```
* add strategy force_sequential_run

* fix

* fix

* fix

* fix

* fix
```
  e1692dc7
07 4月, 2023 2 次提交
- R
  fix_build_ci_error (#52576) · 8630375c
  由 risemeup1 提交于 4月 07, 2023
```
* fix_build_ci_error

* fix_build_ci_error

* fix_build_ci_error
```
  8630375c
- W
  
  clean up WITH_MLU (#52546) · e75c01f9
  由 Wang Xin 提交于 4月 07, 2023
  
  e75c01f9
06 4月, 2023 6 次提交

[StandaloneExe] improving sequentialRun mode of standaloneExecutor (#52111) · 14fe4b54

由 kangguangli 提交于 4月 06, 2023

* Verify SequentialRun Model of StandaloneExecutor

* fix

* fix

* fix

* remove redundant code

* fix CI

* fix CI

* recover multi-step dependency

14fe4b54

由 huangjiyi 提交于 4月 06, 2023

* update

* fix compile bug

* fix bug

* fix bug

* revert crop_op

* fix xpu compile

* fix cinn compile

* fix bug

* fix bug

* fix bug

* fix bug

* update

* update

* update

058ca61d

Remove oneDNN-specific attributes from matmul (#49444) · 4d97b25d

由 Sławomir Siwek 提交于 4月 06, 2023

* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces

* clean attrs in python tests

* delete checkpoint and restore matmul version

* remove unused code

* matmul and reshape/transpose fuses migrated

* split MatmulOneDNN headers

* fuse activation and eltwise_add

* add fuse_activation

* matmul_transpose_reshape/reshape_transpose_matmul

* matmul + elementwise_add (fused)

* activation temporary modifciation

* restore matmul(v1) version 0

* merge newest develop

* remove depedency from other PR

* revert pbtxt

* remove placeholders from matmul_v2

* add description in OPMaker

* remove matmul_v2_op.h and all depedencies

* remove dims changing in base op

* add possibility to fuse already fused_matmul

* restart broken CI

* Empty-Commit

* revert matmul_utils.h

* codestyle

* adjust imports

* add pbtxt file

* 100% matmul unit tests coverage

* trigger CI with minimal changes to develop

* adjust changes to develop

* add fused_matmul op

* inherit base ops

* add "v2"

* move OPMaker

* Gradually add fused_matmul files

* second batch of fused_matmul changes

* split infershapes of matmul_v2 and fused_matmul

* merge code from other PR

* 2023

* inherit fused_matmul from matmul_v2

* Update paddle/phi/backends/onednn/onednn_reuse.h
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* Update paddle/phi/kernels/fusion/onednn/fused_matmul_kernel.cc
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* resolve conflicts

* codestyle

* simplify isgemmlinear

* 2023

* remove import

* reuse methods

* matmul_v2_mkldnn cleanup

* simplify ExecuteMatMulV1Grad

* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* reduce numer of modified files

* adjust ExecuteMatmul

* add scales for ut

* dates

* limit number of modified files

* fluid imports

* remove alpha

* codestyle

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

4d97b25d

张

mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
由张春乔提交于 4月 06, 2023

80dd1672
J

[CINN] fix CINN graph symbolization topo sort fixed (#52556) · 2acc2b14
由 jiangcheng 提交于 4月 06, 2023

2acc2b14
X

[oneDNN]disable interpolate operators by default (#52462) · 690767ed
由 Xinyu Chen 提交于 4月 06, 2023

690767ed

04 4月, 2023 5 次提交

G
delete [-Wno-error=terminate], test=develop (#52490) · 15aa73df
由 Galaxy1458 提交于 4月 04, 2023
```
* delete [-Wno-error=terminate], test=develop

* remove GPUps[-Wterminate],test=develop
```
15aa73df
L
Autogen embedding static graph code (#52460) · 5b7c8f9e
由 lzydev 提交于 4月 04, 2023
```
* autogen embedding

* deal

* fix bug in CompatMetaTensor::share_lod
```
5b7c8f9e

Improve new executor static build (#51149) · 5bac67d4

由 Ruibiao Chen 提交于 4月 04, 2023

* Improve new executor static build

* Skip GC for static build

* Skip infershape for static build

* Handle read_op

* Add fused_attention to OpsWithFluidKernelNeedMoveToPhi

* Fix argsort typos

* Add sequence_pool to OpsWithFluidKernelNeedMoveToPhi

* Fix skip share lod errors

* Fix errors for adam

* Fix errors for eigvals, memcpy and fake_quantize

* Add static_build.cc

* Add black list

* Fix CI errors

* Fix CI errors

* Fix CI errors

* Fix TensorArray

* Fix TensorArray

* Add update_loss_scaling to OpsNeedSetOutputDtypeWhenRegisterPhiKernel

* Fix copy

* Fix errors

* Fix momentum

* Skip mkldnn

* Fix CI errors

* Fix c_sync_calc_stream_op

* Fix CINN

* Fix while op

* All CI pass, disable FLAGS to merge code, enable it after more tests in future

* Add UTs

* Fix typos

* Fix typos

* Add mkldnn UT

* Remove mkldnn test

* Fix typos

* Fix dist test

* Fix typos

* Fix CI errors

* Fix CI errors

* Add UTs

* Fix typos

* Fix typos

* Add sparse tests

* ToComplexType -> ToComplex

* Add test_matmul_op_static_build to disable_win_inference_test

5bac67d4

H
change skip-layernorm to adapt a new method (#52456) · 8a66d999
由 handiz 提交于 4月 04, 2023
```
* change skip-layernorm to adapt a new method

* fix review problem and add vlog

* fix review problem
```
8a66d999
C

Fix inplace op dims not changed (#52416) · 8e7aa296
由 csy0225 提交于 4月 04, 2023

8e7aa296

03 4月, 2023 1 次提交
- H
  [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input (#52421) · 59c9d75e
  由 HongyuJia 提交于 4月 03, 2023
```
* [CustomOP Optional Inplace] Custom operator supports inplace optional vector Tensor input

* uncomment unittest codes
```
  59c9d75e

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功