提交 · e337d2807a6da9ba70f7a56b334aae781066215e · PaddlePaddle / Paddle

30 11月, 2022 5 次提交
- Z
  Fix the name map of operator from Phi to fluid (#48496) · e337d280
  由 zyfncg 提交于 11月 30, 2022
```
* rename some kernel name

* fix compile problem
```
  e337d280
- Z
  Fix bug of wrong eigen dependency (#48485) · 35902ec6
  由 zyfncg 提交于 11月 30, 2022
```
* fix bug of eigen_dependency

* fix xpu compile
```
  35902ec6
- R
  Add int8 support in fused_multi_transformer_pass and fuse_multi_transformer_layer_pass (#48209) · 12486712
  由 RichardWooSJTU 提交于 11月 30, 2022
```
* delete unnecessary shape and slice op
Co-authored-by: NYour Name <you@example.com>
```
  12486712
- J
  use correct xpu stream for synchronization (#48470) · 16562a9d
  由 james 提交于 11月 30, 2022
```
some legacy code still use xpu_wait() for stream sync -- it only syncs
default stream. this PR replaces them with dev_ctx.Wait() to ensure
that correct stream is always used
```
  16562a9d
- Z
  
  optimize for argsort with xpu, test=kunlun (#48440) · 7bf7e6e0
  由 zhangyikun02 提交于 11月 30, 2022
  
  7bf7e6e0
29 11月, 2022 19 次提交

由 lzy 提交于 11月 29, 2022

* fix mma_tensorcore (__CUDA_ARCH__)

* disable tensorcore by default.

disable tensorcore by default, because the judgment of __CUDA_ARCH__ will cause undefined behavior in some environments, can manually enable it on a machine that supports tensorcore.

bf4d1792

H

rename use_cudnn to use_gpudnn in phi (#48443) · 41f15537
由 HongyuJia 提交于 11月 29, 2022

41f15537

[PHI] traspose2 kernel migration (#47748) · d86aa4ca

由 Paulina Gacek 提交于 11月 29, 2022

* traspose2 kernel migrated

* Got rid of mutable_data

* x modification added

* ops added in extra info file

* Formatting fix

* 2 fuse passes with tanpose2 commented

* nr of outs changed in 2 passes, passes uncommented

* Changes in passes reverted

* transpose chnaged in operator.cc

* MKLDNN check in operator.cc

* Transpose fixes

* Fix deleted from operato

* template corrected
Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>

d86aa4ca

张

Replace LoDTensor with phi::DenseTensor in fluid\operators (#48417) · 91dd8a2e

由张春乔提交于 11月 29, 2022

* replace LoDTensor with phi::DenseTensor in fluid\operators

* replace LoDTensor with phi::DenseTensor in fluid\operators

* Update split_lod_tensor_op.cc

* Update warpctc_op.cc

* Update broadcast_tensors_op.cc

* Update crf_decoding_op.cc

* Update lstm_op.cc

* Update lstm_op.cc

* Update lod_reset_op.cc

* Update gru_op.cc

* Update linear_chain_crf_op.cc

* resume 2 files for confilct

* Update gru_op.cc

* Update linear_chain_crf_op.cc

* Update lstm_op.cc

91dd8a2e

N
[CodeStyle][isort] introduce isort (part4) (#48402) · f85def97
由 Nyakku Shigure 提交于 11月 29, 2022
```
* isort all files

* revert conflicting files

* revert conflicting files

* revert conflicting files
```
f85def97
X

[Paddle Inference] Add take_along_axis trt converter (#48358) · 9ae6c854
由 xiaoxiaohehe001 提交于 11月 29, 2022

9ae6c854
A
[PHI decoupling]migrate enforce_custom.h from fluid to phi (#48422) · 9896ac1e
由 Asthestarsfalll 提交于 11月 29, 2022
```
* migrate enforce_custom.h from fluid to phi

* move to backends/custom/
```
9896ac1e
S

eltwise_div + scale [PHI] (#48484) · fa10524d
由 Sławomir Siwek 提交于 11月 29, 2022

fa10524d
V
Optimize the implementation of the argsort operator. (#47738) · 9e9b705a
由 Vvsmile 提交于 11月 29, 2022
```
Optimize the implementation of the argsort operator
```
9e9b705a

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

[Control Flow] replace executor in while op with InterpreterCore (#47573) · 6dbfbfa5

由 kangguangli 提交于 11月 29, 2022

* fix:add no support for cuda_arch<700

* replace Executor in while op with InterpreterCore

* cache InterpreterCore as the member of WhileOp

* fix bug: tensor place changed because of assign op in while loop

* refine code

* refine code

* refine code

* hot fix

* fix compile

* merge develop

* follow comments

* add log for test

* remove LoDTensor

* set flag control_flow_use_new_executor false
Co-authored-by: Nfengshuai <fengshuai03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

6dbfbfa5

H

add floor fp32 op *test=kunlun (#48458) · 9d4b4be3
由 haosicheng 提交于 11月 29, 2022

9d4b4be3
J
Bugfix for Collective default calc stream (#48308) · a66bb67a
由 JZ-LIANG 提交于 11月 29, 2022
```
* get default calc stream from execution ctx instead of global dev ctx pool.
```
a66bb67a
G

Support rsqrt op. (#48223) · fc882c7b
由 gem5 提交于 11月 29, 2022

fc882c7b

[Fluid API]Remove multiple APIs in control_flow (#48279) · c0d31dac

由 LiYuRio 提交于 11月 29, 2022

* remove lod_tensor_to_array, array_to_lod_tensor, DynamicRNN

* remove less_equal, greater_than, greater_equal, equal, not_equal

c0d31dac

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

Generate static graph code for lerp by yaml (#48322) · d5387de2

由 HappyHeavyRain 提交于 11月 29, 2022

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* remove the 'attrs' of lerp, test=develop
Signed-off-by: lizhiyu02 <1528794076@qq.com>
Signed-off-by: lizhiyu02 <1528794076@qq.com>

d5387de2

Z

[Sparse]BatchNorm use inplace (#48254) · d33d6db0
由 zhangkaihuo 提交于 11月 29, 2022

d33d6db0
Z

group the index in not cutlass mode (#48439) · 41ba2722
由 zhangkaihuo 提交于 11月 29, 2022

41ba2722

28 11月, 2022 16 次提交
- S
  
  eltwises + scale fuse pass (#48400) · a0930484
  由 Sławomir Siwek 提交于 11月 28, 2022
  
  a0930484
- J
  Reenabled reshape, squeeze and flatten oneDNN kernels (#48359) · 98aaf797
  由 jakpiase 提交于 11月 28, 2022
```
* re-enabled reshape, squeeze and flatten kernels

* added formatting
```
  98aaf797
- W
  fix: multihead matmul biasqk broadcast support for [1,1,seq,seq] shape (#47975) · 11b9d85f
  由 Wang Bojun 提交于 11月 28, 2022
```
* add trt support
```
  11b9d85f
- Z
  Generate static graph code for some ops by yaml (part5) (#48284) · b5c6c36c
  由 zyfncg 提交于 11月 28, 2022
```
* generate static graph code for some operators

* add some ops generate

* revert npu gelu
```
  b5c6c36c
- H
  [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
  由 huangjiyi 提交于 11月 28, 2022
```
* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug
```
  fd9c91c3
- 张
  
  replace LoDTensor with phi::DenseTensor in fluid\operators\*\ except sequence_ops (#48418) · 30a31a53
  由张春乔提交于 11月 28, 2022
  
  30a31a53
- Y
  Optimize the log of broadcast and decrease the log level. (#48327) · 8424cf28
  由 Yiqun Liu 提交于 11月 28, 2022
```
* Optimize the log of broadcast and decrease the log level.

* Remove the redundant brackets.

* Change op benchmark ci to test the tests module.

* Remove the observe of elementwise and reduce_ops sub-directory.
```
  8424cf28
- Y
  [BugFix]Fix OneDNN Kernels Bug when use pass (#48364) · df82fd35
  由 YuanRisheng 提交于 11月 28, 2022
```
* Fix onednn kernel bugs

* fix gpu bugs
```
  df82fd35
- A
  
  migrate top_k_function_cuda.h from fluid to phi (#48251) · b4b926f4
  由 Asthestarsfalll 提交于 11月 28, 2022
  
  b4b926f4
- P
  
  add cpu_info.h (#48403) · 923ad5dc
  由 PuQing 提交于 11月 28, 2022
  
  923ad5dc
- Q
  [NPU] apply npu_identity to conv bn and copy2cpu, test=develop (#48039) · 32143f44
  由 Qi Li 提交于 11月 28, 2022
```
* [NPU] apply npu_identity to conv bn and copy2cpu, test=develop

* update npu identity to share data with x, test=develop

* address review comments, test=develop
```
  32143f44
- Z
  Add trace mode for interpretercore (#48370) · bb1fffd6
  由 zhangbo9674 提交于 11月 28, 2022
```
* add trace mode for interpretercore

* fix bug

* add a ctrl flag

* add record for memcpyd2h

* polish code

* polish code
```
  bb1fffd6
- R
  Remove kSyncRun in StreamAnalyzer (#48425) · e7d459ac
  由 Ruibiao Chen 提交于 11月 28, 2022
```
* Remove kSyncRun in StreamAnalyzer

* Update code
```
  e7d459ac
- H
  [Phi decouple] remove dependece to "paddle/fluid/platform/device/xpu/xxx.h" in phi (#48420) · 2bae75ed
  由 huangjiyi 提交于 11月 28, 2022
```
* rm fluid “xpu_header.h” deps in phi

* move part of xpu_op_list.h from fluid to phi

* add fluid xpu_op_list deps

* add glog deps for xpu_op_list in phi

* fix PR-CI-Kunlun
```
  2bae75ed
- Z
  Fix bug of TransToFluidOpName (#48355) · d3f52efd
  由 zyfncg 提交于 11月 28, 2022
```
* add fluid_op_name_map

* rename some kernel name

* add comments for op-kernel map

* refine map name of op to kernel
```
  d3f52efd
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功