提交 · de443726c837797175d0aabd5e4493c6595b8b41 · PaddlePaddle / Paddle

29 11月, 2022 10 次提交

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

[Control Flow] replace executor in while op with InterpreterCore (#47573) · 6dbfbfa5

由 kangguangli 提交于 11月 29, 2022

* fix:add no support for cuda_arch<700

* replace Executor in while op with InterpreterCore

* cache InterpreterCore as the member of WhileOp

* fix bug: tensor place changed because of assign op in while loop

* refine code

* refine code

* refine code

* hot fix

* fix compile

* merge develop

* follow comments

* add log for test

* remove LoDTensor

* set flag control_flow_use_new_executor false
Co-authored-by: Nfengshuai <fengshuai03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

6dbfbfa5

H

add floor fp32 op *test=kunlun (#48458) · 9d4b4be3
由 haosicheng 提交于 11月 29, 2022

9d4b4be3
J
Bugfix for Collective default calc stream (#48308) · a66bb67a
由 JZ-LIANG 提交于 11月 29, 2022
```
* get default calc stream from execution ctx instead of global dev ctx pool.
```
a66bb67a
G

Support rsqrt op. (#48223) · fc882c7b
由 gem5 提交于 11月 29, 2022

fc882c7b

[Fluid API]Remove multiple APIs in control_flow (#48279) · c0d31dac

由 LiYuRio 提交于 11月 29, 2022

* remove lod_tensor_to_array, array_to_lod_tensor, DynamicRNN

* remove less_equal, greater_than, greater_equal, equal, not_equal

c0d31dac

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

Generate static graph code for lerp by yaml (#48322) · d5387de2

由 HappyHeavyRain 提交于 11月 29, 2022

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* remove the 'attrs' of lerp, test=develop
Signed-off-by: lizhiyu02 <1528794076@qq.com>
Signed-off-by: lizhiyu02 <1528794076@qq.com>

d5387de2

Z

[Sparse]BatchNorm use inplace (#48254) · d33d6db0
由 zhangkaihuo 提交于 11月 29, 2022

d33d6db0
Z

group the index in not cutlass mode (#48439) · 41ba2722
由 zhangkaihuo 提交于 11月 29, 2022

41ba2722

28 11月, 2022 22 次提交
- S
  
  eltwises + scale fuse pass (#48400) · a0930484
  由 Sławomir Siwek 提交于 11月 28, 2022
  
  a0930484
- J
  Reenabled reshape, squeeze and flatten oneDNN kernels (#48359) · 98aaf797
  由 jakpiase 提交于 11月 28, 2022
```
* re-enabled reshape, squeeze and flatten kernels

* added formatting
```
  98aaf797
- W
  fix: multihead matmul biasqk broadcast support for [1,1,seq,seq] shape (#47975) · 11b9d85f
  由 Wang Bojun 提交于 11月 28, 2022
```
* add trt support
```
  11b9d85f
- Z
  Generate static graph code for some ops by yaml (part5) (#48284) · b5c6c36c
  由 zyfncg 提交于 11月 28, 2022
```
* generate static graph code for some operators

* add some ops generate

* revert npu gelu
```
  b5c6c36c
- H
  [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
  由 huangjiyi 提交于 11月 28, 2022
```
* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug
```
  fd9c91c3
- 张
  
  replace LoDTensor with phi::DenseTensor in fluid\operators\*\ except sequence_ops (#48418) · 30a31a53
  由张春乔提交于 11月 28, 2022
  
  30a31a53
- Y
  Optimize the log of broadcast and decrease the log level. (#48327) · 8424cf28
  由 Yiqun Liu 提交于 11月 28, 2022
```
* Optimize the log of broadcast and decrease the log level.

* Remove the redundant brackets.

* Change op benchmark ci to test the tests module.

* Remove the observe of elementwise and reduce_ops sub-directory.
```
  8424cf28
- Y
  [BugFix]Fix OneDNN Kernels Bug when use pass (#48364) · df82fd35
  由 YuanRisheng 提交于 11月 28, 2022
```
* Fix onednn kernel bugs

* fix gpu bugs
```
  df82fd35
- A
  
  migrate top_k_function_cuda.h from fluid to phi (#48251) · b4b926f4
  由 Asthestarsfalll 提交于 11月 28, 2022
  
  b4b926f4
- P
  
  add cpu_info.h (#48403) · 923ad5dc
  由 PuQing 提交于 11月 28, 2022
  
  923ad5dc
- Q
  [NPU] apply npu_identity to conv bn and copy2cpu, test=develop (#48039) · 32143f44
  由 Qi Li 提交于 11月 28, 2022
```
* [NPU] apply npu_identity to conv bn and copy2cpu, test=develop

* update npu identity to share data with x, test=develop

* address review comments, test=develop
```
  32143f44
- Z
  Add trace mode for interpretercore (#48370) · bb1fffd6
  由 zhangbo9674 提交于 11月 28, 2022
```
* add trace mode for interpretercore

* fix bug

* add a ctrl flag

* add record for memcpyd2h

* polish code

* polish code
```
  bb1fffd6
- R
  Remove kSyncRun in StreamAnalyzer (#48425) · e7d459ac
  由 Ruibiao Chen 提交于 11月 28, 2022
```
* Remove kSyncRun in StreamAnalyzer

* Update code
```
  e7d459ac
- H
  [Phi decouple] remove dependece to "paddle/fluid/platform/device/xpu/xxx.h" in phi (#48420) · 2bae75ed
  由 huangjiyi 提交于 11月 28, 2022
```
* rm fluid “xpu_header.h” deps in phi

* move part of xpu_op_list.h from fluid to phi

* add fluid xpu_op_list deps

* add glog deps for xpu_op_list in phi

* fix PR-CI-Kunlun
```
  2bae75ed
- Z
  Fix bug of TransToFluidOpName (#48355) · d3f52efd
  由 zyfncg 提交于 11月 28, 2022
```
* add fluid_op_name_map

* rename some kernel name

* add comments for op-kernel map

* refine map name of op to kernel
```
  d3f52efd
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092
- W
  
  add pbtxt (#48326) · d7540a4a
  由 wenbin 提交于 11月 28, 2022
  
  d7540a4a
- X
  [Paddle Inference] Add gather_nd trt converter. (#47589) · 20c3224d
  由 xiaoxiaohehe001 提交于 11月 28, 2022
```
* add_gather_nd_

* add_gather_nd_

* add_gather_nd_
```
  20c3224d
- T
  fix expand as op (#48336) · 827fd5cd
  由 Thomas Young 提交于 11月 28, 2022
```
* fix expand as op

* fix bug
```
  827fd5cd
- H
  
  add square fp16 *test=kunlun (#48095) · 81d0a3cc
  由 haosicheng 提交于 11月 28, 2022
  
  81d0a3cc
- X
  【fluid api clear】Remove reduce sum (#48330) · 8d00f76e
  由 xiaoguoguo626807 提交于 11月 28, 2022
```
* remove fluid.reduce_sum

* remove fluid.reduce_sum

* modify axis and import paddle

* modify keepdim and out_name

* modift unittest

* modift unittest

* modify CI_static and loss.py

* modify test_mse_loss

* modify static ci

* modify static ci datatype

* add import paddle in test

* fix conflict

* fix conflict

* modify ci

* modify ci

* fix_conflict

* fix bug

* code_style
```
  8d00f76e
- 张
  Remove LoDTensor and Tensor in fluid except operators folder (#48416) · 4527d249
  由张春乔提交于 11月 28, 2022
```
* Update communicator.cc

* Update communicator.cc

* remove LoDTensor

* remove LoDTensor and Tensor
```
  4527d249
26 11月, 2022 2 次提交
- G
  
  add reciprocal trt converter (#48230) · d80330fe
  由 gem5 提交于 11月 26, 2022
  
  d80330fe
- L
  fix jit input var not ready error (#48351) · ab6a3dad
  由 Leo Chen 提交于 11月 26, 2022
```
* hot fix

* fix compile

* merge develop

* follow comments
```
  ab6a3dad
25 11月, 2022 6 次提交
- Z
  fix loopup_table plugin deserialize size error (#48379) · 128ef1ae
  由 zhangxin81 提交于 11月 25, 2022
```
* fix loopup_table plugin deserialize size error
```
  128ef1ae
- W
  for xpu multi thread bug test (#48373) · a1bdc652
  由 wanghuancoder 提交于 11月 25, 2022
```
* for xpu multi thread bug test
```
  a1bdc652
- W
  [Paddle Inference]fix token prune plugin (#48367) · c6de4342
  由 Wangzheee 提交于 11月 25, 2022
```
* fix
```
  c6de4342
- W
  Group norm fp16 support (#48222) · 34fd65cf
  由 Wang Bojun 提交于 11月 25, 2022
```
* group norm fp16 support
```
  34fd65cf
- C
  [PROFILER] add flops for Profiler (#47766) · 3d1981ad
  由 Chitsing KUI 提交于 11月 25, 2022
```
* attr ready

* op ip ready

* start dynamic

* end2end ok

* input shape to map, stat by op

* layer wip

* first version ready

* fix proto depds

* fix profiler deps

* fix flops typo, rm tuple shape
```
  3d1981ad
- R
  Refactor stream anayzer (#48158) · 889318d8
  由 Ruibiao Chen 提交于 11月 25, 2022
```
* Move stream_anayzer to interpreter

* Refactor StreamAnalyzer

* Refactor RunNextInstructionList

* Remove no_data_transform_index

* Fix typos

* Fix data_transfer OpFuncType error

* Add event for depend_op

* Update transfer OpFuncType for heter place
```
  889318d8

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功