提交 · 485de16aa3214e157aad4fa95b7bbf6958ea48e0 · BaiXuePrincess / Paddle

29 11月, 2022 8 次提交

S

eltwise_div + scale [PHI] (#48484) · fa10524d
由 Sławomir Siwek 提交于 11月 29, 2022

fa10524d

[PHI] Migrate matmul kernel (#48162) · f41ccbd5

由 Sławomir Siwek 提交于 11月 29, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

* matmul fwd

* add extra attr

* temp disable passes

* re-enable passes

* workaround for matmul+act

* fix for matmul+eltwise_add

* fix typo

* merge bugfix #48364

* remove merge conflict

f41ccbd5

[Control Flow] replace executor in while op with InterpreterCore (#47573) · 6dbfbfa5

由 kangguangli 提交于 11月 29, 2022

* fix:add no support for cuda_arch<700

* replace Executor in while op with InterpreterCore

* cache InterpreterCore as the member of WhileOp

* fix bug: tensor place changed because of assign op in while loop

* refine code

* refine code

* refine code

* hot fix

* fix compile

* merge develop

* follow comments

* add log for test

* remove LoDTensor

* set flag control_flow_use_new_executor false
Co-authored-by: Nfengshuai <fengshuai03@baidu.com>
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

6dbfbfa5

J
Bugfix for Collective default calc stream (#48308) · a66bb67a
由 JZ-LIANG 提交于 11月 29, 2022
```
* get default calc stream from execution ctx instead of global dev ctx pool.
```
a66bb67a

[Fluid API]Remove multiple APIs in control_flow (#48279) · c0d31dac

由 LiYuRio 提交于 11月 29, 2022

* remove lod_tensor_to_array, array_to_lod_tensor, DynamicRNN

* remove less_equal, greater_than, greater_equal, equal, not_equal

c0d31dac

S

[PHI decoupling] Move MKLDNN code (#48352) · fa051eec
由 Sławomir Siwek 提交于 11月 29, 2022

fa051eec

Generate static graph code for lerp by yaml (#48322) · d5387de2

由 HappyHeavyRain 提交于 11月 29, 2022

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* generate static graph code for lerp by yaml, test=develop

* modify the op_compat.yaml of lerp, test=develop

* remove the 'attrs' of lerp, test=develop
Signed-off-by: lizhiyu02 <1528794076@qq.com>
Signed-off-by: lizhiyu02 <1528794076@qq.com>

d5387de2

Z

[Sparse]BatchNorm use inplace (#48254) · d33d6db0
由 zhangkaihuo 提交于 11月 29, 2022

d33d6db0

28 11月, 2022 9 次提交
- S
  
  eltwises + scale fuse pass (#48400) · a0930484
  由 Sławomir Siwek 提交于 11月 28, 2022
  
  a0930484
- J
  Reenabled reshape, squeeze and flatten oneDNN kernels (#48359) · 98aaf797
  由 jakpiase 提交于 11月 28, 2022
```
* re-enabled reshape, squeeze and flatten kernels

* added formatting
```
  98aaf797
- W
  fix: multihead matmul biasqk broadcast support for [1,1,seq,seq] shape (#47975) · 11b9d85f
  由 Wang Bojun 提交于 11月 28, 2022
```
* add trt support
```
  11b9d85f
- Z
  Generate static graph code for some ops by yaml (part5) (#48284) · b5c6c36c
  由 zyfncg 提交于 11月 28, 2022
```
* generate static graph code for some operators

* add some ops generate

* revert npu gelu
```
  b5c6c36c
- H
  [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
  由 huangjiyi 提交于 11月 28, 2022
```
* decouple cudnn_desc.h from fluid

* move cudnn_desc.h from fluid to phi

* fix bugs

* decouple cudnn_helper.h from fluid

* fix bugs

* move cudnn_helper.h from fluid to phi

* add fluid cudnn_helper.h

* move miopen_desc.h from fluid to phi

* move miopen_helper.h from fluid to phi

* fix bugs

* move gpu_dnn.h from fluid to phi

* fix bugs

* update copyright year

* simplify gpu_dnn.h in fluid

* fix bugs

* fix xpu build bug

* fix compile bug

* fix bug
```
  fd9c91c3
- 张
  
  replace LoDTensor with phi::DenseTensor in fluid\operators\*\ except sequence_ops (#48418) · 30a31a53
  由张春乔提交于 11月 28, 2022
  
  30a31a53
- A
  
  migrate top_k_function_cuda.h from fluid to phi (#48251) · b4b926f4
  由 Asthestarsfalll 提交于 11月 28, 2022
  
  b4b926f4
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092
- W
  
  add pbtxt (#48326) · d7540a4a
  由 wenbin 提交于 11月 28, 2022
  
  d7540a4a
24 11月, 2022 2 次提交

[PHI decoupling] simplify "convert_utils.h" in fluid (#48168) · de4310e6

由 huangjiyi 提交于 11月 24, 2022

* rm dependence to "convert_utils.h" in some files

* fix bugs

* replace DataType2String with DataTypeToString

* replace framework::DataTypeSize with phi::SizeOf

* mv convert_function from fluid to phi and rm old map

* recommit with pre-commit

* repalce ProtoVarType with ProtoDataType and update comment.

* fix error about include "dnnl.hpp"

* revert add dep mkldnn to convert_utils in phi

* add mkldnn deps in convert_utils.h in phi

* move deps to convert_utils.h in phi

de4310e6

S

[PHI] Migrate batch_norm_grad kernel (#48288) · 561b7278
由 Sławomir Siwek 提交于 11月 24, 2022

561b7278

23 11月, 2022 4 次提交
- H
  [PHI decoupling] move im2col from fluid to phi (#48174) · 88cac16b
  由 huangjiyi 提交于 11月 23, 2022
```
* decouple im2col from fluid

* move im2col to phi

* fix build error

* delete redundant comment
```
  88cac16b
- D
  
  add support of controlflow op for custom device (#48259) · edf46919
  由 duanyanhui 提交于 11月 23, 2022
  
  edf46919
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
- Use cublaslt in multi transformer FFN (#48052) · b07e6b45
  由 MarDino 提交于 11月 23, 2022
```
* use fused mlp in multi transformer
* Restruct code
* use cublaslt to fuse ffn
* fix conflict
```
  b07e6b45
22 11月, 2022 7 次提交

P
[PHI] Migrate elementwise_div + all elementwise grad kernels (#48210) · 78b30e97
由 Piotr Paturej 提交于 11月 22, 2022
```
* Migrate elementwise_div

* Migrate elementwise grad kernels
```
78b30e97
H

fix typo error (#48156) · 0cdca676
由 HongyuJia 提交于 11月 22, 2022

0cdca676
H
[PHI decoupling] move vol2col from fluid to phi (#48175) · aa36c6aa
由 huangjiyi 提交于 11月 22, 2022
```
* move vol2col from fluid to phi

* update copyright year
```
aa36c6aa
T
CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) · df4dfda0
由 Tian Zheng 提交于 11月 22, 2022
```
* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
```
df4dfda0

Some residualdata fixes (#48118) · 7bbdbe5b

由 Sylwester Fraczek 提交于 11月 22, 2022

Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute.
Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass
Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

7bbdbe5b

H
Delete caching from requantize_mkldnn_op and changed to Acquire API (#48113) · 7d6a4a54
由 Hulek 提交于 11月 22, 2022
```
* Delete caching from requantize_mkldnn_op and changed to Acquire API
* Fixed codestyle and implementation
```
7d6a4a54

[PHI decoupling] remove "gpu_device_function.h" in fluid. (#48117) · 4da1a0fe

由 huangjiyi 提交于 11月 22, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* rm dependence to "gpu_device_function.h" in fluid

* rm gpu_device_function.h etc in fluid

* fix rocm-complie bugs

* fix cuda_helper_test.cu bugs

4da1a0fe

21 11月, 2022 5 次提交

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

mma qk tensor_core (#48087) · d79eda71

由 lzy 提交于 11月 21, 2022

* use mma for QK dot computing in fused_multi_transformer.
* Update fused_multi_transformer_op.cu.h

d79eda71

H
[PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
3501ff7d

Unify `ProcessGroupNCCL` APIs underlying implementation (#48163) · 88410225

由 Wen Sun 提交于 11月 21, 2022

* refactor: replace Collective & PointToPoint with NCCLEnv

* refactor: rename to RunFnInNCCLEnv

* refactor: pass std::function by value

88410225

18 11月, 2022 5 次提交

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

W
[PHI decoupling] remove "gpu_primitives.h" in fluid (#48063) · 9918bf9c
由 Wang Xin 提交于 11月 18, 2022
```
* remove "gpu_primitives.h" in fluid namespace

* fix PR-CI-GpuPS fail

* fix PR-CI-GpuPS fail
```
9918bf9c

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致