提交 · ea96172efde585df86035de6c34582c0853e4655 · PaddlePaddle / Paddle

15 9月, 2022 6 次提交
- H
  refine PADDLE_WITH_MKLDNN code (#46053) · ea96172e
  由 HongyuJia 提交于 9月 15, 2022
```
* refine PADDLE_WITH_MKLDNN code

* fix data_norm_op

* polish addmm_op
```
  ea96172e
- J
  updating mul and matmul with set_mem_desc (#45624) · 416e0de7
  由 Jacek Czaja 提交于 9月 15, 2022
```
* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix
```
  416e0de7
- Z
  Clear extra attrs of elementwise op in OpMaker (#45845) · b26efe0d
  由 zyfncg 提交于 9月 15, 2022
```
* clear extra attrs of elementwise op in opmaker

* fix op_debug_string_test

* fix bug of grad_add

* fix sort of runtime attrs
```
  b26efe0d
- W
  Support 0 shapes input Tensor for MKL slice (#45930) · 1d78681d
  由 WangZhen 提交于 9月 15, 2022
```
Support 0 shapes input Tensor for MKL slice kernel
```
  1d78681d
- N
  
  [CodeStyle] trim trailing whitespace in .h, .cc, .cu, etc. (#46006) · 8dde7aea
  由 Nyakku Shigure 提交于 9月 15, 2022
  
  8dde7aea
- W
  
  General Plugin Mechanism (#45355) · bc77e6d5
  由 weishengying 提交于 9月 15, 2022
  
  bc77e6d5
14 9月, 2022 8 次提交
- J
  [PHI] Support bmm and bmm_grad in xpu (#45887) · 6bd2762c
  由 Jiabin Yang 提交于 9月 14, 2022
```
* support bmm and bmm_grad in xpu

* add error removal

* test=kunlun

* refactor code for better structure

* test=kunlun

* add fp16 kernel for bmm

* test=kunlun
```
  6bd2762c
- N
  [CodeStyle] trim trailing whitespace in .md and .rst (#45990) · 3404ff67
  由 Nyakku Shigure 提交于 9月 14, 2022
```
* [CodeStyle] trim trailing whitespace in .md and .rst

* empty commit, test=document_fix
```
  3404ff67
- L
  Migrate scale and scatter to phi, and modify the code style for... · 1349584e
  由 Leo Guo 提交于 9月 14, 2022
```
Migrate scale and scatter to phi, and modify the code style for roi_align_kernel. test=kunlun (#45938)
```
  1349584e
- Y
  
  [XPU] migrate reduce kernels to phi, test=kunlun (#45973) · 5829069d
  由 ykkk2333 提交于 9月 14, 2022
  
  5829069d
- S
  Fix DistributedFusedLAMB NaN problem (#46011) · 6833ecfe
  由 sneaxiy 提交于 9月 14, 2022
```
* fix distributed_fused_lamb nan

* remove CUDA_ASSERT
```
  6833ecfe
- Y
  
  Simplify the codes of conv. (#45966) · 3a5b5048
  由 Yiqun Liu 提交于 9月 14, 2022
  
  3a5b5048
- X
  add mean,sum,ge,gt,ne,abs primitive operators for supporting deepxde (#45888) · 62176f63
  由 Xiaoxu Chen 提交于 9月 14, 2022
```
* add reduce_mean,reduce_sum primitive ops

* add ne_p gt_p primitive operators

* add ge_p abs_p primitive oparators
```
  62176f63
- C
  
  [MLU] add mergedAdam kernel. (#45965) · bf6ec262
  由 Chenxiao Niu 提交于 9月 14, 2022
  
  bf6ec262
13 9月, 2022 3 次提交
- R
  
  [CustomDevice] register load_combine op (#45980) · b2122239
  由 ronnywang 提交于 9月 13, 2022
  
  b2122239
- Z
  Clear extra attributes of activation op in OpMaker (#45772) · c7b373f2
  由 zyfncg 提交于 9月 13, 2022
```
* clear extra attr of activation op in opmaker

* fix syntax bug

* fix mkldnn kernel

* fix merge conflict

* fix bug
```
  c7b373f2
- Y
  
  migrate squeeze kernel to phi, test=kunlun (#45968) · d3366853
  由 ykkk2333 提交于 9月 13, 2022
  
  d3366853
10 9月, 2022 1 次提交
- Q
  
  [MLU] fix compute error of dropout op (#45923) · 36915474
  由 qipengh 提交于 9月 10, 2022
  
  36915474
09 9月, 2022 7 次提交
- D
  make memcpy op to support custom_device (#45918) · 1ed8e9b8
  由 duanyanhui 提交于 9月 09, 2022
```
* make memcpy op to support custom device

* fix bug
```
  1ed8e9b8
- L
  [new-exe] convert fused_all_reduce_op_handle to program (#45774) · e755c07e
  由 Leo Chen 提交于 9月 09, 2022
```
* add operator<< for BuildStrategy

* add fake_coalesce

* fit allreduce mode for new_exe

* remove dubeg code

* follow comments
```
  e755c07e
- C
  [Phi] Migrate load kernel (#45891) · a001f263
  由 Chen Weihang 提交于 9月 09, 2022
```
* migrate load kernel

* remove load op

* fix test failed
```
  a001f263
- X
  
  convfusion_cache (#45902) · 3bad26ec
  由 xiaoxiaohehe001 提交于 9月 09, 2022
  
  3bad26ec
- R
  [CustomDevice] add dy2static support (#45878) · abc85c50
  由 ronnywang 提交于 9月 09, 2022
```
* [CustomDevice] add dy2static support

* update
```
  abc85c50
- C
  [Phi] Add fusion kernel dir and migrate fused_softmax_mask op (#45802) · 2b4f44d5
  由 Chen Weihang 提交于 9月 09, 2022
```
* add fusion dir and fuse_softmax_mask kernel

* remove fusion kernel dir

* migrate infershape

* fix code errror
```
  2b4f44d5
- S
  
  fix fused_gemm_epilogue compile error (#45899) · 7d000112
  由 sneaxiy 提交于 9月 09, 2022
  
  7d000112
08 9月, 2022 7 次提交
- P
  [PHI] Migrate cast, clip+grad and pool+grad oneDNN kernels (#45775) · 1a929c31
  由 piotrekobi 提交于 9月 08, 2022
```
* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* remove fluid code

* onednn renaming

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Migrate pool+grad, clip+grad and cast oneDNN kernels to PHI

* Refactor grad kernels into separate files

* Fix CI failures

* Fix Codestyle

* Implement reviewer suggestions

* Add new lines after includes for readability
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  1a929c31
- L
  
  Migrate roi_align and roi_align_grad to phi. test=kunlun (#45858) · 8add11a0
  由 Leo Guo 提交于 9月 08, 2022
  
  8add11a0
- H
  
  polish code comment, test=doc (#45859) · 447d79da
  由 HongyuJia 提交于 9月 08, 2022
  
  447d79da
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97
- T
  
  cinn_launch op: fix dtype of tensor is always mutable_data<float> (#45835) · ef53e1b4
  由 TeFeng Chen 提交于 9月 08, 2022
  
  ef53e1b4
- X
  [Dy2Static] Filter int64/int32/int16/bool in conditional op (#45759) · 36046a89
  由 xiongkun 提交于 9月 08, 2022
```
* stop pass filter int32/int16/int64/bool inputs in cond_op

* fix bugs: except block 0, the backward vars and forward vars exist in different blocks.

* fix code by review
```
  36046a89
- S
  
  fix fused_gemm_epilogue_op compile error (#45862) · 569d6c5b
  由 sneaxiy 提交于 9月 08, 2022
  
  569d6c5b
07 9月, 2022 8 次提交
- C
  [Phi] Migrate save kernel (#45665) · fc66fdb7
  由 Chen Weihang 提交于 9月 07, 2022
```
* add save kernel

* add save_sr_kernel

* remove original save_op

* add save gpu kernel

* remove combine kernel

* add port.h include

* add save selected rows test

* remove useless kernel.h
```
  fc66fdb7
- Y
  
  rename the template type name for tranpose (#45834) · 9b70c556
  由 Yuang Liu 提交于 9月 07, 2022
  
  9b70c556
- W
  Construct exec and ctx only once in cond op to speed up (#45794) · ba653e7b
  由 WangZhen 提交于 9月 07, 2022
```
* Construct exec and ctx only once in cond op to speed up

* Fix construct function error
```
  ba653e7b
- W
  
  Fix fused cuda op's mutable data [2] (#45562) · 4bbbed9a
  由 Wilber 提交于 9月 07, 2022
  
  4bbbed9a
- P
  [PHI] Migrate reduce sum+grad, mean+grad, min and max oneDNN kernels (#45536) · 22255528
  由 piotrekobi 提交于 9月 07, 2022
```
* gaussian random

* mkldnn to onednn renaming

* fix merge conflicts

* Migrate reduce_op oneDNN kernels to phi

* Remove unnecessary header

* remove fluid code

* onednn renaming

* Change std::vector<int64_t> to IntArray

* Fix code style

* Move classes from mkldnn_reuse.h to onednn_reuse.h

* Move more functions from mkldnn_helper.h to onednn_helpper.h

* Change MKLDNN to OneDNN in VLOG message

* Implement reviewer suggestions
Co-authored-by: NSilv3S <slawomir.siwek@intel.com>
```
  22255528
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1
- Y
  
  [alphafold] Transpose support large tensors where there numel is bigger than INT32_MAX (#45753) · d9a9e638
  由 Yuang Liu 提交于 9月 07, 2022
  
  d9a9e638
- Z
  Clear extra attrs of reduce op in OpMaker (#45786) · 63b6a11b
  由 zyfncg 提交于 9月 07, 2022
```
* clear extra attrs of reduce op in opmaker

* fix reduce_mean
```
  63b6a11b

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功