提交 · 4c375454585cae5612fd9a4f92d325b340c92b69 · PaddlePaddle / Paddle

10 11月, 2022 12 次提交
- H
  [PHI Decoupling] remove dependency on "paddle/fluid/platform/errors.h" and... · 4c375454
  由 huangjiyi 提交于 11月 10, 2022
```
[PHI Decoupling] remove dependency on "paddle/fluid/platform/errors.h" and "paddle/fluid/platform/fast_divmod.h" in phi. (#47815)

* rm "paddle/fluid/platform/errors.h" in phi

* rm "paddle/fluid/platform/fast_divmod.h" in phi
```
  4c375454
- [Zero-Dim] support input 0D Tensor for xpu compare kernel, test=kunlun (#47812) · d01109fc
  由 zhouweiwei2014 提交于 11月 10, 2022
  
  d01109fc
- H
  
  rm "paddle/fluid/platform/place.h" in phi (#47823) · 03f976d6
  由 huangjiyi 提交于 11月 10, 2022
  
  03f976d6
- P
  change cudnn error to cuda error if compiled cuda version is incompatible with... · b96a21df
  由 pangyoki 提交于 11月 10, 2022
```
change cudnn error to cuda error if compiled cuda version is incompatible with installed cuda version (#47743)

* fix cudnn error

* fix

* fix

* fix
```
  b96a21df
- J
  XPU multi-card support eager mode (#47445) · 3b91f8f3
  由 james 提交于 11月 10, 2022
```
* XPU support eager mode

* add unittest for XPU eager mode

* minor bugfix

* minor bugfix, test=kunlun

* correct copyright info

* 1. remove unsed vars/funcs
2. ProcessGroupBKCL inherit from ProcessGroupStream

* bugfix for fp16 in eager mode multi-card, test=kunlun

* rebase & fix a few issues

* use new processgroup interface, test=kunlun

* fix compile issue, test=kunlun
```
  3b91f8f3
- W
  skip_merge_layernorm (#47810) · 1c6013dd
  由 wenbin 提交于 11月 10, 2022
```
* skip_merge_layernorm

* add UT

* modify comments
```
  1c6013dd
- Z
  Add CI check for script of auto code-gen (#47814) · 00ea0b2f
  由 zyfncg 提交于 11月 10, 2022
```
* add ci check for code-gen script

* update
```
  00ea0b2f
- J
  fix paddle with cinn cannot link relu op bug (#47793) · 8e65ac5d
  由 jiangcheng 提交于 11月 10, 2022
```
* fix paddle with cinn cannot link relu op bug

* change cmake activation_op to generator_op
```
  8e65ac5d
- R
  Fuse multi transformer layer pass (#47541) · 1e3245a8
  由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
  1e3245a8
- C
  
  support pow_triple_grad op (#47799) · 7964119b
  由 Charles-hit 提交于 11月 10, 2022
  
  7964119b
- W
  Refactor collective communication P2P C++ API (#47801) · d926c270
  由 Wen Sun 提交于 11月 10, 2022
```
* refactor: send, recv, send_partial, recv_partial

* refactor: rm useless const ref
```
  d926c270
- Z
  
  fix amp cast bug for bn (#47802) · 5004c33a
  由 zhangbo9674 提交于 11月 10, 2022
  
  5004c33a
09 11月, 2022 23 次提交
- H
  [PHI decoupling] remove "paddle/fluid/platform/dynload/xxx.h" in phi (#47787) · 7c302538
  由 huangjiyi 提交于 11月 09, 2022
```
* rm "paddle/fluid/platform/dynload/cudnn.h" in phi

* rm "paddle/fluid/platform/dynload/mklml.h" in phi

* rm "paddle/fluid/platform/dynload/rocblas.h" in phi

* replace "paddle::platform::dynload::" with "phi::dynload::" in phi

* revert "blas_impl.cu.h"
```
  7c302538
- H
  
  clean repetitious GetKernelTypeForVar (#47763) · c551e55d
  由 HongyuJia 提交于 11月 09, 2022
  
  c551e55d
- L
  clean unused code: locked allocator (#47789) · 788d9328
  由 Leo Chen 提交于 11月 09, 2022
```
* remove locked allocator

* fix ut

* add heafer file
```
  788d9328
- J
  
  Fix U2++ perf (#47780) · b1fb2360
  由 joanna.wozna.intel 提交于 11月 09, 2022
  
  b1fb2360
- W
  Get grads from cpp for optimizer to avoid gpu idel time (#47709) · 261ebb0c
  由 WangZhen 提交于 11月 09, 2022
```
* Get params and grads in cpp to avoid gpu idel time

* Using python param instead of cpp return param to fix test_asp_optimize_dynamic.py

* Get grads from cpp and construct params_grads on python

* Check meta and remove comments
```
  261ebb0c
- W
  [PHI decoupling] remove framework/data_type.h from phi (#47776) · 1631836f
  由 Wang Xin 提交于 11月 09, 2022
```
* remove framework/data_type.h from phi

* fix CI fail: map proto::VarType to phi::DataType

* refactor code to add more detailed comments
```
  1631836f
- P
  Enable fc passes (#45704) · 7e914386
  由 Paulina Gacek 提交于 11月 09, 2022
```
* Analysis API interface for disabling fc passes

* Unit tests corrected

* Python API added

* test runs only when PADDLE_WITH_MKLDNN

* Fc op changed to relu in matmul_op_test

* Disable fc passes in tests where acc drops

* code formating

* Unit test for analysisConf added

* Unit test gpu added

* fc passes disabled when iterations=0 in gru test

* style

* passes disabled when fp32 in gru test

* fc passes disabled in lstm test

* Import from inference, not fluid in doc
```
  7e914386
- T
  [CodeStyle][E266] remove multiple '#' in comments (#47772) · 8c8cf0fd
  由 Tony Cao 提交于 11月 09, 2022
```
* fix flake8 CodeStyle E266

* fix comments
```
  8c8cf0fd
- J
  
  fix for missing reorders in profiling (#47777) · a97b3630
  由 jakpiase 提交于 11月 09, 2022
  
  a97b3630
- S
  
  cleanup unused code (#47762) · fb16fea3
  由 Sławomir Siwek 提交于 11月 09, 2022
  
  fb16fea3
- J
  Final changes to introduce mem_desc to be hold in Tensor (#46768) · 14f261ad
  由 Jacek Czaja 提交于 11月 09, 2022
```
* first commit

- more fixes

- compilation fix

- compilation fix

- fix

- another fix

- yet another fix

- Fix

- fix to fused ops

- compilation fix

- compilation fix

- another compilation fix

- another fix

- fix

- fix

- fix

- fix

- yet another fix

- fix

- fix

- cosmetic fix

:- lint

- Revert some changes (to be brought back later)

- fix to build

- Added prototype of slice

- fix

compilation fix

- compilation fix

- fix

- fix

- Fix

- fix

 fix
	modified:   cmake/flags.cmake

* lint

* rerun of CI

* - Fix

* - lint

* - lint2
```
  14f261ad
- H
  
  rm "paddle/fluid/platform/dynload/cublas.h" in phi (#47778) · 692a9632
  由 huangjiyi 提交于 11月 09, 2022
  
  692a9632
- Z
  Generate static graph code for some ops by yaml (part2) (#47752) · ccb47076
  由 zyfncg 提交于 11月 09, 2022
```
* generate static graph code of some op

* polish code

* fix bug

* update default value
```
  ccb47076
- H
  
  rm #include "paddle/fluid/framework/data_layout.h" in phi (#47770) · fd80288e
  由 huangjiyi 提交于 11月 09, 2022
  
  fd80288e
- C
  
  add sin triple grad operator (#47753) · 267b218f
  由 cyber-pioneer 提交于 11月 09, 2022
  
  267b218f
- C
  [PHI decoupling] Move fluid op generator into fluid (#47714) · f369b2b1
  由 Chen Weihang 提交于 11月 09, 2022
```
* move fluid op generator into fluid

* remove parsed op

* resolve sig undef error

* append python interp find logic

* remove dup code
```
  f369b2b1
- W
  
  refactor: ProcessGroupNCCL (#47740) · ae14bad1
  由 Wen Sun 提交于 11月 09, 2022
  
  ae14bad1
- Q
  Revert "[NPU] add more attrs into npu storiages, test=develop (#47645)" (#47751) · 87d97246
  由 Qi Li 提交于 11月 09, 2022
```
This reverts commit 1568d64f.
```
  87d97246
- L
  
  new mp_allreduce_sum_op (#47715) · 18d33346
  由 LiYuRio 提交于 11月 09, 2022
  
  18d33346
- F
  fix ScaleKernel configuration error where input numel is 0 (#47111) · 38ba5f2e
  由 FlyingQianMM 提交于 11月 09, 2022
```
* fix scale kernel configuration error where input numel is 0

* fix code stype

* add unit test case for scale op when numel of input x is zero

* fix ci codestyle check

* add cpu and gpu unit test case for scale op when numel of input x is zero

* add uninitialized judgment for input of scale
```
  38ba5f2e
- W
  [Paddle Inference]upgrade scale and slice op convert for Paddle-TensorRT (#47746) · cdd7b956
  由 Wangzheee 提交于 11月 09, 2022
```
* upgrade scale and slice op convert for Paddle-TensorRT
```
  cdd7b956
- Z
  
  [Sparse]optimize sparse convolution and fix MaskHelper bug (#47703) · 1aa64d13
  由 zhangkaihuo 提交于 11月 09, 2022
  
  1aa64d13
- W
  refine python call error report (#47724) · 5c7fce47
  由 wanghuancoder 提交于 11月 09, 2022
```
* refine python call error report
```
  5c7fce47
08 11月, 2022 5 次提交
- R
  
  [CustomDevice] fix the not ready kernel can not register. (#47758) · 4b0f1b0c
  由 ronnywang 提交于 11月 08, 2022
  
  4b0f1b0c
- W
  
  Fix compiler error with_trt (#47716) · 6934ae2b
  由 Wilber 提交于 11月 08, 2022
  
  6934ae2b
- L
  
  refine comm api implementation (#47713) · 84c9a0d6
  由 LiYuRio 提交于 11月 08, 2022
  
  84c9a0d6
- [Zero-Dim] support input 0D Tensor for sundary api (#47734) · 3198af20
  由 zhouweiwei2014 提交于 11月 08, 2022
```
* [Zero-Dim] support input 0D Tensor for sundary api

* fix comment
```
  3198af20
- S
  Migrate old C++ unit tests to Python framework (#47006) · 0c9f09b8
  由 Sławomir Siwek 提交于 11月 08, 2022
```
* softplus+activation

* fc + elementwise_add test refactored

* rename MKLDNN to OneDNN

* fc+activation tests refactored

* remove softplus ut

* whitespace

* whitespace

* codestyle

* codestyle

* add more cases to fc+act

* remove softplus+hard_sigmoid pass

* remove softplus + hard_sigmoid UT

* add approximate for gelu

* swish beta range

* new codestyle

* reduce number of tests
```
  0c9f09b8

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功