提交 · 8031a4dc8b05dcfee95af2ca613fc736fc7f9830 · 机器未来 / Paddle

09 3月, 2022 3 次提交

[Phi] move Reduce max kernel into phi (#40225) · 8031a4dc

由 chentianyu03 提交于 3月 09, 2022

* add reduce_max kernel

* add reduce max kernel

* update reduce max Argumentmapping

* remove reduce_max kernel

* remove reduce_max kernel

* add reduce max infermeta

* rename reduce infermeta

8031a4dc

J

fix batch_norm op kernel (#40171) · fb4215b2
由 JingZhuangzhuang 提交于 3月 09, 2022

fb4215b2

fix take_along_axis cuda op register bug (#40270) · fcae3430

由 Yang 提交于 3月 09, 2022

* fix take_along_axis cuda op register bug

* add comma after float
Co-authored-by: NChen Weihang <chenwhpro@163.com>

fcae3430

08 3月, 2022 29 次提交
- R
  
  support ema optimizer in sharding optimizers (#39860) · e548f65f
  由 Roc 提交于 3月 08, 2022
  
  e548f65f
- Y
  
  Rename phi::func::TensorReduceImpl to phi::func::ReduceKernel. (#40183) · 688743bf
  由 Yiqun Liu 提交于 3月 08, 2022
  
  688743bf
- C
  Add profiler statistic (#40249) · c1d81ec1
  由 chenjian 提交于 3月 08, 2022
```
* add python profiler package

* update according to review

* fix bug

* fix bug

* fix bug

* add unit test

* Revert "add unit test"

This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48.

* reduce for pr

* add unit test

* modify for pr

* fix unittest

* update for ci coverage

* modify according to review

* fix bug

* improve coverage

* add profiler code

* add statistic code

* reduce content for pr
```
  c1d81ec1
- C
  [Phi] Remove gpudnn suffix & polish cmake (#40239) · 3a77d027
  由 Chen Weihang 提交于 3月 08, 2022
```
* remove gpudnn suffix & polish cmake

* fix typo
```
  3a77d027
- K
  
  fix yolov3 return value in dygraph mode. test=develop (#40185) · 9aa6bfc7
  由 Kaipeng Deng 提交于 3月 08, 2022
  
  9aa6bfc7
- K
  
  remove isinstance Dataset check. test=develop (#40184) · 2ce007ca
  由 Kaipeng Deng 提交于 3月 08, 2022
  
  2ce007ca
- X
  Fix fold python examples (#38636) · d4a4eb9d
  由 xiaoting 提交于 3月 08, 2022
```
* fix fold python examples, test=develop

* fix size type, test=develop

* fix python example, test=develop

* fix fold shape check

* fix fold dygraph mode, test=develop
```
  d4a4eb9d
- Y
  [Phi] move ops: maxout/take_along_axis/put_along_axis (#39959) · 48b4366c
  由 Yang 提交于 3月 08, 2022
```
* [Phi] move put_along_axis/take_along_axis/maxout

* use phi::Copy
```
  48b4366c
- Z
  Add exception throw for norm_conv when platform is not supported (#40166) · 00566ead
  由 Zhang Zheng 提交于 3月 08, 2022
```
* Add throw for norm_conv when platform is not supported

* fix format
```
  00566ead
- L
  add the implementation of process group for hccl (#40228) · 73583f86
  由 lilong12 提交于 3月 08, 2022
```
* add pg_hccl
```
  73583f86
- C
  [Phi] Move matrix inverse into phi (#40237) · 7024ade7
  由 Chen Weihang 提交于 3月 08, 2022
```
* move matrix inverse into phi

* change license year
```
  7024ade7
- Y
  [Phi]Move Relu/Cos/Sin/Tan/Acos/Asin/Atan/Sinh/Cosh/Asinh/Acosh/Atanh kernels... · 975f99ab
  由 YuanRisheng 提交于 3月 08, 2022
```
[Phi]Move Relu/Cos/Sin/Tan/Acos/Asin/Atan/Sinh/Cosh/Asinh/Acosh/Atanh kernels in Activation to Phi (#40175)

* move activation op

* adjust code format

* fix compile bugs

* fix ci bugs

* code format adjust

* code format adjust2

* activate ci status

* modify according to comment
```
  975f99ab
- X
  
  add support for concat and variadic tensor list (#40229) · f1fe2ad4
  由 xiongkun 提交于 3月 08, 2022
  
  f1fe2ad4
- Z
  [PHI] Support string type attr in yaml (#40218) · 47d1d5af
  由 zyfncg 提交于 3月 08, 2022
```
* support str attr in yaml

* fix bug
```
  47d1d5af
- A
  [IPU] update ipu unittests p4 (#40073) · 061044a0
  由 Allen Guo 提交于 3月 08, 2022
```
* update ipu UTs part4

* rename uts

* sync api changes

* update uts for new api
```
  061044a0
- A
  [IPU] update ipu unittests p2 (#40069) · a279a4f8
  由 Allen Guo 提交于 3月 08, 2022
```
* update ipu UTs part2

* clean git

* rename ut

* rename ut 1

* sync api changes

* update uts for new api

* update uts for new api

* fix re-define
```
  a279a4f8
- X
  [phi] transfer accuracy op and pass the unittests (#39982) · 13f2b1e3
  由 xiongkun 提交于 3月 08, 2022
```
* transfer accuracy op and pass the ci

* remove header file

* fix code

* fix code

* fix

* fix
```
  13f2b1e3
- W
  [phi] move isnan_v2、isfinite_v2、isinf_v2 to phi (#40076) · 3c536f2e
  由 WJJ1995 提交于 3月 08, 2022
```
* support isfinite for phi

* mark v2

* fixed bugs

* fixed include bugs

* deal with comments

* decoupling selected_rows

* rm bfloat16

* fixed infermeta

* fixed code style

* rm useless code

* replace pt by pd
```
  3c536f2e
- Z
  
  support code auto-gene for sparse backward api (#40196) · f876320a
  由 zyfncg 提交于 3月 08, 2022
  
  f876320a
- Y
  
  add share dims (#40238) · d4b007af
  由 YuanRisheng 提交于 3月 08, 2022
  
  d4b007af
- A
  [custom kernel]Upgrade support for multiple libs (#40223) · c39aa18e
  由 Aganlengzi 提交于 3月 08, 2022
```
* [custom kernel]Upgade support for multi libs

* upgrade phi_custom_kernel deps
```
  c39aa18e
- [MLU] add fleet init api and collective api pytest for mlu (#40010) · c722ee69
  由 mhhhh1 提交于 3月 08, 2022
```
* [MLU] add fleet init api and collective api pytest for mlu

* fix no value for argument 'data_type' in method call
```
  c722ee69
- W
  [Phi] move the graph_send_recv op to the phi (#40092) · 6bd2d2b1
  由 wawltor 提交于 3月 08, 2022
```
* [Phi] transfer old kernel to pten kernel for the graph_send_recv op

* update the code for the define of graph_send_recv

* fix the gradient problem for graph_send_recv

* fix the compile problem

* update the enfore message for the windows

* update the code for the compiler

* update compiler problem for the windows

* udpate the code for windows

* fix some format problem
```
  6bd2d2b1
- T
  
  remove unnecessary constant fill in sequence conv test=kunlun. (#40126) · 413a743e
  由 tanzhipeng 提交于 3月 08, 2022
  
  413a743e
- F
  [Phi] move InferShape for truncated_gaussian_random and gaussian_random (#40191) · 81d4142b
  由 furnace 提交于 3月 08, 2022
```
* [Phi] move InferShape for truncated_gaussian_random and gaussian_random

* [Phi] delete useless codes
```
  81d4142b
- R
  
  fix paddle.median torch diff (#40118) · 0c33c47e
  由 ronnywang 提交于 3月 08, 2022
  
  0c33c47e
- L
  [phi] move sigmoid_cross_entopy_with_logits log_loss cumsum auc infershape to phi (#40200) · fe1cc8bd
  由 Linjie Chen 提交于 3月 08, 2022
```
* move infershapes to phi

* update code format

* update code format
```
  fe1cc8bd
- C
  add profiler statistic helper (#40111) · 1f857cb9
  由 chenjian 提交于 3月 08, 2022
```
* add profiler helper

* fix unittest

* improve test coverage rate
```
  1f857cb9
- C
  add python profiler package (#40065) · 10325a82
  由 chenjian 提交于 3月 08, 2022
```
* add python profiler package

* update according to review

* fix bug

* fix bug

* fix bug

* add unit test

* Revert "add unit test"

This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48.

* reduce for pr

* add unit test

* modify for pr

* fix unittest

* update for ci coverage

* modify according to review

* fix bug

* improve coverage
```
  10325a82
07 3月, 2022 8 次提交

王

[infrt] fold the infrt.cvtTensorOp. test=develop (#40214) · b798fb07
由王明冬提交于 3月 07, 2022

b798fb07
X
[OpTest] Support to test paddle API end-to-end for check_eager (#40169) · 79a32715
由 xiongkun 提交于 3月 07, 2022
```
* add python api test in TestOp

* test_python_api if self.python_api is set

* fix code by CR
```
79a32715

refactor unittest for nearest_interp_v2_op_xpu. test=kunlun (#39804) · c09adab8

由 houj04 提交于 3月 07, 2022

* refactor unittest for nearest_interp_v2_op_xpu. test=kunlun

* fix code style. test=kunlun

* fix code style. test=kunlun

c09adab8

0
[Phi]Move bincount OP to phi (#39947) · 1c29196e
由 0x45f 提交于 3月 07, 2022
```
* move bincount OP to phi

* fix dtype

* set_dtype by weights or x

* fix conflicts
```
1c29196e

cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca

由 Ming-Xu Huang 提交于 3月 07, 2022

* Added cuBlasLtHandle_t to device context.

* Added fused_gemm_epilogue op.

1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
2. Act currently only be supported ReLU. (Will add GeLU in the future).

* Added UT to fused_gemm_epilogue op.

* Added LinearAct Pattern

1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
pattern.
2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
3. act currently only support ReLU (Will support GeLU in the future).

* Added FuseGemmEpiloguePass

1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
fusion (GeLU will be supported in the future).
2. Only support matmul_v2 from nn.Linear.

* Added pybind to BuildStrageter.fuse_gemm_epilogue_.

* Added UT for fuse_gemm_epilogue_pass.

* GeLU support and EpilogueSingleton

1. Added GeLU support to fused_gemm_epilogue op.
2. Added EpilogueSingleton to cache auxiliary pointer.
3. Added related UTs.

* Rename cublaslt_epilogue_opto gemm_epilogue_op.*.

* Added both train and infer pattern to LinearAct.

1. Added support of fwd graph with grap_ops linking to LinearAct.
2. Added related changes to fuse_gemm_epilogue_pass for above
modification.

* Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.

* Added identity activation support to gemm_epilogue_op.

* Added Linear Fusion (matmul_v2 + ele_add)

1. Added matmul_v2 + ele_add pattern to LinearActPattern.
2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.

* Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*

* Add fused_gemm_epilogue_grad op.

1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.

* Add UTs to fused_gemm_epilogue_grad_op.

* Change attribute name in fused_gemm_epilogue_grad_op for clearing.

* Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.

* Added ElementwiseAdd+Matmul+Act graph pattern detection.

* Fuse backward of Linear( Act(x))

1. Added backward fusion pass to Linear( Act(x)).
2. Added backward fusion pass to Linear(x).

* Added UTs to backward fusion of Linear(Act(x)).

* Complete document of arguments to fused_gemm_epilogue_op.

* Made arguments of some functions pass by reference.

* Modify code with review comments.

1. Made arguments of some function pass by reference.
2. Removed redundant code.
3. Followed Google code style to change code.

* Made 'const' code style be consistent

* Fixed random seed of python UTs.

* Set Compiling constrains to cuBlasLt

1. Require CUDA 11.6+
2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.

* Code Reivew from Paddle

1. Changed arguments name is_first_gemm to without_x_gradient for
clearing.
2. Applied PADDLE_THROW in fused_gemm_epilogue_op.

* Remove EpilogueSingleton

1. Applied ReserveSpace to replace Epilogue for passing auxiliary
pointers between FWD and BWD.

* Fix a logical error and enhance UTs.

1. Added act op count checking in UTs.
2. Fix issue to fuse backward or ReLU(Linear(X)).
3. TODO: solve GELU fusion issues.

* Fix Linear and GeLU fusion issues.

1. Modified graph_detech_pattern to fit with both linear wiht gelu or
relu.
2. Modified data range in Uts to allow negative values.

* Removed fused_gemm_epilogue_op.h.

* Rename namespace pten to phi.

* Rename name of arguments in fused_gemm_epilogue_op

1. bias -> Bias.
2. out -> Out.
3. reserve_space -> ReserveSpace.

* Change EpiloguePassActivationCache as local variable.

1. Removed singleton in EpiloguePassActivationCache.
2. Made EpiloguePassActivationCache as an argument to each pass
functions.

2a3d9eca

[phi] move is_empty to phi (#39919) · 72964335

由 WJJ1995 提交于 3月 07, 2022

* Add is_empty

* fixed for CI

* fixed code style

* resolve conflict

* deal with comments

* replace pt by pd

72964335

W
Add mlir trt engine type. (#40197) · 6fd96a04
由 Wilber 提交于 3月 07, 2022
```
* infrt add trt engine

* update engine name
```
6fd96a04
Y
[Phi]Move elementwise_div grad/double grad Kernel to Phi (#40172) · c52a664e
由 YuanRisheng 提交于 3月 07, 2022
```
* move elementwise_div grad

* change mutable_data to alloc

* fix compile bugs
```
c52a664e

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致