提交 · 6934ac797f6ae6d3c83529af2f510ac194452d66 · PaddlePaddle / Paddle

10 4月, 2023 34 次提交
- X
  [Paddle Inference] Support two inputs of multihead attention named qk_multihead. (#52455) · 6934ac79
  由 xiaoxiaohehe001 提交于 4月 10, 2023
```
* Support two inputs of multihead attention named qk_multihead
```
  6934ac79
- H
  [Opt Performance] Optimize custom operator performance (#52597) · 01247e33
  由 HongyuJia 提交于 4月 10, 2023
```
* [Opt Performance] Optimize custom operator performance, reconstruct python API auto-gen, add cache and use const inference

* opt AutoGradMeta implementation

* remove profiler codes

* fix unit test

* change year, 2021->2023

* fix int64_t parse bug
```
  01247e33
- G
  Autogen code bilinear_tensor_product (#52690) · 90c3bddf
  由 gouzil 提交于 4月 10, 2023
```
* add autogen code bilinear_tensor_product

* [phi] rm cc file
```
  90c3bddf
- C
  
  【Hackathon4 No58】fix exponential and pad (#51300) · 3ee2b237
  由 cyberslack_lee 提交于 4月 10, 2023
  
  3ee2b237
- L
  Autogen softmax_with_cross_entropy (#52515) · 351ccb63
  由 lzydev 提交于 4月 10, 2023
```
* autogen softmax_with_cross_entropy

* fix error in softmax_with_cross_entropy version
```
  351ccb63
- H
  
  [Approval For Phi] Add approval check for including third-party in phi headerfiles (#52653) · f9aaa1e4
  由 HongyuJia 提交于 4月 10, 2023
  
  f9aaa1e4
- K
  [StandaloneExe] Remove flag about Executor (#52671) · d6ee0a13
  由 kangguangli 提交于 4月 10, 2023
```
* add strategy force_sequential_run

* remove flag

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix
```
  d6ee0a13
- H
  [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc (#52573) · 3c0b1795
  由 HongyuJia 提交于 4月 10, 2023
```
* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc

* Add gflags.h for other files

* Add gflags.h for other files

* Add gflags.h for blas_impl.hip.h

* Add gflags.h for miopen_helper.h
```
  3c0b1795
- V
  [AMP OP&Test] Add fp16 and bf16 test to activation (#52521) · 6bd5fd75
  由 Vvsmile 提交于 4月 10, 2023
```
* adjust defalut tolerance of output and grad

* fix a bug in the grad of OpTest

* fix the type of setting defalut value in optest, both forward and
backward

* add defalut

* fix test_sum_op

* adjust tolerance

* fix the tolerance of eager

* add bf16 and fp16 to the activation tests

* remove some fixs

* fix activation

* fix fp16

* fix gelu

* fix the activation tests

* add bfloat16 specialization to singrad and cosgrad

* fix bugs

* fix bugs

* add unittest

* add skip

* add fp/bf to rrelu/rrelu_grad

* git add rrelu

* fix bugs
```
  6bd5fd75
- W
  
  update (#51297) · 70eaf9de
  由 Wilber 提交于 4月 10, 2023
  
  70eaf9de
- Q
  【AMP OP&Test】instance_norm fp16 and bf16 support. (#52241) · 7c98abd9
  由 qizhaoaoe 提交于 4月 10, 2023
```
* add fp16 and bf16 support for instance_norm

* fix /= operator which not support bf16

* fix instance_norm_grad kernel and unittests.

* fix fp32 unittests.

* fix instance_norm_kernel and unittests.

* fix instance_norm_grad_kernel and unittest threshold.

* add fp16/bf16 for instance_norm_grad_grad op.

* add bf16 dtype check.

* fix conflicts.

* fix cpu support for fp32 op and fix type in instance_norm_grad_kernel.

* fix type in instance_norm_kernel.

* fix bf16 outputs in unittests and refine codes.

* fix dx computation.

* delete unuseful params and head including.

* add fp16/bf16 for static graph.

* fix device condiction for instance_norm op.

* fix instance_norm_grad_grad and bf16 op tests.

* fix op_test to support grad of bf16 can be compared with fp32.

* remove updates.

* add self-defined grad.
```
  7c98abd9
- C
  
  fix version message (#50318) · de44b3ac
  由 chalsliu 提交于 4月 10, 2023
  
  de44b3ac
- W
  
  add autogen code support for logcumsumexp op (#52682) · 891cf433
  由 Wang Xin 提交于 4月 10, 2023
  
  891cf433
- H
  register fluid kerenls to phi [part 7] (#52577) · aa35331f
  由 huangjiyi 提交于 4月 10, 2023
```
* update

* fix bug

* fix ci-windows-openblas

* fix test_partial_sum_op

* fix codestyle
```
  aa35331f
- J
  
  remove infrt V1.1 (#52672) · 6913feb0
  由 jjyaoao 提交于 4月 10, 2023
  
  6913feb0
- Z
  【PaddlePaddle Hackathon 4 No.36】为 Paddle 优化 tile op 在 GPU 上的计算性能 (#52482) · 61fe2198
  由 Zero Rains 提交于 4月 10, 2023
```
* fix divide zero bug for softmax_with_cross_entropy

* change the single test way

* can run but slow. the most important is that I do not know why it slow

* remove some useless commet

* change the copyright to correct

* remove some useless change

* if repeat_times == 1, we will not use BroadcastKernel
```
  61fe2198
- C
  
  support auto generate for eigvalsh (#52687) · 93404a61
  由 cyberslack_lee 提交于 4月 10, 2023
  
  93404a61
- A
  【PaddlePaddle Hackathon 4 No.44】为 Paddle 优化 logsumexp op 在 GPU 上的计算性能 (#52509) · 0e776965
  由 Asthestarsfalll 提交于 4月 10, 2023
```
* Optimize the performance of logsumexp

* Support zero-dim tensor
```
  0e776965
- L
  
  support custom device on macos (#52620) · 575cafb4
  由 lishicheng1996 提交于 4月 10, 2023
  
  575cafb4
- Z
  
  add tensor_utils.h into all.h (#52600) · 3cbcaf1a
  由 zyfncg 提交于 4月 10, 2023
  
  3cbcaf1a
- W
  add autogen code support for affine_grid op (#52560) · 90280542
  由 Wang Xin 提交于 4月 10, 2023
```
* add autogen code support for affine_grid op

* update op_compat.yaml for affine_grid

* update op_compat.yaml for affine_grid

* fix AffineGridGradInferMeta

* fix CI error

* update AffineGridInferMeta
```
  90280542
- R
  
  [AMP OP & Test] Tril & Triu (#52411) · ec008a71
  由 Roc 提交于 4月 10, 2023
  
  ec008a71
- W
  
  Fix shape error when check no shape var type (#52629) · 648f58aa
  由 WangZhen 提交于 4月 10, 2023
  
  648f58aa
- G
  modify ~MatmulDescriptor and remove [-Wunused-function] (#52618) · 45f660dd
  由 Galaxy1458 提交于 4月 10, 2023
```
* delete [-Wno-error=terminate], test=develop

* remove GPUps[-Wterminate],test=develop

* remove some -Wno-, test=develop

* modify ~MatmulDescriptor

* mess
```
  45f660dd
- H
  
  [CustomOP unittest] Remove useless comment in custom operator's unit test (#52710) · 50ef5c5a
  由 HongyuJia 提交于 4月 10, 2023
  
  50ef5c5a
- R
  
  fix gcc12 error (#52646) · 66a4804b
  由 risemeup1 提交于 4月 10, 2023
  
  66a4804b
- J
  
  change cmake/operators.cmake (#52679) · fd6a0607
  由 jjyaoao 提交于 4月 10, 2023
  
  fd6a0607
- J
  
  delete paddle/fluid/operators/math,metrics,optimizers,reduce_ops/*_npu.* (#52674) · a6aa701e
  由 jjyaoao 提交于 4月 10, 2023
  
  a6aa701e
- J
  
  delete paddle/fluid/operators/collective/*_npu.* (#52677) · b451aff8
  由 jjyaoao 提交于 4月 10, 2023
  
  b451aff8
- R
  optimize setup.py (#52621) · c1cad896
  由 risemeup1 提交于 4月 10, 2023
```
* optimize setup.py

* add ninja
```
  c1cad896
- J
  
  delete paddle/fluid/operators/controlflow/*_npu.* (#52676) · 4500b64a
  由 jjyaoao 提交于 4月 10, 2023
  
  4500b64a
- J
  
  delete paddle/fluid/operators/elementwise/*_npu.* (#52675) · 599a201f
  由 jjyaoao 提交于 4月 10, 2023
  
  599a201f
- 张
  Remove WITH_ASCEND (#52669) · 0f3bbe10
  由张春乔提交于 4月 10, 2023
```
* mv WITH_ASCEND_CL

* mv WITH_ASCEND

* rollback

* remove WITH_ASCEND

* remove WITH_ASCEND
```
  0f3bbe10
- W
  [bug fix] fix pow composite (#52645) · f2d1f284
  由 wangzhen38 提交于 4月 10, 2023
```
* [bug fix] fix pow composite

* [bug fix] for ci
```
  f2d1f284
09 4月, 2023 6 次提交
- C
  
  [Prim] support amp O1 in prim (#52598) · 58d5af00
  由 cyber-pioneer 提交于 4月 09, 2023
  
  58d5af00
- R
  [PHI CAPI] support complex dtype kernel (#52414) · b60f48ce
  由 ronnywang 提交于 4月 09, 2023
```
* [PHI CAPI] support complex dtype kernel

* update
```
  b60f48ce
- C
  
  fix fused_dropout_add bug (#52644) · 5df1296d
  由 Chitsing KUI 提交于 4月 09, 2023
  
  5df1296d
- S
  [BugFix] Fix random seed bug in hybridparallel (#52656) · 61ca8b39
  由 ShenLiang 提交于 4月 08, 2023
```
* add seed control

* fix bug
```
  61ca8b39
- add bf16 for some ops in static mode (#51582) · 6cd095fc
  由 shaojie_wang 提交于 4月 08, 2023
  
  6cd095fc
- S
  add autogen code support for matrix_nms. (#52479) · 8abc5333
  由 scotty 提交于 4月 09, 2023
```
* add autogen code support for matrix_nms.

* update
```
  8abc5333

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功