提交 · b835d958e53ab7f18f77cbb9797e607f83db4447 · PaddlePaddle / Paddle

12 4月, 2023 4 次提交

Y
fix convert_to_mixed_precision api save model bug (#52767) · b835d958
由 Yuanle Liu 提交于 4月 12, 2023
```
* update save model

* update
```
b835d958

由 RedContritio 提交于 4月 12, 2023

* move python/paddle/fluid/tests/unittests/xpu to test/xpu

* update CMakeLists.txt

* remove xpu in fluid/tests/unittests/

* add path to op_test_xpu

* fix incorrect path

* update test script

* fix test_adadelta_op_xpu error

9a7c83bd

[AMP OP&Test] support bf16 for batch norm (#52407) · 523f8a26

由 Guoxia Wang 提交于 4月 12, 2023

* [AMP OP&Test] support bf16 for batchnorm

* codestyle

* Update batch_norm_grad_kernel.cu

* Update batch_norm_kernel.cu

* fix codestyle

* fix

* fix

* fix

* fix

* fix

* Update batch_norm_kernel.cc

523f8a26

W

fix force sync bug in paddle.grad (#52779) · 7a78a571
由 wanghuancoder 提交于 4月 12, 2023

7a78a571

11 4月, 2023 22 次提交
- L
  
  autogen unique (#52738) · 8e9bfa7f
  由 lzydev 提交于 4月 11, 2023
  
  8e9bfa7f
- G
  remove -Wimplicit-fallthrough (#52717) · 74542577
  由 Galaxy1458 提交于 4月 11, 2023
```
* delete [-Wno-error=terminate], test=develop

* remove GPUps[-Wterminate],test=develop

* remove some -Wno-, test=develop

* modify ~MatmulDescriptor

* mess

* remove -Wimplicit-fallthrough, test=develop

* remove -Wimplicit-fallthrough, test=develop

* remove -Wimplicit-fallthrough, test=develop

* remove -Wimplicit-fallthrough, test=develop

* remove , test=develop
```
  74542577
- Y
  
  [Paddle Inference] Predictor support paddle::Tensor (#50445) · 10fd4a95
  由 Yuanle Liu 提交于 4月 11, 2023
  
  10fd4a95
- W
  
  [XPU] fix error pattern and rename max name (#52726) · 259b0aad
  由 wz1qqx 提交于 4月 11, 2023
  
  259b0aad
- X
  
  [prim]use Operator to reconstruct the primitive operator defined in c++ (#51997) · dd74b3d1
  由 Xiaoxu Chen 提交于 4月 11, 2023
  
  dd74b3d1
- R
  
  support auto generate for op average_accumulates (#52704) · 6741dd22
  由 RedContritio 提交于 4月 11, 2023
  
  6741dd22
- R
  support auto generate static for randperm (#52531) · 4a74f4c5
  由 RedContritio 提交于 4月 11, 2023
```
* support auto generate static for randperm

* remove enforce in randperm infermeta
```
  4a74f4c5
- Z
  
  delete remote_prefetch (#52748) · 3951c40d
  由 zhangyuqin1998 提交于 4月 11, 2023
  
  3951c40d
- W
  
  fix check nan bug (#52729) · 6366cffe
  由 wanghuancoder 提交于 4月 11, 2023
  
  6366cffe
- W
  [AMP OP&Test]Add fp16/bf16 support isnan/isfinite/isinf op (#52259) · aaf873b2
  由 WJJ1995 提交于 4月 11, 2023
```
* add bfp16 test for isfinite

* fixed for ci

* deal with comments

* fixed test

* skip test in cpu

* deal with comments

* fixed for ci

* fixed testcase

* fixed for ci

* fixed for testcase
```
  aaf873b2
- W
  
  mp sync params & grads & opt states. (#51428) · 6b74cf76
  由 wuhuachaocoding 提交于 4月 11, 2023
  
  6b74cf76
- W
  
  [BUG Fixs] adadelta lr support (#49732) · 23032590
  由 wangzhen38 提交于 4月 11, 2023
  
  23032590
- R
  support auto generate for op merged_momentum optimizer (#52708) · 2a420036
  由 RedContritio 提交于 4月 11, 2023
```
* fix error in generator/type_mapping.py

* support auto generate for op merged_momentum optimizer
```
  2a420036
- R
  support auto generate for flatten (flatten_contiguous_range) (#52512) · 410e25fb
  由 RedContritio 提交于 4月 11, 2023
```
* support auto generate for flatten (flatten_contiguous_range)

* add data_type for flatten_grad
```
  410e25fb
- L
  Add output defs for eigh kernel (#51362) · da0c7e14
  由 LinearTemporalLogic 提交于 4月 11, 2023
```
* Add output defs for eigh kernel

* fix

* update

* update

* fix

* fix
```
  da0c7e14
- W
  add autogen code support for reverse op (#52701) · ab754417
  由 Wang Xin 提交于 4月 11, 2023
```
* add autogen code support for reverse op

* bug fixed
```
  ab754417
- R
  
  support auto generate for op adagrad optimizer (#52695) · c4e1fcba
  由 RedContritio 提交于 4月 11, 2023
  
  c4e1fcba
- T
  
  [AMP OP&Test] add bf16 fp16 type support for expand_v2_op and top_k_v2_op (#51263) · 5b09dd56
  由 Thomas Young 提交于 4月 11, 2023
  
  5b09dd56
- R
  support auto generate for op momentum optimizer (#52611) · a6ae1e35
  由 RedContritio 提交于 4月 11, 2023
```
* support auto generate for op momentum optimizer

* remove momentum_op.* and update signature

* fix dgc momentum op maker error
```
  a6ae1e35
- Y
  
  update xpu.cmake to 20230408 (#52409) · 757aa470
  由 ykkk2333 提交于 4月 11, 2023
  
  757aa470
- J
  remove paddle/infrt/ (#52719) · e041ffca
  由 jjyaoao 提交于 4月 11, 2023
```
* remove paddle/infrt/

* delete .lit_test_times.txt
```
  e041ffca
- C
  
  fix c_embedding bug (#52742) · 4a790cba
  由 Chitsing KUI 提交于 4月 11, 2023
  
  4a790cba
10 4月, 2023 14 次提交

L
Autogen segment_pool (#52538) · 1bc00955
由 lzydev 提交于 4月 10, 2023
```
* autogen segment_pool

* delete legacy_dygraph about segment_pool
```
1bc00955
J
delete paddle/fluid/operators/*_npu.* (#52678) · a7707efb
由 jjyaoao 提交于 4月 10, 2023
```
* delete paddle/fluid/operators/*_npu.*

* try pass CI

* try pass CI
```
a7707efb
D
【Hackathon No57】 add fp16 & bf16 for flip, fp16 for gaussian (#52380) · 2b0fffc2
由 Difer 提交于 4月 10, 2023
```
* add_fp_bf_for_flip_gaussian_random

* forget convert uint

* fix some error

* fix some error
```
2b0fffc2
J
delete paddle/fluid/operators/amp/*_npu.* (#52673) · d7a1a178
由 jjyaoao 提交于 4月 10, 2023
```
* delete paddle/fluid/operators/*_npu.*

* try pass code-style
```
d7a1a178

[AMP] support master_grad for amp training (#52235) · 4970dd65

由 Zhang Ting 提交于 4月 10, 2023

* support set master_grad

* move register_hook to auto_cast

* update unittest

* fix fp16 test

* update for review comments

4970dd65

X
[Paddle Inference] Support two inputs of multihead attention named qk_multihead. (#52455) · 6934ac79
由 xiaoxiaohehe001 提交于 4月 10, 2023
```
* Support two inputs of multihead attention named qk_multihead
```
6934ac79

[Opt Performance] Optimize custom operator performance (#52597) · 01247e33

由 HongyuJia 提交于 4月 10, 2023

* [Opt Performance] Optimize custom operator performance, reconstruct python API auto-gen, add cache and use const inference

* opt AutoGradMeta implementation

* remove profiler codes

* fix unit test

* change year, 2021->2023

* fix int64_t parse bug

01247e33

G
Autogen code bilinear_tensor_product (#52690) · 90c3bddf
由 gouzil 提交于 4月 10, 2023
```
* add autogen code bilinear_tensor_product

* [phi] rm cc file
```
90c3bddf
C

【Hackathon4 No58】fix exponential and pad (#51300) · 3ee2b237
由 cyberslack_lee 提交于 4月 10, 2023

3ee2b237
L
Autogen softmax_with_cross_entropy (#52515) · 351ccb63
由 lzydev 提交于 4月 10, 2023
```
* autogen softmax_with_cross_entropy

* fix error in softmax_with_cross_entropy version
```
351ccb63

[StandaloneExe] Remove flag about Executor (#52671) · d6ee0a13

由 kangguangli 提交于 4月 10, 2023

* add strategy force_sequential_run

* remove flag

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

d6ee0a13

[enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc (#52573) · 3c0b1795

由 HongyuJia 提交于 4月 10, 2023

* [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc

* Add gflags.h for other files

* Add gflags.h for other files

* Add gflags.h for blas_impl.hip.h

* Add gflags.h for miopen_helper.h

3c0b1795

[AMP OP&Test] Add fp16 and bf16 test to activation (#52521) · 6bd5fd75

由 Vvsmile 提交于 4月 10, 2023

* adjust defalut tolerance of output and grad

* fix a bug in the grad of OpTest

* fix the type of setting defalut value in optest, both forward and
backward

* add defalut

* fix test_sum_op

* adjust tolerance

* fix the tolerance of eager

* add bf16 and fp16 to the activation tests

* remove some fixs

* fix activation

* fix fp16

* fix gelu

* fix the activation tests

* add bfloat16 specialization to singrad and cosgrad

* fix bugs

* fix bugs

* add unittest

* add skip

* add fp/bf to rrelu/rrelu_grad

* git add rrelu

* fix bugs

6bd5fd75

【AMP OP&Test】instance_norm fp16 and bf16 support. (#52241) · 7c98abd9

由 qizhaoaoe 提交于 4月 10, 2023

* add fp16 and bf16 support for instance_norm

* fix /= operator which not support bf16

* fix instance_norm_grad kernel and unittests.

* fix fp32 unittests.

* fix instance_norm_kernel and unittests.

* fix instance_norm_grad_kernel and unittest threshold.

* add fp16/bf16 for instance_norm_grad_grad op.

* add bf16 dtype check.

* fix conflicts.

* fix cpu support for fp32 op and fix type in instance_norm_grad_kernel.

* fix type in instance_norm_kernel.

* fix bf16 outputs in unittests and refine codes.

* fix dx computation.

* delete unuseful params and head including.

* add fp16/bf16 for static graph.

* fix device condiction for instance_norm op.

* fix instance_norm_grad_grad and bf16 op tests.

* fix op_test to support grad of bf16 can be compared with fp32.

* remove updates.

* add self-defined grad.

7c98abd9

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功