提交 · e5616448deda49ee9617473647f6b2e7316bb3b5 · PaddlePaddle / Paddle

15 3月, 2023 15 次提交
- W
  
  [AMP OP&Test]fix index_select bf16 test (#51652) · e5616448
  由 wangxiaoning 提交于 3月 15, 2023
  
  e5616448
- J
  【Prim】Support amp logic for layer_norm and softmax (#51473) · 64076727
  由 Jiabin Yang 提交于 3月 15, 2023
```
* support amp logic for layer_norm and softmax

* fix layer_norm amp

* fix layernorm api and dropout fp16

* fix layernorm api and dropout fp16

* fix bn, ln dtype in float16

* fix dropout fp16

* fix comment
```
  64076727
- K
  [with_data_parallel][part13] remove with_data_parallel in example code (#51588) · 14f1973d
  由 kangguangli 提交于 3月 15, 2023
```
* remove with_data_parallel in example code

* revert python/paddle/fluid/data_feeder.py

* fix static.nn.fc api
```
  14f1973d
- W
  support gather test on prim and cinn (#51376) · 5b3c7ee7
  由 Weilong Wu 提交于 3月 15, 2023
```
* support gather test on prim and cinn

* reset timeout for gather
```
  5b3c7ee7
- C
  [Prim] add pow composite rule (#51070) · 2d9e103e
  由 chenjian 提交于 3月 15, 2023
```
* add pow composite rule

* fix test

* fix unit test

* update test

* fix test

* update
```
  2d9e103e
- W
  
  fix python syntax issue (#51658) · 9045b882
  由 Weilong Wu 提交于 3月 15, 2023
  
  9045b882
- Y
  
  [AMP OP&Test] Support bf16/fp16 for roll op and add ut. (#51565) · 1fbf423a
  由 Yuang Liu 提交于 3月 15, 2023
  
  1fbf423a
- G
  
  fix quantization int8 weight save bug (#51500) · 8fc9a19f
  由 Guanghua Yu 提交于 3月 15, 2023
  
  8fc9a19f
- S
  【AMP OP&Test】Add fp16 test for divide, matmul, pnorm (#51005) · c2b24166
  由 Siming Dai 提交于 3月 15, 2023
```
* add fp16 test for divide, matmul, pnorm

* add cumsum fp16 unittest

* fix threshold

* revert cumsum

* fix code-style

* fix according to review

* fix kernel not found
```
  c2b24166
- G
  
  add inplace sigmoid_ and multiply_ (#50267) · b3caa233
  由 Guoxia Wang 提交于 3月 15, 2023
  
  b3caa233
- W
  
  [JitLayer]Fix Load error when load path like 'export.jit' (#46279) · 77d9b4c3
  由 WangZhen 提交于 3月 15, 2023
  
  77d9b4c3
- Z
  Delete hardswish_raw op (#51634) · 3e636ec9
  由 zhangyuqin1998 提交于 3月 15, 2023
```
* Delete hardswish_raw op

* fix ut
```
  3e636ec9
- R
  [CustomDevice] fix SyncDefaultStream for process_group_custom (#51618) · bcec0dce
  由 ronnywang 提交于 3月 15, 2023
```
* [CustomDevice] fix SyncDefaultStream for process_group_custom

* update
```
  bcec0dce
- W
  refine amp scaler (#51340) · 1e232e27
  由 wanghuancoder 提交于 3月 15, 2023
```
* refine _found_inf
```
  1e232e27
- X
  【prim】 modify_yaml (#51436) · 870c0837
  由 xiaoguoguo626807 提交于 3月 15, 2023
```
* modify_yaml

* delete default param

* add output for matmul_double_grad
```
  870c0837
14 3月, 2023 25 次提交
- [Zero Dim] hack process Tensor.numpy() from 0D to 1D to avoid much incompatible (#51586) · 4a8b97ee
  由 zhouweiwei2014 提交于 3月 14, 2023
  
  4a8b97ee
- V
  
  Adjust tolerance without modify grad (#51459) · 145a6cbb
  由 Vvsmile 提交于 3月 14, 2023
  
  145a6cbb
- P
  
  delete numpy version (#49556) · 117df481
  由 pangyoki 提交于 3月 14, 2023
  
  117df481
- fix -Werror=maybe-uninitialized (#51608) · ae428a0a
  由 engineer1109 提交于 3月 14, 2023
  
  ae428a0a
- C
  
  Fix typos (#51379) · e34c79c7
  由 chenxujun 提交于 3月 14, 2023
  
  e34c79c7
- [Zero-Dim] correct some code to adapt to 0D Tensor (#51562) · 6737226f
  由 zhouweiwei2014 提交于 3月 14, 2023
  
  6737226f
- C
  add split and split_with_num composite rule (#51341) · bb9eb20f
  由 ccrrong 提交于 3月 14, 2023
```
* add split_with_num composite rule

* add split_with_num composite rule

* add split composite rule

* update

* update test

* update test

* delete split_with_num_grad
```
  bb9eb20f
- add symbol InitDevices InitMemoryMethod (#51553) · a5ebe6ae
  由 engineer1109 提交于 3月 14, 2023
```
fix abi

fix tab
```
  a5ebe6ae
- Q
  
  implement expand as using tile (#51577) · 300b687a
  由 qizhaoaoe 提交于 3月 14, 2023
  
  300b687a
- L
  Optimization for layerNormGrad [Part1] (#51282) · 7a3d05d9
  由 limingshu 提交于 3月 14, 2023
```
* first commit

* fix code bugs in for_loop

* fix bugs in cuLoadAddStridedInputs.

* optimization for LayerNormBackwardComputeGradInput

* add unitest for validating the optimization

* fix windows ci error
```
  7a3d05d9
- G
  
  [Divide by 0 Error] add DataNormKernel check (#51583) · e4ba5f86
  由 gouzil 提交于 3月 14, 2023
  
  e4ba5f86
- P
  cuda graph support multi-stream for new executor (#51389) · 579fb5fd
  由 pangyoki 提交于 3月 14, 2023
```
* cuda graph support multi-stream for new executor

* fix windows compile error

* delete create_cuda_graph_stream
```
  579fb5fd
- Z
  
  fix cmakelist (#51546) · 26007b1d
  由 zhaoyingli 提交于 3月 14, 2023
  
  26007b1d
- Y
  [AMP OP&Test] Append bf16/fp16 support 4 elementwise_max (#51151) · 143eceeb
  由 YuhangLi 提交于 3月 14, 2023
```
* wisemax fp16 support

* add bf16 support 4 elementwise_max

* append broadcast 4 op 4 fp16 / bf16

* fix elewise_max ut bf16 numeric delta

* append fp/bf16 uts

* add fp/bf16 uts

* change bf16 uts delta

* fix some issue

* add prim 4 fp16
```
  143eceeb
- W
  
  fix rank=1 (#51413) · b4f49aa1
  由 wangxiaoning 提交于 3月 14, 2023
  
  b4f49aa1
- W
  
  fix test_layernorm_shift_partition_pass time out (#51612) · b642461d
  由 wenbin 提交于 3月 14, 2023
  
  b642461d
- X
  
  [dy2static] fix the speed problem introduced by #50883 (#51606) · 46d6080d
  由 xiongkun 提交于 3月 14, 2023
  
  46d6080d
- W
  [TRT] Fix conv2d filter of trt elementwiseadd_trans fusion UT (#51294) · dca81a43
  由 Wang Bojun 提交于 3月 14, 2023
```
* fix conv2d filter
```
  dca81a43
- X
  【prim】test composite rules with -1 shape (#51435) · 82a7c33e
  由 xiaoguoguo626807 提交于 3月 14, 2023
```
* init

* modify
```
  82a7c33e
- W
  
  fix (#51552) · c3f8ba9b
  由 wangxiaoning 提交于 3月 14, 2023
  
  c3f8ba9b
- I
  
  add output defs for histogram kernel (#51317) · 2876f6f8
  由 Infinity_lee 提交于 3月 14, 2023
  
  2876f6f8
- Z
  【AMP OP&Test】add fp16 and bf16 test (#51286) · 376dbb82
  由 zhiboniu 提交于 3月 14, 2023
```
* add fp16 and bf16 test

* update
```
  376dbb82
- A
  add register of select (#51595) · 93867e20
  由 Ackeraa 提交于 3月 14, 2023
```
add register of select
Co-authored-by: Nwqgo <1552367872@qq.com>
```
  93867e20
- L
  update empty api to support complex dtype at static mode (#51377) · fc0497bd
  由 Li-fAngyU 提交于 3月 14, 2023
```
* update empty api to support compex dtype at static mode

* code style

* code style

* 补充注释里的类型描述
```
  fc0497bd
- H
  
  [Tensor Operants & Prim-Relevant] Multiply operants replace by scale (#51469) · 2d0e8c3b
  由 HongyuJia 提交于 3月 14, 2023
  
  2d0e8c3b

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功