提交 · a186e60dedebd402add5ae811b46ba7ae11ef55f · BaiXuePrincess / Paddle

30 12月, 2022 3 次提交

在文档中统一静态图模式与动态图模式的英文翻译 (#49170) · a186e60d

由 Sanbu 提交于 12月 30, 2022

* 1219

* temporarily change the num_diff_files limit, test=document_fix

* Revert "temporarily change the num_diff_files limit, test=document_fix"

This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20.

* for codestyle

* remove duplicate license

* `static mode` -> `static graph mode`

* Update hybrid_parallel_inference.py

* Update layer_function_generator.py

* Update manipulation.py

* reset
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

a186e60d

W
Fix default GetExpectedKernelType for ops supported tensor attrs (#49414) · 8a859554
由 WangZhen 提交于 12月 30, 2022
```
* Fix default GetExpectedKernelType for ops supported tensor attrs
```
8a859554
姜
Yj/rm legacy part 0 (#49424) · 3ffcd693
由姜永久提交于 12月 30, 2022
```
* rm legacy

* clear in_legacy

* fix tracer
```
3ffcd693

29 12月, 2022 4 次提交
- L
  
  Add scale and floor_divide ut cases (#49418) · a30e3602
  由 Lin Manhui 提交于 12月 29, 2022
  
  a30e3602
- X
  auto parallel bf16 (#49079) · 418edae5
  由 xu98bin 提交于 12月 29, 2022
```
* auto parallel bf16
```
  418edae5
- 姜
  rm legacy dygraph part7 (#49285) · df3f74df
  由姜永久提交于 12月 29, 2022
```
* rm legacy dygraph part7

* rm non_static_mode

* modify

* modify

* add static test

* set static for lstm_cudnn test

* reset tracer

* reset varbase

* fix
```
  df3f74df
- W
  fused_attention_op paratmers stop grad support (#49351) · 0bb999b6
  由 Wang Bojun 提交于 12月 29, 2022
```
* fusedAttenGrad_noGrad

* code style fix

* add ut

* remove unnecessary log
```
  0bb999b6
28 12月, 2022 9 次提交

R

skip this ut when cuda < 11.2 && cuda_arch < 8 (#49313) · 0c52e8a8
由 RichardWooSJTU 提交于 12月 28, 2022

0c52e8a8

姜

rm legacy nn part2 (#49259) · 69e51c77

由姜永久提交于 12月 28, 2022

* rm legacy nn part2

* rm _non_static_mode

* modify

* modify unpool test

* modify unpool test

* modify loss

* keep legacy for layer_norm

69e51c77

remove fluid.contrib.fused_elemwise_activation, sequence_topk_avg_pooling,... · da357615

由 zqw_1997 提交于 12月 28, 2022

remove fluid.contrib.fused_elemwise_activation, sequence_topk_avg_pooling, var_conv_2d, match_matrix_tensor and tree_conv (#49331)

da357615

[new-exec] Ahead-Of-Time choosing kernel (#48789) · 63d2d722

由 Leo Chen 提交于 12月 28, 2022

* add skip run

* alloc minimum memory

* skip check_size in Alloc

* skip check_size in Alloc

* skip check_size in Alloc

* fix cases when tensor is initialized or empty

* alloc empty output for place info

* add test

* increase timeout

* format code

* skip cpu

* add cudnn_deterministic

* fit for hostAlloc

* follow comments

* change check_size to fake_alloc

63d2d722

[ 0d-Tensor ] einsum support 0d tensor. (#49177) · 71bde066

由 xiongkun 提交于 12月 28, 2022

* einsum support 0d tensor.
1. support 0d tensor in multi-operands.
2. add 9 unittests for einsum 0d tensor.

* override NVIDIA_TF32_OVERRIDE to avoid accuracy problem in 11.2 and 11.8

71bde066

[AutoParallel] adapt for clip (#49249) · df944772

由 zhaoyingli 提交于 12月 28, 2022

* [AutoParallel] adapt for clip

* fix unittest

* enable_static

* fix dist_fill_constant_batch_size_like

* fix process_mesh.shape

* update cond of modifying shape_list

df944772

姜

rm legacy fluid part4 (#49281) · f1072973

由姜永久提交于 12月 28, 2022

* rm legacy fluid part4

* rm non_static_mode

* minor change

* modify initializer

* rm legacy for initializer

* fix dataloader test

f1072973

Fix CUDA11.8 Unittest Accuracy (#49373) · 76f43f6d

由 Huihuang Zheng 提交于 12月 28, 2022

This PR increased the delta in unit test for CUDA 11.8. The reason of this fix:
(1) It seems CUDA 11.8 has higher delta in accuracy result. Our other targets for seresnext under parallel executor have already added delta such as CPU, all reduce test cases, so we did same for GPU base case with CUDA 11.8
(2) A new executor is under developing in PaddlePaddle team, so the unit test for old executor can be relaxed.

76f43f6d

W
delete old dygraph pylayer (#49339) · 0b60b784
由 wanghuancoder 提交于 12月 28, 2022
```
* delete old dygraph pylayer
```
0b60b784

27 12月, 2022 8 次提交
- fux bug of UT test_version (#49349) · 8a4e67a1
  由 zhouweiwei2014 提交于 12月 27, 2022
  
  8a4e67a1
- Z
  
  add unbind op for xpu (#49356) · 16931039
  由 zhangyikun02 提交于 12月 27, 2022
  
  16931039
- X
  fix fold for large bs (#49337) · 9dde26f6
  由 xiaoting 提交于 12月 27, 2022
```
* fix fold for large bs

* fix fold for large bs
```
  9dde26f6
- Z
  [AutoParallel] fix input order (#49329) · a9533953
  由 zhaoyingli 提交于 12月 27, 2022
```
* fix input order

* add unittest

* update cmakelist
```
  a9533953
- Z
  [AutoParallel] quantization pass support export (#48072) · 27ce06aa
  由 zhaoyingli 提交于 12月 27, 2022
```
* [AutoParallel] quantization pass support export

* support subgraph

* move_presist_var_to_global_block

* update unittest

* fix ci-coverage

* fix codestyle

* fix fake_dequantize_op

* remove unused var

* fix ci error and aprroval error

* add unittest for fp16 in test_dequant_linear

* replace mutable data

* fix unittest in non-cuda-core

* fix unittest
Co-authored-by: Ncarryyu <569782149@qq.com>
Co-authored-by: Nwufeisheng <wfs1997@163.com>
```
  27ce06aa
- W
  delete old dygraph sharding (#49334) · 2bbdc47a
  由 wanghuancoder 提交于 12月 27, 2022
```
* delete old dygraph sharding
```
  2bbdc47a
- R
  Support priority scheduling for standalone executor (#49275) · 0839bba3
  由 Ruibiao Chen 提交于 12月 27, 2022
```
* Support priority scheduling for standalone executor

* Add CPU test
```
  0839bba3
- W
  delete legacy dygraph code in python/paddle/tensor (#49286) · 861fef52
  由 wanghuancoder 提交于 12月 27, 2022
```
* delete _in_legacy_dygraph
```
  861fef52
26 12月, 2022 5 次提交

Add collective communication APIs to improve completeness (#49252) · dec67d6d

由 Wen Sun 提交于 12月 26, 2022

* feat: broadcast_object_list & scatter_object_list

* chore: update ut conf

* get_backend & is_available

* docs: update requirements

* fix: resolve conflicts
Co-authored-by: NLiYuRio <liyuruijx@163.com>

dec67d6d

姜
rm legacy unittest part5 (#49282) · a72a0da0
由姜永久提交于 12月 26, 2022
```
* rm legacy unittest part5

* add custom op
```
a72a0da0

[fluid clean]replace fliud.io.load_inference_model from util_factory.py (#49156) · 3f896dce

由 wangxiaoning 提交于 12月 26, 2022

* add index sample fp16 support

* remove fluid APIs in distributed_strategy.py and role_maker.py

* Revert "remove fluid APIs in distributed_strategy.py and role_maker.py"

This reverts commit 223bbee990d3bf69e252fc3c0f19e3873550a264.

* move load_inference_model to distributed

* fix origin develop codes diff

* move _endpoints_replacement

* delete line

* reset line

* add unittest case of load_inference_model

* fix unittest

* fix unittest

* fix coverage

* fix coverage

3f896dce

R

Revert params in paddle.nn.SpectralNorm and paddle.nnFlatten.forward (#49311) · 945f777f
由 Roc 提交于 12月 26, 2022

945f777f

[Auto Parallel] Merge the python and c++ impls of ProcessMesh (#47503) · 1c0afa79

由 Yulong Ao 提交于 12月 26, 2022

* [Auto Parallel] Rename methods of ProcessMesh

* [Auto Parallel] Impl the python process_mesh by the c++ one

* [Auto Parallel] Add some minor modifications

* [Auto Parallel] Rename some methods

* [Auto Parallel] Remove unnecessary codes

* [Auto Parallel] Add back some removed files

* [Auto Parallel] Fix bugs

* [Auto Parallel] Fix a bug

* Update process_mesh.cc

* [Auto Parallel] Fix a bug

1c0afa79

23 12月, 2022 8 次提交
- Q
  
  suport recompute for kunlun (#49069) · 98c17a68
  由 QingshuChen 提交于 12月 23, 2022
  
  98c17a68
- L
  
  make FusedMultiTransformer supports RoPE (#48842) · 644dfc60
  由 lzy 提交于 12月 23, 2022
  
  644dfc60
- Y
  
  Fix arange gpu kernel (#49273) · e073313d
  由 Yuanle Liu 提交于 12月 23, 2022
  
  e073313d
- C
  fix matmul double and triple grad (#48779) · 13c4fd59
  由 Charles-hit 提交于 12月 23, 2022
```
* fix matmul double and triple grad

* remove some comment

* add matmul_double_grad unit test

* fix matmul triple grad

* fix dot triple grad and add unit test

* modify codestyle

* fix dot_grad

* refactor dot triple grad

* disable some unit test

* fix unit test

* fix unit test in double grad
```
  13c4fd59
- Z
  remove paddle.fluid.contrib.layers.BasicLSTMUnit、basic_lstm、BasicGRUUnit、basic_gru (#49268) · a1319074
  由 zqw_1997 提交于 12月 23, 2022
```
* rm paddle.fluid.contrib.layers.BasicLSTMUnit basic_lstm BasicGRUUnit basic_gru

* rm dependency in __init__.py
```
  a1319074
- 姜
  rm eager guard test (#49245) · cb34ee0f
  由姜永久提交于 12月 23, 2022
```
* rm eager guard test

* retain grad for xpu test
```
  cb34ee0f
- H
  [Custom Extension] Fix custom double_grad backward=None (#49224) · ca4155c8
  由 HongyuJia 提交于 12月 23, 2022
```
* fix custom double_grad backward=None

* fix custom_relu.cu bug && polish testcase of double_grad

* remove old dynamic graph test
```
  ca4155c8
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
22 12月, 2022 3 次提交

H

fix custom operator testcase CI error (#49270) · 9f47aac9
由 HongyuJia 提交于 12月 22, 2022

9f47aac9

[eager] use CPUAllocator directly (#47125) · 4537ba23

由 Weilong Wu 提交于 12月 22, 2022

* [eager] use CPUAllocator directly

* modify pstring sizeof 48 default

* rm CPU test for NaiveBestFitAllocator

* fix Mac ci compile errors

* use UNUSED to state unused_obj

* mv UNUSED statement to allocator_facade.cc

* fix roi_align

* fix yolov3 test case

* recover original code

* recover original code

* fix trt roi_align test
Co-authored-by: Njerrywgz <jerrywgz@126.com>

4537ba23

W
delete distribute old dygraph test cast (#49100) · a1d6e2a8
由 wanghuancoder 提交于 12月 22, 2022
```
* delete distribute old dygraph test cast
```
a1d6e2a8

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致