提交 · 1df2ee6c66825017ef87a9a31b9258553093cbb9 · PaddlePaddle / Paddle

16 6月, 2023 4 次提交
- H
  
  int32/int64 forward (#54687) · 1df2ee6c
  由 Hui Zhang 提交于 6月 16, 2023
  
  1df2ee6c
- Z
  fix lamb optimizer always_adapt (#54654) · 2a56f4b3
  由 zhiboniu 提交于 6月 16, 2023
```
* fix lamb always_adapt

* fix optest

* fix all optests
```
  2a56f4b3
- C
  
  fix batch_norm cuda grad kernel test mode bug (#54681) · eb9d07e5
  由 cyber-pioneer 提交于 6月 16, 2023
  
  eb9d07e5
- B
  [inference][trt] zero-dim support for cumsum and bitwise_not op (#54097) · 73fa98ed
  由 bukejiyu 提交于 6月 16, 2023
```
* 0-dims support cumsum and bitwise_not
* Update cumsum_op.cc
* Update test_trt_convert_bitwise_not.py
---------
Co-authored-by: NZhang Jun <ewalker@live.cn>
```
  73fa98ed
15 6月, 2023 6 次提交

Y

fix mac unittest bugs when use static phi (#54656) · b7a6e981
由 YuanRisheng 提交于 6月 15, 2023

b7a6e981

exp/expm1 support int32/int64/float16 forward (#54556) · 58ae8c7c

由 Hui Zhang 提交于 6月 15, 2023

* fix for log xxx

* add int32/int64 for cpu/gpu; add float16/bfloat16 for cpu forward

* fix docstring

* fix bug

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bug

* using cast

* fix test

* fix api

* fix other bugs

* fix ci bug for not using dygraph guard

* add bfloat16 test

* fix ut

* bf16

* exp/expm1 support int32/int64

* fix ut

* fix ut

* fix ut

58ae8c7c

[IR] [Baby step] New interprector support new ir (#54570) · ce0c5c27

由 hong 提交于 6月 15, 2023

* add kernel dialect

* change DenseTensorTypeStorage to DenseTensorType

* add test case`

* add first pd_op to kernel dialect

* lower pd op to kernel dialect

* update

* update

* remove useless code

* add attrite print test

* fix bug

* update

* update

* update

* update

* polish code

* fix bug

* polish  code  and add python test

* add test

* fix test error

* add env flag

* fix bug

* revert test env

* change cc_test_old to cc_test

* fix build_static bug

* fix type test error

* udpate cmake

* disable test in windows

* fix inference compile

ce0c5c27

B
[inference][trt]modify test timeout and test_trt_convert_activation bug fix (#54491) · 1f3dd978
由 bukejiyu 提交于 6月 15, 2023
```
* modify tensorrt ci timeout

* activation ci bug fix

* comment out  int8 mode test_trt_dynamic_shape_groupnorm
```
1f3dd978
C

fix batch_norm optest code (#54661) · 3a8484c4
由 cyber-pioneer 提交于 6月 15, 2023

3a8484c4

Fix sync batch norm op under cuda 12 (#54640) · 7fef4ee9

由 Ghost Screaming 提交于 6月 15, 2023

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Remove climits.

* Fix problem of pickle and NCCL_P2P_DISABLE in distributed testcases in
cuda12.

* Fix problem of TimeOut of distributed testcases under cuda12.

* Fix bug of test_sync_batch_norm_op_static_build accuracy problem under
cuda12.

* Remove useless code modification.

7fef4ee9

14 6月, 2023 12 次提交

[AutoTuner] Add auto tuner to obtain optima configuration (#54460) · e12d2867

由 caozhou 提交于 6月 14, 2023

* add auto tuner

* fix prune

* fix sharding prune and mbs candidates

* fix cfg

* fix launch

* fix launch

* add unittest

* fix code style

e12d2867

Fix cuda12 timeout problems. (#54615) · a90d9088

由 Ghost Screaming 提交于 6月 14, 2023

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* Remove climits.

* Fix problem of pickle and NCCL_P2P_DISABLE in distributed testcases in
cuda12.

* Fix problem of TimeOut of distributed testcases under cuda12.

* Remove useless modification.

* Remove useless modification.

a90d9088

[prim] move batch_norm prim test to op_test (#54458) · 58b4c60f

由 cyber-pioneer 提交于 6月 14, 2023

* move batch_norm prim test to op_test

* fix optest bug

* add test to cmake

* add cinn test case

* fix batch_norm prim grad bf16

* fix code

* add cuda check

* fix batch_norm bfloat16

* fix cpu bfloat16 bug

* skip non-bfloat16-supported platform

* fix code

* fix cinn rtol and atol in bfloat16

* fix name

* fix config

58b4c60f

C

support group_norm and cumsum prim ops bf16 dtype (#54580) · f7eb03c6
由 Charles-hit 提交于 6月 14, 2023

f7eb03c6
[Zero-Dim] paddle.nanmedian/nanquantile support 0D Tensor (#54500) · 3d4d995f
由 zhouweiwei2014 提交于 6月 14, 2023
```
* [Zero-Dim] paddle.nanmedian support 0D Tensor

* fix CI
```
3d4d995f
[Zero-Dim] add 0D test case (#54581) · ca59c72b
由 zhouweiwei2014 提交于 6月 14, 2023

ca59c72b
Z

set xpu context at runtime (#54587) · d0d7d01f
由 zhupengyang 提交于 6月 14, 2023

d0d7d01f
H
Support code generation for op fill_any (#54378) · 4277f61f
由 huangjiyi 提交于 6月 14, 2023
```
* update

* update
```
4277f61f

[BugFix]: Fix ci test bugs in test_fuse_gemm_epilogue_pass.py and... · ded7d190

由 yuehuayingxueluo 提交于 6月 14, 2023

[BugFix]: Fix ci test bugs in test_fuse_gemm_epilogue_pass.py and test_fused_gemm_epilogue_op.py (#54519)

* fix ci bugs in fused_linear

* fix code style

ded7d190

Z
[IR] Support mutable attribute as input for paddle dialect OP build method (#54563) · d658940a
由 zhangbo9674 提交于 6月 14, 2023
```
* support mutable attr is input for build

* add ut

* solve conflict
```
d658940a

Fix A100 CUDA12 ut (#54487) · a96c6dc7

由 sneaxiy 提交于 6月 14, 2023

* fix A100 CUDA12 ut

* fix ci uts

* fix test_sync_batch_norm_op

* fix sync bn op ut again by separating 2 files

* fix codestyle ci

* combine other PRs

* fix codestyle

* fix codestyle ci

a96c6dc7

[IR&PASS] part 3-2: add PatternApplicator and FrozenRewritePatternSet, refine... · 548fb821

由 Yuanle Liu 提交于 6月 14, 2023

[IR&PASS] part 3-2: add PatternApplicator and FrozenRewritePatternSet, refine PatternMatch code, add some api for Builder (#54492)

* [IR&PASS] add PatternApplicator and FrozenRewritePatternSet, refine PatternMatch code, add some api for Builder and TypeId

* fix comment

548fb821

13 6月, 2023 16 次提交

Fix c++17 bug (#54228) · 1b5e1e81

由 risemeup1 提交于 6月 13, 2023

* “update”

* update

* update

* update

* update

* test

* update

* test

* fix_c++17_bug

* fix coverage compile error

* test

* test

* test

* fix C++17 error

* fix c++17 error

* fix c++17 error

* test

* test

* test

* test

* fix cinn compile error

* compile to compiler

* set cinn c++14

---------
Co-authored-by: huangjiyi <947613776@qq.com>
Co-authored-by: Nhuangjiyi <43315610+huangjiyi@users.noreply.github.com>

1b5e1e81

L
Construct dist tensor (#54425) · e32c4375
由 LiYuRio 提交于 6月 13, 2023
```
* construct dist tensor

* move constructor to header
```
e32c4375

Pipeline model, 清理掉self.data (#54374) · b5fe3f63

由 zhenhailiu 提交于 6月 13, 2023

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

b5fe3f63

[CINN] Enable CINN unittest on atan2, tile, top_k, where (#54280) · cf7cd247

由 Fisher 提交于 6月 13, 2023

* Enable check_cinn on atan2, tile, top_k and where

* Update cmakelists in legacy_test

* Reformat code

* Enable check_cinn on op take_along_axis legacy test

* Enable check_cinn on pool2d

* Remove check_cinn=False

* Try fix tile test error

* Rename enable_cinn to test_cinn

* Refactor test_tile_op

* Replace all enable_cinn to check_cinn

* Revert pool2d test timeout

* Remove check_prim and use enable_cinn

cf7cd247

U

[Sparse] Add Spconv2d static mode support. (#54371) · 1a30fe54
由 umiswing 提交于 6月 13, 2023

1a30fe54
N

【Hackathon 4 No.9】Add pca_lowrank API to Paddle (#53743) · 4ebb4764
由 NetPunk 提交于 6月 13, 2023

4ebb4764

Fix cuda12 timeout (#54540) · 7309f8ab

由 TaoTao Li 提交于 6月 13, 2023

* fix a100 cuda12 timeout

* fix cuda12 pickle loads problem

* fix ist_sharding_save ut

7309f8ab

X
【prim】delete multiply_triple_grad dygraph path (#54558) · 10188e8f
由 xiaoguoguo626807 提交于 6月 13, 2023
```
* mutiply_triple delete

* add case

* add timeout
```
10188e8f
Y

fix the timeout bug of some communication api on A100 (#54513) · 60e3e350
由 Yichen Zhang 提交于 6月 13, 2023

60e3e350
K

fix api rename (#54592) · 0379e586
由 kangguangli 提交于 6月 13, 2023

0379e586
N

【Hackathon 4 No.17】Add cummax / cummin API to Paddle (#53546) · 3a3fb1fe
由 NetPunk 提交于 6月 13, 2023

3a3fb1fe

Optimize initialize time by decrease the number of pp group (#53559) · 6bbe92a1

由 LiYuRio 提交于 6月 13, 2023

* use global group to pass meta

* use batch isend irecv

* add partial send/recv

* remove communication group

* remove p2p on npu and xpu

* remove virtual pp ut

6bbe92a1

[IR]Lower pd op to kernel dialect (#54469) · bb848e6b

由 hong 提交于 6月 13, 2023

* add kernel dialect

* change DenseTensorTypeStorage to DenseTensorType

* add test case`

* add first pd_op to kernel dialect

* lower pd op to kernel dialect

* update

* update

* remove useless code

* add attrite print test

* fix bug

* polish code

bb848e6b

W

[IR] polish the new ir api name. (#54562) · eac99c5b
由 winter-wang 提交于 6月 13, 2023

eac99c5b
Y

move single card ut to legacy_test dir (#54560) · 53f24669
由 Yuang Liu 提交于 6月 13, 2023

53f24669

[inference][trt]layer_norm op with dynamic shape support INormalizationLayer in TRT8.6 (#54379) · 3e36f43b

由 bukejiyu 提交于 6月 13, 2023

* layer_norm op with dynamic shape support INormalizationLayer in TRT8.6

* Using trt layer to make layers_norm op in lower than trt8.6
layer_norm op with dynamic shape support INormalizationLayer in TRT8.6

3e36f43b

12 6月, 2023 2 次提交

[IR] Support custom op printer (#54499) · 7d688871

由 kangguangli 提交于 6月 12, 2023

* adapt_startup_program

* refactor program translator

* polish

* add custom op printer hook

* fix merge conflicts

* fix top level op printer

* adapt full int array op

* modify by reviews

* fix

7d688871

log/Log10/log2/log1p support int32/int64/float16/bfloat16 forward (#54089) · 2ddd0473

由 Hui Zhang 提交于 6月 12, 2023

* fix for log xxx

* add int32/int64 for cpu/gpu; add float16/bfloat16 for cpu forward

* fix docstring

* fix bug

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bug

* using cast

* fix test

* fix api

* fix other bugs

* fix ci bug for not using dygraph guard

* add bfloat16 test

* fix ut

* bf16

2ddd0473

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功