提交 · release/2.4 · PaddlePaddle / Paddle

15 2月, 2023 3 次提交
- Z
  [Paddle-TRT]fix slice, bilinear_interp_v2 in trt 7011 (#50187) · f0422a28
  由 zhoutianzi666 提交于 2月 15, 2023
```
* fix bug
* disable bilinear_interp_v2
* add verison check in  py UT
```
  f0422a28
- Z
  
  fix roll bug (#50391) · fd679d31
  由 zhoutianzi666 提交于 2月 15, 2023
  
  fd679d31
- W
  
  prefix (#50381) · d9a134c3
  由 Wang Bojun 提交于 2月 15, 2023
  
  d9a134c3
13 2月, 2023 2 次提交
- C
  [Cherry-pick] Support build with gcc12 for CUDA less than 12.0 (#50291) · 0e92adce
  由 chalsliu 提交于 2月 13, 2023
```
* Support build with gcc12 for CUDA less than 12.0

* fix version message test=document_fix
```
  0e92adce
- S
  
  add p2p (#50337) · 5a1b6f5d
  由 ShenLiang 提交于 2月 12, 2023
  
  5a1b6f5d
10 2月, 2023 2 次提交
- Z
  [cherry-pick] remove if constexpr(), which is not supported on gcc54 (#50421) · 913f40ee
  由 zhangkaihuo 提交于 2月 10, 2023
```
att， cherry-pick #48563 
```
  913f40ee
- Z
  [cherry-pick] Fix bn performance degradation (#50382) · eb610740
  由 zhangkaihuo 提交于 2月 10, 2023
```
att, cherry-pick: #48563 , #50287
```
  eb610740
07 2月, 2023 2 次提交
- S
  [cherry-pick 2.4] Fix to_dlpack (#50138) (#50250) · 59fec5d6
  由 Siming Dai 提交于 2月 07, 2023
```
* Fix to_dlpack (#50138)

* fix to_dlpack for loop

* fix reference count

* fix conflicts
```
  59fec5d6
- Z
  2.4:modify cmake file for cuda11.8 compile (#50185) · b50f04ab
  由 zqw_1997 提交于 2月 07, 2023
```
* 2.4:modify cmake file for cuda11.8 compile

* fix small mistake

* mistake resolved
```
  b50f04ab
06 2月, 2023 1 次提交
- [cherry-pick 2.4]fix compil error on windows for cuda11.6/7/8 (#50205) · 11f478fe
  由 zhouweiwei2014 提交于 2月 06, 2023
  
  11f478fe
03 2月, 2023 2 次提交
- J
  
  add deprecated decorator for paddle.utils.profiler (#50134) · edd0541b
  由 JYChen 提交于 2月 03, 2023
  
  edd0541b
- W
  [Cherry-Pick][Dy2St]Support call backward() without params in dy2st (#49812) (#50144) · 4c82e455
  由 WangZhen 提交于 2月 03, 2023
```
* [Dy2St]Support call backward() without params in dy2st (#49812)

* Support call backward() without params in dy2st

* format code

* format code
```
  4c82e455
02 2月, 2023 2 次提交

Z
[cherry-pick] Optimize sparse kernel and fix some bug (#50118) · 8c5e432b
由 zhangkaihuo 提交于 2月 02, 2023
```
cherry-pick some PR about optimize sparse kernel and fix some bug：
#47736 #47703 #47604 #46679 #48439 #49009 #49734
```
8c5e432b

[cherry-pick][pass] Upgrade Constant Folding Pass (#50105) · e32ff656

由 Zhang Jun 提交于 2月 02, 2023

* constant folding/trt subgrash pass debug
* constant folding set persistalbe var in OP block, and remove unsed log
* set node var persistalbe

e32ff656

31 1月, 2023 1 次提交
- Z
  [cherry-pick]BatchNorm use inplace (#49529) · dddc5d9d
  由 zhangkaihuo 提交于 1月 31, 2023
```
att, cherry-pick#48254, and resolve conflict
```
  dddc5d9d
19 1月, 2023 1 次提交

[cherry-pick]Fix paddle.queeze_ bug (#49937) · 34fafb11

由 heliqi 提交于 1月 19, 2023

* Fix paddle.queeze_ bug (#49903)

* fix queeze_ bug

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* add test case

* Update squeeze_kernel.h

34fafb11

13 1月, 2023 2 次提交
- X
  
  fix_arg_release24 (#49771) · 0699afb1
  由 xiaoxiaohehe001 提交于 1月 13, 2023
  
  0699afb1
- Y
  fix fc kernel diff (#49781) · 01c26ab2
  由 Yuanle Liu 提交于 1月 13, 2023
```
* fix fc kernel diff

* disable fc_elementwise_layernorm_fuse_pass
```
  01c26ab2
12 1月, 2023 1 次提交
- X
  
  fix_split_infermeta (#49745) · 8a934047
  由 xiaoxiaohehe001 提交于 1月 12, 2023
  
  8a934047
09 1月, 2023 1 次提交
- H
  
  fix bugs of paddle.multiplex API (#49368) (#49642) · 6d2d8e50
  由 Haohongxiang 提交于 1月 09, 2023
  
  6d2d8e50
04 1月, 2023 2 次提交
- Y
  [Cherry-pick][Paddle Inference] fix mixed precision diff (#49477) · 1d25c663
  由 Yuanle Liu 提交于 1月 04, 2023
```
* disable scale op in amp pass

* Do not insert redundant cast op

* fix fused_fc_elementwise_layernorm kernel diff

* fix fc kerenl diff
```
  1d25c663
- Y
  [Cherry-pick] add condition of skipif (#49407) · 7696ae02
  由 YUNSHEN XIE 提交于 1月 04, 2023
```
* resolve conflict

* fix format error
```
  7696ae02
03 1月, 2023 2 次提交
- X
  [Cherry pick] fix fold for big bs (#49491) · 2a438b0a
  由 xiaoting 提交于 1月 03, 2023
```
* fix fold for large bs

* fix fold for large bs

* fix pre-commit
```
  2a438b0a
- F
  cherry-pick:Some version of TensorRT don't support qkv_plugin (#49425) · d7855fe8
  由 feng_shuai 提交于 1月 03, 2023
```
* cherry-pick:Some version of TensorRT don't support qkv_plugin

* cherry-pick:support coverage CI
```
  d7855fe8
30 12月, 2022 1 次提交

[MLU] cherry-pick from develop to release/2.4 (#48313) · 6e154fc6

由 Chenxiao Niu 提交于 12月 30, 2022

* [MLU] fix compute error of dropout op (#45923)

* [MLU] add mergedAdam kernel. (#45965)

* [MLU] add int64 support for mlu one_hot_v2 (#46313)

* [MLU] fix profiler compile failure (#46208)

* [MLU] add barrier_op kernel. (#46417)

* [MLU] fluid: add mluop (#46429)

* [MLU] add huber_loss kernel. (#46455)

* [MLU] add mlu kernel for add_reduce_max_grad (#45651)
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>

* [MLU] add_fluid_mluop_yolo_box (#46573)

* [MLU] fix phi::Tensor compile error of mlu. (#46649)

* [MLU] add fluid MLUOps prior_box (#46585)

* [MLU] fix cmake error (#46772)

* [MLU]fix unittest of sync_bn (#46797)

* [MLU] add masterparam support for mlu adamw. (#46804)

* [MLU] add int64 support for allgather. (#46830)

* [MLU] fix compile error & add mlu blacklist function. (#47439)

* [MLU] fix softmax_with_cross_entropy failed in 370-X8.

* [MLU] fix cncl stuck caused by multiple initializations.

* [MLU] fix code style check.
Co-authored-by: Nqipengh <huangqipeng@cambricon.com>
Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com>
Co-authored-by: Lux et Veritas <1004239791@qq.com>
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>
Co-authored-by: Nronnywang <ronny1996@163.com>

6e154fc6

29 12月, 2022 2 次提交
- [cherry-pick]fix bug of UT test_version, test=document_fix (#49401) · 96e974a0
  由 zhouweiwei2014 提交于 12月 29, 2022
  
  96e974a0
- Y
  [Cherry-pick]Move sum op to PHI && Fix MetaTensor's bug when run infermeta (#49342) · 8015fbd6
  由 YuanRisheng 提交于 12月 29, 2022
```
* cherry-pick 45860

* [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265)

* fix sum bug

* fix ci bugs

* fix ci bugs

* update code according comment
```
  8015fbd6
28 12月, 2022 1 次提交
- H
  [Cherry-pick] Fix CUDA11.8 Unittest Accuracy (#49374) · 8aa5be90
  由 Huihuang Zheng 提交于 12月 28, 2022
```
Fix CUDA11.8 Unittest Accuracy
```
  8aa5be90
27 12月, 2022 2 次提交

Y

update jetson ampere sm (#49364) · b5fdd175
由 Yuanle Liu 提交于 12月 27, 2022

b5fdd175

[Cherry-pick] Fix custom operator backward=None (#48656) (#48715) · 39eb77a6

由 HongyuJia 提交于 12月 27, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* fix custom operator backward=None (#48656)

* [Custom Extension] Fix custom double_grad backward=None (#49224)

* fix custom double_grad backward=None

* fix custom_relu.cu bug && polish testcase of double_grad

* remove old dynamic graph test

* add import fluid

* add import fluid
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

39eb77a6

22 12月, 2022 3 次提交

G

fix unittest in post training quantization (#49257) · 5d29a5bf
由 Guanghua Yu 提交于 12月 22, 2022

5d29a5bf

Fix mixed precision bug (#49239) · 11c7f570

由 Yuanle Liu 提交于 12月 22, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* fix mixed precision inference
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

11c7f570

L

[Docs]update readme; test=document_fix (#49246) · 612bdb17
由 Ligoml 提交于 12月 22, 2022

612bdb17

21 12月, 2022 2 次提交
- A
  
  fix unittests (#49203) (#49210) · 7c36b887
  由 Aganlengzi 提交于 12月 21, 2022
  
  7c36b887
- Z
  
  cherry-pick #75b734 (#49201) · fb19648a
  由 zhangkaihuo 提交于 12月 21, 2022
  
  fb19648a
20 12月, 2022 1 次提交
- S
  Fix nullptr to TestFuseGemmEpilogueReluBWDFP* (#48997) (#49090) · cdab3a44
  由 ShenLiang 提交于 12月 20, 2022
```
Co-authored-by: NMing-Xu Huang <mingh@nvidia.com>
```
  cdab3a44
19 12月, 2022 1 次提交

[cherry-pick][Inference] support mixed precision inference (#49077) · ddcd1b61

由 Yuanle Liu 提交于 12月 19, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* [Paddle Inference] Add float_to_half_pass to support  inference with mixed precision (#47993)

* [Inference] optimize some code and fix some bug (#48780)

* clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass

* fix unitest timeout

* [Paddle Inference] clean unused code  (#48392)

* fix

* update

* update
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

ddcd1b61

29 11月, 2022 1 次提交

[cherry-pick] updating mul and matmul with set_mem_desc and fix... · 9e2ba9b9

由 yeliang2258 提交于 11月 29, 2022

[cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951)

* Fix slice bugs in MKLDNN when input dims are zeros (#46671)

* fix slice bugs

* fix

* update code

* fix

* update code

* updating mul and matmul with set_mem_desc (#45624)

* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix

* fix squueze_transpose (#47911)
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

9e2ba9b9

28 11月, 2022 1 次提交

Cherrypick NV fixes to release/2.4 (#48263) · 7a0b8625

由 zlsh80826 提交于 11月 28, 2022

* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)

* Add missing fp32 config and reduce the testing combination

* Reduce trt matmul pass test max examples

* Loose TRT fp16 tests tolerance (#47100)

* Loose TRT half test tolerance to 1e-3 (#47101)

* Loose TRT half test tolerance to 1e-3 (#47106)

* Update distributed_strategy.proto (#46531)

* Close popen pipe after used (#47053)

* Add launch_bounds (#47285)

* Fix TRT UT failures (#47488)

* Format cherry-picked commits

* CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203)

* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
Co-authored-by: NShijie <505749828@qq.com>
Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
Co-authored-by: NTian Zheng <tizheng@nvidia.com>

7a0b8625

25 11月, 2022 1 次提交
- Z
  Fix wrong eigen header include in data_type.h (#48157) (#48260) · a2f61fef
  由 zyfncg 提交于 11月 25, 2022
```
* Fix wrong eigen header include

* fix compile bug
```
  a2f61fef

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功