提交 · 34fafb11924e032f64c4e29918736b9b85026a96 · BaiXuePrincess / Paddle

19 1月, 2023 1 次提交

[cherry-pick]Fix paddle.queeze_ bug (#49937) · 34fafb11

由 heliqi 提交于 1月 19, 2023

* Fix paddle.queeze_ bug (#49903)

* fix queeze_ bug

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* add test case

* Update squeeze_kernel.h

34fafb11

13 1月, 2023 2 次提交
- X
  
  fix_arg_release24 (#49771) · 0699afb1
  由 xiaoxiaohehe001 提交于 1月 13, 2023
  
  0699afb1
- Y
  fix fc kernel diff (#49781) · 01c26ab2
  由 Yuanle Liu 提交于 1月 13, 2023
```
* fix fc kernel diff

* disable fc_elementwise_layernorm_fuse_pass
```
  01c26ab2
12 1月, 2023 1 次提交
- X
  
  fix_split_infermeta (#49745) · 8a934047
  由 xiaoxiaohehe001 提交于 1月 12, 2023
  
  8a934047
09 1月, 2023 1 次提交
- H
  
  fix bugs of paddle.multiplex API (#49368) (#49642) · 6d2d8e50
  由 Haohongxiang 提交于 1月 09, 2023
  
  6d2d8e50
04 1月, 2023 1 次提交

[Cherry-pick][Paddle Inference] fix mixed precision diff (#49477) · 1d25c663

由 Yuanle Liu 提交于 1月 04, 2023

* disable scale op in amp pass

* Do not insert redundant cast op

* fix fused_fc_elementwise_layernorm kernel diff

* fix fc kerenl diff

1d25c663

03 1月, 2023 2 次提交
- X
  [Cherry pick] fix fold for big bs (#49491) · 2a438b0a
  由 xiaoting 提交于 1月 03, 2023
```
* fix fold for large bs

* fix fold for large bs

* fix pre-commit
```
  2a438b0a
- F
  cherry-pick:Some version of TensorRT don't support qkv_plugin (#49425) · d7855fe8
  由 feng_shuai 提交于 1月 03, 2023
```
* cherry-pick:Some version of TensorRT don't support qkv_plugin

* cherry-pick:support coverage CI
```
  d7855fe8
30 12月, 2022 1 次提交

[MLU] cherry-pick from develop to release/2.4 (#48313) · 6e154fc6

由 Chenxiao Niu 提交于 12月 30, 2022

* [MLU] fix compute error of dropout op (#45923)

* [MLU] add mergedAdam kernel. (#45965)

* [MLU] add int64 support for mlu one_hot_v2 (#46313)

* [MLU] fix profiler compile failure (#46208)

* [MLU] add barrier_op kernel. (#46417)

* [MLU] fluid: add mluop (#46429)

* [MLU] add huber_loss kernel. (#46455)

* [MLU] add mlu kernel for add_reduce_max_grad (#45651)
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>

* [MLU] add_fluid_mluop_yolo_box (#46573)

* [MLU] fix phi::Tensor compile error of mlu. (#46649)

* [MLU] add fluid MLUOps prior_box (#46585)

* [MLU] fix cmake error (#46772)

* [MLU]fix unittest of sync_bn (#46797)

* [MLU] add masterparam support for mlu adamw. (#46804)

* [MLU] add int64 support for allgather. (#46830)

* [MLU] fix compile error & add mlu blacklist function. (#47439)

* [MLU] fix softmax_with_cross_entropy failed in 370-X8.

* [MLU] fix cncl stuck caused by multiple initializations.

* [MLU] fix code style check.
Co-authored-by: Nqipengh <huangqipeng@cambricon.com>
Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com>
Co-authored-by: Lux et Veritas <1004239791@qq.com>
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>
Co-authored-by: Nronnywang <ronny1996@163.com>

6e154fc6

29 12月, 2022 1 次提交

[Cherry-pick]Move sum op to PHI && Fix MetaTensor's bug when run infermeta (#49342) · 8015fbd6

由 YuanRisheng 提交于 12月 29, 2022

* cherry-pick 45860

* [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265)

* fix sum bug

* fix ci bugs

* fix ci bugs

* update code according comment

8015fbd6

27 12月, 2022 1 次提交

[Cherry-pick] Fix custom operator backward=None (#48656) (#48715) · 39eb77a6

由 HongyuJia 提交于 12月 27, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* fix custom operator backward=None (#48656)

* [Custom Extension] Fix custom double_grad backward=None (#49224)

* fix custom double_grad backward=None

* fix custom_relu.cu bug && polish testcase of double_grad

* remove old dynamic graph test

* add import fluid

* add import fluid
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

39eb77a6

22 12月, 2022 1 次提交

Fix mixed precision bug (#49239) · 11c7f570

由 Yuanle Liu 提交于 12月 22, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* fix mixed precision inference
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

11c7f570

20 12月, 2022 1 次提交
- S
  Fix nullptr to TestFuseGemmEpilogueReluBWDFP* (#48997) (#49090) · cdab3a44
  由 ShenLiang 提交于 12月 20, 2022
```
Co-authored-by: NMing-Xu Huang <mingh@nvidia.com>
```
  cdab3a44
19 12月, 2022 1 次提交

[cherry-pick][Inference] support mixed precision inference (#49077) · ddcd1b61

由 Yuanle Liu 提交于 12月 19, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* [Paddle Inference] Add float_to_half_pass to support  inference with mixed precision (#47993)

* [Inference] optimize some code and fix some bug (#48780)

* clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass

* fix unitest timeout

* [Paddle Inference] clean unused code  (#48392)

* fix

* update

* update
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

ddcd1b61

29 11月, 2022 1 次提交

[cherry-pick] updating mul and matmul with set_mem_desc and fix... · 9e2ba9b9

由 yeliang2258 提交于 11月 29, 2022

[cherry-pick] updating mul and matmul with set_mem_desc and fix squeeze_transpose for MKLDNN (#47951)

* Fix slice bugs in MKLDNN when input dims are zeros (#46671)

* fix slice bugs

* fix

* update code

* fix

* update code

* updating mul and matmul with set_mem_desc (#45624)

* - mul & matmul changes

- fix

- bs16 correction of strides

* - cosmetic fixes

* - lint

* - fix

* - fix

* - format -> mem_desc

* - fix

* - fix

* - fix

* - fix

* - fix

* fix squueze_transpose (#47911)
Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>

9e2ba9b9

28 11月, 2022 1 次提交

Cherrypick NV fixes to release/2.4 (#48263) · 7a0b8625

由 zlsh80826 提交于 11月 28, 2022

* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)

* Add missing fp32 config and reduce the testing combination

* Reduce trt matmul pass test max examples

* Loose TRT fp16 tests tolerance (#47100)

* Loose TRT half test tolerance to 1e-3 (#47101)

* Loose TRT half test tolerance to 1e-3 (#47106)

* Update distributed_strategy.proto (#46531)

* Close popen pipe after used (#47053)

* Add launch_bounds (#47285)

* Fix TRT UT failures (#47488)

* Format cherry-picked commits

* CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203)

* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
Co-authored-by: NShijie <505749828@qq.com>
Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
Co-authored-by: NTian Zheng <tizheng@nvidia.com>

7a0b8625

25 11月, 2022 2 次提交
- Z
  Fix wrong eigen header include in data_type.h (#48157) (#48260) · a2f61fef
  由 zyfncg 提交于 11月 25, 2022
```
* Fix wrong eigen header include

* fix compile bug
```
  a2f61fef
- W
  
  update (#48350) · b9b7f009
  由 Wilber 提交于 11月 25, 2022
  
  b9b7f009
16 11月, 2022 1 次提交
- W
  Fix mac link python (#48017) · 3fa7a736
  由 wanghuancoder 提交于 11月 16, 2022
```
* finx mac link python

* refine
```
  3fa7a736
11 11月, 2022 1 次提交
- Y
  Fix slice bugs in MKLDNN when input dims are zeros (#46671) (#47887) · 5033b6c2
  由 yeliang2258 提交于 11月 11, 2022
```
* fix slice bugs

* fix

* update code

* fix

* update code
```
  5033b6c2
10 11月, 2022 3 次提交

R
Fuse multi transformer layer pass (#47541) (#47830) · 3a6cc57c
由 RichardWooSJTU 提交于 11月 10, 2022
```
* add fuse_multi_transformer_layer_pass
```
3a6cc57c

[Cherry-pick] Fix python link error (#47811) · ff642c68

由 Chen Weihang 提交于 11月 10, 2022

* Fix Python Link Order Error (#46259)

* fix cc_library link python lib (#47605)

* fix cc_library link python lib
Co-authored-by: engineer <1292846099@qq.com>
Co-authored-by: Nwanghuancoder <wanghuan29@baidu.com>

ff642c68

【Cherry-pick PR47743】change cudnn error to cuda error if compiled cuda version... · 76b883c2

由 pangyoki 提交于 11月 10, 2022

【Cherry-pick PR47743】change cudnn error to cuda error if compiled cuda version is incompatible with installed cuda version (#47744)

* cherry-pick pr47743

* fix

* fix

* fix

76b883c2

09 11月, 2022 1 次提交
- H
  [cherry-pick] Squeeze2 and transpose2 fuse using oneDNN(#47712) · ea5f44b8
  由 Hui Zhang 提交于 11月 09, 2022
```
* suqeeze2 + transpose2 fuse onednn cherrypick 2.4

* format

* fix merge
```
  ea5f44b8
08 11月, 2022 2 次提交
- K
  
  add fuse_multi_transformer passes to fp16. test=develop (#47733) · 34f67a88
  由 Kaipeng Deng 提交于 11月 08, 2022
  
  34f67a88
- J
  [CHERRY-PICK] Added caching to oneDNN FC and op+unsqueeze2 and op+reshape2 fuse passes (#47690) · d0e19af3
  由 jakpiase 提交于 11月 08, 2022
```
* fc cherrypick

* another files added

* added transpose cherrypick

* reverter somebodys fc changes

* minor fix

* minor fix

* cherry-pick of fc+act changes

* minor fix

* fix
```
  d0e19af3
07 11月, 2022 3 次提交

[cherry-pick2.4]docs fix (#47669) · cf668ab3

由 Ligoml 提交于 11月 07, 2022

* #46165

* #45752

* fix some doc bug test=document_fix (#45488)

* fix some doc bug test=document_fix

* fix some docs issues, test=document_fix

* beta -> \beta in softplus

* threshold -> \varepsilon in softplus

* parameter name

* delta -> \delta in smooth_l1_loss

* fix some docs test=document_fix

* fix docs test=document_fix

* fix docs && 增加空行 test=document_fix

* Update python/paddle/nn/functional/activation.py, test=document_fix

* Update python/paddle/nn/layer/activation.py, test=document_fix
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

* [docs] add ipustrategy Hyperlink (#46422)

* [docs] add ipustrategy Hyperlink

* fix ipu_shard_guard docs; test=document_fix

* [docs] add set_ipu_shard note

* [docs] fix hyperlink

* update framework.py

* fix mlu_places docs; test=document_fix

* fix put_along_axis docs; test=document_fix

* fix flake8 W293 error, test=document_fix

* fix typo in typing, test=document_fix
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>

* #46659

* Update README_cn.md (#46927)

修复了错别字

* #46738

* fix paddle.get_default_dtype (#47040)

Chinese and English return values are inconsistent

* fix bug
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>
Co-authored-by: NInfinity_lee <luhputu0815@gmail.com>
Co-authored-by: Nmrcangye <chenloong@88.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
Co-authored-by: Ngouzil <66515297+gouzil@users.noreply.github.com>
Co-authored-by: NHamid Zare <12127420+hamidzr@users.noreply.github.com>
Co-authored-by: NSqhttwl <61459740+Sqhttwl@users.noreply.github.com>
Co-authored-by: NOccupyMars2025 <31559413+OccupyMars2025@users.noreply.github.com>
Co-authored-by: N超级码牛 <54444805+SuperCodebull@users.noreply.github.com>
Co-authored-by: Njzhang533 <jzhang533@gmail.com>

cf668ab3

【Cherry-pick PR47666】add cudnn error if compiled cudnn version is incompatible... · 764cea0c

由 pangyoki 提交于 11月 07, 2022

【Cherry-pick PR47666】add cudnn error if compiled cudnn version is incompatible with installed cudnn version (#47673)

* Cherry-pick PR47666, add cudnn error (#47666)

* [CherryPick] Cherry pick #45916 #46031 #47299  (#47610)

* [ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916)

* fix select_input with different shape errors:
1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var
2. the output shape of select_input is inferred from inputs.

* reverse the logic in select_input

* [warning] added warning message in cond block when one branch returns variable and another returns None (#46031)

* [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931)

* Allow manaully set py_reader name in standalone executor

* [BugFix] while cond receives dict as input (#47299)

* fix bugs while cond receives dict as input

* add unittest

* change flatten -> _is_sequence_except_dict

* code format
Co-authored-by: Nfeifei-111 <wuzhanfei@baidu.com>
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
Co-authored-by: Nfeifei-111 <wuzhanfei@baidu.com>

764cea0c

Z
Revert "SparseConv support duplicate coordinates (#44976)" (#45202) (#47699) · 7145db6e
由 zhangkaihuo 提交于 11月 07, 2022
```
Revert SparseConv support duplicate coordinates
```
7145db6e

03 11月, 2022 4 次提交
- S
  
  FC/matmul(v2) + scale fuse pass (#47420) · 99c872fa
  由 Sławomir Siwek 提交于 11月 03, 2022
  
  99c872fa
- Y
  Fix ComputePropagateScalesMkldnnPass of MKLDNN (#47574) (#47639) · 559b9754
  由 yeliang2258 提交于 11月 03, 2022
```
* add constant_folding_pass pass for mkldnn int8

* update UpdateScaleOpInOutScales
```
  559b9754
- Z
  [Sparse] Unified api args name (#47529) (#47627) · 75088bbf
  由 zhangkaihuo 提交于 11月 03, 2022
```
Unified api args name
```
  75088bbf
- K
  [cherry pick] fix memory copy in prepare_data of FusedMultiTransformer pass (#47308) · ba4fbe71
  由 Kaipeng Deng 提交于 11月 03, 2022
```
* fix memory copy in prepare_data. test=develop

* add cache_kv fp16 support. test=develop

* fit for simplify_with_basic_ops_pass. test=develop
```
  ba4fbe71
02 11月, 2022 1 次提交
- S
  
  [geometric] Optimize graph sample speed (#47531) (#47548) · 7a1cf277
  由 Siming Dai 提交于 11月 02, 2022
  
  7a1cf277
01 11月, 2022 2 次提交

[cherry-pick][code-gen] Support code-gen for opmaker of sparse op (#46993) (#47417) · 601626ac

由 zyfncg 提交于 11月 01, 2022

* support generating code of opmaker for backward op invoke forward op (#46912)

* [code-gen] Support code-gen for opmaker of sparse op (#46993)

* support generating code of opmaker for backward op invoke forward op

* gsupport code-gen of opmaker for sparse op

* refind logic of choose phi kernrel

* fix complie budg

* fix code_gen bug

* fix bug

* fix kernel signature code-gen

* fix complie bug of VarType

* fix complie bug of VarType

* fix test_sparse_conv_op

* fix test_sparse_norm_op

* [Phi] Refactor logic of judging whether having a phi kernrel (#46920)

* refind logic of choose phi kernrel

* fix complie budg

* update cmake

601626ac

Y

fix p2p comm memory release logic (#47497) (#47517) · 0201ccc4
由 Yuang Liu 提交于 11月 01, 2022

0201ccc4

29 10月, 2022 1 次提交
- A
  [JITLayer]Enable OneDNN on CPU and Fix zero shape (#47428) (#47436) · f4788442
  由 Aurelius84 提交于 10月 29, 2022
```
* [JITLayer]Enable OneDNN on CPU and Fix zero shape
```
  f4788442
28 10月, 2022 3 次提交
- W
  [Dy2St]Fix abnormal growth of memory in train mode and no_grad for Dy2St (#47398) (#47414) · 7618cbdc
  由 WangZhen 提交于 10月 28, 2022
```
* [Dy2St]Fix abnormal growth of memory in train mode and no_grad for Dy2St 
```
  7618cbdc
- A
  [Cherry-pick][JIT] Add Predictor for JITLayer (#47379) (#47419) · c42929c5
  由 Aurelius84 提交于 10月 28, 2022
```
* [JIT] Add Predictor for JITLayer (#47379)

* add predictor_engine

* add predictor_engine

* fix zero shape

* fix lodTensor

* fix unittest

* fix code style

* update CmakeList

* fix new executor
```
  c42929c5
- Z
  [cherry-pick]add sync_batch_norm_bn and deliver indices_dict (#47407) · 0fa8309a
  由 zhangkaihuo 提交于 10月 28, 2022
```
add sync_batch_norm_bn and deliver indices_dict 
```
  0fa8309a

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致