提交 · 6f3c96438b4ac3a199195d300b38563221c661b6 · PaddlePaddle / Paddle

14 4月, 2023 1 次提交
- J
  Eb118 BF16 Adoption (#52827) · 6f3c9643
  由 JZ-LIANG 提交于 4月 14, 2023
```
* pr1

* pr2

* pr3

* fixed unitest

* adopt for scale
```
  6f3c9643
12 4月, 2023 3 次提交
- Y
  
  Cherry-pick the support of bf16 of grad_clip, in #51285. (#52816) · 8cbc75ca
  由 Yiqun Liu 提交于 4月 12, 2023
  
  8cbc75ca
- J
  
  Cherry Pick Random Ctrl (#52778) · 3869a3b4
  由 JZ-LIANG 提交于 4月 12, 2023
  
  3869a3b4
- Y
  
  Unify the static amp codes of fp16 and bf16. Reimplement #52694 in release/2.4. (#52697) · 6959eae5
  由 Yiqun Liu 提交于 4月 12, 2023
  
  6959eae5
11 4月, 2023 1 次提交

Cherry pick for fix of operator precision. (#52705) · d1e8b1e2

由 Yiqun Liu 提交于 4月 11, 2023

* Fix scale kernel for low precision, cherry pick #50998.

* Fix the FP16 precision problem of add_n. (#50129)

* Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315.

* Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993.

* Cherry-pick the multi-precision support of AdamW for bf16, #48041.

* Fix compiling error.

* Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993.

* Fix unittest.

---------
Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>

d1e8b1e2

10 4月, 2023 1 次提交
- Y
  Broadcast the master weight along with param for distributed training. (#52638) · d12588d2
  由 Yiqun Liu 提交于 4月 10, 2023
```
* Broadcast the master weight along with param for distributed training.

* Fix codestyle.
```
  d12588d2
09 4月, 2023 2 次提交

Add bfloat16 support for several operators and apis. (#52696) · ba9a22db

由 Yiqun Liu 提交于 4月 09, 2023

* Cherry-pick the register of bfloat16 for amp_kernel, pull request #45541.

* Cherry-pick the master_grad support of adamw, pull request #51141.

* add bf16 for some ops in static mode (#51582)

* Add bfloat16 support for some api in static mode.

* Fix codestyle.

* Revert the change of layer_function_generator.py.

---------
Co-authored-by: Shaojie WANG <wsjmessi@163.com>

ba9a22db

Cherry pick the support of bfloat16 for several operators. (#52608) · 95c3d613

由 Yiqun Liu 提交于 4月 09, 2023

* Register exp/expm1/logit bf16 activation op kernels (#48702)

* register more bf16 ops

* update to register coresponding backward ops

* Addition of bf16 type support for Compare OP  (#46413)

* first commit

* clarify the quotes

* change code style format

* support bfloat16

* add bfloat16 support for more ops (#48272)

* [Bfloat16]register bfloat16 datatype for squared l2 norm (#50908)

* Sync the pull request #51903.

* Add some header files back.

* modify cmake file for cuda11.8 compile (#49020)

* modify cmake file for cuda11.8 compile

* add op_library(fused_embedding_eltwise_layernorm_op DEPS bert_encoder_functor)

* Fix compling error.

* Cherry-pick pull request #51396.

---------
Co-authored-by: Nsneaxiy <32832641+sneaxiy@users.noreply.github.com>
Co-authored-by: Nlimingshu <61349199+JamesLim-sy@users.noreply.github.com>
Co-authored-by: Shaojie WANG <wsjmessi@163.com>
Co-authored-by: Nzqw_1997 <118182234+zhengqiwen1997@users.noreply.github.com>

95c3d613

03 4月, 2023 1 次提交
- Z
  
  make micro bsz configurable (#52447) · 722f880e
  由 zhaoyingli 提交于 4月 03, 2023
  
  722f880e
20 3月, 2023 1 次提交
- L
  
  Cherry-pick fleet executor and auto parallel (#50071) · 92c2dcbd
  由 LiYuRio 提交于 3月 20, 2023
  
  92c2dcbd
09 3月, 2023 1 次提交
- J
  
  Extra Sync for Tensor Parallel (#50637) · 4bacf2ab
  由 JZ-LIANG 提交于 3月 09, 2023
  
  4bacf2ab
17 2月, 2023 1 次提交
- W
  
  Add rpc ops to fetch data from remote service (#50220) · 9025fddd
  由 Wen Sun 提交于 2月 17, 2023
  
  9025fddd
04 1月, 2023 1 次提交
- Y
  [Cherry-pick] add condition of skipif (#49407) · 7696ae02
  由 YUNSHEN XIE 提交于 1月 04, 2023
```
* resolve conflict

* fix format error
```
  7696ae02
03 1月, 2023 2 次提交
- X
  [Cherry pick] fix fold for big bs (#49491) · 2a438b0a
  由 xiaoting 提交于 1月 03, 2023
```
* fix fold for large bs

* fix fold for large bs

* fix pre-commit
```
  2a438b0a
- F
  cherry-pick:Some version of TensorRT don't support qkv_plugin (#49425) · d7855fe8
  由 feng_shuai 提交于 1月 03, 2023
```
* cherry-pick:Some version of TensorRT don't support qkv_plugin

* cherry-pick:support coverage CI
```
  d7855fe8
30 12月, 2022 1 次提交

[MLU] cherry-pick from develop to release/2.4 (#48313) · 6e154fc6

由 Chenxiao Niu 提交于 12月 30, 2022

* [MLU] fix compute error of dropout op (#45923)

* [MLU] add mergedAdam kernel. (#45965)

* [MLU] add int64 support for mlu one_hot_v2 (#46313)

* [MLU] fix profiler compile failure (#46208)

* [MLU] add barrier_op kernel. (#46417)

* [MLU] fluid: add mluop (#46429)

* [MLU] add huber_loss kernel. (#46455)

* [MLU] add mlu kernel for add_reduce_max_grad (#45651)
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>

* [MLU] add_fluid_mluop_yolo_box (#46573)

* [MLU] fix phi::Tensor compile error of mlu. (#46649)

* [MLU] add fluid MLUOps prior_box (#46585)

* [MLU] fix cmake error (#46772)

* [MLU]fix unittest of sync_bn (#46797)

* [MLU] add masterparam support for mlu adamw. (#46804)

* [MLU] add int64 support for allgather. (#46830)

* [MLU] fix compile error & add mlu blacklist function. (#47439)

* [MLU] fix softmax_with_cross_entropy failed in 370-X8.

* [MLU] fix cncl stuck caused by multiple initializations.

* [MLU] fix code style check.
Co-authored-by: Nqipengh <huangqipeng@cambricon.com>
Co-authored-by: Ncifar10 <41565156+cifar10@users.noreply.github.com>
Co-authored-by: Lux et Veritas <1004239791@qq.com>
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>
Co-authored-by: Nronnywang <ronny1996@163.com>

6e154fc6

29 12月, 2022 2 次提交
- [cherry-pick]fix bug of UT test_version, test=document_fix (#49401) · 96e974a0
  由 zhouweiwei2014 提交于 12月 29, 2022
  
  96e974a0
- Y
  [Cherry-pick]Move sum op to PHI && Fix MetaTensor's bug when run infermeta (#49342) · 8015fbd6
  由 YuanRisheng 提交于 12月 29, 2022
```
* cherry-pick 45860

* [BUG FIX]Fix MetaTensor's bug when run infermeta (#46265)

* fix sum bug

* fix ci bugs

* fix ci bugs

* update code according comment
```
  8015fbd6
28 12月, 2022 1 次提交
- H
  [Cherry-pick] Fix CUDA11.8 Unittest Accuracy (#49374) · 8aa5be90
  由 Huihuang Zheng 提交于 12月 28, 2022
```
Fix CUDA11.8 Unittest Accuracy
```
  8aa5be90
27 12月, 2022 1 次提交

[Cherry-pick] Fix custom operator backward=None (#48656) (#48715) · 39eb77a6

由 HongyuJia 提交于 12月 27, 2022

* [Release2.4] Revert python link prs (#48573)

* Revert "Fix mac link python (#48017)"

This reverts commit 3fa7a736.

* Revert "[Cherry-pick] Fix python link error (#47811)"

This reverts commit ff642c68.

* Update config.go

* fix custom operator backward=None (#48656)

* [Custom Extension] Fix custom double_grad backward=None (#49224)

* fix custom double_grad backward=None

* fix custom_relu.cu bug && polish testcase of double_grad

* remove old dynamic graph test

* add import fluid

* add import fluid
Co-authored-by: NChen Weihang <chenweihang@baidu.com>

39eb77a6

22 12月, 2022 1 次提交
- G
  
  fix unittest in post training quantization (#49257) · 5d29a5bf
  由 Guanghua Yu 提交于 12月 22, 2022
  
  5d29a5bf
21 12月, 2022 2 次提交
- A
  
  fix unittests (#49203) (#49210) · 7c36b887
  由 Aganlengzi 提交于 12月 21, 2022
  
  7c36b887
- Z
  
  cherry-pick #75b734 (#49201) · fb19648a
  由 zhangkaihuo 提交于 12月 21, 2022
  
  fb19648a
28 11月, 2022 1 次提交

Cherrypick NV fixes to release/2.4 (#48263) · 7a0b8625

由 zlsh80826 提交于 11月 28, 2022

* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)

* Add missing fp32 config and reduce the testing combination

* Reduce trt matmul pass test max examples

* Loose TRT fp16 tests tolerance (#47100)

* Loose TRT half test tolerance to 1e-3 (#47101)

* Loose TRT half test tolerance to 1e-3 (#47106)

* Update distributed_strategy.proto (#46531)

* Close popen pipe after used (#47053)

* Add launch_bounds (#47285)

* Fix TRT UT failures (#47488)

* Format cherry-picked commits

* CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203)

* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
Co-authored-by: NShijie <505749828@qq.com>
Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
Co-authored-by: NTian Zheng <tizheng@nvidia.com>

7a0b8625

24 11月, 2022 1 次提交
- U
  [cherry-pick2.4]en-docs warning&error fix (#48332) · 1490aaa9
  由 ustiniankw 提交于 11月 24, 2022
```
* fixdocs, test=document_fix

* fixdocs, test=document_fix
```
  1490aaa9
11 11月, 2022 1 次提交
- H
  
  rename fw_bw func name of interleave pp (#47571) (#47862) · 4465ba27
  由 Haohongxiang 提交于 11月 11, 2022
  
  4465ba27
10 11月, 2022 1 次提交
- W
  【cherry-pick】update Recompute doc (#47784) · 2e9e65d8
  由 wuhuachaocoding 提交于 11月 10, 2022
```
* cherry-pick recompute doc update.

* update.
```
  2e9e65d8
09 11月, 2022 1 次提交
- J
  [Cherry-pick] remove functions not belong to public-api from __all__ (#47577) · 51248f89
  由 JYChen 提交于 11月 09, 2022
```
* remove functions not belong to public-api from __all__

* fix code style

* fix error in distributed
```
  51248f89
07 11月, 2022 4 次提交

[cherry-pick2.4]docs fix (#47669) · cf668ab3

由 Ligoml 提交于 11月 07, 2022

* #46165

* #45752

* fix some doc bug test=document_fix (#45488)

* fix some doc bug test=document_fix

* fix some docs issues, test=document_fix

* beta -> \beta in softplus

* threshold -> \varepsilon in softplus

* parameter name

* delta -> \delta in smooth_l1_loss

* fix some docs test=document_fix

* fix docs test=document_fix

* fix docs && 增加空行 test=document_fix

* Update python/paddle/nn/functional/activation.py, test=document_fix

* Update python/paddle/nn/layer/activation.py, test=document_fix
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

* [docs] add ipustrategy Hyperlink (#46422)

* [docs] add ipustrategy Hyperlink

* fix ipu_shard_guard docs; test=document_fix

* [docs] add set_ipu_shard note

* [docs] fix hyperlink

* update framework.py

* fix mlu_places docs; test=document_fix

* fix put_along_axis docs; test=document_fix

* fix flake8 W293 error, test=document_fix

* fix typo in typing, test=document_fix
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>

* #46659

* Update README_cn.md (#46927)

修复了错别字

* #46738

* fix paddle.get_default_dtype (#47040)

Chinese and English return values are inconsistent

* fix bug
Co-authored-by: N张春乔 <83450930+Liyulingyue@users.noreply.github.com>
Co-authored-by: NInfinity_lee <luhputu0815@gmail.com>
Co-authored-by: Nmrcangye <chenloong@88.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
Co-authored-by: Ngouzil <66515297+gouzil@users.noreply.github.com>
Co-authored-by: NHamid Zare <12127420+hamidzr@users.noreply.github.com>
Co-authored-by: NSqhttwl <61459740+Sqhttwl@users.noreply.github.com>
Co-authored-by: NOccupyMars2025 <31559413+OccupyMars2025@users.noreply.github.com>
Co-authored-by: N超级码牛 <54444805+SuperCodebull@users.noreply.github.com>
Co-authored-by: Njzhang533 <jzhang533@gmail.com>

cf668ab3

Y
update the split logic for uniform (#47670) (#47705) · 3a014783
由 Yuang Liu 提交于 11月 07, 2022
```
* code format change

* update the split logic for uniform (#47670)
```
3a014783

[cherry-pick2.4]fix numpy issue in codeblock examples (#47664) · d5809836

由 Ligoml 提交于 11月 07, 2022

* #46765

* #47042

* Remove redundant numpy import (#47483)

* #47555

* resolve conflict

* resolve conflict

* resolve conflict

* resolve conflict

* resolve conflict

* for_codestyle

* fix sample code paddle.linalg.multi_dot
Co-authored-by: NKevin吴嘉文 <417333277@qq.com>

d5809836

[Cherry-pick][BugFix]Fix set_attr modify underly type (#47500) (#47566) · 58c47e8d

由 Aurelius84 提交于 11月 07, 2022

* Fix set_attr modify underly type (#47500)

* reformat code

* Revert "reformat code"

This reverts commit f11a5d7658633e53c279f11612254937e2d87feb.

58c47e8d

04 11月, 2022 2 次提交

[CherryPick] Cherry pick #45916 #46031 #47299 (#47610) · 72e1eb6b

由 xiongkun 提交于 11月 04, 2022

* [ Dy2Static ] Fix bugs when select inputs meeting different shape or undefined-var (#45916)

* fix select_input with different shape errors:
1. select_input_with_buildin_type directly return non-undefinedvar branch when meeting undefined var
2. the output shape of select_input is inferred from inputs.

* reverse the logic in select_input

* [warning] added warning message in cond block when one branch returns variable and another returns None (#46031)

* [cherry-pick] Allow manaully set py_reader name in standalone executor (#45898) (#45931)

* Allow manaully set py_reader name in standalone executor

* [BugFix] while cond receives dict as input (#47299)

* fix bugs while cond receives dict as input

* add unittest

* change flatten -> _is_sequence_except_dict

* code format
Co-authored-by: Nfeifei-111 <wuzhanfei@baidu.com>

72e1eb6b

L
[cherry-pick2.4]for CodeStyle (#47608) · cfee9c13
由 Ligoml 提交于 11月 04, 2022
```
* only run pre-commit

* only run pre-commit
```
cfee9c13

03 11月, 2022 3 次提交
- S
  
  FC/matmul(v2) + scale fuse pass (#47420) · 99c872fa
  由 Sławomir Siwek 提交于 11月 03, 2022
  
  99c872fa
- Z
  [Sparse] Unified api args name (#47529) (#47627) · 75088bbf
  由 zhangkaihuo 提交于 11月 03, 2022
```
Unified api args name
```
  75088bbf
- S
  support unbalanced data for pipeline (#47199) (#47569) · d4bf8b1a
  由 ShenLiang 提交于 11月 03, 2022
```
* add unbalanced data

* fix utest
```
  d4bf8b1a
01 11月, 2022 3 次提交

Z
[cherry-pick]Fix english documents of sparse api (#47496) · 61953b90
由 zhangkaihuo 提交于 11月 01, 2022
```
Fix english documents of sparse api
```
61953b90

[cherry-pick][code-gen] Support code-gen for opmaker of sparse op (#46993) (#47417) · 601626ac

由 zyfncg 提交于 11月 01, 2022

* support generating code of opmaker for backward op invoke forward op (#46912)

* [code-gen] Support code-gen for opmaker of sparse op (#46993)

* support generating code of opmaker for backward op invoke forward op

* gsupport code-gen of opmaker for sparse op

* refind logic of choose phi kernrel

* fix complie budg

* fix code_gen bug

* fix bug

* fix kernel signature code-gen

* fix complie bug of VarType

* fix complie bug of VarType

* fix test_sparse_conv_op

* fix test_sparse_norm_op

* [Phi] Refactor logic of judging whether having a phi kernrel (#46920)

* refind logic of choose phi kernrel

* fix complie budg

* update cmake

601626ac

S

add missing scale parameter (#47522) · 5ffd4afe
由 sneaxiy 提交于 11月 01, 2022

5ffd4afe

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功