提交 · 848deecde80134d5f3dc17b19400d07bbbfe691d · PaddlePaddle / Paddle

15 5月, 2023 1 次提交

[AMP]fix embedding model weight type mismatch error (#53770) · 848deecd

由 shaojie_wang 提交于 5月 15, 2023

* fix embedding model weight type mismatch error

* Update fp16_utils.py

---------
Co-authored-by: NZhang Ting <zhangting_2017@163.com>

848deecd

12 5月, 2023 1 次提交
- Z
  
  fix dtype missmatch error (#53712) · 772b4906
  由 Zhang Ting 提交于 5月 12, 2023
  
  772b4906
08 5月, 2023 1 次提交
- Z
  
  [AMP] fix static promote (#53439) · 2bf61284
  由 Zhang Ting 提交于 5月 08, 2023
  
  2bf61284
24 4月, 2023 1 次提交

[AMP] support promote kernel for static graph (#52514) · 71a513c2

由 Zhang Ting 提交于 4月 24, 2023

* support promote dtype for static amp training

* unify o1 and o2

* update for unittest

* fix op_role

* add use_promote arg

* fix doc

* add promote unittest

* polish unittests

* fix controflow and test

71a513c2

18 4月, 2023 1 次提交
- Y
  [AMP] Support overload of paddle.static.amp.decorate function. (#52918) · 79a01d6c
  由 Yiqun Liu 提交于 4月 18, 2023
```
* Implement a common AmpTestBase.

* Support overload of decorate.

* Change the ignore list of flake and fix an error.
```
  79a01d6c
14 4月, 2023 1 次提交

[AMP] Unify the static amp codes of fp16 and bf16. (#52694) · dfcba7f4

由 Yiqun Liu 提交于 4月 14, 2023

* Unify the static amp codes of fp16 and bf16.

* Polish apis and add unittest.

* Add operator stats collecting tools for program.

* Add the check of number of bloat16 operators in unittest.

* Add warning for operator not supported for amp.

* Add testing of BF16 O1 and O2.

dfcba7f4

12 4月, 2023 1 次提交
- Q
  fix dtype cast in amp for instance_norm. (#52765) · f650e901
  由 qizhaoaoe 提交于 4月 12, 2023
```
* fix dtype cast in amp.

* add test case and update docs.

* remove set_prim.
```
  f650e901
31 3月, 2023 1 次提交

张

[CodeStyle][UP030][UP031][UP032] using f-string (#52062) · 40e4f5a5

由张春乔提交于 3月 31, 2023

* autofix
Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com>

* revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py

* empty commit, trigger ci

* fix test_slice

---------
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

40e4f5a5

16 3月, 2023 1 次提交
- L
  
  address question inputs is Tensor array not Dense Tensor in static amp mode. (#51476) · 7dbd8b8c
  由 liuruyan 提交于 3月 16, 2023
  
  7dbd8b8c
10 3月, 2023 1 次提交
- L
  
  Address bug of open amp after dynamic to static, when control op in program. (#50799) · 3c7cde95
  由 liuruyan 提交于 3月 10, 2023
  
  3c7cde95
17 1月, 2023 1 次提交
- Z
  
  Fix the paddle/staitc/amp/__init__.py (#49791) · fcc90531
  由 zhangkaihuo 提交于 1月 17, 2023
  
  fcc90531
12 1月, 2023 1 次提交
- Z
  
  move fuild.contrib.mixed_precision to paddle.static.amp (#49412) · 69d01eb9
  由 zhangkaihuo 提交于 1月 12, 2023
  
  69d01eb9
23 10月, 2022 1 次提交
- N
  [CodeStyle][black] use black instead of yapf (#46014) · 7097630f
  由 Nyakku Shigure 提交于 10月 23, 2022
```
* update config

* re-blacken python code

* temporarily disable date and diff_py_file

* skip a format
```
  7097630f
27 9月, 2022 1 次提交
- N
  [CodeStyle] remove all future import (#46411) · 30387006
  由 Nyakku Shigure 提交于 9月 27, 2022
```
* [CodeStyle] remove all future import

* revert test_error.py

* restore future import in example code
```
  30387006
23 8月, 2022 2 次提交
- J
  
  bugfix (#45332) · 257438f3
  由 JZ-LIANG 提交于 8月 23, 2022
  
  257438f3
- J
  [Auto Parallel] Data Parallel Comm & Calc Overlap Optimization (#45173) · 229befc8
  由 JZ-LIANG 提交于 8月 23, 2022
```
* bugfix

* remove scaling

* support rescale_grad opt

* add unitest
```
  229befc8
05 6月, 2022 1 次提交

【code format check upgrade】 step2：yapf (#42944) · a072fca8

由 Sing_chan 提交于 6月 05, 2022

* use yapf to format all python file

* yapf exclude two unittests file for they rely on writing and reading file, and format will break them

* disable diff_py_file because too many diff files cause command following failed

a072fca8

26 4月, 2022 1 次提交
- W
  
  Add fused_multi_transformer op to optimize transformer generation performance (#41814) · 9dadf7df
  由 WangXi 提交于 4月 26, 2022
  
  9dadf7df
15 4月, 2022 1 次提交
- A
  [IPU] add mixed-precission support for ipu (#41733) · d7224482
  由 Allen Guo 提交于 4月 15, 2022
```
* add mixed-precission support for ipu

* restore cast_model_to_fp16 api

* update UTs
```
  d7224482
17 12月, 2021 1 次提交

Refine some AMP operators for BERT (#37923) · d80fe268

由 sneaxiy 提交于 12月 17, 2021

* support multi precision update for LAMB

* hide some api

* fix ci uts

* fix lamb output of dygraph

* remove some changes to some PR

* try to fix Py3 CI compile error

* fix test_imperative_optimizer, add lars ut, add layer_norm ut

* fix ut, fix format

* fix ut

* fix windows ci

d80fe268

27 10月, 2021 1 次提交

Fused transformer encoder layer and fused feedforward layer (#36604) · 9f3613f3

由 zhangkaihuo 提交于 10月 27, 2021

本PR是fused_transformer的layer层代码，包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。

9f3613f3

14 10月, 2021 1 次提交
- Z
  
  Add the complete code and related files of resnet_unit_op (#36366) · 12e6dbbc
  由 Zhang Zheng 提交于 10月 14, 2021
  
  12e6dbbc
05 8月, 2021 1 次提交
- W
  
  optimize pipeline performance with recompute and amp, test=allcase (#34519) · 911c8593
  由 WangXi 提交于 8月 05, 2021
  
  911c8593
07 5月, 2021 1 次提交
- J
  Mechanism that converts startup_program initializers to BF16 (#32720) · ce2bdb0a
  由 joanna.wozna.intel 提交于 5月 07, 2021
```
* Add casting initializers for bf16 training

* Changes after review

* Correct test and add comment
```
  ce2bdb0a
21 4月, 2021 1 次提交
- H
  
  fix bug in amp O2 (#32343) · 4be3b057
  由 huangxu96 提交于 4月 21, 2021
  
  4be3b057
15 4月, 2021 1 次提交
- F
  fix test sync_with_cpp (#32212) · 0c037d2d
  由 fangshuixun007 提交于 4月 15, 2021
```
fix test sync_with_cpp (#32212)
```
  0c037d2d
26 3月, 2021 1 次提交
- L
  [3D-parallel] Reformat pipeline parallel (#31786) · c3974d0e
  由 lilong12 提交于 3月 26, 2021
```
* update, test=develop
```
  c3974d0e
13 1月, 2021 1 次提交
- H
  
  add amp example document (#30314) · 342d62de
  由 huangxu96 提交于 1月 13, 2021
  
  342d62de
08 1月, 2021 1 次提交

Support pure fp16 training for AMP API. (#29544) · 7f7dfccf

由 Zhen Wang 提交于 1月 08, 2021

* add cast ops before and after unsupported fp16 ops.

* Keep partial net in FP32 pattern.

* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.

* Add fp16 support for adam op.

* add multi precision attr for adam.

* Fix the bug of test_multi_precision_fp16_train UT.

* Code format for CI.

* Fix the redefine error about MPTypeTrait on windows.

* fix bugs of the _create_accumulators func in Momentum.

* fix bug when inserting post cast op.

* Add the update_loss_scaling op in allow_set of UnusedVarCheck.

* Update for ci coverage.

* Add some doc for OptimizerWithMixedPrecision.

* Fix the code style.

* Imporve the doc of `amp_init`.

* Change for fp16 testing if users have the infer program defined in separate way.

7f7dfccf

15 12月, 2020 1 次提交
- H
  add alias for fluid.contrib.mixed_precision (#29562) · c05170d3
  由 huangxu96 提交于 12月 15, 2020
```
* add alias for fluid.contrib.mixed_precision
```
  c05170d3
02 12月, 2020 2 次提交

Add pure fp16 training with master weights. (#27712) · be3777a5

由 Zhen Wang 提交于 12月 02, 2020

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

* For CI Coverage Checking.

be3777a5

Layer norm fp16 (#29169) · 7584bb50

由 furnace 提交于 12月 02, 2020

* add fp16 for layer_norm op

* revert layernorm api

* fix forward

* fix forward

* fix backward for layernorm with fp16

* fix unit test for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16

* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>

* fix with_mkldnn compile error for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

7584bb50

04 11月, 2020 1 次提交
- L
  Skip reader op in mixed_precision decorator (#28353) · 71d62207
  由 Leo Chen 提交于 11月 04, 2020
```
* skip reader op in mixed_precision decorator

* add ut
```
  71d62207
23 9月, 2020 1 次提交
- Z
  add fuse_bn_act op (#27230) · 906e7f92
  由 Zhang Ting 提交于 9月 23, 2020
```
* add fused_bn_add_relu op
```
  906e7f92
14 9月, 2020 1 次提交

Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for... · d708b210

由 Zhen Wang 提交于 9月 14, 2020

Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240)

* update amp_check_finite_and_scale_op for static_amp.

* use amp_check_finite_and_scale in static graph amp.

* update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op).

* add update_loss_scaling op in cpp.

* add update_loss_scaling_op unit test.

* update the doc of the check_finite_and_unscale op

* Update the process of gradients updating skipping if the gradients have infinite values.

* update the way to zero grads.

* update test_update_loss_scaling_op.py

* add log info when find infinite grads.

* add the unit test for UpdateLossScaling Layer.

d708b210

03 9月, 2020 1 次提交
- Z
  
  fix some cast error. (#26884) · bcdbac17
  由 Zhen Wang 提交于 9月 03, 2020
  
  bcdbac17
15 4月, 2020 1 次提交
- M
  fix AMP and recompute (#23551) · f0e743f1
  由 mapingshuo 提交于 4月 15, 2020
```
* allow amp and recompute working together
```
  f0e743f1
26 11月, 2019 1 次提交
- Z
  Fix some typos in AMP. (#21354) · be2e3e67
  由 Zhen Wang 提交于 11月 26, 2019
```
* fix some typos in AMP. test=develop

* delete useless codes. test=develop
```
  be2e3e67
30 10月, 2019 1 次提交

Add custom black variable name set in amp interface. (#20875) · 3255fe69

由 gongweibao 提交于 10月 30, 2019

* add custom black varname test=develop

* fix dtype test=develop

* fix num test=develop

* fix ut test=develop

* fix coverage test=develop

* fix blackvar names test=develop

3255fe69

19 9月, 2019 1 次提交
- J
  Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714) · d9db94d7
  由 Jie Fang 提交于 9月 19, 2019
```
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
```
  d9db94d7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功