- 15 5月, 2023 1 次提交
-
-
由 shaojie_wang 提交于
* fix embedding model weight type mismatch error * Update fp16_utils.py --------- Co-authored-by: NZhang Ting <zhangting_2017@163.com>
-
- 12 5月, 2023 1 次提交
-
-
由 Zhang Ting 提交于
-
- 08 5月, 2023 1 次提交
-
-
由 Zhang Ting 提交于
-
- 24 4月, 2023 1 次提交
-
-
由 Zhang Ting 提交于
* support promote dtype for static amp training * unify o1 and o2 * update for unittest * fix op_role * add use_promote arg * fix doc * add promote unittest * polish unittests * fix controflow and test
-
- 18 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Implement a common AmpTestBase. * Support overload of decorate. * Change the ignore list of flake and fix an error.
-
- 14 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Unify the static amp codes of fp16 and bf16. * Polish apis and add unittest. * Add operator stats collecting tools for program. * Add the check of number of bloat16 operators in unittest. * Add warning for operator not supported for amp. * Add testing of BF16 O1 and O2.
-
- 12 4月, 2023 1 次提交
-
-
由 qizhaoaoe 提交于
* fix dtype cast in amp. * add test case and update docs. * remove set_prim.
-
- 31 3月, 2023 1 次提交
-
-
由 张春乔 提交于
* autofix Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com> * revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py * empty commit, trigger ci * fix test_slice --------- Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 16 3月, 2023 1 次提交
-
-
由 liuruyan 提交于
-
- 10 3月, 2023 1 次提交
-
-
由 liuruyan 提交于
-
- 17 1月, 2023 1 次提交
-
-
由 zhangkaihuo 提交于
-
- 12 1月, 2023 1 次提交
-
-
由 zhangkaihuo 提交于
-
- 23 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* update config * re-blacken python code * temporarily disable date and diff_py_file * skip a format
-
- 27 9月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle] remove all future import * revert test_error.py * restore future import in example code
-
- 23 8月, 2022 2 次提交
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* use yapf to format all python file * yapf exclude two unittests file for they rely on writing and reading file, and format will break them * disable diff_py_file because too many diff files cause command following failed
-
- 26 4月, 2022 1 次提交
-
-
由 WangXi 提交于
-
- 15 4月, 2022 1 次提交
-
-
由 Allen Guo 提交于
* add mixed-precission support for ipu * restore cast_model_to_fp16 api * update UTs
-
- 17 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
* support multi precision update for LAMB * hide some api * fix ci uts * fix lamb output of dygraph * remove some changes to some PR * try to fix Py3 CI compile error * fix test_imperative_optimizer, add lars ut, add layer_norm ut * fix ut, fix format * fix ut * fix windows ci
-
- 27 10月, 2021 1 次提交
-
-
由 zhangkaihuo 提交于
本PR是fused_transformer的layer层代码,包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。
-
- 14 10月, 2021 1 次提交
-
-
由 Zhang Zheng 提交于
-
- 05 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 07 5月, 2021 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Add casting initializers for bf16 training * Changes after review * Correct test and add comment
-
- 21 4月, 2021 1 次提交
-
-
由 huangxu96 提交于
-
- 15 4月, 2021 1 次提交
-
-
由 fangshuixun007 提交于
fix test sync_with_cpp (#32212)
-
- 26 3月, 2021 1 次提交
-
-
由 lilong12 提交于
* update, test=develop
-
- 13 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
-
- 08 1月, 2021 1 次提交
-
-
由 Zhen Wang 提交于
* add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.
-
- 15 12月, 2020 1 次提交
-
-
由 huangxu96 提交于
* add alias for fluid.contrib.mixed_precision
-
- 02 12月, 2020 2 次提交
-
-
由 Zhen Wang 提交于
* add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT. * For CI Coverage Checking.
-
由 furnace 提交于
* add fp16 for layer_norm op * revert layernorm api * fix forward * fix forward * fix backward for layernorm with fp16 * fix unit test for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U> * fix with_mkldnn compile error for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
- 04 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* skip reader op in mixed_precision decorator * add ut
-
- 23 9月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add fused_bn_add_relu op
-
- 14 9月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240) * update amp_check_finite_and_scale_op for static_amp. * use amp_check_finite_and_scale in static graph amp. * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op). * add update_loss_scaling op in cpp. * add update_loss_scaling_op unit test. * update the doc of the check_finite_and_unscale op * Update the process of gradients updating skipping if the gradients have infinite values. * update the way to zero grads. * update test_update_loss_scaling_op.py * add log info when find infinite grads. * add the unit test for UpdateLossScaling Layer.
-
- 03 9月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
-
- 15 4月, 2020 1 次提交
-
-
由 mapingshuo 提交于
* allow amp and recompute working together
-
- 26 11月, 2019 1 次提交
-
-
由 Zhen Wang 提交于
* fix some typos in AMP. test=develop * delete useless codes. test=develop
-
- 30 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
* add custom black varname test=develop * fix dtype test=develop * fix num test=develop * fix ut test=develop * fix coverage test=develop * fix blackvar names test=develop
-
- 19 9月, 2019 1 次提交
-
-
由 Jie Fang 提交于
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
-