- 14 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Unify the static amp codes of fp16 and bf16. * Polish apis and add unittest. * Add operator stats collecting tools for program. * Add the check of number of bloat16 operators in unittest. * Add warning for operator not supported for amp. * Add testing of BF16 O1 and O2.
-
- 06 4月, 2023 1 次提交
-
-
由 Kim Yann 提交于
* rem is_compiled_with_npu * rem nup related code * make lint happy * rem test * remove some tests * Update grad_scaler.py * fix an error
-
- 14 2月, 2023 1 次提交
-
-
由 mhy-666 提交于
-
- 17 1月, 2023 1 次提交
-
-
由 zhangkaihuo 提交于
-
- 12 1月, 2023 1 次提交
-
-
由 zhangkaihuo 提交于
-
- 09 12月, 2022 1 次提交
-
-
由 cyber-pioneer 提交于
-
- 02 12月, 2022 1 次提交
-
-
由 heyanru 提交于
-
- 08 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition * fix an increment
-
- 23 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* update config * re-blacken python code * temporarily disable date and diff_py_file * skip a format
-
- 14 9月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* trim trailing whitespace * fix `.cmake-format.py` * revert npu ut changes, avoid npu ci error
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* use yapf to format all python file * yapf exclude two unittests file for they rely on writing and reading file, and format will break them * disable diff_py_file because too many diff files cause command following failed
-
- 28 4月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* add gradient merge for DistributedFusedLamb * use master acc gradient * fix CI ut * polish * remove math_function_impl.h change * fix test_update_loss_scaling_op.py * try to fix XPU/NPU CI * add gm ut
-
- 19 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* add DistributedFusedLamb op * polish code * fix compile error * compatible with pten changement * fix rocm compile error * improve converage * update upstream/develop * fix cast_with_ptr.h * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1 * fix clip before allreduce * add use_master_param_norm * code polish * fix bug * fix ROCM ci
-
- 17 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
* support multi precision update for LAMB * hide some api * fix ci uts * fix lamb output of dygraph * remove some changes to some PR * try to fix Py3 CI compile error * fix test_imperative_optimizer, add lars ut, add layer_norm ut * fix ut, fix format * fix ut * fix windows ci
-
- 17 8月, 2021 1 次提交
-
-
由 Roc 提交于
-
- 22 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* copy found_inf to cpu in advance to improve performance * add npu test * add npu test * refine code * refine memcpy op * fix adam
-
- 19 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* pass found_inf to adam * add unittest * fix bug * refine unittest * change unit test's directory * disable unittest on cpu
-
- 16 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* add clear_float_status op * refine infershape * fix typo * refine check_finite_and_scale * refine code
-
- 10 6月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 23 4月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* refactor_check_finite_and_scale_npu_kernel * fix compile * add alloc_float_status op * add alloc_float_status op * add FloatStatus for check_finite_and_unscale * refine code * remove unneccessary logic * refine for fleet
-
- 22 4月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 21 4月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 13 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
-
- 08 1月, 2021 1 次提交
-
-
由 Zhen Wang 提交于
* add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.
-
- 05 1月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 09 12月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 30 11月, 2020 1 次提交
-
-
由 WangXi 提交于
-
- 12 10月, 2020 1 次提交
-
-
由 WangXi 提交于
-
- 14 9月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240) * update amp_check_finite_and_scale_op for static_amp. * use amp_check_finite_and_scale in static graph amp. * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op). * add update_loss_scaling op in cpp. * add update_loss_scaling_op unit test. * update the doc of the check_finite_and_unscale op * Update the process of gradients updating skipping if the gradients have infinite values. * update the way to zero grads. * update test_update_loss_scaling_op.py * add log info when find infinite grads. * add the unit test for UpdateLossScaling Layer.
-
- 08 1月, 2020 1 次提交
-
-
由 gongweibao 提交于
-
- 26 11月, 2019 1 次提交
-
-
由 Zhen Wang 提交于
* fix some typos in AMP. test=develop * delete useless codes. test=develop
-
- 15 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 10 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 19 9月, 2019 1 次提交
-
-
由 Jie Fang 提交于
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
-
- 10 9月, 2019 1 次提交
-
-
由 gongweibao 提交于
Fix float16 optimizer
-
- 06 9月, 2019 1 次提交
-
-
由 Jie Fang 提交于
init new amp, optimize inserting cast op for batchnorm
-
- 28 6月, 2019 1 次提交
-
-
由 Jie Fang 提交于
test=develop
-
- 25 6月, 2019 1 次提交
-
-
由 Jie Fang 提交于
test=develop
-
- 16 5月, 2019 1 次提交
-
-
由 Jie Fang 提交于
* init auto loss scaling test=develop * change API.spec * change ifelse to switch and use reduce_sum to optimize checking isfinite test=develop * Remove redundant code test=develop
-
- 25 4月, 2019 1 次提交
-
-
由 Yibing Liu 提交于
* Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop
-