1. 22 5月, 2023 1 次提交
  2. 18 5月, 2023 1 次提交
    • shaojie_wang's avatar
      [AMP]Master grad in static graph (#53362) · 972581d8
      shaojie_wang 提交于
      * add master gradients on static graph
      
      * add unit test for bf16 master grad static graph
      
      * use float16 as v100 test dtype
      
      * only skip GPU which do not support bf16
      
      * use linear layer to test master grad
      
      * 1.push master grad creation before all optimizer ops; 2.remove useless unittest; 3.use a function to create master grad states
      972581d8
  3. 16 5月, 2023 1 次提交
  4. 11 5月, 2023 1 次提交
  5. 24 4月, 2023 2 次提交
  6. 18 4月, 2023 1 次提交
  7. 14 4月, 2023 1 次提交
    • Y
      [AMP] Unify the static amp codes of fp16 and bf16. (#52694) · dfcba7f4
      Yiqun Liu 提交于
      * Unify the static amp codes of fp16 and bf16.
      
      * Polish apis and add unittest.
      
      * Add operator stats collecting tools for program.
      
      * Add the check of number of bloat16 operators in unittest.
      
      * Add warning for operator not supported for amp.
      
      * Add testing of BF16 O1 and O2.
      dfcba7f4
  8. 06 4月, 2023 1 次提交
    • K
      rem is_compiled_with_npu (#52385) · 7976e2a3
      Kim Yann 提交于
      * rem is_compiled_with_npu
      
      * rem nup related code
      
      * make lint happy
      
      * rem test
      
      * remove some tests
      
      * Update grad_scaler.py
      
      * fix an error
      7976e2a3
  9. 14 2月, 2023 1 次提交
  10. 17 1月, 2023 1 次提交
  11. 12 1月, 2023 1 次提交
  12. 09 12月, 2022 1 次提交
  13. 02 12月, 2022 1 次提交
  14. 08 11月, 2022 1 次提交
  15. 23 10月, 2022 1 次提交
  16. 14 9月, 2022 1 次提交
  17. 05 6月, 2022 1 次提交
    • S
      【code format check upgrade】 step2:yapf (#42944) · a072fca8
      Sing_chan 提交于
      * use yapf to format all python file
      
      * yapf exclude two unittests file for they rely on writing and reading file, and format will break them
      
      * disable diff_py_file because too many diff files cause command following failed
      a072fca8
  18. 28 4月, 2022 1 次提交
  19. 19 2月, 2022 1 次提交
    • S
      Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61
      sneaxiy 提交于
      * add DistributedFusedLamb op
      
      * polish code
      
      * fix compile error
      
      * compatible with pten changement
      
      * fix rocm compile error
      
      * improve converage
      
      * update upstream/develop
      
      * fix cast_with_ptr.h
      
      * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1
      
      * fix clip before allreduce
      
      * add use_master_param_norm
      
      * code polish
      
      * fix bug
      
      * fix ROCM ci
      5df3cd61
  20. 17 12月, 2021 1 次提交
    • S
      Refine some AMP operators for BERT (#37923) · d80fe268
      sneaxiy 提交于
      * support multi precision update for LAMB
      
      * hide some api
      
      * fix ci uts
      
      * fix lamb output of dygraph
      
      * remove some changes to some PR
      
      * try to fix Py3 CI compile error
      
      * fix test_imperative_optimizer, add lars ut, add layer_norm ut
      
      * fix ut, fix format
      
      * fix ut
      
      * fix windows ci
      d80fe268
  21. 17 8月, 2021 1 次提交
  22. 22 7月, 2021 1 次提交
  23. 19 7月, 2021 1 次提交
  24. 16 7月, 2021 1 次提交
  25. 10 6月, 2021 1 次提交
  26. 23 4月, 2021 1 次提交
    • L
      [NPU] refactor check_finite_and_scale npu kernel (#32407) · 39a59dcf
      Leo Chen 提交于
      * refactor_check_finite_and_scale_npu_kernel
      
      * fix compile
      
      * add alloc_float_status op
      
      * add alloc_float_status op
      
      * add FloatStatus for check_finite_and_unscale
      
      * refine code
      
      * remove unneccessary logic
      
      * refine for fleet
      39a59dcf
  27. 22 4月, 2021 1 次提交
  28. 21 4月, 2021 1 次提交
  29. 13 1月, 2021 1 次提交
  30. 08 1月, 2021 1 次提交
    • Z
      Support pure fp16 training for AMP API. (#29544) · 7f7dfccf
      Zhen Wang 提交于
      * add cast ops before and after unsupported fp16 ops.
      
      * Keep partial net in FP32 pattern.
      
      * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
      
      * Add fp16 support for adam op.
      
      * add multi precision attr for adam.
      
      * Fix the bug of test_multi_precision_fp16_train UT.
      
      * Code format for CI.
      
      * Fix the redefine error about MPTypeTrait on windows.
      
      * fix bugs of the _create_accumulators func in Momentum.
      
      * fix bug when inserting post cast op.
      
      * Add the update_loss_scaling op in allow_set of UnusedVarCheck.
      
      * Update for ci coverage.
      
      * Add some doc for OptimizerWithMixedPrecision.
      
      * Fix the code style.
      
      * Imporve the doc of `amp_init`.
      
      * Change for fp16 testing if users have the infer program defined in separate way.
      7f7dfccf
  31. 05 1月, 2021 1 次提交
  32. 09 12月, 2020 1 次提交
  33. 30 11月, 2020 1 次提交
  34. 12 10月, 2020 1 次提交
  35. 14 9月, 2020 1 次提交
    • Z
      Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for... · d708b210
      Zhen Wang 提交于
      Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240)
      
      * update amp_check_finite_and_scale_op for static_amp.
      
      * use amp_check_finite_and_scale in static graph amp.
      
      * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op).
      
      * add update_loss_scaling op in cpp.
      
      * add update_loss_scaling_op unit test.
      
      * update the doc of the check_finite_and_unscale op
      
      * Update the process of gradients updating skipping if the gradients have infinite values.
      
      * update the way to zero grads.
      
      * update test_update_loss_scaling_op.py
      
      * add log info when find infinite grads.
      
      * add the unit test for UpdateLossScaling Layer.
      d708b210
  36. 08 1月, 2020 1 次提交
  37. 26 11月, 2019 1 次提交
  38. 15 10月, 2019 1 次提交
  39. 10 10月, 2019 1 次提交