1. 24 4月, 2023 2 次提交
  2. 18 4月, 2023 1 次提交
  3. 14 4月, 2023 1 次提交
    • Y
      [AMP] Unify the static amp codes of fp16 and bf16. (#52694) · dfcba7f4
      Yiqun Liu 提交于
      * Unify the static amp codes of fp16 and bf16.
      
      * Polish apis and add unittest.
      
      * Add operator stats collecting tools for program.
      
      * Add the check of number of bloat16 operators in unittest.
      
      * Add warning for operator not supported for amp.
      
      * Add testing of BF16 O1 and O2.
      dfcba7f4
  4. 13 4月, 2023 1 次提交
  5. 12 4月, 2023 2 次提交
  6. 10 4月, 2023 1 次提交
  7. 06 4月, 2023 1 次提交
    • K
      rem is_compiled_with_npu (#52385) · 7976e2a3
      Kim Yann 提交于
      * rem is_compiled_with_npu
      
      * rem nup related code
      
      * make lint happy
      
      * rem test
      
      * remove some tests
      
      * Update grad_scaler.py
      
      * fix an error
      7976e2a3
  8. 03 4月, 2023 1 次提交
  9. 30 3月, 2023 3 次提交
  10. 15 3月, 2023 1 次提交
  11. 09 3月, 2023 1 次提交
    • G
      Fix hybrid parallel training strategy using bf16 (#51103) · 8db15a42
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Fix bug of hybrid parallel strategy with recompute using bf16.
      
      * Fix bug of recompute_hybrid ctx.amp_dtype
      
      * Fix bug of amp_dtype.
      
      * Fix bug of auto_cast.
      8db15a42
  12. 24 2月, 2023 1 次提交
    • W
      Revert grad scale optimization pr (#50839) · 8a503522
      Weilong Wu 提交于
      * Revert "fixoptminizer _set_auxiliary_var bug (#50335)"
      
      This reverts commit c44005f0.
      
      * Revert "refine optimizer create accumulators (#50188)"
      
      This reverts commit 244e7546.
      
      * Revert "fix found_inf bug for custom optimizer (#50158)"
      
      This reverts commit 64573f9f.
      
      * Revert "refine amp scaler found_inf (#49864)"
      
      This reverts commit 382e9a06.
      
      * fix code format
      
      * fix conflict
      8a503522
  13. 13 2月, 2023 1 次提交
  14. 03 2月, 2023 1 次提交
  15. 30 1月, 2023 1 次提交
  16. 19 1月, 2023 1 次提交
    • J
      [KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9
      jameszhang 提交于
      * [KUNLUN] add op: maxpool_with_index
      
      * use DeviceContext::Alloc() instead of DenseTensor::mutable_data()
      
      * fix file format
      
      * solve clip unittest failure
      
      * minor fix
      
      * Revert "solve clip unittest failure" since the issue is fixed
      in #49535
      
      This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.
      
      * align with xdnn on the definition of mask in max_pool_with_index
      
      * minor
      f71f77e9
  17. 12 1月, 2023 1 次提交
  18. 11 1月, 2023 1 次提交
  19. 06 1月, 2023 1 次提交
  20. 05 1月, 2023 1 次提交
  21. 15 12月, 2022 1 次提交
    • M
      修复paddle.amp.decorate等API的文档 (#48983) · c5af51ca
      mjxs 提交于
      * 涉及到的api有
      paddle.amp.decorate
      paddle.static.npu_places
      paddle.signal.istft
      paddle.signal.stft
      paddle.linalg.eigvalsh
      paddle.randint_like
      
      * change signal.stft
      
      * randint_like的low增加optional
      
      * ; test=docs_preview
      
      * 修改了注解格式; test=docs_preview
      
      * 修改了公式格式
      
      * 修改了decorate的models等
      
      * test=document_fix
      Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
      c5af51ca
  22. 29 11月, 2022 1 次提交
  23. 03 11月, 2022 1 次提交
  24. 23 10月, 2022 1 次提交
  25. 14 9月, 2022 2 次提交
  26. 05 6月, 2022 1 次提交
    • S
      【code format check upgrade】 step2:yapf (#42944) · a072fca8
      Sing_chan 提交于
      * use yapf to format all python file
      
      * yapf exclude two unittests file for they rely on writing and reading file, and format will break them
      
      * disable diff_py_file because too many diff files cause command following failed
      a072fca8
  27. 09 5月, 2022 1 次提交
  28. 07 3月, 2022 1 次提交
  29. 18 2月, 2022 1 次提交
    • Z
      [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
      zhangbo9674 提交于
      * support dtype param for auto_cast
      
      * add amp_dtype for tracer
      
      * add unsupported bf16 list
      
      * support bf16 amp for O2
      
      * refine python interface for bfloat16
      
      * refine code
      
      * refine code
      
      * refine unittest
      
      * refine code
      
      * refine code
      
      * add bf16 o1
      
      * refine code by comment
      
      * add gradient accumulator
      
      * add recompute
      7d6d3848
  30. 29 11月, 2021 1 次提交
  31. 21 10月, 2021 1 次提交
  32. 22 9月, 2021 1 次提交
  33. 17 9月, 2021 1 次提交
    • Z
      [AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d
      zhangbo9674 提交于
      * add pure fp16 major function in auto_cast & tracer
      
      * support master weight in dygraph for pure fp16
      
      * check mix dtype of fp16&fp32 for check_finite_and_unscale op
      
      * change pure fp16 funtion name
      
      * refine some bug in auto_cast
      
      * refine auto_cast interface logic
      
      * add param _casted_by_pure_fp16 for class Layer
      
      * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator
      
      * refine pure_fp16_decorator as decorator
      
      * add unittest
      
      * add comment
      
      * add comment
      
      * support recompute
      
      * add comment for auto_cast and decorator
      
      * support to_static_state_dict for paddle.jit.save
      
      * unlimite models num and optimizers num
      
      * add lookup_table in black_list
      
      * fix momentum and layer state_dict
      
      * fix bug in layer state_dict
      
      * fix bug in layer state_dict_helper
      
      * refine unittest
      
      * refine test_momentun_op
      
      * refine interface and some code
      
      * refine amp_decorator interface
      
      * refine pure fp16 interface
      
      * refine master weight interface
      adaeee4d
  34. 16 8月, 2021 1 次提交
  35. 11 8月, 2021 1 次提交
    • Z
      [AMP] add state_dict and load_state_dict and unittest for class GradScaler (#34300) · 99f8f5c8
      zhangbo9674 提交于
      * add state_dict and load_state_dict and unittest for class GradScaler
      
      * refine unittest for coverage of load_state_dict
      
      * refine comments of code-block
      
      * refine some comments
      
      * refine state_dict code and unittest
      
      * add #require gpu, xpu for GradScaler get/set example code
      
      * add #require gpu, xpu for GradScaler get/set example code
      
      * refine example code
      
      * refine unittest for state_dict
      
      * refine unittest for state_dict
      
      * fix bug of DataLoader in TestGradScalerStateDict
      
      * add flag FLAGS_cudnn_deterministic
      99f8f5c8