1. 11 7月, 2023 1 次提交
    • P
      support sharding parallel (#54634) · b7a05057
      pangengzheng 提交于
      * support sharding parallel
      
      * fix name
      
      * fix
      
      * update
      
      * test amp for sharding
      
      ---------
      
      Co-authored-by: pangengzheng <pangengzheng.baidu.com>
      b7a05057
  2. 16 5月, 2023 1 次提交
  3. 08 5月, 2023 1 次提交
  4. 27 4月, 2023 1 次提交
  5. 24 4月, 2023 2 次提交
  6. 18 4月, 2023 1 次提交
  7. 12 4月, 2023 1 次提交
  8. 10 4月, 2023 1 次提交
  9. 06 4月, 2023 1 次提交
    • K
      rem is_compiled_with_npu (#52385) · 7976e2a3
      Kim Yann 提交于
      * rem is_compiled_with_npu
      
      * rem nup related code
      
      * make lint happy
      
      * rem test
      
      * remove some tests
      
      * Update grad_scaler.py
      
      * fix an error
      7976e2a3
  10. 03 4月, 2023 1 次提交
  11. 30 3月, 2023 2 次提交
  12. 09 3月, 2023 1 次提交
    • G
      Fix hybrid parallel training strategy using bf16 (#51103) · 8db15a42
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Fix bug of hybrid parallel strategy with recompute using bf16.
      
      * Fix bug of recompute_hybrid ctx.amp_dtype
      
      * Fix bug of amp_dtype.
      
      * Fix bug of auto_cast.
      8db15a42
  13. 13 2月, 2023 1 次提交
  14. 19 1月, 2023 1 次提交
    • J
      [KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9
      jameszhang 提交于
      * [KUNLUN] add op: maxpool_with_index
      
      * use DeviceContext::Alloc() instead of DenseTensor::mutable_data()
      
      * fix file format
      
      * solve clip unittest failure
      
      * minor fix
      
      * Revert "solve clip unittest failure" since the issue is fixed
      in #49535
      
      This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.
      
      * align with xdnn on the definition of mask in max_pool_with_index
      
      * minor
      f71f77e9
  15. 12 1月, 2023 1 次提交
  16. 11 1月, 2023 1 次提交
  17. 06 1月, 2023 1 次提交
  18. 05 1月, 2023 1 次提交
  19. 15 12月, 2022 1 次提交
    • M
      修复paddle.amp.decorate等API的文档 (#48983) · c5af51ca
      mjxs 提交于
      * 涉及到的api有
      paddle.amp.decorate
      paddle.static.npu_places
      paddle.signal.istft
      paddle.signal.stft
      paddle.linalg.eigvalsh
      paddle.randint_like
      
      * change signal.stft
      
      * randint_like的low增加optional
      
      * ; test=docs_preview
      
      * 修改了注解格式; test=docs_preview
      
      * 修改了公式格式
      
      * 修改了decorate的models等
      
      * test=document_fix
      Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
      c5af51ca
  20. 29 11月, 2022 1 次提交
  21. 23 10月, 2022 1 次提交
  22. 14 9月, 2022 2 次提交
  23. 09 5月, 2022 1 次提交
  24. 07 3月, 2022 1 次提交
  25. 18 2月, 2022 1 次提交
    • Z
      [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
      zhangbo9674 提交于
      * support dtype param for auto_cast
      
      * add amp_dtype for tracer
      
      * add unsupported bf16 list
      
      * support bf16 amp for O2
      
      * refine python interface for bfloat16
      
      * refine code
      
      * refine code
      
      * refine unittest
      
      * refine code
      
      * refine code
      
      * add bf16 o1
      
      * refine code by comment
      
      * add gradient accumulator
      
      * add recompute
      7d6d3848
  26. 29 11月, 2021 1 次提交
  27. 17 9月, 2021 1 次提交
    • Z
      [AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d
      zhangbo9674 提交于
      * add pure fp16 major function in auto_cast & tracer
      
      * support master weight in dygraph for pure fp16
      
      * check mix dtype of fp16&fp32 for check_finite_and_unscale op
      
      * change pure fp16 funtion name
      
      * refine some bug in auto_cast
      
      * refine auto_cast interface logic
      
      * add param _casted_by_pure_fp16 for class Layer
      
      * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator
      
      * refine pure_fp16_decorator as decorator
      
      * add unittest
      
      * add comment
      
      * add comment
      
      * support recompute
      
      * add comment for auto_cast and decorator
      
      * support to_static_state_dict for paddle.jit.save
      
      * unlimite models num and optimizers num
      
      * add lookup_table in black_list
      
      * fix momentum and layer state_dict
      
      * fix bug in layer state_dict
      
      * fix bug in layer state_dict_helper
      
      * refine unittest
      
      * refine test_momentun_op
      
      * refine interface and some code
      
      * refine amp_decorator interface
      
      * refine pure fp16 interface
      
      * refine master weight interface
      adaeee4d
  28. 11 6月, 2021 1 次提交
  29. 27 4月, 2021 1 次提交
  30. 28 10月, 2020 1 次提交
  31. 21 10月, 2020 1 次提交
    • C
      2.0rc api rename (#28088) · 7c1aa0d6
      cnn 提交于
      * rename manual_seed to seed
      
      * rename xxx1d-->xxx1D, xxx2d-->xxx2D, xxx3d-->xxx3D
      
      * rename manual_seed --> seed
      
      * do not rename .cc, .cu and .h file
      
      * rename manual_seed --> seed
      
      * rename manual_seed --> seed
      
      * rename manual_seed --> seed
      
      * rename manual_seed --> seed
      
      * disable_static on doc example code
      
      * donot change manual_seed on generator
      
      * add enable_static on sample code
      
      * convert python/paddle/fluid/layers/nn.py to bak
      
      * fix typo
      
      * fix code style
      
      * fix seed to manual_seed when call functions of Generator()
      
      * fix bug
      7c1aa0d6
  32. 30 9月, 2020 1 次提交