1. 16 3月, 2023 1 次提交
    • J
      [Auto Parallel Performance] Support BF16 Training (#51285) · 9ded5707
      JZ-LIANG 提交于
      * update env setting
      
      * update pass logic
      
      * dist op support bf16
      
      * backward cast update
      
      * update setting
      
      * update backward
      
      * revert amp pass
      
      * update fp16 backward logic
      
      * register c_embedding bf16
      
      * revert engine
      
      * add unitest
      
      * add unitest
      
      * update unitest
      
      * update cmake
      
      * update math
      
      * update math.py
      
      * update unitest
      
      * update unitest
      
      * revise unitest
      
      * revise unitest
      
      * update unitest
      
      * update unitest
      
      * update unitest
      9ded5707
  2. 27 2月, 2023 1 次提交
  3. 21 2月, 2023 1 次提交
  4. 13 2月, 2023 1 次提交
    • Y
      [Auto Parallel] Fix a bug of dist_scale (#50288) · 7f7e9320
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      
      * [Auto Parallel] Clear some fluid APIs
      
      * [Auto Parallel] Fix a bug of dist_scale
      7f7e9320
  5. 16 1月, 2023 1 次提交
    • Y
      [Auto Parallel] Clear some fluid APIs (#49793) · e70af91d
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      
      * [Auto Parallel] Clear some fluid APIs
      e70af91d
  6. 10 1月, 2023 1 次提交
  7. 06 1月, 2023 1 次提交
    • Y
      [Auto Parallel] Merge dist attrs from python into c++ (#49214) · c7899074
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      c7899074
  8. 04 1月, 2023 1 次提交
    • J
      [Auto Parallel-Performance] Sharding Comm Optimization (#48604) · 5592f8ad
      JZ-LIANG 提交于
      * remove deps and prior comm
      
      * grad comm fuse
      
      * add deps for amp&global norm
      
      * stage2 broadcast prior deps
      
      * stage2 grad overlap
      
      * stream_analyzer bugfix
      
      * overlap enable
      
      * dep op namescope
      
      * depend support multiple inputs
      
      * check finite deps
      
      * stage2 param comm overlap
      
      * Set kD2HStream
      
      * grad comm hierarchical
      
      * grad comm hierarchical
      
      * new unitest
      Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
      5592f8ad
  9. 29 12月, 2022 1 次提交
  10. 28 12月, 2022 1 次提交
    • Z
      [AutoParallel] adapt for clip (#49249) · df944772
      zhaoyingli 提交于
      * [AutoParallel] adapt for clip
      
      * fix unittest
      
      * enable_static
      
      * fix dist_fill_constant_batch_size_like
      
      * fix process_mesh.shape
      
      * update cond of modifying shape_list
      df944772
  11. 26 12月, 2022 1 次提交
    • Y
      [Auto Parallel] Merge the python and c++ impls of ProcessMesh (#47503) · 1c0afa79
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Fix a bug
      1c0afa79
  12. 29 11月, 2022 1 次提交
  13. 24 11月, 2022 1 次提交
  14. 22 11月, 2022 1 次提交
  15. 15 11月, 2022 2 次提交
  16. 14 11月, 2022 1 次提交
  17. 10 11月, 2022 1 次提交
  18. 03 11月, 2022 1 次提交
  19. 23 10月, 2022 1 次提交
  20. 18 10月, 2022 1 次提交
    • C
      [Auto Parallel]Add parallel tuner (#46189) · 3108ba11
      caozhou 提交于
      * add parallel tuner
      
      * add unittest
      
      * fix unittest
      
      * set timeout of unittest
      
      * set unittest timeout
      
      * fix auto_mode setting
      
      * update unittest
      
      * sync from develop and update unittest
      
      * remove unused import
      
      * update unittest
      
      * update cmakelist
      
      * add unittests
      3108ba11
  21. 14 10月, 2022 1 次提交
  22. 12 10月, 2022 1 次提交
    • N
      [CodeStyle][F401] remove unused imports in python/paddle/distributed (#46758) · fe716a0b
      Nyakku Shigure 提交于
      * [CodeStyle][F401] remove unused import in python/paddle/distributed
      
      * remove pass
      
      * empty commit
      
      * Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      * Fix split import.
      
      Fix split import.
      
      * add noqa after meta_optimizers in factory
      
      * restort collective ops
      
      * expand `import *`
      
      * add noqa after required imports
      
      * try to fix APIs without core.ops
      
      * Revert "try to fix APIs without core.ops"
      
      This reverts commit 6172beaf601e84bf61f2490c12c4739f0edaa5eb.
      
      * fix an increment
      
      * empty commit
      
      * add noqa after required imports
      
      * expand `import *`, fix ci error
      Co-authored-by: NShuangchi He <34329208+Yulv-git@users.noreply.github.com>
      fe716a0b
  23. 28 9月, 2022 1 次提交
  24. 14 9月, 2022 2 次提交
  25. 13 9月, 2022 1 次提交
  26. 05 9月, 2022 1 次提交
  27. 31 8月, 2022 1 次提交
  28. 25 8月, 2022 1 次提交
  29. 23 8月, 2022 1 次提交
  30. 16 8月, 2022 2 次提交
  31. 12 8月, 2022 1 次提交
  32. 09 8月, 2022 1 次提交
  33. 03 8月, 2022 1 次提交
  34. 29 7月, 2022 1 次提交
  35. 28 7月, 2022 1 次提交
  36. 25 7月, 2022 1 次提交
    • C
      [Auto Parallel] Add dist op cost (#44146) · d0f4465d
      caozhou 提交于
      * update comp cost
      
      * add dist default op cost
      
      * add dist fill constant batch size like op cost
      
      * add elewise op cost
      
      * add fill_constant_batch_size_like op cost unittest
      
      * add unittest and remove fill_constant_batch_size_like grad op cost
      
      * add to cmakelist
      
      * fix unittest bug
      d0f4465d
  37. 13 7月, 2022 1 次提交