1. 16 3月, 2023 1 次提交
    • J
      [Auto Parallel Performance] Support BF16 Training (#51285) · 9ded5707
      JZ-LIANG 提交于
      * update env setting
      
      * update pass logic
      
      * dist op support bf16
      
      * backward cast update
      
      * update setting
      
      * update backward
      
      * revert amp pass
      
      * update fp16 backward logic
      
      * register c_embedding bf16
      
      * revert engine
      
      * add unitest
      
      * add unitest
      
      * update unitest
      
      * update cmake
      
      * update math
      
      * update math.py
      
      * update unitest
      
      * update unitest
      
      * revise unitest
      
      * revise unitest
      
      * update unitest
      
      * update unitest
      
      * update unitest
      9ded5707
  2. 13 3月, 2023 1 次提交
  3. 09 3月, 2023 1 次提交
    • Z
      Remove paddle.fluid.layers.utils.* (#51033) · 86e990d4
      zqw_1997 提交于
      * move fluid.utils to paddle.utils.layers_utils
      
      * fix error
      
      * delete original fluid layers utils
      
      * remove import and old utils
      
      * remove more old utils import
      
      * change import path of fill_constant in the layers_utils.py
      
      * fix mistake
      
      * fix error
      
      * expose in __init__.py
      
      * for comment
      
      * when change the ref of func is_sequence, it should change to the root of is_sequence instead
      
      * for codecheck
      86e990d4
  4. 08 3月, 2023 1 次提交
  5. 06 3月, 2023 1 次提交
  6. 27 2月, 2023 3 次提交
  7. 22 2月, 2023 1 次提交
  8. 21 2月, 2023 1 次提交
  9. 16 2月, 2023 1 次提交
  10. 15 2月, 2023 1 次提交
    • X
      align tool (#49865) · 4632ca13
      xu98bin 提交于
      * auto parallel align tool
      
      * modify function get_var's return
      
      * add save and load in align_tool
      
      * modify load function and save function
      
      * add finding different ops in align tool
      
      * full auto parallel align tool
      
      add test file for auto parallel align tool
      
      set timeout for test
      
      modify get_backward_tmp_var function
      
      add annotation for align tool
      
      modify test file
      
      modify code to restart CI
      
      remove timeout
      
      * set timeout
      4632ca13
  11. 13 2月, 2023 1 次提交
    • Y
      [Auto Parallel] Fix a bug of dist_scale (#50288) · 7f7e9320
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      
      * [Auto Parallel] Clear some fluid APIs
      
      * [Auto Parallel] Fix a bug of dist_scale
      7f7e9320
  12. 09 2月, 2023 1 次提交
    • Z
      remove paddle.fluid.dygraph.parallel.ParallelEnv (#50157) · 9dd1f4bf
      zqw_1997 提交于
      * remove dygraph.parallel.ParallelEnv
      
      * logger.py error: AttributeError: module 'paddle' has no attribute 'distributed'
      
      * move the implenmentation to the root folder
      
      * logger.py import ParallelEnv from paddle.parallel to avoid circular import
      
      * add the comment of why import ParallelEnv from paddle.parallel in logger.py and remove the api interface in the paddle/parallel.py
      
      * outdated Env and note removed
      
      * decouple the logger.py and ParallelEnv
      
      * remove another ref of parallel in init.py
      9dd1f4bf
  13. 03 2月, 2023 1 次提交
  14. 02 2月, 2023 1 次提交
  15. 16 1月, 2023 1 次提交
    • Y
      [Auto Parallel] Clear some fluid APIs (#49793) · e70af91d
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      
      * [Auto Parallel] Clear some fluid APIs
      e70af91d
  16. 12 1月, 2023 1 次提交
  17. 11 1月, 2023 1 次提交
    • Y
      add FusedLinear pass (#49606) · 0f08a432
      yuehuayingxueluo 提交于
      * add FusedLinear pass
      
      * add fused_op_list and renname PASSES to OP_FUSION
      
      * add fused_passes_list to constants.py
      
      * add test_passes.py
      
      * fix test_fused_passes.py
      
      * fix add if float(paddle.version.cuda()) >= 11.6:
      
      * renamed test_fused_passes.py
      
      * fix CMakeList.txt
      0f08a432
  18. 10 1月, 2023 1 次提交
  19. 07 1月, 2023 1 次提交
    • R
      Enable standalone executor for fleet training (#49293) · 67fc8e93
      Ruibiao Chen 提交于
      * Enable standalone executor for fleet training
      
      * Update code
      
      * Replace use_standalone_executor utils in auto parallel
      
      * Update code
      
      * Diable standalone executor for test_pass_sharding
      
      * Update code
      
      * Set sequential run for auto parallel
      
      * Fix dist_attr bug
      
      * Set sequential run for auto parallel
      67fc8e93
  20. 06 1月, 2023 1 次提交
    • Y
      [Auto Parallel] Merge dist attrs from python into c++ (#49214) · c7899074
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Merge dist attrs of Python into C++
      
      * [Auto Parallel] Add back deleted importing
      
      * [Auto Parallel] Add back removed unittest
      
      * [Auto Parallel] Remove type qualifiers of return types
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix a bug of the quant pass
      
      * [Auto Parallel] Fix the code style
      c7899074
  21. 04 1月, 2023 1 次提交
    • J
      [Auto Parallel-Performance] Sharding Comm Optimization (#48604) · 5592f8ad
      JZ-LIANG 提交于
      * remove deps and prior comm
      
      * grad comm fuse
      
      * add deps for amp&global norm
      
      * stage2 broadcast prior deps
      
      * stage2 grad overlap
      
      * stream_analyzer bugfix
      
      * overlap enable
      
      * dep op namescope
      
      * depend support multiple inputs
      
      * check finite deps
      
      * stage2 param comm overlap
      
      * Set kD2HStream
      
      * grad comm hierarchical
      
      * grad comm hierarchical
      
      * new unitest
      Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
      5592f8ad
  22. 30 12月, 2022 2 次提交
  23. 29 12月, 2022 1 次提交
  24. 28 12月, 2022 1 次提交
    • Z
      [AutoParallel] adapt for clip (#49249) · df944772
      zhaoyingli 提交于
      * [AutoParallel] adapt for clip
      
      * fix unittest
      
      * enable_static
      
      * fix dist_fill_constant_batch_size_like
      
      * fix process_mesh.shape
      
      * update cond of modifying shape_list
      df944772
  25. 27 12月, 2022 2 次提交
  26. 26 12月, 2022 1 次提交
    • Y
      [Auto Parallel] Merge the python and c++ impls of ProcessMesh (#47503) · 1c0afa79
      Yulong Ao 提交于
      * [Auto Parallel] Rename methods of ProcessMesh
      
      * [Auto Parallel] Impl the python process_mesh by the c++ one
      
      * [Auto Parallel] Add some minor modifications
      
      * [Auto Parallel] Rename some methods
      
      * [Auto Parallel] Remove unnecessary codes
      
      * [Auto Parallel] Add back some removed files
      
      * [Auto Parallel] Fix bugs
      
      * [Auto Parallel] Fix a bug
      
      * Update process_mesh.cc
      
      * [Auto Parallel] Fix a bug
      1c0afa79
  27. 25 12月, 2022 1 次提交
  28. 21 12月, 2022 2 次提交
  29. 14 12月, 2022 2 次提交
    • Z
      [AutoParallel] recompute tuning (#48608) · 170a31f9
      zhaoyingli 提交于
      * [AutoParallel] recompute tuning
      
      * fix conflict
      
      * update comment
      
      * bug fix
      
      * update rc algo
      
      * tiny fix
      
      * fix clear process_group
      
      * remove comment
      
      * update segment print
      
      * fix import OpRole
      
      * adapt amp pass and grad_clip pass for opt_tuner
      
      * update tuning config
      
      * fix import
      
      * annotate recompute info on ops and upgrade recompute pass
      
      * add op_namescope for seed op
      
      * record reserved vars
      
      * fix recompute var's dist_attr
      
      * fix strategy unittest
      
      * adapt for fp16
      
      * update unittest
      
      * revert copy opt
      
      * update unittest
      
      * rename set_recompute_segments
      
      * fix unittest
      170a31f9
    • J
      [Bugfix] recompute dep filter param (#49010) · b9fad5da
      JZ-LIANG 提交于
      * recompute dep filter param
      
      * recompute dep for reshard
      b9fad5da
  30. 12 12月, 2022 1 次提交
  31. 08 12月, 2022 1 次提交
  32. 05 12月, 2022 1 次提交
  33. 29 11月, 2022 2 次提交