1. 18 10月, 2022 2 次提交
    • Y
      Cherry pick for sharding (#47061) · 5b642140
      Yuang Liu 提交于
      * [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495)
      
      * [dygraph sharding stage 2] sharding broadcast overlap (#46656)
      
      * Multi groups for broadcast of sharding stage 2 (#46894)
      5b642140
    • H
      [cherry-pick] Fix perf issues of mp/pp/fuse in eager mode (#47071) · b84edd90
      Haohongxiang 提交于
      * [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116)
      
      * [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780)
      
      * update
      b84edd90
  2. 17 10月, 2022 1 次提交
    • W
      [Cherry-pick] Collective communication APIs (#46922) · 5fba2a98
      Wen Sun 提交于
      * Support both use_calc_stream and sync_op in send recv APIs (#46023)
      
      * Support both use_calc_stream and sync_op in allgather API (#46295)
      
      * Support both use_calc_stream and sync_op in collective communication API (#46761)
      
      * Move group and all reduce from collective to communication (#45848)
      
      * Completes bfloat16 dtype for collective api in eager mode (#45844)
      
      * Fix collective APIs cannot be recognized when building docs (#46962)
      Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>
      5fba2a98
  3. 11 10月, 2022 1 次提交
    • Y
      Cherry pick for dygraph pp (#46876) · 9cc3f69f
      Yuang Liu 提交于
      * bug fix for virtual pipeline parallel (#45922)
      
      * dont wait for send op under dygraph pp (#46209)
      
      * [interleave pp] sync recv for 1f1b (#46399)
      
      * [dygraph pp] all sync for allgather partial (#46483)
      9cc3f69f
  4. 27 9月, 2022 2 次提交
  5. 26 9月, 2022 1 次提交
    • Z
      cherry-pick V2.4 (#46358) · 536d9d8c
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fix gloo compile warning
      
      * adapt for nn fl-ps
      
      * flps del fake-init op
      
      * add learning_rate_0 intializer op
      
      * bug fix
      
      * .
      
      * .
      536d9d8c
  6. 22 9月, 2022 3 次提交
  7. 20 9月, 2022 3 次提交
  8. 19 9月, 2022 6 次提交
    • W
      Recompute unify incubate (#46073) (#46210) · 4bced24a
      wuhuachaocoding 提交于
      4bced24a
    • X
      [cherry-pick] add abs,mean,sum,ge,gt,pow,etc higher-order differentiation operators (#46184) · ad8beaaf
      Xiaoxu Chen 提交于
      * [cherry-pick] extend reduce_sum,reduce_sum,eq,ne,ge,abs,pow,etc higher order operators
      
      * add reduce_mean,reduce_sum primitive ops
      * add ne_p gt_p primitive operators
      * add ge_p abs_p primitive oparators
      * add cast primitive operators
      * add pow,square prim2oirg rules
      * add elementwise_div orig2prim rule
      
      * [cherry-pick] add mean,sum,ge,gt,ne,abs,etc higher-order differentiation operators(#45888)
      
      * add reduce_mean,reduce_sum primitive ops
      
      * add ne_p gt_p primitive operators
      
      * add ge_p abs_p primitive oparators
      ad8beaaf
    • W
      refactor mp. (#45803) (#46121) · e5dc9d61
      wuhuachaocoding 提交于
      * refactor mp.
      
      * update setup.py.
      
      * update mp_layers.py for compatibility.
      
      * add documents for mp_layers.py
      
      * update init.py
      
      * update collective.py.
      
      * update.
      
      * update mp_ops.py
      
      * update.
      
      * update code style.
      
      * update code style.
      e5dc9d61
    • Y
      [Cherry-pick][Auto Parallel] Improve the APIs (#46164) · c5cc4278
      Yulong Ao 提交于
      * [AutoParallel] adapt gradient merge pass (#45915)
      
      * adapt gradient merge
      
      * fix op_role
      
      * fix strategy
      
      * [Auto Parallel] Gradient Fuse Allreduce (#45643)
      
      * bugfix (#45332)
      
      * dist embedding support lookup table v1
      
      * add unitest
      
      * customize wait_comm
      
      * group gradients
      
      * bugfix
      
      * update program
      
      * [Auto Parallel] Improve the APIs (#45776)
      
      * [Auto Parallel] Use c++ dist attr in the completion process
      
      * [Auto Parallel] Add minor changes
      
      * [Auto Parallel] Use c++ dist attr in the completion process
      
      * [Auto Parallel] Add minor changes
      
      * [Auto Parallel] Add the serialization process for dist attrs
      
      * [Auto Parallel] Remove unnecessary comments
      
      * [Auto Parallel] Fix some bugs
      
      * [Auto Parallel] Fix the code style
      
      * [Auto Parallel] Remove unnecessary impls
      
      * [Auto Parallel] Fix the importing error
      
      * [Auto Parallel] Fix the copy from bugs of op dist attr
      
      * [Auto Parallel] Replace the use of constexpr if
      
      * [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh
      
      * [Auto Parallel] Change API of the completion unittest
      
      * [Auto Parallel] Fix the bug when set_attr an int
      
      * [Auto Parallel] Add the unittest for the serialization
      
      * [Auto Parallel] Add some unit tests
      
      * [Auto Paralle] Unify the strategy
      
      * [Auto Parallel] Improve the engine api
      
      * [Auto Parallel] Reset the changes made to the framework
      
      * [Auto Parallel] Change the engine unittest
      
      * [Auto Parallel] Update API of the completion and partitioner
      
      * [Auto Parallel] Update unit tests using engine api
      
      * update shard annotation
      
      * [Auto Parallel] Remove the modifications of other modules
      
      * [Auto Parallel] Add docs for APIs
      
      * add new strategy
      
      * [Auto Parallel] Replace the logger
      
      * [Auto Parallel] Restore the test_program.py
      
      * [Auto Parallel] Change the import rules
      
      * [Auto Parallel] Add the examples for Engine
      
      * [Auto Parallel] Do some minor changes
      
      * [Auto Parallel] Remove yaml dependency
      
      * [Auto Parallel] Fix the unittests
      
      * add valid after train
      
      * bug fix
      Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
      Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
      Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
      
      * [Auto Parallel] Bugfix allreduce fuse for MP (#46086)
      
      * bugfix
      
      * bugfix
      
      * typos fixed
      
      * update strategy (#46138)
      Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com>
      Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
      Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
      Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
      Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
      c5cc4278
    • C
      Revert "Simplify size op impl (#45808)" (#46168) · dabb8f23
      Chen Weihang 提交于
      This reverts commit c252b1de.
      dabb8f23
    • S
      rename fleetx, develop=document_fix (#46141) · 7a6db0a3
      ShenLiang 提交于
      7a6db0a3
  9. 17 9月, 2022 1 次提交
    • Z
      V2.4 - cherry-pick (#46126) · a76fa414
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fix gloo compile warning
      
      * adapt for nn fl-ps
      a76fa414
  10. 15 9月, 2022 1 次提交
  11. 09 9月, 2022 3 次提交
  12. 08 9月, 2022 1 次提交
  13. 07 9月, 2022 2 次提交
  14. 06 9月, 2022 2 次提交
  15. 05 9月, 2022 1 次提交
  16. 02 9月, 2022 3 次提交
  17. 01 9月, 2022 3 次提交
  18. 31 8月, 2022 3 次提交
  19. 29 8月, 2022 1 次提交