1. 17 10月, 2022 1 次提交
    • W
      [Cherry-pick] Collective communication APIs (#46922) · 5fba2a98
      Wen Sun 提交于
      * Support both use_calc_stream and sync_op in send recv APIs (#46023)
      
      * Support both use_calc_stream and sync_op in allgather API (#46295)
      
      * Support both use_calc_stream and sync_op in collective communication API (#46761)
      
      * Move group and all reduce from collective to communication (#45848)
      
      * Completes bfloat16 dtype for collective api in eager mode (#45844)
      
      * Fix collective APIs cannot be recognized when building docs (#46962)
      Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>
      5fba2a98
  2. 07 9月, 2022 1 次提交
  3. 06 9月, 2022 1 次提交
  4. 01 9月, 2022 2 次提交
  5. 31 8月, 2022 1 次提交
  6. 26 8月, 2022 1 次提交
  7. 25 8月, 2022 1 次提交
  8. 23 8月, 2022 2 次提交
  9. 22 8月, 2022 1 次提交
  10. 17 8月, 2022 1 次提交
  11. 16 8月, 2022 1 次提交
  12. 15 8月, 2022 1 次提交
    • H
      [XPU] add some collective ops. (#45049) · 7e2a20d5
      houj04 提交于
      * [XPU] add some collective ops. test=kunlun
      
      * use XPUOpTestWrapper. test=kunlun
      
      * skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
      7e2a20d5
  13. 13 8月, 2022 1 次提交
    • Z
      fl-ps: support split sparse params in local & remote (#44864) · 3f5c405f
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      
      * fix logging risk
      
      * fix logging possible risk
      
      * write trainer_desc file
      
      * support split sparse params in local & remote
      
      * fix import paddle.fluid.core.PSGPU
      
      * fix import paddle.fluid.core.PSGPU
      
      * add remote_sparse & local_sparse config
      
      * fix unittest
      
      * fix test_dist_fleet_geo table error
      
      * fix PADDLE_ENFORCE error
      
      * fix other's pr conflict
      3f5c405f
  14. 12 8月, 2022 2 次提交
    • L
      fix nccl comm in sync_bn (#45100) · 1e965756
      LiYuRio 提交于
      1e965756
    • Y
      [Auto Parallel] Pybind ProcessMesh and DeviceMesh (#45013) · 5bf3dec9
      Yulong Ao 提交于
      * [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh
      
      * [Auto Parallel] Fix the unittest problem
      
      * [Auto Parallel] Explicitly add the src file for auto_parallel target
      
      * [Auto Parallel] Add the proto depedency explicitly
      
      * [Auto Parallel] Fix the cmake bug on windows and mac
      
      * [Auto Parallel] Remove the pybind11 header file in process_mesh.h
      5bf3dec9
  15. 11 8月, 2022 1 次提交
  16. 09 8月, 2022 2 次提交
    • Y
      [Auto Parallel] Add the c++ dist attrs (#44989) · 2c77b575
      Yulong Ao 提交于
      * [Auto Parallel] Add the c++ dist attrs
      
      * [Auto Parallel] Remove some codes to be less than 1000 lines
      2c77b575
    • Z
      refine save/load interface for distributed cpups (#44862) · 7b29c89b
      zhaocaibei123 提交于
      * save load
      
      * save load
      
      * add unittest
      
      * first commit
      
      * second commit
      
      * third commit
      
      * remove SaveLocalFS in memory sparse table
      
      * save dense param
      
      * update
      
      * push slot
      
      * fix push show clk: int -> float
      
      * add unittest
      
      * fix sample
      
      * unittest
      
      * add AsExtra for op
      
      * unittest
      
      * modify fs.py
      
      * modify fs.py
      
      * fix some bugs
      
      * add dataset hdfs config
      
      * local change
      
      * dataset use differenct hadoop ugi/fs_name
      
      * add
      
      * fix conflict
      
      * fix
      
      * remove logs
      
      * code style
      
      * fix
      
      * code style
      
      * code style
      
      * fix
      
      * code style
      
      * save_dense_param
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * change momentum in dense optimzer
      
      * fix
      
      * fix
      
      * change fluid => paddle.static
      
      * remove some unuseful code
      Co-authored-by: Nesythan <esythan@126.com>
      7b29c89b
  17. 08 8月, 2022 3 次提交
  18. 04 8月, 2022 1 次提交
  19. 03 8月, 2022 1 次提交
  20. 01 8月, 2022 2 次提交
  21. 29 7月, 2022 2 次提交
  22. 28 7月, 2022 1 次提交
  23. 26 7月, 2022 1 次提交
    • Z
      add horizontal federation learning ps feature (#44327) · 4bc22b69
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      4bc22b69
  24. 22 7月, 2022 1 次提交
  25. 21 7月, 2022 1 次提交
  26. 20 7月, 2022 1 次提交
  27. 19 7月, 2022 2 次提交
  28. 16 7月, 2022 1 次提交
  29. 15 7月, 2022 1 次提交
  30. 11 7月, 2022 2 次提交