1. 23 10月, 2022 1 次提交
  2. 14 9月, 2022 1 次提交
  3. 26 7月, 2022 1 次提交
    • Z
      add horizontal federation learning ps feature (#44327) · 4bc22b69
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      4bc22b69
  4. 05 6月, 2022 1 次提交
    • S
      【code format check upgrade】 step2:yapf (#42944) · a072fca8
      Sing_chan 提交于
      * use yapf to format all python file
      
      * yapf exclude two unittests file for they rely on writing and reading file, and format will break them
      
      * disable diff_py_file because too many diff files cause command following failed
      a072fca8
  5. 25 3月, 2022 1 次提交
    • J
      Refactor Dygraph Flags (#40786) · 3085d5e4
      Jiabin Yang 提交于
      * refactor eager flags
      
      * fix flags error when we switch from eager to dygraph
      
      * fix ci problem
      
      * fix ci
      
      * fix ci
      
      * merge develop and fix code style
      
      * merge develop and fix code style
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * merge develop
      3085d5e4
  6. 25 1月, 2022 1 次提交
  7. 18 1月, 2022 1 次提交
  8. 18 11月, 2021 1 次提交
    • Z
      [heterps]change default executor for heter trainer (#37314) · c98d175d
      zmx 提交于
      * fix pslib. test=develop
      
      * add device to train_from_dataset. test=develop
      
      * refine fleet.stop_worker. test=develop
      
      * fix ut. test=develop
      
      * fix ut. test=develop
      
      * fix executor & ut. test=develop
      
      * fix executor & ut. test=develop
      
      * fix executor & ut. test=develop
      c98d175d
  9. 11 11月, 2021 1 次提交
    • Z
      [Heterps]Refactor Heter Pipeline Parameter Server (#36845) · a2da1efa
      zmx 提交于
      * change username
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * update
      
      * update
      
      * update unittests
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * fix
      
      * fix
      
      * fix
      
      * update
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update send_and_recv op. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix ut. test=develop
      
      * fix unit. notest,test=coverage
      
      * fix ut. notest, test=coverage
      
      * update. notest,test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix. notest, test=coverage
      
      * fix. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * add func. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix. test=develop
      
      * fix. test=develop
      a2da1efa
  10. 06 5月, 2021 1 次提交
  11. 07 4月, 2021 1 次提交
    • Z
      【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3
      zhang wenhui 提交于
      * Ascend rc (#30483)
      
      * Fix compilcation on CANN20.1 and older (#30494)
      
      Fix compilcation on CANN20.1 and older
      
      * Add distribution supported (#30578)
      
      Add distribution supported
      
      * Build praser for Hcom* operators (#30627)
      
      Build praser for Hcom* operators
      
      * Pass device_ids info from launch to trainer. (#30632)
      
      Pass device_ids info from launch to trainer
      
      * Add Hccl program group (#30642)
      
      Add Hccl program group
      
      * Add startup bash files of test_ascend_group. (#30645)
      
      Add startup bash files of test_ascend_group
      
      * cleanup (#30646)
      
      cleanup test_ascend_group.py
      
      * [Feature] Build parser to support distributed training (#30658)
      
      [Feature] Build parser to support distributed training
      
      * fix compilation on ascend-20.1 (#30722)
      
      fix compilation on ascend-20.1
      
      * Dev/fix ascend string (#30749)
      
      Dev/fix ascend string
      
      * code style (#30781)
      
      code style
      
      * Merge ascend_optimizer and ascend_parser. (#30776)
      
      Merge ascend_optimizer and ascend_parser.
      
      * Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)
      
      Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug
      
      * Add paddle ascend distribution training supported (#30796)
      
      Add paddle ascend distribution training supported
      
      * pass cxx_flags to gloo cmake (#30857)
      
      * Destroy session first. (#30954)
      
      Destroy session first.
      
      * merge
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style, test=develop
      
      * fix, test=develop
      
      * fix
      
      * fix log fatal, test=develop
      
      * fix enforce style, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix rccl, test=develop
      
      * fix test, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix node_num, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix ids str, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      
      * fix style code, test=develop
      Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
      Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
      Co-authored-by: Ndingsiyu <18369187719@163.com>
      Co-authored-by: NOleNet <olenet@126.com>
      8c7c53b3
  12. 31 12月, 2020 1 次提交
  13. 08 12月, 2020 1 次提交
  14. 26 11月, 2020 1 次提交
  15. 15 10月, 2020 1 次提交
  16. 14 10月, 2020 1 次提交
  17. 13 10月, 2020 1 次提交
    • C
      【paddle.fleet】Update fleetrun & ps-heter (#27472) · c5f2802d
      Chengmo 提交于
      * refine fleetrun.ps_launch
      
      * update fleet run for multi device support
      
      * ps_graph support ps-gpu
      
      * fix heter save
      
      * add heter save unittest
      
      * fix unittest & simple code
      
      * update fleetrun
      
      * fix fleetrun
      
      * fix launch barrier
      
      * fix role maker
      
      * add paddlecloud rolemaker unittest
      
      * rename heter_worker_device_guard
      c5f2802d
  18. 30 9月, 2020 1 次提交
  19. 29 9月, 2020 2 次提交
  20. 28 9月, 2020 3 次提交
  21. 27 9月, 2020 1 次提交
  22. 23 9月, 2020 1 次提交
  23. 20 9月, 2020 1 次提交
    • T
      【paddle.fleet】Fix/role maker api fix (#27326) · d6b54de4
      tangwei12 提交于
      * fix fleet util and gloo
      
      * fix worker endpoints
      
      * fix
      
      * fix UT
      
      * fix gloo
      
      * fix gloo
      
      * update gloo
      
      * update gloo
      
      * update gloo
      
      * update gloo
      
      * update gloo
      
      * fix gloo wrapper for hdfs
      
      * add file gloo and UT
      
      * fix UT
      
      * fix UT
      
      * fix UT
      
      * hide public method of RoleMaker
      
      * fix UT
      
      * GPU fleetrun support gloo
      
      * parameterserver fleetrun support gloo
      
      * add UT
      
      * add UT
      
      * fix UT
      
      * fix get server endpoint
      
      * fix get server endpoint
      
      * fix UT
      
      * hide public method of rolemaker
      
      * hide public method of rolemaker
      
      * hide public method of rolemaker
      
      * Update test_fleet_rolemaker_new.py
      
      * hide public method of rolemaker
      
      * hide public method of rolemaker
      d6b54de4
  24. 18 9月, 2020 1 次提交
  25. 17 9月, 2020 1 次提交
  26. 03 9月, 2020 1 次提交
  27. 30 8月, 2020 1 次提交
  28. 29 8月, 2020 1 次提交
  29. 22 8月, 2020 1 次提交
  30. 18 8月, 2020 1 次提交
  31. 13 8月, 2020 1 次提交
  32. 07 8月, 2020 1 次提交
  33. 06 7月, 2020 1 次提交
  34. 23 3月, 2020 1 次提交
  35. 17 9月, 2018 1 次提交
  36. 03 9月, 2018 1 次提交
  37. 15 8月, 2018 1 次提交