1. 26 8月, 2022 1 次提交
  2. 23 8月, 2022 2 次提交
  3. 16 8月, 2022 1 次提交
  4. 15 8月, 2022 1 次提交
    • W
      refactor fleet. (#44833) · 8636d2a2
      wuhuachaocoding 提交于
      * refactor fleet.
      
      * refact fleet.py.
      
      * update fleet/__init__.py.
      
      * update fleet.py
      
      * update code style.
      
      * update fleet
      
      * update fleet
      
      * update fleet
      
      * update fleet
      
      * update model.py
      
      * update fleet.
      
      * update __init__.py
      
      * update fleet.
      
      * update fleet.
      
      * update fleet
      
      * update fleet
      
      * update fleet
      
      * update fleet.
      
      * update optimizer.py
      
      * update optimizer
      
      * update fleet.py
      
      * update scaler.py
      
      * update setup.py.in
      8636d2a2
  5. 13 8月, 2022 1 次提交
    • Z
      fl-ps: support split sparse params in local & remote (#44864) · 3f5c405f
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      
      * fix logging risk
      
      * fix logging possible risk
      
      * write trainer_desc file
      
      * support split sparse params in local & remote
      
      * fix import paddle.fluid.core.PSGPU
      
      * fix import paddle.fluid.core.PSGPU
      
      * add remote_sparse & local_sparse config
      
      * fix unittest
      
      * fix test_dist_fleet_geo table error
      
      * fix PADDLE_ENFORCE error
      
      * fix other's pr conflict
      3f5c405f
  6. 12 8月, 2022 1 次提交
  7. 10 8月, 2022 1 次提交
  8. 09 8月, 2022 2 次提交
    • Z
      refine save/load interface for distributed cpups (#44862) · 7b29c89b
      zhaocaibei123 提交于
      * save load
      
      * save load
      
      * add unittest
      
      * first commit
      
      * second commit
      
      * third commit
      
      * remove SaveLocalFS in memory sparse table
      
      * save dense param
      
      * update
      
      * push slot
      
      * fix push show clk: int -> float
      
      * add unittest
      
      * fix sample
      
      * unittest
      
      * add AsExtra for op
      
      * unittest
      
      * modify fs.py
      
      * modify fs.py
      
      * fix some bugs
      
      * add dataset hdfs config
      
      * local change
      
      * dataset use differenct hadoop ugi/fs_name
      
      * add
      
      * fix conflict
      
      * fix
      
      * remove logs
      
      * code style
      
      * fix
      
      * code style
      
      * code style
      
      * fix
      
      * code style
      
      * save_dense_param
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * change momentum in dense optimzer
      
      * fix
      
      * fix
      
      * change fluid => paddle.static
      
      * remove some unuseful code
      Co-authored-by: Nesythan <esythan@126.com>
      7b29c89b
    • Y
      [model parallel] enable mp to use fused linear (#44968) · e84250e8
      Yuang Liu 提交于
      e84250e8
  9. 08 8月, 2022 1 次提交
  10. 03 8月, 2022 2 次提交
  11. 01 8月, 2022 1 次提交
  12. 26 7月, 2022 1 次提交
    • Z
      add horizontal federation learning ps feature (#44327) · 4bc22b69
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      4bc22b69
  13. 22 7月, 2022 1 次提交
  14. 20 7月, 2022 1 次提交
  15. 13 7月, 2022 2 次提交
  16. 27 6月, 2022 1 次提交
  17. 24 6月, 2022 1 次提交
    • G
      Fix hang bug of TCPStore (#43724) · 4c9330d6
      gongweibao 提交于
      * tmp fix
      
      * init
      
      * compile ok
      
      * compile ok
      
      * add vlogs
      
      * add test
      
      * fix termination error
      
      * add testfile
      
      * add
      
      * fix window compile
      
      * fix window compile
      
      * fix windows compile
      
      * fix windows compile
      
      * fix windows compile
      
      * fix windows compile
      
      * fix windows compile
      
      * fix windows compile
      
      * fix kunlun compile
      
      * fix compilation
      
      * fix compilation
      
      * fix compilation
      
      * tmp fix
      
      * add windows
      
      * add windows
      
      * add more logs
      
      * change timeout to protected
      
      * SB
      
      * add
      
      * add
      
      * fix timeout
      
      * add
      
      * fix test
      
      * fix test
      
      * fix test
      
      * fix ut
      
      * fix ut
      
      * fix ut
      4c9330d6
  18. 16 6月, 2022 1 次提交
  19. 14 6月, 2022 2 次提交
  20. 13 6月, 2022 1 次提交
  21. 09 6月, 2022 1 次提交
  22. 07 6月, 2022 1 次提交
  23. 05 6月, 2022 1 次提交
    • S
      【code format check upgrade】 step2:yapf (#42944) · a072fca8
      Sing_chan 提交于
      * use yapf to format all python file
      
      * yapf exclude two unittests file for they rely on writing and reading file, and format will break them
      
      * disable diff_py_file because too many diff files cause command following failed
      a072fca8
  24. 02 6月, 2022 1 次提交
  25. 31 5月, 2022 1 次提交
  26. 26 5月, 2022 1 次提交
  27. 25 5月, 2022 1 次提交
  28. 23 5月, 2022 1 次提交
  29. 19 5月, 2022 1 次提交
  30. 16 5月, 2022 1 次提交
  31. 12 5月, 2022 1 次提交
  32. 10 5月, 2022 1 次提交
  33. 22 4月, 2022 1 次提交
    • Z
      Ssd sparse table (#41812) · cca57c4a
      zhaocaibei123 提交于
      * [cherry-pick2.3]fix compile bug of windows cuda11.5 (#41464)
      
      cherry-pick
      
      fix compile bug of windows cuda11.5 #41433
      
      * fix bug of missing boost when compile cache.cc (#41449)
      
      【chery-pick #41430】fix bug of random compile failure, due to incorrect compile order of dependencies
      
      * Fix eager try catch (#41438) (#41477)
      
      [Cherry-Pick]Fix eager try catch (#41438)
      
      * Cherry-pick-PR41407, fix device_id bug for final_state op in multiprocess testcase (#41407) (#41475)
      
      Cherry-pick PR #41407
      
      * [BugFix] Add error hint for one_hot gpu version (#41335) (#41495)
      
      * add one_hot gpu hint
      
      * move allow_out_of_range judgement
      
      * delete useless unittest
      
      * fix bugs of reshape double grad infermeta (#41459) (#41493)
      
      * [cherrypick-2.3] modify infer gpu memory strategy (#41427), remove cudnn_deterministic=True (#41341)  (#41491)
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      
      * [Cherry-pick][ROCm] fix dcu error in device event base, test=develop (#41523)
      
      Cherry-pick of #41521
      
      * [Cherry-Pick]Cherry pick PR41200, PR41474, PR41382 (#41509)
      
      * Use `self`as a parameter of _hash_with_id function to avoid error caused by hash_id reuse (#41200)
      
      * Add fill_constant_batch_size YAML and UT (#41474)
      
      * Switch some dy2st UT to eager mode (#41382)
      
      * Sitch some dy2st UT to eager mode
      
      * Fix test_lstm and remove test_transformer
      
      * Run test_resnet_v2 in old dy mode
      
      * Unittest recover (#41431)
      
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      
      * add ssd sparse table
      
      * fix
      
      * add cache shuffle
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add unit test
      
      * fix
      Co-authored-by: zhouweiwei2014's avatarZhou Wei <1183042833@qq.com>
      Co-authored-by: NSing_chan <51314274+betterpig@users.noreply.github.com>
      Co-authored-by: N0x45f <23097963+0x45f@users.noreply.github.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      Co-authored-by: NSiming Dai <908660116@qq.com>
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NZhang Jun <ewalker@live.cn>
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      Co-authored-by: NQi Li <qili93@qq.com>
      Co-authored-by: Nesythan <esythan@126.com>
      cca57c4a
  34. 19 4月, 2022 2 次提交