1. 22 4月, 2022 1 次提交
    • Z
      Ssd sparse table (#41812) · cca57c4a
      zhaocaibei123 提交于
      * [cherry-pick2.3]fix compile bug of windows cuda11.5 (#41464)
      
      cherry-pick
      
      fix compile bug of windows cuda11.5 #41433
      
      * fix bug of missing boost when compile cache.cc (#41449)
      
      【chery-pick #41430】fix bug of random compile failure, due to incorrect compile order of dependencies
      
      * Fix eager try catch (#41438) (#41477)
      
      [Cherry-Pick]Fix eager try catch (#41438)
      
      * Cherry-pick-PR41407, fix device_id bug for final_state op in multiprocess testcase (#41407) (#41475)
      
      Cherry-pick PR #41407
      
      * [BugFix] Add error hint for one_hot gpu version (#41335) (#41495)
      
      * add one_hot gpu hint
      
      * move allow_out_of_range judgement
      
      * delete useless unittest
      
      * fix bugs of reshape double grad infermeta (#41459) (#41493)
      
      * [cherrypick-2.3] modify infer gpu memory strategy (#41427), remove cudnn_deterministic=True (#41341)  (#41491)
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      
      * [Cherry-pick][ROCm] fix dcu error in device event base, test=develop (#41523)
      
      Cherry-pick of #41521
      
      * [Cherry-Pick]Cherry pick PR41200, PR41474, PR41382 (#41509)
      
      * Use `self`as a parameter of _hash_with_id function to avoid error caused by hash_id reuse (#41200)
      
      * Add fill_constant_batch_size YAML and UT (#41474)
      
      * Switch some dy2st UT to eager mode (#41382)
      
      * Sitch some dy2st UT to eager mode
      
      * Fix test_lstm and remove test_transformer
      
      * Run test_resnet_v2 in old dy mode
      
      * Unittest recover (#41431)
      
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      
      * add ssd sparse table
      
      * fix
      
      * add cache shuffle
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add unit test
      
      * fix
      Co-authored-by: zhouweiwei2014's avatarZhou Wei <1183042833@qq.com>
      Co-authored-by: NSing_chan <51314274+betterpig@users.noreply.github.com>
      Co-authored-by: N0x45f <23097963+0x45f@users.noreply.github.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      Co-authored-by: NSiming Dai <908660116@qq.com>
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NZhang Jun <ewalker@live.cn>
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      Co-authored-by: NQi Li <qili93@qq.com>
      Co-authored-by: Nesythan <esythan@126.com>
      cca57c4a
  2. 19 4月, 2022 1 次提交
  3. 13 4月, 2022 1 次提交
    • W
      the one ps proto (#41659) · b12af9e1
      wangguanqun 提交于
      * the one ps proto
      
      * the one ps proto
      
      * fix
      
      * fix
      
      * fix
      
      * fix windows ci
      
      * fix windows ci
      
      * add dependency
      
      * add dependency
      b12af9e1
  4. 12 4月, 2022 1 次提交
  5. 02 4月, 2022 1 次提交
  6. 31 3月, 2022 1 次提交
  7. 28 3月, 2022 1 次提交
  8. 25 3月, 2022 1 次提交
    • J
      Refactor Dygraph Flags (#40786) · 3085d5e4
      Jiabin Yang 提交于
      * refactor eager flags
      
      * fix flags error when we switch from eager to dygraph
      
      * fix ci problem
      
      * fix ci
      
      * fix ci
      
      * merge develop and fix code style
      
      * merge develop and fix code style
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * merge develop
      3085d5e4
  9. 24 3月, 2022 1 次提交
  10. 23 3月, 2022 1 次提交
    • Z
      two-phase training for ps (#40762) · b1a4668c
      zhaocaibei123 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      
      * ps optimizer multi programs
      
      * cvm & datanorm backend
      
      * fix dim
      
      * fix unittest
      
      * fix
      
      * the one ps merge
      
      * remove comm
      
      * add DownpourLiteWorker
      
      * all
      
      * fix
      
      * fix
      
      * device worker downpour lite
      
      * fix
      
      * fix bug in global shuffle
      
      * save inference model
      
      * fix & add log
      
      * fix
      
      * remove log
      
      * fix
      
      * fix save summary
      
      * fix
      
      * fix pscore
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * remove logs
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add some comments
      
      * fix
      Co-authored-by: Nesythan <esythan@126.com>
      b1a4668c
  11. 03 3月, 2022 1 次提交
  12. 25 1月, 2022 1 次提交
  13. 18 1月, 2022 1 次提交
  14. 17 1月, 2022 1 次提交
  15. 30 12月, 2021 1 次提交
  16. 17 12月, 2021 2 次提交
  17. 09 12月, 2021 1 次提交
  18. 06 12月, 2021 1 次提交
  19. 30 11月, 2021 1 次提交
  20. 26 11月, 2021 1 次提交
  21. 24 11月, 2021 1 次提交
  22. 22 11月, 2021 1 次提交
  23. 18 11月, 2021 1 次提交
    • Z
      [heterps]change default executor for heter trainer (#37314) · c98d175d
      zmx 提交于
      * fix pslib. test=develop
      
      * add device to train_from_dataset. test=develop
      
      * refine fleet.stop_worker. test=develop
      
      * fix ut. test=develop
      
      * fix ut. test=develop
      
      * fix executor & ut. test=develop
      
      * fix executor & ut. test=develop
      
      * fix executor & ut. test=develop
      c98d175d
  24. 11 11月, 2021 1 次提交
    • Z
      [Heterps]Refactor Heter Pipeline Parameter Server (#36845) · a2da1efa
      zmx 提交于
      * change username
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * update
      
      * update
      
      * update unittests
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * fix
      
      * fix
      
      * fix
      
      * update
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update send_and_recv op. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix ut. test=develop
      
      * fix unit. notest,test=coverage
      
      * fix ut. notest, test=coverage
      
      * update. notest,test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix. notest, test=coverage
      
      * fix. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * add func. notest, test=coverage
      
      * fix ut. notest, test=coverage
      
      * fix. test=develop
      
      * fix. test=develop
      a2da1efa
  25. 25 10月, 2021 1 次提交
  26. 18 10月, 2021 1 次提交
  27. 13 10月, 2021 1 次提交
  28. 13 9月, 2021 1 次提交
  29. 10 9月, 2021 1 次提交
  30. 08 9月, 2021 2 次提交
    • Y
      [Auto Parallel] Integrate all modules (#35483) · 12155358
      Yulong Ao 提交于
      * add auto_parallel dir
      
      * mv to paddle.distributed
      
      * add shard_xx api
      
      * add distributed attrs for var
      
      * add ut, test=develop
      
      * add dist
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update, test=develop
      
      * update, test=develop
      
      * update
      
      * update
      
      * delete unused proto
      
      * resotre op_desc
      
      * restore type_defs
      
      * update var_desc
      
      * remove dimss_mapping for proto_pybind
      
      * update interface.py
      
      * update framework.py
      
      * update
      
      * update
      
      * add auto_parallel dir
      
      * mv to paddle.distributed
      
      * add shard_xx api
      
      * add distributed attrs for var
      
      * add ut, test=develop
      
      * [WIP] Add the auto completion feature and related codes
      
      * [WIP] Improve the auto completion and related codes
      
      * [WIP] Make the auto completion to support data-parallel
      
      * [WIP] Make the completion support mp and dp+mp
      
      * [WIP] Refactor auto completion unit test for MLP
      
      * [WIP] Refactor the implementation of DistributedOperatorImpl
      
      * [WIP] Improve dims_mapping update rule and fix a bug
      
      * [WIP] Support auto completion for one transformer decoder layer
      
      * [WIP] Add a minor change
      
      * [WIP] Fix a bug within the uint test
      
      * Shard XShape tensor, add embedding completion and refactor code
      
      * Add the distributed_operators dir to setup.py.in
      
      * Improve the completion process and add the unittest for gpt
      
      * fix process_mesh ut
      
      * fix process_mesh ut
      
      * update
      
      * update, test=develop
      
      * Add support for automatically completing distributed attrs of special ops
      
      * update
      
      * update
      
      * update
      
      * fix doc sample codes, test=develop
      
      * improve coverage, test=develop
      
      * add static_mode check, test=develop
      
      * Model the cluster for cost model and physical mapping
      
      * update, test=develop
      
      * add set_placement, test=develop
      
      * Add the check to make sure the candidate tensors' size is great than zero
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update, test=develop
      
      * Auto mark dist attrs annotated by user
      
      * update ndarray to nested list, test=develop
      
      * update, test=develop
      
      * Add auto-completion module for auto-parallel (based on PR#33804)
      
      * Remove unnecessary files
      
      * Remove unrelated files for the auto completion pr
      
      * Update the unit test to improve the coverage
      
      * Modify codes based on reviews
      
      * Minor changes for CI
      
      * Improve some codes based on new comments
      
      * Fix bugs caused by shallow copy in attributes.py
      * Imporve amend_distributed_attr_for_program in context.py
      * Other changes for weihang's comments
      
      * support shard reader
      
      * support shard reader
      
      * add parallel mode
      
      * update process mesh
      
      * add method to compute comm_group
      
      * implement dist_embedding forward func
      
      * implement dist matmul forward func
      
      * implement dist reshape forward func
      
      * add transpiler framework
      
      * add transpiler forward
      
      * implement transpiler forward
      
      * implement transpiler backward & update
      
      * add process
      
      * add unitest
      
      * chmod
      
      * chmod
      
      * chmod
      
      * update unitest
      
      * add unitest for gpt
      
      * remove unused print
      
      * rename transpiler --> partitioner
      
      * rename transpiler --> partitioner
      
      * chmod
      
      * chmod
      
      * bug fixed
      
      * remove amp function
      
      * update case for dp mode
      
      * update case for dp mode
      
      * [Auto Parallel] Integrate all parts with the newest code
      
      * Integrate all parts of auto parallel and improve codes
      
      * Integrate all parts by AutoParallelizer
      * Add unit test for AutoParallelizer
      * Improve auto completion module for pipeline parallel
      * Add support for matmul_v2 in dist_matmul
      * Correct the typo "stratergy" to "strategy"
      
      * Modify distributed_strategy.proto to conform the main stream
      
      * Restore parts of distributed_strategy to conform the develop branch
      Co-authored-by: Nsandyhouse <lilong12@baidu.com>
      Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
      12155358
    • Z
      Enable program passes on Fleet APIs (#34955) · 5f369881
      Zeng Jinle 提交于
      * add fleet api for program pass
      
      * turn on apply pass for CI test
      
      * fix disable fuse_all_optimizer bug
      
      * try to test ci
      
      * fix CI
      
      * fill unspecified op role
      
      * fix fuse_allreduce
      
      * add ut to improve coverage
      
      * remove useless change
      
      * improve c++ coverage
      
      * follow some comments
      
      * test ir pass pipeline
      
      * update doc
      
      * reduce ut time again
      5f369881
  31. 20 8月, 2021 1 次提交
  32. 18 8月, 2021 1 次提交
  33. 02 8月, 2021 1 次提交
  34. 30 7月, 2021 1 次提交
  35. 27 7月, 2021 1 次提交
  36. 08 7月, 2021 1 次提交
  37. 01 7月, 2021 2 次提交