1. 14 10月, 2022 1 次提交
  2. 13 10月, 2022 3 次提交
    • W
      combine dp and stage2 hybrid parallel. (#46795) · a95b6f33
      wuhuachaocoding 提交于
      * combine dp and stage2 hybrid parallel.
      
      * update condition.
      a95b6f33
    • X
      [WIP]飞桨PaddlePaddle 分布式强化学习功能研发 (#45998) · f0afcabc
      Xinger 提交于
      * add rpc module in cpp side
      
      * add rpc module in python side
      
      * support win32 and mac for rpc
      
      * 代码优化
      
      * 优化代码
      
      * update rpc
      
      * update rpc launch
      
      * rpc remove rank and world_size api
      
      * fix logger import bug
      
      * remove support for win and mac
      
      * remove support for xpu, npu, cinn and rocm
      
      * remove support for xpu, npu, cinn and rocm
      
      * fix shutdown barrier timeout bug
      
      * update:python_rpc_handler to shared ptr
      
      * fix master shutodwn first bug
      
      * tests support for cpu
      
      * update log to vlog
      
      * update get service info api
      
      * add single process test case
      
      * remove process group
      
      * remove some useless dependencies
      
      * update rpc api comments
      
      * update rpc comments: Example to Examples
      
      * update rpc api comments
      
      * update rpc api comments
      
      * update launch api comments
      
      * update init_rpc comments
      
      * update rpc sync and async comments
      
      * fix bug: init_rpc cant be called repeatly in a process
      
      * update rpc api comment: make master endpoint unique
      
      * update rpc api:service to worker, timeout_ms to timeout
      
      * rename ServiceInfo to WorkerInfo
      
      * refactor: rename server to worker, log to vlog
      
      * add launch test
      
      * remove unused codes
      
      * refine
      f0afcabc
    • N
  3. 12 10月, 2022 5 次提交
    • J
      bugfix (#46921) · acdaa4fb
      JZ-LIANG 提交于
      acdaa4fb
    • Y
      [Auto Parallel] Improve the fine-grained APIs (#46552) · 686fa07a
      Yulong Ao 提交于
      * [Auto Parallel] Suppport different dataloaders
      
      * [Auto Parallel] Add num_shards config for dataset
      
      * [Auto Parallel] Unify the logger and outputs of Engine API
      
      * [Auto Parallel] Fix the bugs of to_static
      
      * [Auto Parallel] Adjust the test_to_static.py
      
      * [Auto Parallel] Add the prepare API and replace __call__ with run
      
      * [Auto Parallel] Improve the private implementations of Engine
      
      * [Auto Parallel] Set capacity of dataloader for opt tuning
      
      * [Auto Parallel] [WIP] Change the fine-grained API
      
      * [Auto Parallel] Improve APIs to support different user cases
      
      * [Auto Parallel] Add removed config
      
      * [Auto Parallel] Add imports
      
      * [Auto Parallel] Fix bugs for to_static
      
      * [Auto Parallel] Remove unnecessary imports
      686fa07a
    • zhouweiwei2014's avatar
      [Zero-Dim] support input 0D Tensor for some unary api (#45992) · 05c2b9ba
      zhouweiwei2014 提交于
      * [Zero-Dim] support input 0D Tensor for unary api
      
      * fix CI
      05c2b9ba
    • Y
      Multi groups for broadcast of sharding stage 2 (#46894) · 95768115
      Yuang Liu 提交于
      95768115
    • N
      [CodeStyle][F401] remove unused imports in python/paddle/distributed (#46758) · fe716a0b
      Nyakku Shigure 提交于
      * [CodeStyle][F401] remove unused import in python/paddle/distributed
      
      * remove pass
      
      * empty commit
      
      * Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      * Fix split import.
      
      Fix split import.
      
      * add noqa after meta_optimizers in factory
      
      * restort collective ops
      
      * expand `import *`
      
      * add noqa after required imports
      
      * try to fix APIs without core.ops
      
      * Revert "try to fix APIs without core.ops"
      
      This reverts commit 6172beaf601e84bf61f2490c12c4739f0edaa5eb.
      
      * fix an increment
      
      * empty commit
      
      * add noqa after required imports
      
      * expand `import *`, fix ci error
      Co-authored-by: NShuangchi He <34329208+Yulv-git@users.noreply.github.com>
      fe716a0b
  4. 11 10月, 2022 5 次提交
  5. 10 10月, 2022 5 次提交
  6. 09 10月, 2022 1 次提交
  7. 08 10月, 2022 2 次提交
  8. 30 9月, 2022 1 次提交
  9. 29 9月, 2022 2 次提交
  10. 28 9月, 2022 7 次提交
  11. 27 9月, 2022 2 次提交
  12. 26 9月, 2022 1 次提交
  13. 23 9月, 2022 2 次提交
    • N
      ed2bb051
    • Z
      fl-ps bug fix (#46356) · 56d30534
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      
      * fix logging risk
      
      * fix logging possible risk
      
      * write trainer_desc file
      
      * support split sparse params in local & remote
      
      * fix import paddle.fluid.core.PSGPU
      
      * fix import paddle.fluid.core.PSGPU
      
      * add remote_sparse & local_sparse config
      
      * fix unittest
      
      * fix test_dist_fleet_geo table error
      
      * fix PADDLE_ENFORCE error
      
      * fix other's pr conflict
      
      * forbidden ssd table
      
      * .
      
      * recover ssd table code
      
      * recover file mode
      
      * debug auc 0.5
      
      * adapt for nn fl-ps
      
      * adapt for nn fl-ps
      
      * add learning_rate_0 intializer op
      
      * recover ssd table
      
      * modify file mode
      
      * flps del fake-init op
      
      * bug fix
      56d30534
  14. 22 9月, 2022 3 次提交
    • Y
      [interleave pp] sync recv for 1f1b (#46399) · f7784700
      Yuang Liu 提交于
      f7784700
    • Fix the En docs (delete some expression like 'This OP') (#46165) · 3a928a8c
      张春乔 提交于
      * 1. Delete some expression like 'This Op'
      2. remove import numpy as np
      
      * test=document_fix
      
      * fix eg; test=document_fix
      
      * fix 'import numpy' cases; test=document_fix
      
      * fix 'import numpy' cases; test=document_fix
      
      * fix some docs; test=document_fix
      
      * delete raise; test=document_fix
      
      * add some introduction; test=document_fix
      
      * add some introduction; test=document_fix
      
      * test=document_fix
      
      * Fix ’note‘ format; test=document_fix
      
      * Fix Returns of cholesky; test=document_fix
      
      * Fix Example format; test=document_fix
      
      * Fix det; test=document_fix
      
      * Fix eig; test=document_fix
      
      * Fix eigh; test=document_fix
      
      * Fix eigh; test=document_fix
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * test=document_fix
      
      * test=document_fix
      
      * KLDiv;test=document_fix
      
      * norm example code; test=document_fix
      
      * revert python/paddle/fluid/**/*
      
      * revert python/paddle/distributed/spawn.py
      
      * revert python/paddle/fluid/*
      
      * fix a `Note` format
      
      * Fix inv; test=document_fix
      
      * Fix lu; test=document_fix
      
      * Fix lu_unpack; test=document_fix
      
      * Fix matrix_power; test=document_fix
      
      * Fix multi_dot; test=document_fix
      
      * Fix solve; test=document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      3a928a8c
    • H
      [Dygraph] Fix bugs of mp in eager mode (#46303) · 11002430
      Haohongxiang 提交于
      * fix bugs of mp
      
      * fix bugs of mp
      
      * update
      
      * update
      
      * fix bug
      11002430