1. 30 8月, 2023 1 次提交
    • G
      [Auto Parallel] Compatible new comm library upgrade (#56604) · ade51aa5
      Ghost Screaming 提交于
      * for verify
      
      fluid operator support new comm library
      
      * u
      
      * u
      
      * u
      
      * compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter.
      
      * Remove useless comments in process_group.py
      
      * Polish code style.
      
      * Fix some problems.
      
      * Remove use fluid api in phi comm_context_manager.
      
      * Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement.
      
      * Fix bug of HIP architecture.
      
      * Fix some problems.
      1. remove useless loggings.
      2. Fix conditional compilation for HIP.
      3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first,
      then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call
      paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not
      set, as a result, NCCLCommContext's dev_ctx is not initialized.
      
      * Fix some problems.
      
      * Polish code.
      
      * Polish code.
      
      * Revert compatiable upgrade for communication operators. Their upgrades
      will be submitted in another PR.
      
      * Remove StaticTCPStore.
      
      * Remove useless modification.
      
      * Remove useless set_cuda_device_id.
      
      * Polish code.
      
      * Remove fluid header files in phi files.
      
      * Remove useless comments.
      
      * Fix problems of hip arch.
      
      * Fix some problems.
      
      * Polish code.
      
      * Polish code style.
      
      ---------
      Co-authored-by: TaoTao Li's avatarhitywt <yuwentao126@126.com>
      ade51aa5
  2. 25 8月, 2023 1 次提交
  3. 22 8月, 2023 1 次提交
  4. 07 8月, 2023 1 次提交
  5. 01 8月, 2023 1 次提交
  6. 13 7月, 2023 1 次提交
  7. 29 6月, 2023 1 次提交
  8. 22 5月, 2023 1 次提交
    • M
      [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() (#53856) · 3794d171
      Meteor Liu 提交于
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode()
      
      * fixed cyclic reference that caused patial import
      
      * fixed bad change
      
      * fix bad import
      
      * fix bad import
      
      * fix bad import
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fix ut failed caused by change in_dynamic_mode
      
      * fixed usage of in_dynamic_mode() or in_dygraph_mode()
      
      * revert python3 to python in .pre-commit-config.yaml
      
      * fix merge conflicts
      3794d171
  9. 17 4月, 2023 1 次提交
  10. 14 4月, 2023 1 次提交
  11. 06 4月, 2023 1 次提交
    • K
      rem is_compiled_with_npu (#52385) · 7976e2a3
      Kim Yann 提交于
      * rem is_compiled_with_npu
      
      * rem nup related code
      
      * make lint happy
      
      * rem test
      
      * remove some tests
      
      * Update grad_scaler.py
      
      * fix an error
      7976e2a3
  12. 03 4月, 2023 1 次提交
  13. 25 3月, 2023 1 次提交
  14. 22 3月, 2023 1 次提交
  15. 13 1月, 2023 1 次提交
    • D
      [Custom Device] Clear ProcessGroup Manually (#49182) · a923a757
      duanyanhui 提交于
      * clear ProcessGroupCustom manually
      
      * fix bug
      
      * fix bug
      
      * move destroy ProcessGroup to ProcessGroupIdMap
      
      * enable destroy to all device
      
      * remove unused comments
      
      * change to internal api
      
      * Update process_group.cc
      
      * Update process_group.cc
      a923a757
  16. 10 1月, 2023 1 次提交
  17. 09 1月, 2023 1 次提交
  18. 30 12月, 2022 1 次提交
  19. 26 12月, 2022 1 次提交
  20. 25 12月, 2022 1 次提交
  21. 08 12月, 2022 1 次提交
    • G
      Clean fluid APIs in distributed and fleet files (#48851) · 911d6bb1
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Clean fluid API in paddle/distributed and paddle/fleetx folders.
      Include following files:
      python/paddle/distributed/__init__.py
      python/paddle/distributed/collective.py
      python/paddle/distributed/fleet/utils/fs.py
      python/paddle/distributed/fleet/utils/hybrid_parallel_inference.py
      python/paddle/distributed/fleet/utils/hybrid_parallel_util.py
      python/paddle/distributed/fleet/utils/internal_storage.py
      python/paddle/distributed/launch/context/device.py
      python/paddle/distributed/parallel.py
      python/paddle/distributed/parallel_with_gloo.py
      python/paddle/distributed/spawn.py
      python/paddle/framework/__init__.py
      To be mentioned, 'paddle.fluid.dygraph.parallel.ParallelEnv'
       and 'fluid.framework.core' keeps unchanged in those files.
      ParallelEnv is used by paddle.fluid.dygraph.parallel.DataParallel.
      However, APIs in paddle.fluid.dygraph.parallel can't be
      migrated to paddle.distributed, as there exists cyclic import
      dependencies in modules like paddle.static, paddle.tensor. And
      'fluid.framework.core' will be changed to import framework.core
      after fluid.core is transmitted.
      
      * Change TODO authors.
      911d6bb1
  22. 29 11月, 2022 1 次提交
  23. 28 11月, 2022 1 次提交
  24. 25 11月, 2022 1 次提交
  25. 21 11月, 2022 1 次提交
  26. 14 11月, 2022 2 次提交
  27. 10 11月, 2022 1 次提交
    • J
      XPU multi-card support eager mode (#47445) · 3b91f8f3
      james 提交于
      * XPU support eager mode
      
      * add unittest for XPU eager mode
      
      * minor bugfix
      
      * minor bugfix, test=kunlun
      
      * correct copyright info
      
      * 1. remove unsed vars/funcs
      2. ProcessGroupBKCL inherit from ProcessGroupStream
      
      * bugfix for fp16 in eager mode multi-card, test=kunlun
      
      * rebase & fix a few issues
      
      * use new processgroup interface, test=kunlun
      
      * fix compile issue, test=kunlun
      3b91f8f3
  28. 08 11月, 2022 1 次提交
  29. 04 11月, 2022 1 次提交
  30. 28 10月, 2022 1 次提交
  31. 23 10月, 2022 1 次提交
  32. 12 10月, 2022 1 次提交
    • N
      [CodeStyle][F401] remove unused imports in python/paddle/distributed (#46758) · fe716a0b
      Nyakku Shigure 提交于
      * [CodeStyle][F401] remove unused import in python/paddle/distributed
      
      * remove pass
      
      * empty commit
      
      * Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      Fix ValueError: list.remove(x): x not in list for meta_optimizer_names.
      
      * Fix split import.
      
      Fix split import.
      
      * add noqa after meta_optimizers in factory
      
      * restort collective ops
      
      * expand `import *`
      
      * add noqa after required imports
      
      * try to fix APIs without core.ops
      
      * Revert "try to fix APIs without core.ops"
      
      This reverts commit 6172beaf601e84bf61f2490c12c4739f0edaa5eb.
      
      * fix an increment
      
      * empty commit
      
      * add noqa after required imports
      
      * expand `import *`, fix ci error
      Co-authored-by: NShuangchi He <34329208+Yulv-git@users.noreply.github.com>
      fe716a0b
  33. 11 10月, 2022 1 次提交
  34. 10 10月, 2022 1 次提交
  35. 22 9月, 2022 1 次提交
    • Fix the En docs (delete some expression like 'This OP') (#46165) · 3a928a8c
      张春乔 提交于
      * 1. Delete some expression like 'This Op'
      2. remove import numpy as np
      
      * test=document_fix
      
      * fix eg; test=document_fix
      
      * fix 'import numpy' cases; test=document_fix
      
      * fix 'import numpy' cases; test=document_fix
      
      * fix some docs; test=document_fix
      
      * delete raise; test=document_fix
      
      * add some introduction; test=document_fix
      
      * add some introduction; test=document_fix
      
      * test=document_fix
      
      * Fix ’note‘ format; test=document_fix
      
      * Fix Returns of cholesky; test=document_fix
      
      * Fix Example format; test=document_fix
      
      * Fix det; test=document_fix
      
      * Fix eig; test=document_fix
      
      * Fix eigh; test=document_fix
      
      * Fix eigh; test=document_fix
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * Apply suggestions from code review;test = document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      
      * test=document_fix
      
      * test=document_fix
      
      * KLDiv;test=document_fix
      
      * norm example code; test=document_fix
      
      * revert python/paddle/fluid/**/*
      
      * revert python/paddle/distributed/spawn.py
      
      * revert python/paddle/fluid/*
      
      * fix a `Note` format
      
      * Fix inv; test=document_fix
      
      * Fix lu; test=document_fix
      
      * Fix lu_unpack; test=document_fix
      
      * Fix matrix_power; test=document_fix
      
      * Fix multi_dot; test=document_fix
      
      * Fix solve; test=document_fix
      Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
      3a928a8c
  36. 21 9月, 2022 1 次提交
  37. 19 9月, 2022 1 次提交
  38. 16 9月, 2022 1 次提交
    • W
      refactor mp. (#45803) · fa97e5ba
      wuhuachaocoding 提交于
      * refactor mp.
      
      * update setup.py.
      
      * update mp_layers.py for compatibility.
      
      * add documents for mp_layers.py
      
      * update init.py
      
      * update collective.py.
      
      * update.
      
      * update mp_ops.py
      
      * update.
      
      * update code style.
      
      * update code style.
      fa97e5ba
  39. 14 9月, 2022 1 次提交