- 30 8月, 2023 1 次提交
-
-
由 Ghost Screaming 提交于
* for verify fluid operator support new comm library * u * u * u * compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter. * Remove useless comments in process_group.py * Polish code style. * Fix some problems. * Remove use fluid api in phi comm_context_manager. * Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement. * Fix bug of HIP architecture. * Fix some problems. 1. remove useless loggings. 2. Fix conditional compilation for HIP. 3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first, then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not set, as a result, NCCLCommContext's dev_ctx is not initialized. * Fix some problems. * Polish code. * Polish code. * Revert compatiable upgrade for communication operators. Their upgrades will be submitted in another PR. * Remove StaticTCPStore. * Remove useless modification. * Remove useless set_cuda_device_id. * Polish code. * Remove fluid header files in phi files. * Remove useless comments. * Fix problems of hip arch. * Fix some problems. * Polish code. * Polish code style. --------- Co-authored-by: hitywt <yuwentao126@126.com>
-
- 25 8月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 22 8月, 2023 1 次提交
-
-
由 PommesPeter 提交于
* fix: updated code examples. * fix: added paddle.seed * fix: updated code style * Apply suggestions from code review * refactor: refine detail of code examples * Update python/paddle/distributed/auto_parallel/static/process_mesh_v2.py * fix: refine detail * fix: refine detail * Update python/paddle/distributed/auto_parallel/static/process_mesh_v2.py Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * refactor: refine detail * refactor: refine detail * fix: refine doc --------- Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
-
- 07 8月, 2023 1 次提交
-
-
由 LiYuRio 提交于
* make tcp store a global instance * fix windows compile error
-
- 01 8月, 2023 1 次提交
-
-
由 LiYuRio 提交于
* use string as key for comm_context_manager * remove device_id from comm_context
-
- 13 7月, 2023 1 次提交
-
-
由 lil-Xing 提交于
* add phi operator c_concat and ut * update create_var use * update copyright
-
- 29 6月, 2023 1 次提交
-
-
由 TaoTao Li 提交于
* update dygraph collective fix ut * remove debug log
-
- 22 5月, 2023 1 次提交
-
-
由 Meteor Liu 提交于
* [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * [dygraph]unify _non_static_mode() in_dygraph_mode() and in_dynamic_mode() * fixed cyclic reference that caused patial import * fixed bad change * fix bad import * fix bad import * fix bad import * fix ut failed caused by change in_dynamic_mode * fix ut failed caused by change in_dynamic_mode * fixed usage of in_dynamic_mode() or in_dygraph_mode() * revert python3 to python in .pre-commit-config.yaml * fix merge conflicts
-
- 17 4月, 2023 1 次提交
-
-
由 张春乔 提交于
* remove hccl in .py files * remove ascend in setup.py.in * remove ascend in setup.py
-
- 14 4月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 06 4月, 2023 1 次提交
-
-
由 Kim Yann 提交于
* rem is_compiled_with_npu * rem nup related code * make lint happy * rem test * remove some tests * Update grad_scaler.py * fix an error
-
- 03 4月, 2023 1 次提交
-
-
由 Kim Yann 提交于
* rem is_compiled_with_mlu * fix some mlu_place and mlu_device_coount * make lint happy
-
- 25 3月, 2023 1 次提交
-
-
由 张春乔 提交于
-
- 22 3月, 2023 1 次提交
-
-
由 Ainavo 提交于
* replace assert false with AssertionError * 修改配置文件多余的部分
-
- 13 1月, 2023 1 次提交
-
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
- 10 1月, 2023 1 次提交
-
-
由 Wen Sun 提交于
* refactor: gloo comm context migration * fix: headers & avoid mutable_data usage * fix: cmake gloo dep * style: rename funcs * refactor: move to new files * fix: gloo deps * refactor: simplify create device
-
- 09 1月, 2023 1 次提交
-
-
由 LiYuRio 提交于
* comm_context and static init * refactor: move to phi/core/distributed * refactor: avoid mutable_data usage * fix: windows sock * fix: device without nccl Co-authored-by: Wen Sun <syl1887415157@126.com>
-
- 30 12月, 2022 1 次提交
-
-
由 Sanbu 提交于
* 1219 * temporarily change the num_diff_files limit, test=document_fix * Revert "temporarily change the num_diff_files limit, test=document_fix" This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20. * for codestyle * remove duplicate license * `static mode` -> `static graph mode` * Update hybrid_parallel_inference.py * Update layer_function_generator.py * Update manipulation.py * reset Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 26 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
* feat: broadcast_object_list & scatter_object_list * chore: update ut conf * get_backend & is_available * docs: update requirements * fix: resolve conflicts Co-authored-by: NLiYuRio <liyuruijx@163.com>
-
- 25 12月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* delete legacy dygraph code in python/paddle/distributed * refine
-
- 08 12月, 2022 1 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Remove climits. * Clean fluid API in paddle/distributed and paddle/fleetx folders. Include following files: python/paddle/distributed/__init__.py python/paddle/distributed/collective.py python/paddle/distributed/fleet/utils/fs.py python/paddle/distributed/fleet/utils/hybrid_parallel_inference.py python/paddle/distributed/fleet/utils/hybrid_parallel_util.py python/paddle/distributed/fleet/utils/internal_storage.py python/paddle/distributed/launch/context/device.py python/paddle/distributed/parallel.py python/paddle/distributed/parallel_with_gloo.py python/paddle/distributed/spawn.py python/paddle/framework/__init__.py To be mentioned, 'paddle.fluid.dygraph.parallel.ParallelEnv' and 'fluid.framework.core' keeps unchanged in those files. ParallelEnv is used by paddle.fluid.dygraph.parallel.DataParallel. However, APIs in paddle.fluid.dygraph.parallel can't be migrated to paddle.distributed, as there exists cyclic import dependencies in modules like paddle.static, paddle.tensor. And 'fluid.framework.core' will be changed to import framework.core after fluid.core is transmitted. * Change TODO authors.
-
- 29 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* isort all files * revert conflicting files * revert conflicting files * revert conflicting files
-
- 28 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
* refactor: move wait * refactor: move barrier * fix: fix incorrect import
-
- 25 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
* refactor: move all_gather
-
- 21 11月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 14 11月, 2022 2 次提交
- 10 11月, 2022 1 次提交
-
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
- 08 11月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 04 11月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 28 10月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 23 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* update config * re-blacken python code * temporarily disable date and diff_py_file * skip a format
-
- 12 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][F401] remove unused import in python/paddle/distributed * remove pass * empty commit * Fix ValueError: list.remove(x): x not in list for meta_optimizer_names. Fix ValueError: list.remove(x): x not in list for meta_optimizer_names. * Fix split import. Fix split import. * add noqa after meta_optimizers in factory * restort collective ops * expand `import *` * add noqa after required imports * try to fix APIs without core.ops * Revert "try to fix APIs without core.ops" This reverts commit 6172beaf601e84bf61f2490c12c4739f0edaa5eb. * fix an increment * empty commit * add noqa after required imports * expand `import *`, fix ci error Co-authored-by: NShuangchi He <34329208+Yulv-git@users.noreply.github.com>
-
- 11 10月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 10 10月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 22 9月, 2022 1 次提交
-
-
由 张春乔 提交于
* 1. Delete some expression like 'This Op' 2. remove import numpy as np * test=document_fix * fix eg; test=document_fix * fix 'import numpy' cases; test=document_fix * fix 'import numpy' cases; test=document_fix * fix some docs; test=document_fix * delete raise; test=document_fix * add some introduction; test=document_fix * add some introduction; test=document_fix * test=document_fix * Fix ’note‘ format; test=document_fix * Fix Returns of cholesky; test=document_fix * Fix Example format; test=document_fix * Fix det; test=document_fix * Fix eig; test=document_fix * Fix eigh; test=document_fix * Fix eigh; test=document_fix * Apply suggestions from code review;test = document_fix Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * Apply suggestions from code review;test = document_fix Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * Apply suggestions from code review;test = document_fix Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * test=document_fix * test=document_fix * KLDiv;test=document_fix * norm example code; test=document_fix * revert python/paddle/fluid/**/* * revert python/paddle/distributed/spawn.py * revert python/paddle/fluid/* * fix a `Note` format * Fix inv; test=document_fix * Fix lu; test=document_fix * Fix lu_unpack; test=document_fix * Fix matrix_power; test=document_fix * Fix multi_dot; test=document_fix * Fix solve; test=document_fix Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
-
- 21 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 19 9月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
This reverts commit c252b1de.
-
- 16 9月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
* refactor mp. * update setup.py. * update mp_layers.py for compatibility. * add documents for mp_layers.py * update init.py * update collective.py. * update. * update mp_ops.py * update. * update code style. * update code style.
-
- 14 9月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* trim trailing whitespace * fix `.cmake-format.py` * revert npu ut changes, avoid npu ci error
-