- 29 6月, 2023 1 次提交
-
-
由 TaoTao Li 提交于
* update dygraph collective fix ut * remove debug log
-
- 14 4月, 2023 1 次提交
-
-
由 ronnywang 提交于
-
- 04 4月, 2023 1 次提交
-
-
由 yuehuayingxueluo 提交于
* add gloo gather * add gloo_tools * fix CI bug * use gloo gather * remove redundant code * fix process_group_gloo.py * rename send_recv * fix conflict * fix conflict * fix codestyle * fix CI bug * add PADDLE_ENFORCE_NE
-
- 31 3月, 2023 1 次提交
-
-
由 zhenhailiu 提交于
* gather with doc * resolve comment * polish * polish * code style * polish doc * add_test * polish * polish * add test check * add test check * polish * polish * polish * polish * fix_time_out * polish * fix timeout * fix_timeout * polish * polish * polish * polish * polish
-
- 07 3月, 2023 1 次提交
-
-
由 Chen Weihang 提交于
-
- 09 2月, 2023 1 次提交
-
-
由 Roc 提交于
Co-authored-by: Nzhangxiaoci <zhangxiaoci@baidu.com>
-
- 13 1月, 2023 1 次提交
-
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
- 09 1月, 2023 1 次提交
-
-
由 LiYuRio 提交于
* comm_context and static init * refactor: move to phi/core/distributed * refactor: avoid mutable_data usage * fix: windows sock * fix: device without nccl Co-authored-by: Wen Sun <syl1887415157@126.com>
-
- 05 1月, 2023 1 次提交
-
-
由 Wen Sun 提交于
* refactor: use base class * fix: incorrect deps * fix: add missing header * refactor: update class structures * fix: bkcl typo * fix: remove redundant def
-
- 26 12月, 2022 1 次提交
-
-
由 Roc 提交于
* revert concat and change concat to stack * let stack kernel support int8, uint8 and bool type
-
- 19 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 17 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 16 12月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 21 11月, 2022 3 次提交
- 18 11月, 2022 2 次提交
-
-
由 Wen Sun 提交于
-
由 james 提交于
* fix device id issue for xpu eager xpu device id is not correctly set in eager mode, thus vars are on dev0 unless XPUDeviceGurad is called, leading to this error message for all node rank != 0: "NotImplementedError: (Unimplemented) Place Place(xpu:0) is not supported." * fix typo * fix pybind error
-
- 17 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 14 11月, 2022 3 次提交
- 10 11月, 2022 2 次提交
-
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
由 Wen Sun 提交于
* refactor: send, recv, send_partial, recv_partial * refactor: rm useless const ref
-
- 09 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 08 11月, 2022 2 次提交
- 07 11月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 28 10月, 2022 2 次提交
-
-
由 Haohongxiang 提交于
-
由 Haohongxiang 提交于
* fix no sync bugs * update * update task chain fix: update wait chain feat: add `GetDeviceContext` for gloo * fix oom * fix dev * update * update Co-authored-by: NLiYuRio <liyuruijx@163.com> Co-authored-by: NForFishes <2282912238@qq.com>
-
- 11 10月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 08 10月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 30 9月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 21 9月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
-
- 16 9月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 31 8月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 03 8月, 2022 1 次提交
-
-
由 ronnywang 提交于
* [CustomDevice] add custom ccl 2/2 * update * update * update launch
-
- 22 7月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 11 7月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* fix conflict * new pg apis * add docs of new apis * update * fix coverage * update * fix bug * fix reduce scatter * fix api * update Co-authored-by: NForFishes <2282912238@qq.com>
-
- 22 6月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* fix bugs * update * update * update * code style * code style check
-