- 17 2月, 2023 1 次提交
-
-
由 Wen Sun 提交于
-
- 03 11月, 2022 1 次提交
-
-
由 ShenLiang 提交于
* add unbalanced data * fix utest
-
- 18 10月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
* [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495) * [dygraph sharding stage 2] sharding broadcast overlap (#46656) * Multi groups for broadcast of sharding stage 2 (#46894)
-
- 17 10月, 2022 1 次提交
-
-
由 Wen Sun 提交于
* Support both use_calc_stream and sync_op in send recv APIs (#46023) * Support both use_calc_stream and sync_op in allgather API (#46295) * Support both use_calc_stream and sync_op in collective communication API (#46761) * Move group and all reduce from collective to communication (#45848) * Completes bfloat16 dtype for collective api in eager mode (#45844) * Fix collective APIs cannot be recognized when building docs (#46962) Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>
-
- 27 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 22 9月, 2022 1 次提交
-
-
由 Roc 提交于
uniform logger manager in FleetAPI. hidde API under distributed/utils which users don't need.
-
- 20 9月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 zhaoyingli 提交于
* [Auto Parallel] Change the import way of Auto Parallel (#46115) * fix strategy (#46256) * [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (#46180) * remove no need grad allreduce communication when sharding-dp * remove no need grad allreduce communication when sharding-dp * bugfix * bugfix * bugfix Co-authored-by: NYulong Ao <aoyulong@baidu.com> Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
-
- 19 9月, 2022 2 次提交
-
-
由 wuhuachaocoding 提交于
-
由 Yulong Ao 提交于
* [AutoParallel] adapt gradient merge pass (#45915) * adapt gradient merge * fix op_role * fix strategy * [Auto Parallel] Gradient Fuse Allreduce (#45643) * bugfix (#45332) * dist embedding support lookup table v1 * add unitest * customize wait_comm * group gradients * bugfix * update program * [Auto Parallel] Improve the APIs (#45776) * [Auto Parallel] Use c++ dist attr in the completion process * [Auto Parallel] Add minor changes * [Auto Parallel] Use c++ dist attr in the completion process * [Auto Parallel] Add minor changes * [Auto Parallel] Add the serialization process for dist attrs * [Auto Parallel] Remove unnecessary comments * [Auto Parallel] Fix some bugs * [Auto Parallel] Fix the code style * [Auto Parallel] Remove unnecessary impls * [Auto Parallel] Fix the importing error * [Auto Parallel] Fix the copy from bugs of op dist attr * [Auto Parallel] Replace the use of constexpr if * [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh * [Auto Parallel] Change API of the completion unittest * [Auto Parallel] Fix the bug when set_attr an int * [Auto Parallel] Add the unittest for the serialization * [Auto Parallel] Add some unit tests * [Auto Paralle] Unify the strategy * [Auto Parallel] Improve the engine api * [Auto Parallel] Reset the changes made to the framework * [Auto Parallel] Change the engine unittest * [Auto Parallel] Update API of the completion and partitioner * [Auto Parallel] Update unit tests using engine api * update shard annotation * [Auto Parallel] Remove the modifications of other modules * [Auto Parallel] Add docs for APIs * add new strategy * [Auto Parallel] Replace the logger * [Auto Parallel] Restore the test_program.py * [Auto Parallel] Change the import rules * [Auto Parallel] Add the examples for Engine * [Auto Parallel] Do some minor changes * [Auto Parallel] Remove yaml dependency * [Auto Parallel] Fix the unittests * add valid after train * bug fix Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com> Co-authored-by: Ncaozhou <caozhou@radi.ac.cn> Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com> * [Auto Parallel] Bugfix allreduce fuse for MP (#46086) * bugfix * bugfix * typos fixed * update strategy (#46138) Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com> Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com> Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com> Co-authored-by: Ncaozhou <caozhou@radi.ac.cn> Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
-
- 08 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 07 9月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
-
- 06 9月, 2022 3 次提交
- 05 9月, 2022 1 次提交
-
-
由 Roc 提交于
-
- 02 9月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
-
- 01 9月, 2022 1 次提交
-
-
由 Roc 提交于
-
- 31 8月, 2022 2 次提交
- 29 8月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 26 8月, 2022 1 次提交
-
-
由 Roc 提交于
* add simple reformated ci files * update * add radme for new unitetsts * add radme for new unitetsts * add radme for new unitetsts * reset mlu * update for samples * add base api * reset some dist unit tests * add warning in grenerated cmakelists file * update readme for new dist unit tests * add all collective tests * remain base file and launcher file * Update README.md * Update README.md * fix env PYTHONPATH * Update gen_ut_cmakelists.py * add all collective tests * add docs for gen_ut_cmakelists.py * pretify codes * commont name == "name" * update for comments * update function's help * update for run type * update readme * add all collective tests * add all collective tests * mv collective test files * update for all collective tests * update * update * update * update for all tests * update for checking name * Update Cmakelists.txt * update testlist.csv * remain test_parallel_dygraph_dataparallel in unittests * set broadcast op all platforms * update * remain test_broadcast_tensors_op * fix * rm some collective files * update more colective tests * update * update * update gen_ut_supports recursion * update * update * update * update * fix nccl version * update * update * update * update * fix a bug and try to pass * update * add csv * update for timeout * remove tcp store * fix * fix * update * update * update for more dist tests * move multi node tests * update * update * update * fix for auto parallele * update * update path in python file * update * reset some test in unittests * fix * update readme * fix * update * fix port
-
- 18 8月, 2022 1 次提交
-
-
由 Roc 提交于
-