- 19 10月, 2022 1 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Support allow_partial switch, which can be configure in pipeline_configs. If sent tensor are not the same from different hosts, they shouldn't been sent partially and then concated as a whole tensor. * Change name allow_partial to enable_partial_send_recv. * Add global variable _enable_partial_send_recv
-
- 18 10月, 2022 2 次提交
-
-
由 Yuang Liu 提交于
* [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495) * [dygraph sharding stage 2] sharding broadcast overlap (#46656) * Multi groups for broadcast of sharding stage 2 (#46894)
-
由 Haohongxiang 提交于
* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116) * [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) * update
-
- 17 10月, 2022 1 次提交
-
-
由 Wen Sun 提交于
* Support both use_calc_stream and sync_op in send recv APIs (#46023) * Support both use_calc_stream and sync_op in allgather API (#46295) * Support both use_calc_stream and sync_op in collective communication API (#46761) * Move group and all reduce from collective to communication (#45848) * Completes bfloat16 dtype for collective api in eager mode (#45844) * Fix collective APIs cannot be recognized when building docs (#46962) Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>
-
- 11 10月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
* bug fix for virtual pipeline parallel (#45922) * dont wait for send op under dygraph pp (#46209) * [interleave pp] sync recv for 1f1b (#46399) * [dygraph pp] all sync for allgather partial (#46483)
-
- 27 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 22 9月, 2022 2 次提交
-
-
由 Roc 提交于
uniform logger manager in FleetAPI. hidde API under distributed/utils which users don't need.
-
由 Haohongxiang 提交于
* fix bugs of mp * fix bugs of mp * update * update * fix bug
-
- 20 9月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 zhaoyingli 提交于
* [Auto Parallel] Change the import way of Auto Parallel (#46115) * fix strategy (#46256) * [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (#46180) * remove no need grad allreduce communication when sharding-dp * remove no need grad allreduce communication when sharding-dp * bugfix * bugfix * bugfix Co-authored-by: NYulong Ao <aoyulong@baidu.com> Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
-
- 19 9月, 2022 3 次提交
-
-
由 wuhuachaocoding 提交于
-
由 wuhuachaocoding 提交于
* refactor mp. * update setup.py. * update mp_layers.py for compatibility. * add documents for mp_layers.py * update init.py * update collective.py. * update. * update mp_ops.py * update. * update code style. * update code style.
-
由 ShenLiang 提交于
-
- 09 9月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
-
- 07 9月, 2022 2 次提交
- 06 9月, 2022 1 次提交
-
-
由 Yuang Liu 提交于
-
- 02 9月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
-
- 01 9月, 2022 1 次提交
-
-
由 wangguanqun 提交于
* config * fix unittest * zero init & cache & patch config * add barrier to save and load * add unittest
-
- 26 8月, 2022 3 次提交
-
-
由 Yuang Liu 提交于
-
由 wanghuancoder 提交于
-
由 Yuang Liu 提交于
-
- 23 8月, 2022 2 次提交
-
-
由 zhaoyingli 提交于
* add quant pass
-
由 LiYuRio 提交于
-
- 16 8月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* reconstruct_of_fleet_api * update
-
- 15 8月, 2022 1 次提交
-
-
由 wuhuachaocoding 提交于
* refactor fleet. * refact fleet.py. * update fleet/__init__.py. * update fleet.py * update code style. * update fleet * update fleet * update fleet * update fleet * update model.py * update fleet. * update __init__.py * update fleet. * update fleet. * update fleet * update fleet * update fleet * update fleet. * update optimizer.py * update optimizer * update fleet.py * update scaler.py * update setup.py.in
-
- 13 8月, 2022 1 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error * fix logging risk * fix logging possible risk * write trainer_desc file * support split sparse params in local & remote * fix import paddle.fluid.core.PSGPU * fix import paddle.fluid.core.PSGPU * add remote_sparse & local_sparse config * fix unittest * fix test_dist_fleet_geo table error * fix PADDLE_ENFORCE error * fix other's pr conflict
-
- 12 8月, 2022 1 次提交
-
-
由 hong 提交于
-
- 10 8月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute * add unittest for inference predictor
-
- 09 8月, 2022 2 次提交
-
-
由 zhaocaibei123 提交于
* save load * save load * add unittest * first commit * second commit * third commit * remove SaveLocalFS in memory sparse table * save dense param * update * push slot * fix push show clk: int -> float * add unittest * fix sample * unittest * add AsExtra for op * unittest * modify fs.py * modify fs.py * fix some bugs * add dataset hdfs config * local change * dataset use differenct hadoop ugi/fs_name * add * fix conflict * fix * remove logs * code style * fix * code style * code style * fix * code style * save_dense_param * fix * fix * fix * fix * change momentum in dense optimzer * fix * fix * change fluid => paddle.static * remove some unuseful code Co-authored-by: Nesythan <esythan@126.com>
-
由 Yuang Liu 提交于
-
- 08 8月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 03 8月, 2022 2 次提交
-
-
由 ronnywang 提交于
* [CustomDevice] add custom ccl 2/2 * update * update * update launch
-
由 muyuliufeng 提交于
-
- 01 8月, 2022 1 次提交
-
-
由 danleifeng 提交于
Co-authored-by: seemingwang <zsasuke@qq.com> Co-authored-by: NDesmonDay <908660116@qq.com> Co-authored-by: Nseemingwang <seemingwang@users.noreply.github.com> Co-authored-by: NThunderbrook <a754913769@163.com> Co-authored-by: Nxuewujiao <105861147+xuewujiao@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-sys-hic-k8s-v100-box-a225-0693.yq01.baidu.com> Co-authored-by: NThunderbrook <52529258+Thunderbrook@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: Nhuwei02 <53012141+huwei02@users.noreply.github.com> Co-authored-by: Nyaoxuefeng <yaoxuefeng@baidu.com> Co-authored-by: Nlxsbupt <luoxsbupt@163.com> Co-authored-by: Nmiaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> Co-authored-by: Nchao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: Nqingshui <qshuihu@gmail.com> Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>
-
- 26 7月, 2022 1 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error
-
- 22 7月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-
- 20 7月, 2022 1 次提交
-
-
由 danleifeng 提交于
* add adam/sharedadam optimzier for gpups;edit optimizer struct;test=develop
-
- 13 7月, 2022 2 次提交
-
-
由 ShenLiang 提交于
-
由 Jiabin Yang 提交于
* fix sharding in eager * support eager sharding
-