- 22 9月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
-
- 20 9月, 2022 3 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fix gloo compile warning * adapt for nn fl-ps * flps del fake-init op * add learning_rate_0 intializer op
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 zhaoyingli 提交于
* [Auto Parallel] Change the import way of Auto Parallel (#46115) * fix strategy (#46256) * [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (#46180) * remove no need grad allreduce communication when sharding-dp * remove no need grad allreduce communication when sharding-dp * bugfix * bugfix * bugfix Co-authored-by: NYulong Ao <aoyulong@baidu.com> Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
-
- 19 9月, 2022 6 次提交
-
-
由 wuhuachaocoding 提交于
-
由 Xiaoxu Chen 提交于
* [cherry-pick] extend reduce_sum,reduce_sum,eq,ne,ge,abs,pow,etc higher order operators * add reduce_mean,reduce_sum primitive ops * add ne_p gt_p primitive operators * add ge_p abs_p primitive oparators * add cast primitive operators * add pow,square prim2oirg rules * add elementwise_div orig2prim rule * [cherry-pick] add mean,sum,ge,gt,ne,abs,etc higher-order differentiation operators(#45888) * add reduce_mean,reduce_sum primitive ops * add ne_p gt_p primitive operators * add ge_p abs_p primitive oparators
-
由 wuhuachaocoding 提交于
* refactor mp. * update setup.py. * update mp_layers.py for compatibility. * add documents for mp_layers.py * update init.py * update collective.py. * update. * update mp_ops.py * update. * update code style. * update code style.
-
由 Yulong Ao 提交于
* [AutoParallel] adapt gradient merge pass (#45915) * adapt gradient merge * fix op_role * fix strategy * [Auto Parallel] Gradient Fuse Allreduce (#45643) * bugfix (#45332) * dist embedding support lookup table v1 * add unitest * customize wait_comm * group gradients * bugfix * update program * [Auto Parallel] Improve the APIs (#45776) * [Auto Parallel] Use c++ dist attr in the completion process * [Auto Parallel] Add minor changes * [Auto Parallel] Use c++ dist attr in the completion process * [Auto Parallel] Add minor changes * [Auto Parallel] Add the serialization process for dist attrs * [Auto Parallel] Remove unnecessary comments * [Auto Parallel] Fix some bugs * [Auto Parallel] Fix the code style * [Auto Parallel] Remove unnecessary impls * [Auto Parallel] Fix the importing error * [Auto Parallel] Fix the copy from bugs of op dist attr * [Auto Parallel] Replace the use of constexpr if * [Auto Parallel] Redesign the shard_tensor, shard_op and ProcessMesh * [Auto Parallel] Change API of the completion unittest * [Auto Parallel] Fix the bug when set_attr an int * [Auto Parallel] Add the unittest for the serialization * [Auto Parallel] Add some unit tests * [Auto Paralle] Unify the strategy * [Auto Parallel] Improve the engine api * [Auto Parallel] Reset the changes made to the framework * [Auto Parallel] Change the engine unittest * [Auto Parallel] Update API of the completion and partitioner * [Auto Parallel] Update unit tests using engine api * update shard annotation * [Auto Parallel] Remove the modifications of other modules * [Auto Parallel] Add docs for APIs * add new strategy * [Auto Parallel] Replace the logger * [Auto Parallel] Restore the test_program.py * [Auto Parallel] Change the import rules * [Auto Parallel] Add the examples for Engine * [Auto Parallel] Do some minor changes * [Auto Parallel] Remove yaml dependency * [Auto Parallel] Fix the unittests * add valid after train * bug fix Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com> Co-authored-by: Ncaozhou <caozhou@radi.ac.cn> Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com> * [Auto Parallel] Bugfix allreduce fuse for MP (#46086) * bugfix * bugfix * typos fixed * update strategy (#46138) Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com> Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com> Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com> Co-authored-by: Ncaozhou <caozhou@radi.ac.cn> Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
-
由 Chen Weihang 提交于
This reverts commit c252b1de.
-
由 ShenLiang 提交于
-
- 17 9月, 2022 1 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fix gloo compile warning * adapt for nn fl-ps
-
- 15 9月, 2022 1 次提交
-
-
由 Charles-hit 提交于
-
- 09 9月, 2022 3 次提交
-
-
由 zhaoyingli 提交于
* adapt lazy init and fix pass * add unittest * update comment * fix amp and sharding * remove clip_by_norm
-
由 Chen Weihang 提交于
* simplify size op * trans to cuda manuly * fix copy error
-
由 Yuang Liu 提交于
-
- 08 9月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 07 9月, 2022 2 次提交
- 06 9月, 2022 2 次提交
- 05 9月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* dist_matmul trans * update unittest * update cmakelist
-
- 02 9月, 2022 3 次提交
-
-
由 JZ-LIANG 提交于
* bugfix (#45332) * customize wait_comm
-
由 wuhuachaocoding 提交于
-
由 Haohongxiang 提交于
-
- 01 9月, 2022 3 次提交
-
-
由 kuizhiqing 提交于
* add fetch and prune for build cinn pass * add prune flag
-
由 wangguanqun 提交于
* config * fix unittest * zero init & cache & patch config * add barrier to save and load * add unittest
-
由 zhaoyingli 提交于
-
- 31 8月, 2022 3 次提交
-
-
由 JZ-LIANG 提交于
* bugfix (#45332) * dist embedding support lookup table v1 * add unitest * update unitest cmake
-
由 zhaoyingli 提交于
* add grad_clip pass * add unittest * add notes * update func * add dist_attr for new op
-
由 LiYuRio 提交于
-
- 29 8月, 2022 1 次提交
-
-
由 Wen Sun 提交于
-
- 26 8月, 2022 3 次提交
-
-
由 Yuang Liu 提交于
-
由 wanghuancoder 提交于
-
由 Yuang Liu 提交于
-
- 25 8月, 2022 2 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error * fix logging risk * fix logging possible risk * write trainer_desc file * support split sparse params in local & remote * fix import paddle.fluid.core.PSGPU * fix import paddle.fluid.core.PSGPU * add remote_sparse & local_sparse config * fix unittest * fix test_dist_fleet_geo table error * fix PADDLE_ENFORCE error * fix other's pr conflict * forbidden ssd table * . * recover ssd table code * recover file mode
-
由 JZ-LIANG 提交于
* support high order differential with data parallel overlap * update unitest
-
- 23 8月, 2022 4 次提交
-
-
由 zhaoyingli 提交于
* add quant pass
-
由 LiYuRio 提交于
-
由 JZ-LIANG 提交于
-
由 JZ-LIANG 提交于
* bugfix * remove scaling * support rescale_grad opt * add unitest
-