- 02 3月, 2021 1 次提交
-
-
由 lilong12 提交于
* update, test=develop (#30692) * align the default value of some configuration for fleet to that of single cards (#30740) * update, test=develop
-
- 26 2月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 25 2月, 2021 1 次提交
-
-
由 tangwei12 提交于
* fix entry * fix distributed lookup table fuse case * fix entry bug at first time * move entry from paddle.fluid -> paddle.distributed * fix ut with paddle.enable_static() Co-authored-by: Nmalin10 <malin10@baidu.com> Co-authored-by: Nmalin10 <malin10@baidu.com>
-
- 23 2月, 2021 1 次提交
-
-
由 tangwei12 提交于
* test=develop, save/load, shrink Co-authored-by: NseiriosPlus <tangwei12@baidu.com> Co-authored-by: N123malin <malin10@baidu.com>
-
- 20 1月, 2021 2 次提交
- 19 1月, 2021 2 次提交
- 18 1月, 2021 1 次提交
-
-
由 123malin 提交于
* test=develop, fix fleet.metrics(mse, rmse, mae)
-
- 15 1月, 2021 1 次提交
-
-
由 123malin 提交于
* test=develop, add distributed_infer (#30300) * test=develop, add distributed_infer * test=develop, fix unittest cmakefile conflict * test=develop, fix test_dist_fleet_base
-
- 14 1月, 2021 1 次提交
-
-
由 Chengmo 提交于
Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
-
- 13 1月, 2021 2 次提交
- 12 1月, 2021 2 次提交
-
-
由 Chengmo 提交于
* Fix server.h include device_context (#30243) * fix cmake Co-authored-by: NseiriosPlus <tangwei12@baidu.com> * 【Paddle.Fleet】Support local save sparse param (#30175) * add save tensor support Co-authored-by: NseiriosPlus <tangwei12@baidu.com> * add sparse embedding & load vars for 2.0 & gloo bug fix (#30306) * add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0 Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
-
由 Chengmo 提交于
-
- 11 1月, 2021 2 次提交
-
-
由 WangXi 提交于
* Optimization grad merge performance (#29784) * [fleet] combine amp and gradient merge, test=develop (#30086) * fix assign_op_xpu concat_op_xpu warining (#30120) Co-authored-by: Nliuyuhui <liuyuhui@baidu.com>
-
由 Chen Weihang 提交于
att, cherry-pick of #30219
-
- 08 1月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* Simplify the options of spawn based on fleetrun (#30144) * Simplify the options of spawn based on fleetrun * polish details * polish doc details * cleanup enum test=develop (#29294) Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
-
- 06 1月, 2021 1 次提交
-
-
由 gongweibao 提交于
* fix log test=release/2.0 * fix ut test=develop
-
- 05 1月, 2021 2 次提交
-
-
由 gongweibao 提交于
-
由 Chen Weihang 提交于
Set FLAGS_selected_gpus for spawn. When the child process starts, it will inherit the configuration of the main process and set the FLAGS once, but the environment variable has not been set at this time, which leads to the FLAGS_selected_gpus is keep same with mainprocess(usually empty), so manually update the flags here. 注:增加了一个单测,又移除了,单测打印显示CI机器nvidia-smi只有两张卡,需要大于两张卡才能测这个问题
-
- 31 12月, 2020 3 次提交
- 25 12月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add ps table (#29463) * add ps table Change-Id: I468a04bd071d21ff52654926fcf4d5f3da19e178 * add service (#29560) * add service, remove ut on mac * fix heter_profiler & add heter stop method * fix code style * merge pscore Change-Id: Ie7f60d1cdde6755a0c29db26863c6283e9843d57 * fix cmake Change-Id: I6773509a7b4ca79139ecc40b7bf3eb318ceff8bb * fix conflit Change-Id: I35575be0c96a8520f9d756ea7f1ff0b904a165ba * fix conflit Change-Id: Ic926ea0b0d67803226d51241397ba3b510226bfa
-
- 22 12月, 2020 2 次提交
- 17 12月, 2020 1 次提交
-
-
由 ShenLiang 提交于
* Fix the dowanload bug in the case of multiple machines (#29551) * fix the dowanload bug * add sort for ips * Fix bug of matmul_v2 for broadcast case (#29599) * fix bug of matmul_v2 for broadcast * Rebuild group automatically in dynamic graph distributed (#29255) * add tensor_indices in AssignGroupBySize * add rebuild group in reducer * fix error message of gather nd (#29521)
-
- 16 12月, 2020 1 次提交
-
-
由 JZ-LIANG 提交于
* Sharding add hybrid-dp feature * update sharding in distributed_strategy * update sharding unitest * revise code format for sharding
-
- 08 12月, 2020 1 次提交
-
-
由 lilong12 提交于
* update, test=develop (#29331)
-
- 04 12月, 2020 1 次提交
-
-
由 ShenLiang 提交于
-
- 03 12月, 2020 2 次提交
- 01 12月, 2020 1 次提交
-
-
由 123malin 提交于
* fix fleet api doc
-
- 30 11月, 2020 2 次提交
- 27 11月, 2020 4 次提交
-
-
由 ShenLiang 提交于
* add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document
-
由 Chen Long 提交于
-
由 lilong12 提交于
-
由 lilong12 提交于
-