- 30 8月, 2020 1 次提交
-
-
由 Chengmo 提交于
* Support Heter Parameter Server
-
- 30 7月, 2020 1 次提交
-
-
由 tangwei12 提交于
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957) * Integrated Trainer of Parameter Server
-
- 08 7月, 2020 1 次提交
-
- 12 6月, 2020 1 次提交
-
-
由 tangwei12 提交于
* fix sync barrier with barrier monitor, test=develop
-
- 03 6月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop
-
- 07 4月, 2020 1 次提交
-
-
由 qingqing01 提交于
* Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO
-
- 10 2月, 2020 1 次提交
-
-
由 Wilber 提交于
Compile without nccl deps. [1/2] Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 13 1月, 2020 1 次提交
-
-
由 123malin 提交于
* test=develop, bug fix for sparse recorder
-
- 01 11月, 2019 1 次提交
-
-
由 123malin 提交于
* update pserver decay blocks * update distributed notify handler
-
- 18 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 16 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 15 10月, 2019 1 次提交
-
-
由 123malin 提交于
* bug fix: invalid learning rate decay in pserver async mode
-
- 07 10月, 2019 1 次提交
-
-
由 tangwei12 提交于
Heartbeat for distributed async training.
-
- 18 9月, 2019 1 次提交
-
-
由 123malin 提交于
* rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3
-
- 12 8月, 2019 1 次提交
-
-
由 gongweibao 提交于
Polish fleet API to support cuda collective mode and nccl2 mode
-
- 11 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 10 4月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 gongweibao 提交于
-
- 27 3月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 25 3月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 24 3月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 14 3月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 21 2月, 2019 1 次提交
-
-
由 Dun 提交于
* refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop
-
- 23 1月, 2019 1 次提交
-
-
由 tangwei12 提交于
checkpoint for distributed training.
-
- 18 1月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 28 12月, 2018 1 次提交
-
-
由 Wu Yi 提交于
* wip * wip * refactor no.1 dir structure test=develop * fix linking test=develop * fix includes test=develop * fix build test=develop * fix build test=develop
-