- 10 2月, 2020 1 次提交
-
-
由 Wilber 提交于
Compile without nccl deps. [1/2] Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 13 1月, 2020 1 次提交
-
-
由 123malin 提交于
* test=develop, bug fix for sparse recorder
-
- 01 11月, 2019 1 次提交
-
-
由 123malin 提交于
* update pserver decay blocks * update distributed notify handler
-
- 18 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 16 10月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 15 10月, 2019 1 次提交
-
-
由 123malin 提交于
* bug fix: invalid learning rate decay in pserver async mode
-
- 07 10月, 2019 1 次提交
-
-
由 tangwei12 提交于
Heartbeat for distributed async training.
-
- 18 9月, 2019 1 次提交
-
-
由 123malin 提交于
* rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3
-
- 12 8月, 2019 1 次提交
-
-
由 gongweibao 提交于
Polish fleet API to support cuda collective mode and nccl2 mode
-
- 11 4月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 10 4月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 gongweibao 提交于
-
- 27 3月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 25 3月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 24 3月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 14 3月, 2019 2 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 21 2月, 2019 1 次提交
-
-
由 Dun 提交于
* refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop
-
- 23 1月, 2019 1 次提交
-
-
由 tangwei12 提交于
checkpoint for distributed training.
-
- 18 1月, 2019 1 次提交
-
-
由 Qiao Longfei 提交于
-
- 28 12月, 2018 1 次提交
-
-
由 Wu Yi 提交于
* wip * wip * refactor no.1 dir structure test=develop * fix linking test=develop * fix includes test=develop * fix build test=develop * fix build test=develop
-