- 28 7月, 2022 1 次提交
-
-
由 LiYuRio 提交于
-
- 23 6月, 2022 1 次提交
-
-
由 zlsh80826 提交于
* Reduce gather op unit tests size and increase the timeout * Add NVIDIA_TF32_OVERRIDE for multi-processes environment * Remove record test for device event ut
-
- 21 6月, 2022 1 次提交
-
-
由 gongweibao 提交于
-
- 15 6月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* use tempfile to place temporary files * update * revert test_communicator * fix test_dist_base
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* use yapf to format all python file * yapf exclude two unittests file for they rely on writing and reading file, and format will break them * disable diff_py_file because too many diff files cause command following failed
-
- 31 5月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
* [Eager] fix collective_global_gather * fix eager_ode = 1
-
- 28 5月, 2022 1 次提交
-
-
由 ShenLiang 提交于
* fix alltoall * rename utest
-
- 13 9月, 2021 1 次提交
-
-
由 李季 提交于
* upload global scatter and global gather operators related files
-
- 21 6月, 2021 1 次提交
-
-
由 tianshuo78520a 提交于
* del py2 code2 * fix test timeout
-
- 09 6月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 26 5月, 2021 1 次提交
-
-
由 JZ-LIANG 提交于
-
- 27 4月, 2021 1 次提交
-
-
由 lilong12 提交于
* add alltoall api, test=develop
-
- 26 4月, 2021 1 次提交
-
-
由 lilong12 提交于
* add sendrecv, test=develop
-
- 21 4月, 2021 1 次提交
-
-
由 liuyuhui 提交于
-
- 31 12月, 2020 3 次提交
- 21 10月, 2020 1 次提交
-
-
由 lilong12 提交于
* modify ut cmakefile, test=develop
-
- 29 9月, 2020 1 次提交
-
-
由 lilong12 提交于
* add gloo initializer, test=develop
-
- 28 9月, 2020 2 次提交
- 28 8月, 2020 1 次提交
-
-
由 lilong12 提交于
-
- 27 8月, 2020 1 次提交
-
-
由 lilong12 提交于
add collective op for cpu using gloo and paddle.distributed.* apis
-
- 21 8月, 2020 1 次提交
-
-
由 lilong12 提交于
-
- 24 11月, 2019 1 次提交
-
- 27 6月, 2019 1 次提交
-
-
由 HaoRen 提交于
* fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O
-