- 08 1月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* Simplify the options of spawn based on fleetrun (#30144) * Simplify the options of spawn based on fleetrun * polish details * polish doc details * cleanup enum test=develop (#29294) Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
-
- 05 1月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
Set FLAGS_selected_gpus for spawn. When the child process starts, it will inherit the configuration of the main process and set the FLAGS once, but the environment variable has not been set at this time, which leads to the FLAGS_selected_gpus is keep same with mainprocess(usually empty), so manually update the flags here. 注:增加了一个单测,又移除了,单测打印显示CI机器nvidia-smi只有两张卡,需要大于两张卡才能测这个问题
-
- 26 11月, 2020 1 次提交
-
-
由 gongweibao 提交于
-
- 24 11月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish parallel api impl & doc details * add unittest for coverage * remove spawn test in py2.7 * add parallel api into white list
-
- 14 10月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 29 9月, 2020 1 次提交
-
-
由 lilong12 提交于
* add gloo initializer, test=develop
-
- 28 9月, 2020 2 次提交
- 31 8月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* remove backend argument of init_parallel_env * remove keep name table in transformer * add cpu version check * add skip unittest for init_parallel_env * polish doc: remove func use & update example
-
- 28 8月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add dygraph parallel run interface * polish implement & unified env property name * add print config arg * refactor init_parallel_env function * Compatible with multiprocessing and launch modes * set default trainer start port * support run in python 2 * polish python2 support code * remove python2 support * refine launch import * polish dome design details * refactor api implemention & path * use new method _set_expected_place * add spawn unittest framework & mnist test * add more unittests & doc * fix unittest failed * polish english doc * self review and polish details * refactor code by reviewer's comments * fix unittest failed * fix parallel_env unittest * fix several typos * fix error introduced when fixing typos * add unpublic note for start_processes * polish details by xiaoguang's comment * verify correctly when spawn nprocs=-1 * refactor spawn & init_parallel_env design * polish doc details * open spawn unittests * try to fix doc compile error * try to fix unknown doc format error * add skip unittest when not gpu
-