- 27 9月, 2018 1 次提交
-
-
由 chengduo 提交于
* add GraphNum test=develop * add graph number check in parallelExecutor test=develop * fix transformer_model bug test=develop * fix graph num
-
- 25 9月, 2018 1 次提交
-
-
由 Xin Pan 提交于
-
- 20 9月, 2018 1 次提交
-
-
由 chengduo 提交于
* Add Preface * Add demo code * Save file * Refine code * seems can work * use elementwise strategy * Use ElementwiseComputeEx * Add comments * extract functions from operator * Refine code * Follow comment * code refine * add op_fuse pass * add backward * code refine * use TopologySortOperations * follow comments * refine IsFusible * code enhance * fix op_fusion_pass * refine code * refine fuse_elemwise_act_op * adjust the input and output * refine logic * add intermediate_edge * disable inplace * follow comments * refine logic * follow comments * Remove the removable IntermediateOut * change strategy * code refine * enable fuse backward * code refine * code refine * rename unit test * follow comments
-
- 17 9月, 2018 2 次提交
- 15 9月, 2018 1 次提交
-
-
由 sneaxiy 提交于
-
- 10 9月, 2018 2 次提交
- 14 8月, 2018 1 次提交
-
-
由 yuyang18 提交于
-
- 09 8月, 2018 1 次提交
-
-
由 Xin Pan 提交于
Reduce one level of inheritence.
-
- 27 7月, 2018 1 次提交
-
-
由 Xin Pan 提交于
-
- 26 7月, 2018 5 次提交
- 22 7月, 2018 1 次提交
-
-
由 Xin Pan 提交于
-
- 18 7月, 2018 5 次提交
- 15 7月, 2018 1 次提交
-
-
由 chengduo 提交于
* Add learning rate decay test * fix test name * doesn't share @LR_DECAY_COUNTER@
-
- 13 7月, 2018 1 次提交
-
-
由 chengduo 提交于
* refine multi-thread CPU Parallel exe * refine multi thread CPU Parallel exe * Refine CPU version for ParallelExecutor * add share_parameter_between_cards_ * Fix ParallelExecutor bug * Fix unit test * Fix parameter opt balance * Fix with opti (param->grad) * Add grad to op var * Remove shard_param_between_cards
-
- 12 7月, 2018 2 次提交
-
-
由 Yancey1989 提交于
-
由 Yancey1989 提交于
-
- 29 6月, 2018 1 次提交
-
-
由 chengduo 提交于
* Fix tensorcopy bug * follow comment * Refine TensorCopy
-
- 28 6月, 2018 1 次提交
-
-
由 chengduo 提交于
-
- 26 6月, 2018 4 次提交
- 21 6月, 2018 1 次提交
-
-
由 fengjiayi 提交于
-
- 20 6月, 2018 1 次提交
-
-
由 Yancey1989 提交于
-
- 14 6月, 2018 1 次提交
-
-
由 Qiyang Min 提交于
* 1. Create buddy allocator in each places before NcclBcast the variables 2. Check the memory usage of ALL gpus rather than the first one * 1. Make NCCLGroupGuard guards only the ncclBcast part, which avoid ncclGroupEnd blocking the exception throwing 2. NOTE the usage of NCCLGroupGuard * Remove the memory usage check of gpus * Fix code style
-
- 12 6月, 2018 1 次提交
-
-
由 Yancey1989 提交于
-
- 11 6月, 2018 1 次提交
-
-
由 chengduoZH 提交于
replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
-
- 10 6月, 2018 2 次提交
-
-
由 chengduoZH 提交于
-
由 chengduoZH 提交于
-
- 08 6月, 2018 1 次提交
-
-
由 chengduoZH 提交于
-