- 02 3月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop
-
- 28 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 11 2月, 2020 1 次提交
-
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 06 1月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
-
- 28 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop
-
- 22 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* polish code details, test=develop * futher polish hint msg, test=develop
-
- 12 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 23 9月, 2019 1 次提交
-
-
由 wopeizl 提交于
* remove the useless warning for user to avoid confuse test=develop
-
- 18 9月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop
-
- 17 9月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 11 9月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* make leaky relu inplacable, test=develop * force add unittests to pass coverage, test=develop
-
- 30 8月, 2019 1 次提交
-
-
由 chengduo 提交于
* update executor feed
-
- 08 8月, 2019 1 次提交
-
-
由 Leo Chen 提交于
* fix memory overlapping of fetch var (return of executor.run), test=develop * fix wrong usage of ParallelExecutor in op_test, test=develop * remove useless parameter and simplify code * avoid tensor destruct untimely, test=develop * add testcase independent of OpTest, test=develop
-
- 29 7月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop
-
- 26 7月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* first version memory optimize pass, test=develop * remove move_tensor_sharing_pass, test=develop * refine code comments, add unittests, test=develop * turn off memory_optimize by default, test=develop * follow huihuang's comments, test=develop * follow chengduoZH's comments, test=develop * fix grammar error, add const qualifier, fix pass_test exception message, test=develop * follow chengduoZH's comments 2nd, test=develop
-
- 11 7月, 2019 2 次提交
-
-
由 gongweibao 提交于
-
由 Zeng Jinle 提交于
* feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop
-
- 27 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* update pe reduce config test=develop * drop the local_exe_scopes of the previous parallel_executor test=develop
-
- 26 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 24 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 19 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* update execution_strategy option default value test=develop * fix doc error test=develop
-
- 18 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* remove nccl dep when the number of GPU is 1 test=develop
-
- 14 6月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 13 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* update CPU_NUM config test=develop
-
- 08 6月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 06 6月, 2019 1 次提交
-
-
由 wopeizl 提交于
* fix the ParallelExecutor on Windows test=develop * restrict to use one GPU only under windows
-
- 03 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 30 5月, 2019 1 次提交
-
-
由 chengduo 提交于
* add event for fast executor and add threads for scopebuffer executor test=develop
-
- 29 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 27 5月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 12 5月, 2019 1 次提交
-
-
由 chengduo 提交于
* reset drop local scope counter test=develop
-
- 08 5月, 2019 1 次提交
-
-
由 chengduo 提交于
* move pass to ir * polish code test=develop * fix dependency test=develop
-
- 07 5月, 2019 1 次提交
-
-
由 songhao 提交于
integer', test=develop
-
- 11 4月, 2019 1 次提交
-
-
由 dongdaxiang 提交于
test=develop
-
- 03 4月, 2019 1 次提交
-
-
由 chengduo 提交于
-
- 28 3月, 2019 1 次提交
-
-
由 chengduo 提交于
* modify the interface of Pass::Allay test=develop * Polish code test=develop * Fix Travis CI test=develop * fix Pass::Apply interface test=develop * Fix Travis CI test=develop
-
- 20 3月, 2019 1 次提交
-
-
由 Wu Yi 提交于
* wip allreduce in op * wip * wip * wip * wip adding test * wip for conflict with mp mode * fix tests test=develop * fix cpu build test=develop * fix travis clang format test=develop * fix cpu build test=develop * update api.spec test=develop * delete comment test=develop * fix cpplint test=develop * fix test=develop * follow comment test=develop * add file test=develop * fix build test=develop * update test=develop * to be compatible with sync_bn, and fix mp mode in develop test=develop
-
- 15 3月, 2019 1 次提交
-
-
由 qingqing01 提交于
* Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)
-
- 13 3月, 2019 1 次提交
-
-
由 Yan Xu 提交于
* fix broadcast with mp mode * polish code test=develop * fix bcast strategy test=develop * fic cpplint test=develop * fix py3 failed test=develop * fix comment test=develop * update comment test=develop
-