提交 · 89cfa491567240ae1777cb0803496417566f9511 · BaiXuePrincess / Paddle

02 3月, 2020 1 次提交

由 Zhen Wang 提交于 3月 02, 2020

* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.

* add the unit test for fetch_unmerged.

* update ut for multi-card and multi-cpu.

* add the error message and the user suggestion in FetchOpHandle. test=develop

89cfa491

28 2月, 2020 1 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
11 2月, 2020 1 次提交

Compile without nccl deps. [1/2] (#22509) · a90fa540

由 Wilber 提交于 2月 11, 2020

支持不依赖nccl进行编译。[1/2]

多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

a90fa540

05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

06 1月, 2020 1 次提交
- H
  
  Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029) · dd436156
  由 Huihuang Zheng 提交于 1月 06, 2020
  
  dd436156
28 11月, 2019 1 次提交

Polish reference count pass (#21324) · 89966525

由 Zeng Jinle 提交于 11月 28, 2019

* fix ref_cnt pass, test=develop

* add cpp unittests to reference_count_pass, test=develop

* follow comments, test=develop

89966525

22 11月, 2019 1 次提交
- C
  Polish some PE code details (#21274) · 95250852
  由 Chen Weihang 提交于 11月 22, 2019
```
* polish code details, test=develop

* futher polish hint msg, test=develop
```
  95250852
12 11月, 2019 1 次提交
- Z
  
  remove so many logs of parallel executor, test=develop (#21105) · d625aaf0
  由 Zeng Jinle 提交于 11月 12, 2019
  
  d625aaf0
23 9月, 2019 1 次提交
- W
  remove the useless warning for user to avoid confuse test=develop (#19871) · 5452b6a1
  由 wopeizl 提交于 9月 23, 2019
```
* remove the useless warning for user to avoid confuse test=develop
```
  5452b6a1
18 9月, 2019 1 次提交

[Bug fix] Disable memory reuse on feeded variables (#19835) · db26de83

由 Zeng Jinle 提交于 9月 18, 2019

* fix memory reuse bug on feeding variables, test=develop

* add comments to reference count members, test=develop

db26de83

17 9月, 2019 1 次提交
- Z
  
  disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778) · 754fd57e
  由 Zeng Jinle 提交于 9月 17, 2019
  
  754fd57e
11 9月, 2019 1 次提交

Make leaky relu inplacable (#19676) · 0daa5c97

由 Zeng Jinle 提交于 9月 11, 2019

* make leaky relu inplacable, test=develop

* force add unittests to pass coverage, test=develop

0daa5c97

30 8月, 2019 1 次提交
- C
  Support feed single persistable variable to PE (#19417) · e340df01
  由 chengduo 提交于 8月 30, 2019
```
* update executor feed
```
  e340df01
08 8月, 2019 1 次提交

Fix memory overwriting of tensors returned by executor (#19030) · 8f537354

由 Leo Chen 提交于 8月 08, 2019

* fix memory overlapping of fetch var (return of executor.run), test=develop

* fix wrong usage of ParallelExecutor in op_test, test=develop

* remove useless parameter and simplify code

* avoid tensor destruct untimely, test=develop

* add testcase independent of OpTest, test=develop

8f537354

29 7月, 2019 1 次提交

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

26 7月, 2019 1 次提交

Feature/mem opt pass refactor (#18735) · a802da65

由 Zeng Jinle 提交于 7月 26, 2019

* first version memory optimize pass, test=develop

* remove move_tensor_sharing_pass, test=develop

* refine code comments, add unittests, test=develop

* turn off memory_optimize by default, test=develop

* follow huihuang's comments, test=develop

* follow chengduoZH's comments, test=develop

* fix grammar error, add const qualifier, fix pass_test exception message, test=develop

* follow chengduoZH's comments 2nd, test=develop

a802da65

11 7月, 2019 2 次提交

G

Polish backwards optimizer dependency codes and use more default values. (#18255) · c0a82748
由 gongweibao 提交于 7月 11, 2019

c0a82748

Feature/buffer_shared_inplace (#17911) · d3003a16

由 Zeng Jinle 提交于 7月 11, 2019

* feature/buffer_shared_inplace, test=develop

* refine code, test=develop

* fix elementwise_add op cpu inplace and sum inplace bug, test=develop

* add unittest and debug log, test=develop

* fix parallel_executor scope bug, polish code, test=develop

* fix sum op, activation op, single_in_place_inference bug, test=develop

* remove kLocalExecScopeName, test=develop

* fix unittest,test=develop

* fix out_var first version bug, test=develop

* follow comments,test=develop

d3003a16

27 6月, 2019 1 次提交

Fix Bug-prone code of PE (#18354) · 8ed33bf9

由 chengduo 提交于 6月 27, 2019

* update pe reduce config
test=develop

*  drop the local_exe_scopes of the previous parallel_executor
test=develop

8ed33bf9

26 6月, 2019 1 次提交
- C
  update reduce config (#18334) · 135a59ed
  由 chengduo 提交于 6月 26, 2019
```
test=develop
```
  135a59ed
24 6月, 2019 1 次提交
- C
  update alloc_continuous_space_for_grad_pass (#18287) · 14e1e165
  由 chengduo 提交于 6月 24, 2019
```
test=develop
```
  14e1e165
19 6月, 2019 1 次提交
- C
  Update execution_strategy option default value (#18183) · 25f3cd64
  由 chengduo 提交于 6月 19, 2019
```
* update execution_strategy option default value
test=develop

* fix doc error
test=develop
```
  25f3cd64
18 6月, 2019 1 次提交
- C
  Remove nccl dep when the number of GPU is 1 (#18158) · 4978db2c
  由 chengduo 提交于 6月 18, 2019
```
* remove nccl dep when the number of GPU is 1
test=develop
```
  4978db2c
14 6月, 2019 1 次提交
- G
  
  Fix reinitialized ncclid error! (#18025) · f5caf344
  由 gongweibao 提交于 6月 14, 2019
  
  f5caf344
13 6月, 2019 1 次提交
- C
  Update CPU_NUM config (#18059) · b5a1c146
  由 chengduo 提交于 6月 13, 2019
```
* update CPU_NUM config
test=develop
```
  b5a1c146
08 6月, 2019 1 次提交
- G
  
  Fix sync_batch_norm_op ncclallreduce error! (#17918) · dd4cd352
  由 gongweibao 提交于 6月 08, 2019
  
  dd4cd352
06 6月, 2019 1 次提交
- W
  Make ParallelExecutor support Windows GPU (#17787) · 453a49b1
  由 wopeizl 提交于 6月 06, 2019
```
* fix the ParallelExecutor on Windows
test=develop
* restrict to use one GPU only under windows
```
  453a49b1
03 6月, 2019 1 次提交
- C
  polish error doc (#17772) · 863c7516
  由 chengduo 提交于 6月 03, 2019
```
test=develop
```
  863c7516
30 5月, 2019 1 次提交
- C
  Add Event in ScopeBuffer Executor (#17667) · 67c8dade
  由 chengduo 提交于 5月 30, 2019
```
* add event for fast executor and add threads for scopebuffer executor
test=develop
```
  67c8dade
29 5月, 2019 1 次提交
- G
  
  fix 2dconn test=develop (#17681) · 0d561ef4
  由 gongweibao 提交于 5月 29, 2019
  
  0d561ef4
27 5月, 2019 1 次提交
- G
  
  Add multi-ncclcomm and 2D ncclallreduce support. (#17263) · 65bbf950
  由 gongweibao 提交于 5月 27, 2019
  
  65bbf950
12 5月, 2019 1 次提交
- C
  Add DropLocalExeScopes in ParallelExecutor (#17297) · bc833945
  由 chengduo 提交于 5月 12, 2019
```
* reset drop local scope counter
test=develop
```
  bc833945
08 5月, 2019 1 次提交
- C
  Code Clean: Move all pass to paddle::framework::ir (#17228) · 04bd413a
  由 chengduo 提交于 5月 08, 2019
```
* move pass to ir

* polish code
test=develop

* fix dependency
test=develop
```
  04bd413a
07 5月, 2019 1 次提交
- S
  fix build warning like 'comparison between signed and unsigned (#17240) · c2e20e2a
  由 songhao 提交于 5月 07, 2019
```
integer', test=develop
```
  c2e20e2a
11 4月, 2019 1 次提交
- D
  remove all warnings · 3c2d2368
  由 dongdaxiang 提交于 4月 11, 2019
```
test=develop
```
  3c2d2368
03 4月, 2019 1 次提交
- C
  
  Fix the bug of AllReduceDepPass (#16393) · ea2a2f77
  由 chengduo 提交于 4月 02, 2019
  
  ea2a2f77
28 3月, 2019 1 次提交

Fix the interface of Pass::Apply (#16484) · ed61d67c

由 chengduo 提交于 3月 27, 2019

* modify the interface of Pass::Allay
test=develop

* Polish code
test=develop

* Fix Travis CI
test=develop

* fix Pass::Apply interface
test=develop

* Fix Travis CI
test=develop

ed61d67c

20 3月, 2019 1 次提交

Collective ops (#15572) · 6382b62f

由 Wu Yi 提交于 3月 20, 2019

* wip allreduce in op

* wip

* wip

* wip

* wip adding test

* wip for conflict with mp mode

* fix tests test=develop

* fix cpu build test=develop

* fix travis clang format test=develop

* fix cpu build test=develop

* update api.spec test=develop

* delete comment test=develop

* fix cpplint test=develop

* fix test=develop

* follow comment test=develop

* add file test=develop

* fix build test=develop

* update test=develop

* to be compatible with sync_bn, and fix mp mode in develop test=develop

6382b62f

15 3月, 2019 1 次提交

Support sync batch norm. (#16121) · 8ad672a2

由 qingqing01 提交于 3月 15, 2019

* Support Sync Batch Norm.
* Note, do not enable it in one device.

Usage:

build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
        loss_name=loss_mean.name,
        build_strategy=build_strategy)

8ad672a2

13 3月, 2019 1 次提交

fix broadcast on mp mode (#15951) · 30568473

由 Yan Xu 提交于 3月 13, 2019

* fix broadcast with mp mode

* polish code test=develop

* fix bcast strategy test=develop

* fic cpplint test=develop

* fix py3 failed test=develop

* fix comment test=develop

* update comment test=develop

30568473

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致