提交 · 71168dad57402e3c15531cfb5342e94f4a456cf6 · BaiXuePrincess / Paddle

21 8月, 2019 1 次提交

[Cherry Pick] Bug fix and speedup dygraph multi-cards on v1.5 (#19298) · 71168dad

由 chengduo 提交于 8月 21, 2019

* add warning info for CPU_NUM
test=develop

* update dygraph parallel.py
test=develop

* prune the feed op in compiler
test=release/1.5

* remove compile from PE
test=develop

* test CUDAPinnedPlace in reader
test=release/1.5

71168dad

20 8月, 2019 1 次提交
- C
  [Cherry pick] Fix register op without gradient (#19272) · 305bd25b
  由 chengduo 提交于 8月 20, 2019
```
* fix REGISTER_OP_WITHOUT_GRADIENT
test=develop
```
  305bd25b
29 7月, 2019 1 次提交
- C
  [Cherry pick] Fix backward error (#18835) · cc3ba765
  由 chengduo 提交于 7月 29, 2019
```
* fix backward bug
```
  cc3ba765
08 7月, 2019 1 次提交

CHERRY-Pick: Inference: fix mask rcnn model diff, optim memory usage, memory leak. #18532 (#18547) · bc9fd1fc

由 Zhaolong Xing 提交于 7月 08, 2019

fix mask rcnn
add interface for setting optim_cache_dir(eg: when in trt int8 mode, and load model from memory, there should be a interface for setting the trt calibration table data dir)

test=release/1.5

bc9fd1fc

05 7月, 2019 1 次提交
- G
  
  checkerrpick Make fuse_all_reduce_op_pass support mix_precision test=develop test=release (#18490) · 3232618a
  由 gongweibao 提交于 7月 05, 2019
  
  3232618a
28 6月, 2019 1 次提交

石

Update the Anakin interfaces for content-dnn and MLU, test=release/1.5 (#18028) · 924e53b7

由石晓伟提交于 6月 28, 2019

* Update the Anakin interfaces for content-dnn and MLU (#17890)

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* modify the access level of anakin engine (#18015)

test=develop

* fix ci test cmake test=develop

924e53b7

27 6月, 2019 1 次提交

Cherry pick Fix Bug-prone code of PE (#18355) · b09ba8a7

由 chengduo 提交于 6月 27, 2019

* update pe reduce config
test=release/1.5

*  drop the local_exe_scopes of the previous parallel_executor
test=release/1.5

b09ba8a7

26 6月, 2019 1 次提交
- C
  update reduce config (#18335) · 401c03fc
  由 chengduo 提交于 6月 26, 2019
```
test=release/1.5
```
  401c03fc
24 6月, 2019 1 次提交
- C
  update alloc_continuous_space_for_grad_pass (#18288) · cac315f9
  由 chengduo 提交于 6月 24, 2019
```
test=release/1.5
```
  cac315f9
19 6月, 2019 3 次提交
- C
  [Cherry-pick] Update execution_strategy option default value (#18184) · 6e3c9dd7
  由 chengduo 提交于 6月 19, 2019
```
* update execution_strategy option default value
test=release/1.5

* fix doc error
test=release/1.5
```
  6e3c9dd7
- C
  [Cherry Pick] Not init nccl when rank is 1 (#18170) · 041bc72c
  由 chengduo 提交于 6月 19, 2019
```
* remove nccl dep when the number of GPU is 1
test=develop

* use multi card run syncBN
test=release/1.5
```
  041bc72c
- H
  add trainer_desc proto DEPS (#18019) (#18130) · 39002b08
  由 hutuxian 提交于 6月 19, 2019
```
Add trainer_desc proto DEPS to solve CI random fail.
```
  39002b08
17 6月, 2019 1 次提交

Pipeline Concurrency (#17402) (#17971) · a0732cba

由 hutuxian 提交于 6月 17, 2019

cherry-pick for (https://github.com/PaddlePaddle/Paddle/pull/17402)

Add Pipeline Concurrency Train Mode:
- Cpp: pipeline_trainer & section_worker
- Python: PipelineOptimizer
- Add a new data_feed type: PrivateInstantDataFeed
- Add a test demo of pipeline trainer and the test model is gnn
- Do not support win32 now

a0732cba

15 6月, 2019 1 次提交
- C
  [Cherry pick]Update CPU_NUM config (#18110) · be8c82cc
  由 chengduo 提交于 6月 15, 2019
```
* update CPU_NUM config
test=develop
```
  be8c82cc
14 6月, 2019 1 次提交
- G
  
  cherrpick fixncclid 18025 test=release/1.5 (#18093) · 751497db
  由 gongweibao 提交于 6月 14, 2019
  
  751497db
13 6月, 2019 1 次提交
- G
  
  Polish codes of old prs (#17981) · 73eacf3e
  由 gongweibao 提交于 6月 13, 2019
  
  73eacf3e
10 6月, 2019 2 次提交
- Z
  Remove attribute in Allocator::Allocate (#17878) · 3ece61f7
  由 Zeng Jinle 提交于 6月 10, 2019
```
* remove attribute in Allocator::Allocate, test=develop

* fix travis ci error, test=develop
```
  3ece61f7
- G
  
  Fix FLAGS_fuse_parameter_memory_size unit from Bytes to MBytes. (#17924) · 972c54cd
  由 gongweibao 提交于 6月 10, 2019
  
  972c54cd
08 6月, 2019 1 次提交
- G
  
  Fix sync_batch_norm_op ncclallreduce error! (#17918) · dd4cd352
  由 gongweibao 提交于 6月 08, 2019
  
  dd4cd352
06 6月, 2019 2 次提交
- G
  
  Add backward and optimizer operator dependency pass. (#17746) · fbbdc9cc
  由 gongweibao 提交于 6月 06, 2019
  
  fbbdc9cc
- W
  Make ParallelExecutor support Windows GPU (#17787) · 453a49b1
  由 wopeizl 提交于 6月 06, 2019
```
* fix the ParallelExecutor on Windows
test=develop
* restrict to use one GPU only under windows
```
  453a49b1
05 6月, 2019 1 次提交

[NGraph] some ngraph updates to enable bert (#17739) · a4c528a3

由 baojun 提交于 6月 05, 2019

* delay infershape test=develop

* fall back subblock to paddle test=develop

* fix edge cases test=develop

* remove output duplicates test=develop

* handle reshape2_grad infershape test=develop

a4c528a3

04 6月, 2019 2 次提交
- C
  fix DropLocalExeScopes (#17829) · 43752047
  由 chengduo 提交于 6月 04, 2019
```
test=develop
```
  43752047
- L
  enable mkldnn primitive reuse for platform reorder (#17826) · 50326563
  由 Leo Zhao 提交于 6月 04, 2019
```
test=develop
```
  50326563
03 6月, 2019 1 次提交
- C
  polish error doc (#17772) · 863c7516
  由 chengduo 提交于 6月 03, 2019
```
test=develop
```
  863c7516
31 5月, 2019 1 次提交

fix prepare context redundant code problem, optimize executor by cach… (#17743) · d5239109

由 guru4elephant 提交于 5月 31, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* cache sub_scope, program, var when use_program_cache=True is set

* make fetch_list runable with variables, add more unittest for use_program_cache

d5239109

30 5月, 2019 2 次提交

C
Add Event in ScopeBuffer Executor (#17667) · 67c8dade
由 chengduo 提交于 5月 30, 2019
```
* add event for fast executor and add threads for scopebuffer executor
test=develop
```
67c8dade

Enhance fused_elementwise_activation op and add python api in contrib.layers (#17236) · 8fd39f3e

由 Yiqun Liu 提交于 5月 30, 2019

* Enhance fused_elementwise_activation op.
test=develop

* Move the api fused_elementwise_activation to contrib.
test=develop

* Add including files.
test=develop

* Add the support of sigmoid in fused_elementwise_activetion op.

* Update API.spec.
test=develop

8fd39f3e

29 5月, 2019 2 次提交
- G
  
  fix 2dconn test=develop (#17681) · 0d561ef4
  由 gongweibao 提交于 5月 29, 2019
  
  0d561ef4
- M
  
  Capi for a ngraph engine (#17037) · 5eb81fe5
  由 mozga-intel 提交于 5月 28, 2019
  
  5eb81fe5
28 5月, 2019 1 次提交

[MKL-DNN] conv_transpose mkldnn bias pass (#17644) · 6d8075ec

由 Jacek Czaja 提交于 5月 28, 2019

* - changes to graph detector

- Changes to pass

- Added ut for new pass

- use_pass

- Added pass to mkldnn passes

- fix to registration

- improved verbose messaging for conv bias passes

- Lint fixes

test=develop

* - Lint fixes

test=develop

6d8075ec

27 5月, 2019 3 次提交

add Concat quantization (#17448) · 96845d21

由 Sylwester Fraczek 提交于 5月 27, 2019

* add Concat quantization
add unit test for quantizing concat
fix for wrong value when the input is not in map of calculated scales
add use_quantizer to concat_op.cc
add scale_algo rules for concat

test=develop

* missing fix for multiple inputs quantize-squash

* wojtuss review fix: adding comment

test=develop

96845d21

G

Add multi-ncclcomm and 2D ncclallreduce support. (#17263) · 65bbf950
由 gongweibao 提交于 5月 27, 2019

65bbf950

Code clean of Allocator (#17602) · 4aa931dd

由 Zeng Jinle 提交于 5月 27, 2019

* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop

* clean code of allocator,test=develop

* delete zero_size_allocator.h,test=develop

* fix failed unittest,test=develop

4aa931dd

25 5月, 2019 1 次提交

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

24 5月, 2019 5 次提交

[MKL-DNN] Add Fully Connected Op for inference only(#15226) · 0c39b97b

由 Michał Gallus 提交于 5月 24, 2019

* fuse mul and elementwise add to fc

* Reimplement the FC forward operator

* Fix FC MKLDNN integration by transposing weights

* Add FC MKLDNN Pass

test=develop

* FC MKLDNN Pass: change memcpy to std::copy

* Fix MKLDNN FC handling of mismatch input and weights dims

* Lower tolerance for MKL-DNN in resnet50 test

test=develop

* Adjust FC to support MKLDNN Op placement

test=develop

* Adjust Placement Op to set use_mkldnn attribute for graph

test=develop

* MKLDNN FC: fix weights format so that gemm version is called

test=develop

* FC MKLDNN: Remove tolerance decrease from tester_helper

* FC MKL-DNN: Refactor the code, change input reorder to weight reorder

* MKL-DNN FC: Introduce operator caching

test=develop

* FC MKL-DNN: Fix the tensor type in ExpectedKernelType

test=develop

* FC MKL-DNN: fix style changes

test=develop

* FC MKL-DNN: fallback to native on non-supported dim sizes

test=develop

* FC MKLDNN: fix CMake paths

test=develop

* FC MKLDNN: Refine placement pass graph mkldnn attribute

test=develop

* Fix Transpiler error for fuse_conv_eltwise

test=develop

* Fix missing STL includes in files

test=develop

* FC MKL-DNN: Enable new output size computation

Also, refine pass to comply with newest interface.
test=develop

* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled

* FC MKL-DNN: Allow Weights to use oi or io format

* FC MKL-DNN: Adjust UT to work with correct dims

test=develop

* Enable MKL DEBUG for resnet50 analyzer

test=develop

* FC MKL-DNN: Improve Hashing function

test=develop

* FC MKL-DNN: Fix shape for fc weights in transpiler

* FC MKL-DNN: Update input pointer in re-used fc primitive

* Add log for not handling fc fuse for unsupported dims

test=develop

* FC MKL-DNN: Move transpose from pass to Op Kernel

test=develop

* FC MKL-DNN: Disable transpose in unit test

test=develop

* FC MKL-DNN: Remove fc_mkldnn_pass from default list

* Correct Flag for fake data analyzer tests

test=develop

* FC MKL-DNN: Add comment about fc mkldnn pass disablement

test=develop

* FC MKL-DNN: Disable fc in int8 tests

test=develop

0c39b97b

W
add __str__ method for tensor and lodtensor to support print test=dev… (#17588) · 6724a652
由 wopeizl 提交于 5月 24, 2019
```
* add __str__ method for tensor and lodtensor to support print test=develop
```
6724a652

Conv concat relu quantization (#17466) · 5b2a3c4b

由 Sylwester Fraczek 提交于 5月 24, 2019

* add conv_concat_relu fuse

test=develop

* add test code

test=develop

* added missing include with unordered_map

test=develop

* review fixes for wojtuss

test=develop

* remove 'should (not) be fused' comment statements

one of them was invalid anyway

test=develop

5b2a3c4b

fix quantize_squash_pass segfault when no tensor linked to Bias (#17292) · bccb0ba4

由 Sylwester Fraczek 提交于 5月 24, 2019

* fix quantize_squash_pass segfault when there is no tensor linked do Bias input

test=develop

* add googlenet test

test=develop

* fix concat CreateKey not using input format

test=develop

bccb0ba4

G
polish_executor_and_add_ctx_cache (#17536) · 7f8bc49d
由 guru4elephant 提交于 5月 24, 2019
```
* polish_executor_and_add_ctx_cache
```
7f8bc49d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致