提交 · c5548178b0a7dc428d545a532bf2bfcc74ffde3d · BaiXuePrincess / Paddle

03 9月, 2019 3 次提交

A a pass to enable the use of cudnn (#19346) · c5548178

由 Yiqun Liu 提交于 9月 03, 2019

* Add a interface to enable cudnn for inference.

* Add cudnn_placement_pass.
test=develop

* Set the default value of cudnn_enabled_op_types to null.
test=develop

* Write the common basic class, placement_pass_base, to refine the codes.
test=develop

* Call EnableCUDNN in unittest.
test=develop

* Refine cudnn_placement_pass tester.

* Enable the testing of cudnn_placement_pass in inference's unittest.
test=develop

* Add the check of op kernels.
test=develop

c5548178

A
using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568) · e94b26da
由 Adam 提交于 9月 03, 2019
```
* using MKLDNNMemoryFormat = mkldnn::memory::format changes
test=develop

* PADDLE_ENFORCE update
test=develop
```
e94b26da
G
Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506) · abaf87be
由 gongweibao 提交于 9月 03, 2019
```
Change backward_guard to optimize_guard to maximize the allreduce overlap
```
abaf87be

02 9月, 2019 2 次提交
- Z
  
  fix fast pe to run highest priority ops first, test=develop (#19575) · 19474019
  由 Zeng Jinle 提交于 9月 02, 2019
  
  19474019
- Z
  
  fix seg fault of share lod, test=develop (#19573) · 0af85497
  由 Zeng Jinle 提交于 9月 02, 2019
  
  0af85497
31 8月, 2019 1 次提交

Paddlebox Framework (#18982) · c756b5d2

由 hutuxian 提交于 8月 31, 2019

* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982

c756b5d2

30 8月, 2019 5 次提交

[MKL-DNN] Fix to face model on AVX512 platforms (#19282) · ecd9f330

由 Jacek Czaja 提交于 8月 30, 2019

- Refactor step 1

- Compilation fix

- Yet another compilation fix

- Even more compilation fix

- Lint fixes

test=develop

- Removed deprectaed PADDLE_ENFORCE occurance

test=develop

- Candidate fix to BN forward

- Lint fixes

test=develop

- Refactoring in data_layout_transform

- compilation fix

- Another comppilation fix

- Step further into darkness

- Yet another compilation fix

- Yet another compilation fix

- missing header

- compilation fix

- Added MKLDNN -> Paddle conversion in fetch op

test=develop

- Compilation fix

test=develop

- Lint

test=develop

- Mul fix

- Fix to MKLDNN MUL op and Elementwise MUL UT

test=develop

- Workaround for diffrent weights with groups representation Paddle vs
  MKL-DNN.

test=develop

- Candidate fix for 5D convolution with groups

- Refactor of fix for conv3d and conv2d in fetch op

test=develop

- Compilation fix

- Still same compilation fix

- Compilation fix

- Compilation fix

- Reverted refactoring of fixes

- Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format
  not NHWC

test=develop

- minor fix in UT

test=develop

- Lint fixes

test=develop

ecd9f330

add thread scope stat accurate metrics test=develop (#19480) · 10ca3f96

由 yaoxuefeng 提交于 8月 30, 2019

* add thread scope stat accurate metrics test=develop

* fix style

* fix style

* fix style

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix conflict

* fix style

* fix style test=develop

* fix error test=develop

* fix error test=develop

10ca3f96

T
remove unused assert.h (#19529) · 02270b3e
由 Tao Luo 提交于 8月 30, 2019
```
test=develop
```
02270b3e
C
Support feed single persistable variable to PE (#19417) · e340df01
由 chengduo 提交于 8月 30, 2019
```
* update executor feed
```
e340df01

Add a pass to replace dropout_op with scale_op when is_test is true (#19297) · fcec365d

由 Yiqun Liu 提交于 8月 30, 2019

* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true.
test=develop

* Delete dropout_op directly when upscale_in_train is true.
test=develop

* Improve the debug string, adding the print of op_desc information.

* Fix the case when dropout's input x is reused as the next op's output.

* Add the pass to inference.
test=develop

* Change the log level.
test=develop

* Add unittest for inplace case.

* Add comment to explain the pass.

* Apply the pass for CPU inference.
test=develop

* Fix the typo.
test=develop

* Add the check of AttrType.
test=develop

fcec365d

29 8月, 2019 3 次提交

support debug each output of each ins (#19004) · 1fe468d3

由 Thunderbrook 提交于 8月 29, 2019

* dump slot

* test

* proto

* dump slot

* test

* proto

* code style

* code style

* code style

* style

* add delete after unseen days

* add unseen days

* code style

* conflict solve
test=develop

* add clear model

* code style
test=develop

* code style
test=develop

* support debug tensor of each ins
test=develop

* support debug tensor of each ins
test=develop

* learning rate

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style

* code style
test=develop

* code style
test=develop

* unitest

* style

* style

* multi phase

* add channel

* code style

* style

* style

* unitest

* style

* define

* define
test=develop

* style
test=develop

* rm define
test=develop

* linux

* linux
test=develop

* style
test=develop

* output format
test=develop

* windows ci
test=develop

1fe468d3

Z

refine inplace inference registry, test=develop (#19032) · 5c8f210c
由 Zeng Jinle 提交于 8月 29, 2019

5c8f210c

Increase num_iteration_per_drop_scope (#19075) · b6d1d890

由 chengduo 提交于 8月 29, 2019

* increase num_iteration_per_drop_scope
test=develop

* Fix bug of while_op
test=develop

* fix bug of whileOp
test=develop

b6d1d890

28 8月, 2019 1 次提交

Fix the correctness of async mode at distributed training (#18863) · 65c73684

由 tangwei12 提交于 8月 28, 2019

* fix correctness of the communicator

* fix a bug in send thread when sending var context is empty, test=develop

* add lookup_table_prefetch_op and prefetch optimize, test=develop

* remove remote prefetch GPU supported

* word2vec force with CPU, test=develop

* test dist remote lookup table force with CPU, test=develop

65c73684

27 8月, 2019 1 次提交
- J
  
  Add conv dequant squash for int8 (#18905) · 2e3ec66b
  由 joanna.wozna.intel 提交于 8月 27, 2019
  
  2e3ec66b
23 8月, 2019 1 次提交
- T
  remove unused conv_elementwise_add2_act_fuse.cc (#19344) · c82280e4
  由 Tao Luo 提交于 8月 23, 2019
```
test=develop
```
  c82280e4
22 8月, 2019 2 次提交

Enhance OpTest to check the consistency of operators when using and not using inplace (#19101) · a9d5fc51

由 Leo Chen 提交于 8月 22, 2019

* add pybind interface to get all inplace ops, test=develop

* enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop

* handle corner cases in op_test, test=develop

* support outputs without tensor holder_, like XShape in reshape_op, test=develop

* fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop

* use reshape_grad instead of reshape in FlattenGradOp, test=develop

* fix error debug dims info for variables like XShape, test=develop

* change computational order in sum_op to relieve computation difference using inplace, test=develop

* add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop

* follow sneaxiy's comments, test=develop

* remove unused DefaultGradOpDescMaker in mkldnn op, test=develop

a9d5fc51

T
stronger the error message of tensor's mutable_data (#19303) · e3c68bde
由 Tao Luo 提交于 8月 22, 2019
```
* stronger the error message of tensor's mutable_data

test=develop

* update error message

test=develop
```
e3c68bde

21 8月, 2019 1 次提交

Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237) · 97d1db18

由 Adam 提交于 8月 21, 2019

* Add generalized Conv+Activation MKLDNN fuse pass creation Part2
test=develop

* Undefined behaviour of GetAttrIfExists<> FIX
test=develop

97d1db18

19 8月, 2019 5 次提交
- Z
  Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213) · 76c95af0
  由 Zhaolong Xing 提交于 8月 19, 2019
```
* fix mask rcnn bug:
1. affine channel fuse (diff)
2. condition block op (memory leak)
3. merge lod tensor op (diff)
4. memroy optim (diff)
test=develop

* fix ci aboud PADDLE_ENFOCE
fix merge lod infer op ut
test=develop
```
  76c95af0
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
- L
  fix compilation issue in windows vs2017 (#19183) · 50582071
  由 liuwei1031 提交于 8月 19, 2019
```
* fix compilation issue in windows vs2017, test=develop

* fix gtest lib not found issue, test=develop
```
  50582071
- J
  remove the warning for reminding user to avoid using the OriginProgram method,... · 5368b365
  由 juncaipeng 提交于 8月 19, 2019
```
remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244)

This log information may annoy users who don't need to care about it.
```
  5368b365
- C
  Fix REGISTER_OP_WITHOUT_GRADIENT (#19251) · 8a89ca94
  由 chengduo 提交于 8月 19, 2019
```
* fix REGISTER_OP_WITHOUT_GRADIENT
test=develop
```
  8a89ca94
16 8月, 2019 1 次提交
- Z
  
  move_flags_to_unified_files_for_management, test=develop (#19224) · 708bd979
  由 Zeng Jinle 提交于 8月 16, 2019
  
  708bd979
15 8月, 2019 2 次提交

A
Add generalized Conv+Activation MKLDNN fuse pass creation (#19072) · b837689e
由 Adam 提交于 8月 15, 2019
```
test=develop
```
b837689e

Enhance the error message when GrapOpMaker is null. (#19070) · 77572b70

由 Yiqun Liu 提交于 8月 15, 2019

* Enhance the error message when GrapOpMaker is null.
test=develop

* Call Proto() instead of directly using proto_ pointer.
test=develop

* Rollback to use proto_ directly, because some sepecial grad ops, such some double grad ops, donot have proto.
test=develop

77572b70

14 8月, 2019 2 次提交
- C
  Use CUDAPinnedPlace in buffered_reader (#19112) · c70a97f4
  由 chengduo 提交于 8月 14, 2019
```
Use CUDAPinnedPlace in buffered_reader
```
  c70a97f4
- J
  add get_last_save_xbox_base/get_last_save_xbox (#19122) · b104ea06
  由 jiaqi 提交于 8月 14, 2019
```
* add get_last_save_xbox_base/get_last_save_xbox
* fix fleet_util bug of load paddle model
* add doc string in fleet api
```
  b104ea06
13 8月, 2019 1 次提交

Add conv reqantize squash (#18754) · 492a00f5

由 joanna.wozna.intel 提交于 8月 13, 2019

* Add requantize squash

test=develop

* Add more precise tests
test=develop

* REname and REfactor tester

test=develop

492a00f5

12 8月, 2019 3 次提交
- J
  Replace Relu with bounded Relu in MobileNetV2 quantization (#18988) · bce72c7f
  由 joanna.wozna.intel 提交于 8月 12, 2019
```
test=develop
```
  bce72c7f
- C
  open fuse_all_optimizer_ops (#19087) · e044e842
  由 chengduo 提交于 8月 12, 2019
```
test=develop
```
  e044e842
- G
  Polish fleet API to support cuda collective mode and nccl2 mode. (#18966) · 29d87812
  由 gongweibao 提交于 8月 12, 2019
```
Polish fleet API to support cuda collective mode and nccl2 mode
```
  29d87812
11 8月, 2019 1 次提交

add save cache model api in fleet& add slots shuffle in dataset module & add... · 9150cf50

由 yaoxuefeng 提交于 8月 11, 2019

add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871)

* add ctr related metric layer test=develop

* add save cache and slots shuffle test=develop

* add save cache and slots shuffle test=develop

* fix error

* fix error

* fix style for ci

* fix for comments

* change SlotsShuffle input to std::strinf for generality

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix stylr

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* change non-const reference to pointer

* fix style

* fix style

* fix style test=develop

* fix style  test=develop

* add return ins num in ctr metric op

* change dtype to float in metric_op.py

* fix error test=develop

* fix style test=develop

* fix API spec

* fix API spec

* fix API spec test=develop

* add UT test=develop

9150cf50

10 8月, 2019 1 次提交

Datafeed support reading to cuda place directly. (#19071) · 5a80cc84

由 hutuxian 提交于 8月 10, 2019

* add a place field in DataFeed to denote which place it will feed data to.
* abstract the copy process in CopyToFeedTensor function
* add UT for float32 type and for CUDAPlace

5a80cc84

09 8月, 2019 2 次提交

C
Enhance fuse optimization op pass (#19010) · 17d62ab2
由 chengduo 提交于 8月 09, 2019
```
* Enhance fuse optimization op pass
test=develop
```
17d62ab2

Add call stack info during compile time (#19067) · 21440b4d

由 chengduo 提交于 8月 09, 2019

* Add call stack info during runtime and compile time
test=develop

* Rename operator_call_stack
test=develop

* Add unit test
test=develop

* follow comment
test=develop

21440b4d

08 8月, 2019 2 次提交

fix QueueDataset queue size (#19016) · fc038da7

由 jiaqi 提交于 8月 08, 2019

* fix QueueDataset queue size，set queue size = batch size * 100, to avoid too many instances in channel when training is much slower than reading data.

fc038da7

Fix memory overwriting of tensors returned by executor (#19030) · 8f537354

由 Leo Chen 提交于 8月 08, 2019

* fix memory overlapping of fetch var (return of executor.run), test=develop

* fix wrong usage of ParallelExecutor in op_test, test=develop

* remove useless parameter and simplify code

* avoid tensor destruct untimely, test=develop

* add testcase independent of OpTest, test=develop

8f537354

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致