提交 · 37f76407b0263593d086e1970f3ddb893375e319 · Crayon鑫 / Paddle

24 9月, 2019 8 次提交

Add float16 support to `sync_batch_norm_op` (#19681) · ebff68fa

由 Yang Zhang 提交于 9月 24, 2019

* Add float16 support to `sync_batch_norm_op`

test=develop

* Add test for sync_bn with FP16 input

test=develop

ebff68fa

Remove constraint that last dimension is forced to be 1 by adding lookup_table_v2 (#19735) · 039b9710

由 Aurelius84 提交于 9月 24, 2019

* Remove constraint that last dimension is forced to be 1 by add
lookup_table_v2 test=develop

* modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop

* Revert "modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop"

This reverts commit 8a960bfc61e51aa27c3c529df8fb90b93ebd19f9.

* move api into fluid.embedding test=develop

* fix example code test=develop

* move one_hot into fluid.one_hot

* modify api.spec test=develop

* fix loss shape test=develop

039b9710

[PaddleSlim] Enhence compressor api in PaddleSlim (#19894) · bdb3e376

由 whs 提交于 9月 24, 2019

1. Support customize eval function instead of eval program.
2. Fix loading checkpoint in quantization strategy.
3. Support saving eval model when saving a checkpoint.
4. Fix decoder of loading context in PaddleSlim.
5. Fix restoring from the checkpoint of uniform prune strategy.
6. Support saving eval model and infer model during training.
7. Add ‘unitest’ for saving eval model, saving infer model and uniform pruning restoring from the checkpoint.
8. Fix pruning of depthwise_conv_grad op by updating the groups.

bdb3e376

support change shuffle and train thread num (#19841) · cedc0477

由 xujiaqi01 提交于 9月 24, 2019

* support change shuffle thread num
* support change train thread num
* fix receive shuffle data of each channel
* data norm stop gradient
* add check thread_tensor type and root_tensor type when merge metric
* remove sleep in shuffle, add config
* add config of pslib client to client communication
* fix xbox str
* add data norm op testcase
* add flush in trainer finalize

cedc0477

K

add elementwise mod support float/double. test=develop (#19570) · 14625ffe
由 Kaipeng Deng 提交于 9月 24, 2019

14625ffe
G
give warnings when save a model without any parameters (#19931) · 790d5226
由 Ghost Under Moon 提交于 9月 24, 2019
```
* give warnings when save a model without any parameters test=develop

* delete one line comment test=develop
```
790d5226
Z
Add py_reader combination unittest (#19923) · f254b477
由 Zeng Jinle 提交于 9月 24, 2019
```
* add py_reader combination unittest,test=develop

* follow huihuang's comments, test=develop
```
f254b477

Make OpTest check grad inplace even if forward has no inplace (#19847) · 57606205

由 Leo Chen 提交于 9月 24, 2019

* make OpTest check grad inplace even if forward has no inplace, test=develop

* do not run PE when enable_inplace is False, test=develop

* add conv3d cuda kernel for float16 type, test=develop

* refactor OpTest for inplace, test=develop

* add comments, test=develop

57606205

23 9月, 2019 10 次提交

J
add fake_quant_dequant_op for average pool2d, test=develop (#19880) · b0ceed6f
由 juncaipeng 提交于 9月 23, 2019
```
* add fake_quant_dequant_op for average pool2d
* add test
```
b0ceed6f
Z

resize Ops support data_layout:channel_last, test=develop, test=document_preview (#19914) · cb8f3c03
由 Zhang Ting 提交于 9月 23, 2019

cb8f3c03

Forward recompute3 (#19913) · 9901f696

由 mapingshuo 提交于 9月 23, 2019

* add recompute based checkpoints methods for large batch training
test=develop

* add append_backward_with_forward_recomputation
test=develop

* refine optimizer
test=develop

* update backward and optimizer
test=develop

* make Variable usable
test=develop

* add recompute code

* refine optimizer
test=develop

* refine addup _append_backward_ops_with_checkpoints_
1) for recompute part, just cache the grad_op_desc without appending to block
2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch
test=develop

* make method private

* add recompute strategy into DistributedStrategy
test=develop

* checkpoint version3
test=develop

* remove some print information
test=develop

* remove unused sumop
test=develop

* try to fix recompute with graph building modules

* add input names to vars should be held

* add memory debug tool

* backup backward

* Fix bugs

* add backward desc for op not in any segments

* add exception info for sub_block

test=develop

* modify code style

test=develop

* modify code style

test=develop

* remove print functions

test=develop

* add API spec

test=develop
test=document_preview

* make Recompute a child class of Optimizer

test=develop
test=document_preview

* add API spec

test=develop
test=document_preview

* modify API spec

test=develop
test=document_preview

* add document for Recompute

test=develop
test=document_preview

* change API doc of Rcompute

test=develop
test=document_preview

* code cleaning

test=develop
test=document_preview

* modify API spec

* fix bugs when segments hold no element

* add testcase for Recompute Optimizer

test=develop
test=document_preview

* add test for apply_gradient, and code cleaning

test=develop
test=document_preview

* add test case for load function

* enable CI

test=develop
test=document

* add test case

test=develop
test=document_preview

* add sample code for 4 function of recompute optimizer

test=develop
test=document_preview

9901f696

C
Delete local execution scopes (#19749) · d7251a8e
由 chengduo 提交于 9月 23, 2019
```
* Add RecordHistoryLocalExecScopes
test=develop
```
d7251a8e
G

warning when user save a inference model which contains auc op test=develop (#19838) · 4836ee68
由 Ghost Under Moon 提交于 9月 23, 2019

4836ee68
W
optimize the error information when the input for while op has a wron… (#19872) · e606b175
由 wopeizl 提交于 9月 23, 2019
```
* optimize the error information when the input for while op has a wrong shape test=develop
```
e606b175
R
add mse_loss (#19759) · d31c92a2
由 ruri 提交于 9月 23, 2019
```
* add mse_loss op
```
d31c92a2

move tree_conv to fluid.contrib.layers (#19918) · a4919d36

由 Tao Luo 提交于 9月 23, 2019

* move tree_conv to fluid.contrib.layers

test=develop

* update API.spec for tree_conv

test=develop

* update tree_conv api to increase unit coverage

test=develop

a4919d36

Unify DataLoader APIs (#19305) · 0436efd6

由 Zeng Jinle 提交于 9月 23, 2019

* unify DataLoader APIs, test=develop

* integrate iterable CPU Dataset, test=develop
add GPU dataset supporting, test=develop

* add unittests for dataset, test=develop

* add more docs to dataloader apis, test=develop, test=document_preview

* refine doc, test=develop

* refine doc again, test=develop

* increase coverage, test=develop

0436efd6

T
paddle cloud role maker fix (#19646) · 278dd003
由 tangwei12 提交于 9月 23, 2019
```
* optimize cloud rolemaker, test=develop
```
278dd003

22 9月, 2019 1 次提交
- L
  add instance norm (#19500) · 4155e625
  由 lvmengsi 提交于 9月 22, 2019
```
* add instance norm op
```
  4155e625
21 9月, 2019 4 次提交

A
Add support for other axes in MKLDNN softmax op (#19907) · cb65439d
由 Adam 提交于 9月 21, 2019
```
* Initial, functional commit

* Clean commit related files
test=develop
```
cb65439d

Feature/auto prune in dygraph (#19757) · 45425411

由 Jiabin Yang 提交于 9月 21, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* test=develop, refoctor name to make it easier to understand

* test=develop, refoctor name to make it easier to understand

* test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ

* test=develop, fix ut failed on parallel se-resnext

* test=develop, change one more PADDLE_ENFORCE

* support auto prune in dygraph mode

* test=develop, support auto prune

* test=develop, merge develop conflict

* test=develop, fix test_layer and test_tracer ut

* test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs

45425411

A

move match_matrix var_conv2d et.al api into fluid.contrib test=develop (#19859) · 418a0967
由 Aurelius84 提交于 9月 21, 2019

418a0967
Z

add py_reader may be deprecated msg, test=develop (#19891) · e2372750
由 Zeng Jinle 提交于 9月 21, 2019

e2372750

20 9月, 2019 8 次提交

Z

fix readers bug, test=develop (#19868) · cee0079a
由 Zeng Jinle 提交于 9月 20, 2019

cee0079a
A
support 2-level lod of input in sequence_pool (#19839) · fcf53e55
由 Aurelius84 提交于 9月 20, 2019
```
* support 2-level lod of input in sequence_pool test=develop

* fix lod level bug in .cu test=develop
```
fcf53e55
C
refine optimier function (#19886) · ae31faaa
由 chengduo 提交于 9月 20, 2019
```
test=developt
```
ae31faaa
Z
group_norm support data_layout:NHWC, test=develop, test=document_preview (#19614) · 93364b45
由 Zhang Ting 提交于 9月 20, 2019
```
1. group_norm support data_layout=NHWC
2. modified doc of group_norm
```
93364b45

modified interpolate op to support tensor attribute, test=develop, test=document_preview (#19287) · 439d95e1

由 Zhang Ting 提交于 9月 20, 2019

modified interpolate_op to support tensor attribute

1. the parameter out_shape of image_resize、resize_nearest/bilinear/trilinear can be a list or a 1-D tensor variable. If a list, each element can be an integer or a tensor variable with shape: [1].

2. the parameter scale of above Ops can be a 1-D tensor variable.
modified document of image_resize, resize_nearest, resize_bilinear, resize_trilinear and add some code example.

439d95e1

add crop_tensor_op, test=develop, test=document_preview (#19314) · b3888941

由 Zhang Ting 提交于 9月 20, 2019

add crop_tensor op. The main difference with crop is :

1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration.

2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x

offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].

b3888941

Z

fix download.py empty line, test=develop (#19870) · bf836736
由 Zeng Jinle 提交于 9月 20, 2019

bf836736
C
refine executor bug info (#19887) · 1f686744
由 chengduo 提交于 9月 20, 2019
```
test=develop
```
1f686744

19 9月, 2019 9 次提交

F

hide with inference optim API (#17355) · fe18cfdb
由 flame 提交于 9月 19, 2019

fe18cfdb
A
Remove constraint that last dimension is forced to be 1 in cross_entropy (#19606) · b125e327
由 Aurelius84 提交于 9月 19, 2019
```
* Remove constraint that last dimension is forced to be 1 in cross_entropy
test=develop

* modify labels last dims test=develop
```
b125e327
G
change _origin_program test=develop (#19863) · e8d3745c
由 gongweibao 提交于 9月 19, 2019
```
change _origin_program test=develop
```
e8d3745c

add precise roi pooling op test=develop (#18960) · a7c440d3

由 wopeizl 提交于 9月 19, 2019

* add precise roi pooling op test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* detail the description test=develop

* test=develop

* elaborate the doc for return type test=develop

* test=develop

a7c440d3

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

W
distribute.launch use poll to query subprocess (#19853) · 8c2c8dc6
由 WangXi 提交于 9月 18, 2019
```
distribute.launch use poll to query subprocess
```
8c2c8dc6

Disable test_dygraph_mnist_fp16.py (#19844) · 8e927327

由 chengduo 提交于 9月 19, 2019

* Fix std::ostream& operator<<(std::ostream& os, const Tensor& t)
test=develop

* Fix test_dygraph_mnist_fp16
test=develop

* disable test_dygraph_mnist_fp16
test=develop

* revert tensor_util.cc fix
test=develop

8e927327

J
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714) · d9db94d7
由 Jie Fang 提交于 9月 19, 2019
```
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
```
d9db94d7

Strided slice (#19642) · 47af618f

由 wangchaochaohu 提交于 9月 19, 2019

* strided_slice op basic function test=develop

* test=develop rewrite and fix

* fix bug test=develop

* fix for the PADDLE_ENFORCE usage

* add some unit testw

* fix for the aip  test and copright and fix test=develop

* fix API.spec test=develop

* fix API.spec test=develop

* add axis parameter test=develop

* fix for the build error test=develop

* fix python api  test=develop

* fix the build test=develop

* fix build test=develop

* fix API spec test=develop

* test=develop add some comment and single op test

* fix API spece test=develop

* fix test=develop

* fix test=develop

* fix api test=develop

* fix api test=develop

* fix API.spec test=develop

* fix typo test=develop

* fix API.spec test=develop

* fix API typo test=develop

* fix doc and API.spec test=develop

47af618f

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致