提交 · 3cd985a66910599d3c911a0552fd4bd4485323a2 · Crayon鑫 / Paddle

19 9月, 2019 5 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

W
distribute.launch use poll to query subprocess (#19853) · 8c2c8dc6
由 WangXi 提交于 9月 18, 2019
```
distribute.launch use poll to query subprocess
```
8c2c8dc6

Disable test_dygraph_mnist_fp16.py (#19844) · 8e927327

由 chengduo 提交于 9月 19, 2019

* Fix std::ostream& operator<<(std::ostream& os, const Tensor& t)
test=develop

* Fix test_dygraph_mnist_fp16
test=develop

* disable test_dygraph_mnist_fp16
test=develop

* revert tensor_util.cc fix
test=develop

8e927327

J
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714) · d9db94d7
由 Jie Fang 提交于 9月 19, 2019
```
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
```
d9db94d7

Strided slice (#19642) · 47af618f

由 wangchaochaohu 提交于 9月 19, 2019

* strided_slice op basic function test=develop

* test=develop rewrite and fix

* fix bug test=develop

* fix for the PADDLE_ENFORCE usage

* add some unit testw

* fix for the aip  test and copright and fix test=develop

* fix API.spec test=develop

* fix API.spec test=develop

* add axis parameter test=develop

* fix for the build error test=develop

* fix python api  test=develop

* fix the build test=develop

* fix build test=develop

* fix API spec test=develop

* test=develop add some comment and single op test

* fix API spece test=develop

* fix test=develop

* fix test=develop

* fix api test=develop

* fix api test=develop

* fix API.spec test=develop

* fix typo test=develop

* fix API.spec test=develop

* fix API typo test=develop

* fix doc and API.spec test=develop

47af618f

18 9月, 2019 11 次提交
- Z
  
  remove some flags and add comments to some flags, test=develop (#19813) · 13ca364c
  由 Zeng Jinle 提交于 9月 18, 2019
  
  13ca364c
- H
  
  Return correct currrent block of a var (#19850) · 3e1e1fee
  由 Huihuang Zheng 提交于 9月 18, 2019
  
  3e1e1fee
- 1
  add retry function to try to solve grpc error code 14 (#19661) · 1bc285a5
  由 123malin 提交于 9月 18, 2019
```
* rpc retry for asycsend/get/prefetch

* test=develop, change retry vlog level to 3

* test=develop, set default grpc_retry_times is 3
```
  1bc285a5
- Z
  
  refine reallocate of workspace size, test=develop (#19843) · 5eb381a3
  由 Zeng Jinle 提交于 9月 18, 2019
  
  5eb381a3
- 石
  
  support MLU nums, test=develop (#19372) · 71b2ed61
  由石晓伟提交于 9月 18, 2019
  
  71b2ed61
- B
  Support dispensable student_loss in PaddleSlim distillation (#19824) · e2c6bada
  由 Bai Yifan 提交于 9月 18, 2019
```
* support_dispensable_student_loss, test=develop

* add distillation test, test=develop

* fix distillation test non convergence problem, test=develop

* fix test_distillation fail problem, test=develop
```
  e2c6bada
- Z
  
  refine executor_gc_helper codes, test=develop (#19814) · 3f87464e
  由 Zeng Jinle 提交于 9月 18, 2019
  
  3f87464e
- L
  
  fix_roi_transform_bug (#19785) · 6d72a86b
  由 LielinJiang 提交于 9月 18, 2019
  
  6d72a86b
- Z
  
  fix gc bug in controlflow ops, test=develop (#19827) · 3fd3b663
  由 Zeng Jinle 提交于 9月 18, 2019
  
  3fd3b663
- L
  Update elementwise double grad to save gpu memory (#19509) · 982e61f5
  由 Leo Chen 提交于 9月 18, 2019
```
* update elementwise double grad to save gpu memory, test=develop

* update elementwise_mul/div_grad_grad to save memory, test=develop

* remove eval function in eigen statement to save memory, test=develop

* add unittest for elementwise_div_grad_grad without dout, test=develop

* add unittest for elementwise_add_grad_grad without ddx, test=develop

* add float16 cuda kernel for elementwise double grad op, test=develop
```
  982e61f5
- Z
  [Bug fix] Disable memory reuse on feeded variables (#19835) · db26de83
  由 Zeng Jinle 提交于 9月 18, 2019
```
* fix memory reuse bug on feeding variables, test=develop

* add comments to reference count members, test=develop
```
  db26de83
17 9月, 2019 20 次提交

A
Add MKLDNNhandlerT templatized class (#19801) · dfdd73cb
由 Adam 提交于 9月 17, 2019
```
test=develop
```
dfdd73cb
Z

fix leaky_relu op when alpha is zero, test=develop (#19833) · cabb9501
由 Zeng Jinle 提交于 9月 17, 2019

cabb9501

zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff

由 Pei Yang 提交于 9月 17, 2019

zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)

9cbc1eff

C
add deformable conv v1 op and cpu version of deformable conv v2 (#18500) · 00efd1d8
由 chengjuntao 提交于 9月 17, 2019
```
* add deformable conv v1 op, test=develop
```
00efd1d8
T
rm return in vfork (#19734) · 40c66f8d
由 Thunderbrook 提交于 9月 17, 2019
```
* rm return in vfork

* rm return in vfork
test=develop
```
40c66f8d
C
Add fp16 support for dygraph (#19828) · b99fc38c
由 chengduo 提交于 9月 17, 2019
```
* Add fp16 support for dygraph
test=develop

* Add unit test
test=develop
```
b99fc38c
Z
fix memory optimization type (#19781) · 110be57c
由 Zhaolong Xing 提交于 9月 17, 2019
```
test=develop
```
110be57c

Enhance OpTest to support double grad inplace check (#19826) · 5fbf03d6

由 Leo Chen 提交于 9月 17, 2019

* update OpTest to support double grad inplace check, test=develop

* keep consistency of _calc_output function, test=develop

5fbf03d6

X
fix libps.so path problem (#19768) · 6045541e
由 xujiaqi01 提交于 9月 17, 2019
```
* fix libps.so path problem of  1/2/3 dir and third_party
* test = develop
```
6045541e

fix pow op, support tensor for agument factor. (#19313) · 677e7144

由 liym27 提交于 9月 17, 2019

improve pow op according to reviews:
1. Delete unnecessary judgement statements in PowGradOpDescMaker;
2. Improve test of test_api;

overload GetKernelTypeForVar

add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow.
test=develop,test=document_preview

677e7144

add tensor support for argument shape in reshape op; (#19268) · bd89a273

由 liym27 提交于 9月 17, 2019

add support parameter inference when argument shape is a list containing integer and tensor variable;
test=develop

fix reshape op according to reviews:
1. improve or message;
2. improve test of test_api.
test=develop,test=document_preview

fix reshape op: Add error message in nn.py, test=develop

add stop_gradient=True when attr(shape) is tensor Variable.
change examples in API reshape.
test=develop,test=document_preview

bd89a273

add tensor(tensor and tensor in list) support for argument starts and ends in slice op; (#19208) · 88628016

由 liym27 提交于 9月 17, 2019

add support parameter inference when arguments starts or ends is a list containing integer and tensor variable;
test=develop,test=document_preview

improve slice op according to review(from hongyu). test=develop

fix slice op according to review: infer_flags, test=develop

fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable.
test=develop,test=document_preview

fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop

add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable.
test=develop,test=document_preview

88628016

fix expand op: (#19302) · e9e3c087

由 liym27 提交于 9月 17, 2019

1. add tensor support for argument expand_times in expand op;
2. add support parameter inference when argument expand_times is a list containing integer and tensor variable;

improve expand op according to reviews:
1. add doc of ExpandTimes in expand_op.cc;
2. improve the test of test_api.

add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples.
test=develop,test=document_preview

e9e3c087

X
support preload thread, optimize hdfs log, fix master+patch bug (#19695) · 6bf298bf
由 xujiaqi01 提交于 9月 17, 2019
```
* support preload thread
* sleep before fleet wrapper exit for pslib core dump
* optimize hdfs log
* fix master+patch bug
```
6bf298bf
H

Add comments for CUDA Device Context Allocator related stuff (#19809) · a0d80754
由 Huihuang Zheng 提交于 9月 17, 2019

a0d80754

Feature/add transform data dygraph (#19707) · cc311bdf

由 Jiabin Yang 提交于 9月 17, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* add transform_data to dygraph

* test=develop, refoctor name to make it easier to understand

* test=develop, refoctor name to make it easier to understand

* add test and change input to const ref for safety

* test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ

* add ut for data transform

* refine ut for data_transform

* test=develop, fix ut failed on parallel se-resnext

* test=develop, change one more PADDLE_ENFORCE

* add test_tracer on multiple devices

* test=develop, change place to mutable for data transform

* test=develop, add transform data on same place test and remove useless log

* test=develop, Add to do for data layout and and ut for conv2d with no bias

cc311bdf

L
cpu Conv double grad (#19672) · b76343c3
由 lvmengsi 提交于 9月 17, 2019
```
* cpu conv_grad_grad
```
b76343c3
Z

disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778) · 754fd57e
由 Zeng Jinle 提交于 9月 17, 2019

754fd57e

翟

Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy (#19770) · 93c85c93

由翟飞跃提交于 9月 17, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* optimize bp with mkl sparse matrix
test=develop

* tmp add fused_emb_seq layer

* Add the support of padding_idx attribute.

test=develop

* add padding_idx support
test=develop

* implement grad refer lego
test=develop

93c85c93

C
Fix example error of Variable and Operator (#19821) · 2729c174
由 chengduo 提交于 9月 17, 2019
```
* fix example error
test=develop

* Remove set_desc
test=develop
```
2729c174

16 9月, 2019 4 次提交

C
Fix warning info of build_strategy (#19805) · 82814970
由 chengduo 提交于 9月 16, 2019
```
* fix warning info
test=develop

* fix bug of all_reduce_deps_pass
test=develop
```
82814970
R
add unittest for square error cost op (#19746) · a0e9b7b9
由 ruri 提交于 9月 16, 2019
```
* add unit test for square error cost op
```
a0e9b7b9
Z

fix retry allocator bug, test=develop (#19794) · b34933d9
由 Zeng Jinle 提交于 9月 16, 2019

b34933d9

Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758

由 Yiqun Liu 提交于 9月 16, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

* Enhance fc_fuse_pass to enable fusing relu.

* Allow print the shapes of var_desc in graph.
test=develop

* Enhance fc_fuse_pass_tester.

* Remove the use of PADDLE_ENFORCE.
test=develop

* Correct the number of ops after fusing.
test=develop

* Fix a typo.
test=develop

* Set activation_type to null when there is no relu in fc.
test=develop

* Refine fc_fuse_pass's codes.

* Enable the set of shape for tensor.

* Refine repeated_fc_relu_pass and add unittest.
test=develop

c67c8758

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致