- 20 9月, 2019 1 次提交
-
-
由 Zhang Ting 提交于
add crop_tensor op. The main difference with crop is : 1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration. 2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].
-
- 19 9月, 2019 11 次提交
-
-
由 lidanqing 提交于
* fix conflicts test=develop * change mask_bias_reorder test=develop * add ComputeMask function to make code clear test=develop * change according to reviews test=develop * change according to reviews test=develop
-
由 joanna.wozna.intel 提交于
* Fix conv2d+dequantize squash for residual fusion test=develop * Change condition test=develop
-
由 Huihuang Zheng 提交于
Add boost as dependency of prune fix #19862
-
由 Adam 提交于
* Add template functions for Acquire primitive/primitive_desc test=develop * Move acquire primitive descriptor to protected section test=develop
-
由 flame 提交于
-
由 Leo Chen 提交于
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 in cross_entropy test=develop * modify labels last dims test=develop
-
由 wopeizl 提交于
* add precise roi pooling op test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * detail the description test=develop * test=develop * elaborate the doc for return type test=develop * test=develop
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
由 Jie Fang 提交于
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
-
由 wangchaochaohu 提交于
* strided_slice op basic function test=develop * test=develop rewrite and fix * fix bug test=develop * fix for the PADDLE_ENFORCE usage * add some unit testw * fix for the aip test and copright and fix test=develop * fix API.spec test=develop * fix API.spec test=develop * add axis parameter test=develop * fix for the build error test=develop * fix python api test=develop * fix the build test=develop * fix build test=develop * fix API spec test=develop * test=develop add some comment and single op test * fix API spece test=develop * fix test=develop * fix test=develop * fix api test=develop * fix api test=develop * fix API.spec test=develop * fix typo test=develop * fix API.spec test=develop * fix API typo test=develop * fix doc and API.spec test=develop
-
- 18 9月, 2019 9 次提交
-
-
由 Zeng Jinle 提交于
-
由 123malin 提交于
* rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3
-
由 Zeng Jinle 提交于
-
由 石晓伟 提交于
-
由 Zeng Jinle 提交于
-
由 LielinJiang 提交于
-
由 Zeng Jinle 提交于
-
由 Leo Chen 提交于
* update elementwise double grad to save gpu memory, test=develop * update elementwise_mul/div_grad_grad to save memory, test=develop * remove eval function in eigen statement to save memory, test=develop * add unittest for elementwise_div_grad_grad without dout, test=develop * add unittest for elementwise_add_grad_grad without ddx, test=develop * add float16 cuda kernel for elementwise double grad op, test=develop
-
由 Zeng Jinle 提交于
* fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop
-
- 17 9月, 2019 16 次提交
-
-
由 Adam 提交于
test=develop
-
由 Zeng Jinle 提交于
-
由 Pei Yang 提交于
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
-
由 chengjuntao 提交于
* add deformable conv v1 op, test=develop
-
由 Thunderbrook 提交于
* rm return in vfork * rm return in vfork test=develop
-
由 Zhaolong Xing 提交于
test=develop
-
由 liym27 提交于
improve pow op according to reviews: 1. Delete unnecessary judgement statements in PowGradOpDescMaker; 2. Improve test of test_api; overload GetKernelTypeForVar add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow. test=develop,test=document_preview
-
由 liym27 提交于
add support parameter inference when argument shape is a list containing integer and tensor variable; test=develop fix reshape op according to reviews: 1. improve or message; 2. improve test of test_api. test=develop,test=document_preview fix reshape op: Add error message in nn.py, test=develop add stop_gradient=True when attr(shape) is tensor Variable. change examples in API reshape. test=develop,test=document_preview
-
由 liym27 提交于
add support parameter inference when arguments starts or ends is a list containing integer and tensor variable; test=develop,test=document_preview improve slice op according to review(from hongyu). test=develop fix slice op according to review: infer_flags, test=develop fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable. test=develop,test=document_preview fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable. test=develop,test=document_preview
-
由 liym27 提交于
1. add tensor support for argument expand_times in expand op; 2. add support parameter inference when argument expand_times is a list containing integer and tensor variable; improve expand op according to reviews: 1. add doc of ExpandTimes in expand_op.cc; 2. improve the test of test_api. add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples. test=develop,test=document_preview
-
由 xujiaqi01 提交于
* support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug
-
由 Huihuang Zheng 提交于
-
由 Jiabin Yang 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * add transform_data to dygraph * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * add test and change input to const ref for safety * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * add ut for data transform * refine ut for data_transform * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * add test_tracer on multiple devices * test=develop, change place to mutable for data transform * test=develop, add transform data on same place test and remove useless log * test=develop, Add to do for data layout and and ut for conv2d with no bias
-
由 lvmengsi 提交于
* cpu conv_grad_grad
-
由 Zeng Jinle 提交于
-
由 翟飞跃 提交于
* Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop * tmp add fused_emb_seq layer * Add the support of padding_idx attribute. test=develop * add padding_idx support test=develop * implement grad refer lego test=develop
-
- 16 9月, 2019 3 次提交
-
-
由 chengduo 提交于
* fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop
-
由 Zeng Jinle 提交于
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop
-