- 20 9月, 2019 3 次提交
-
-
由 Zhang Ting 提交于
add crop_tensor op. The main difference with crop is : 1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration. 2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].
-
由 Zeng Jinle 提交于
-
由 chengduo 提交于
test=develop
-
- 19 9月, 2019 9 次提交
-
-
由 flame 提交于
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 in cross_entropy test=develop * modify labels last dims test=develop
-
由 gongweibao 提交于
change _origin_program test=develop
-
由 wopeizl 提交于
* add precise roi pooling op test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * detail the description test=develop * test=develop * elaborate the doc for return type test=develop * test=develop
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
由 WangXi 提交于
distribute.launch use poll to query subprocess
-
由 chengduo 提交于
* Fix std::ostream& operator<<(std::ostream& os, const Tensor& t) test=develop * Fix test_dygraph_mnist_fp16 test=develop * disable test_dygraph_mnist_fp16 test=develop * revert tensor_util.cc fix test=develop
-
由 Jie Fang 提交于
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
-
由 wangchaochaohu 提交于
* strided_slice op basic function test=develop * test=develop rewrite and fix * fix bug test=develop * fix for the PADDLE_ENFORCE usage * add some unit testw * fix for the aip test and copright and fix test=develop * fix API.spec test=develop * fix API.spec test=develop * add axis parameter test=develop * fix for the build error test=develop * fix python api test=develop * fix the build test=develop * fix build test=develop * fix API spec test=develop * test=develop add some comment and single op test * fix API spece test=develop * fix test=develop * fix test=develop * fix api test=develop * fix api test=develop * fix API.spec test=develop * fix typo test=develop * fix API.spec test=develop * fix API typo test=develop * fix doc and API.spec test=develop
-
- 18 9月, 2019 6 次提交
-
-
由 Zeng Jinle 提交于
-
由 Huihuang Zheng 提交于
-
由 123malin 提交于
* rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3
-
由 Bai Yifan 提交于
* support_dispensable_student_loss, test=develop * add distillation test, test=develop * fix distillation test non convergence problem, test=develop * fix test_distillation fail problem, test=develop
-
由 LielinJiang 提交于
-
由 Zeng Jinle 提交于
* fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop
-
- 17 9月, 2019 13 次提交
-
-
由 chengjuntao 提交于
* add deformable conv v1 op, test=develop
-
由 chengduo 提交于
* Add fp16 support for dygraph test=develop * Add unit test test=develop
-
由 Leo Chen 提交于
* update OpTest to support double grad inplace check, test=develop * keep consistency of _calc_output function, test=develop
-
由 xujiaqi01 提交于
* fix libps.so path problem of 1/2/3 dir and third_party * test = develop
-
由 liym27 提交于
improve pow op according to reviews: 1. Delete unnecessary judgement statements in PowGradOpDescMaker; 2. Improve test of test_api; overload GetKernelTypeForVar add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow. test=develop,test=document_preview
-
由 liym27 提交于
add support parameter inference when argument shape is a list containing integer and tensor variable; test=develop fix reshape op according to reviews: 1. improve or message; 2. improve test of test_api. test=develop,test=document_preview fix reshape op: Add error message in nn.py, test=develop add stop_gradient=True when attr(shape) is tensor Variable. change examples in API reshape. test=develop,test=document_preview
-
由 liym27 提交于
add support parameter inference when arguments starts or ends is a list containing integer and tensor variable; test=develop,test=document_preview improve slice op according to review(from hongyu). test=develop fix slice op according to review: infer_flags, test=develop fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable. test=develop,test=document_preview fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable. test=develop,test=document_preview
-
由 liym27 提交于
1. add tensor support for argument expand_times in expand op; 2. add support parameter inference when argument expand_times is a list containing integer and tensor variable; improve expand op according to reviews: 1. add doc of ExpandTimes in expand_op.cc; 2. improve the test of test_api. add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples. test=develop,test=document_preview
-
由 xujiaqi01 提交于
* support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug
-
由 Jiabin Yang 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * add transform_data to dygraph * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * add test and change input to const ref for safety * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * add ut for data transform * refine ut for data_transform * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * add test_tracer on multiple devices * test=develop, change place to mutable for data transform * test=develop, add transform data on same place test and remove useless log * test=develop, Add to do for data layout and and ut for conv2d with no bias
-
由 lvmengsi 提交于
* cpu conv_grad_grad
-
由 翟飞跃 提交于
* Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop * tmp add fused_emb_seq layer * Add the support of padding_idx attribute. test=develop * add padding_idx support test=develop * implement grad refer lego test=develop
-
由 chengduo 提交于
* fix example error test=develop * Remove set_desc test=develop
-
- 16 9月, 2019 4 次提交
-
-
由 ruri 提交于
* add unit test for square error cost op
-
由 zhongpu 提交于
* add kernel for squeeze_op, test=develop * delete comment, test=develop
-
由 Chen Weihang 提交于
-
由 tangwei12 提交于
fix wrong place with distributed_lookup_table
-
- 12 9月, 2019 2 次提交
-
-
由 Aurelius84 提交于
* add one_hot_v2_op to remove last_dims==1 test=develop * add api unittest code for CI_Coverage test=develop * improve CI_Coverage rate by adding test_with_depth test=develop
-
由 JesseyXujin 提交于
-
- 11 9月, 2019 3 次提交
-
-
由 Huihuang Zheng 提交于
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation
-
由 chengduo 提交于
Fix test_parallel_executor_test_while_train
-
由 Zeng Jinle 提交于
-