- 26 9月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 qingqing01 提交于
-
- 25 9月, 2019 8 次提交
-
-
由 zhongpu 提交于
* add kernel for fill_op, test=develop * modify PADDLE_ENFORCE to PADDLE_ENFORCE_EQ, test=develop * add op test for fill_op, test=develop * REGISTER COP CUDA KERNEL, test=develop * update test_fill_op.py, test=develop * change FillConstantOpVarTypeInference to FillOpVarTypeInference, test=develop * fix op test, test=develop * add head file, test=develop
-
由 wangchaochaohu 提交于
* add support tensor and tensorlist for strided_slice OP test=develop * fix the commnet test=develop * fix test=develop * fix the bug test=develop * delete log test=develop * fix API.spec test=develop * fix test=develop
-
由 lvmengsi 提交于
* fix bn
-
由 ShenLiang 提交于
* treat broadcast as non-initial, test=develop * rename the class name * rename the class name, test=develop
-
由 Bob Zhu 提交于
* add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * refactor the code of matmul with multiple head even different width and height test=develop
-
由 Liufang Sang 提交于
* refine ctc align op with padding * refine api sample code
-
由 Tao Luo 提交于
* add input type and dtype check for softmax_op test=develop * refine error message test=develop
-
由 Aurelius84 提交于
* Removing last dims constraints of seq_pad and seq_unpad test=develop * fix test_layer api code test=develop * fix sequence_pad_op.cc conflict test=develop * remove test_analyzer_mm_dnn test=develop * fix vectorize bug test=develop * fix vectorize<int> test=develop
-
- 24 9月, 2019 8 次提交
-
-
由 jhjiangcs 提交于
-
由 Yang Zhang 提交于
* Add float16 support to `sync_batch_norm_op` test=develop * Add test for sync_bn with FP16 input test=develop
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 by add lookup_table_v2 test=develop * modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop * Revert "modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop" This reverts commit 8a960bfc61e51aa27c3c529df8fb90b93ebd19f9. * move api into fluid.embedding test=develop * fix example code test=develop * move one_hot into fluid.one_hot * modify api.spec test=develop * fix loss shape test=develop
-
由 xujiaqi01 提交于
* support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize
-
由 Kaipeng Deng 提交于
-
由 Ghost Under Moon 提交于
* give warnings when save a model without any parameters test=develop * delete one line comment test=develop
-
由 Zeng Jinle 提交于
* add py_reader combination unittest,test=develop * follow huihuang's comments, test=develop
-
由 Leo Chen 提交于
* make OpTest check grad inplace even if forward has no inplace, test=develop * do not run PE when enable_inplace is False, test=develop * add conv3d cuda kernel for float16 type, test=develop * refactor OpTest for inplace, test=develop * add comments, test=develop
-
- 23 9月, 2019 8 次提交
-
-
由 Zhang Ting 提交于
-
由 mapingshuo 提交于
* add recompute based checkpoints methods for large batch training test=develop * add append_backward_with_forward_recomputation test=develop * refine optimizer test=develop * update backward and optimizer test=develop * make Variable usable test=develop * add recompute code * refine optimizer test=develop * refine addup _append_backward_ops_with_checkpoints_ 1) for recompute part, just cache the grad_op_desc without appending to block 2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch test=develop * make method private * add recompute strategy into DistributedStrategy test=develop * checkpoint version3 test=develop * remove some print information test=develop * remove unused sumop test=develop * try to fix recompute with graph building modules * add input names to vars should be held * add memory debug tool * backup backward * Fix bugs * add backward desc for op not in any segments * add exception info for sub_block test=develop * modify code style test=develop * modify code style test=develop * remove print functions test=develop * add API spec test=develop test=document_preview * make Recompute a child class of Optimizer test=develop test=document_preview * add API spec test=develop test=document_preview * modify API spec test=develop test=document_preview * add document for Recompute test=develop test=document_preview * change API doc of Rcompute test=develop test=document_preview * code cleaning test=develop test=document_preview * modify API spec * fix bugs when segments hold no element * add testcase for Recompute Optimizer test=develop test=document_preview * add test for apply_gradient, and code cleaning test=develop test=document_preview * add test case for load function * enable CI test=develop test=document * add test case test=develop test=document_preview * add sample code for 4 function of recompute optimizer test=develop test=document_preview
-
由 Ghost Under Moon 提交于
-
由 wopeizl 提交于
* optimize the error information when the input for while op has a wrong shape test=develop
-
由 ruri 提交于
* add mse_loss op
-
由 Tao Luo 提交于
* move tree_conv to fluid.contrib.layers test=develop * update API.spec for tree_conv test=develop * update tree_conv api to increase unit coverage test=develop
-
由 Zeng Jinle 提交于
* unify DataLoader APIs, test=develop * integrate iterable CPU Dataset, test=develop add GPU dataset supporting, test=develop * add unittests for dataset, test=develop * add more docs to dataloader apis, test=develop, test=document_preview * refine doc, test=develop * refine doc again, test=develop * increase coverage, test=develop
-
由 tangwei12 提交于
* optimize cloud rolemaker, test=develop
-
- 22 9月, 2019 1 次提交
-
-
由 lvmengsi 提交于
* add instance norm op
-
- 21 9月, 2019 3 次提交
-
-
由 Adam 提交于
* Initial, functional commit * Clean commit related files test=develop
-
由 Jiabin Yang 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * support auto prune in dygraph mode * test=develop, support auto prune * test=develop, merge develop conflict * test=develop, fix test_layer and test_tracer ut * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
-
由 Aurelius84 提交于
-
- 20 9月, 2019 5 次提交
-
-
由 Zeng Jinle 提交于
-
由 Aurelius84 提交于
* support 2-level lod of input in sequence_pool test=develop * fix lod level bug in .cu test=develop
-
由 Zhang Ting 提交于
1. group_norm support data_layout=NHWC 2. modified doc of group_norm
-
由 Zhang Ting 提交于
modified interpolate_op to support tensor attribute 1. the parameter out_shape of image_resize、resize_nearest/bilinear/trilinear can be a list or a 1-D tensor variable. If a list, each element can be an integer or a tensor variable with shape: [1]. 2. the parameter scale of above Ops can be a 1-D tensor variable. modified document of image_resize, resize_nearest, resize_bilinear, resize_trilinear and add some code example.
-
由 Zhang Ting 提交于
add crop_tensor op. The main difference with crop is : 1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration. 2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].
-
- 19 9月, 2019 5 次提交
-
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 in cross_entropy test=develop * modify labels last dims test=develop
-
由 wopeizl 提交于
* add precise roi pooling op test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * detail the description test=develop * test=develop * elaborate the doc for return type test=develop * test=develop
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
由 WangXi 提交于
distribute.launch use poll to query subprocess
-
由 chengduo 提交于
* Fix std::ostream& operator<<(std::ostream& os, const Tensor& t) test=develop * Fix test_dygraph_mnist_fp16 test=develop * disable test_dygraph_mnist_fp16 test=develop * revert tensor_util.cc fix test=develop
-