1. 23 9月, 2019 1 次提交
  2. 22 9月, 2019 1 次提交
  3. 21 9月, 2019 2 次提交
  4. 20 9月, 2019 5 次提交
    • A
      support 2-level lod of input in sequence_pool (#19839) · fcf53e55
      Aurelius84 提交于
      * support 2-level lod of input in sequence_pool test=develop
      
      * fix lod level bug in .cu test=develop
      fcf53e55
    • Z
      group_norm support data_layout:NHWC, test=develop, test=document_preview (#19614) · 93364b45
      Zhang Ting 提交于
      1. group_norm support data_layout=NHWC
      2. modified doc of group_norm
      93364b45
    • J
      [MKL-DNN] LRN refactoring (#19798) · 619c797a
      Jacek Czaja 提交于
      - LRN mkl-dnn kernel refactor
      
      test=develop
      
      - compilation fix
      
      - Another compilation fix
      
      - Compilation fix
      
      - another compilation fix
      
      - compilation fix
      
      - Crash fix
      
      - optional LRN mkldnn workspace
      
      - Added mid allocation
      
      - Workaround for tests
      
      - Removed gradient from is_test ut
      
      - Removed mid for inference
      
      - Reverted LRN mid removal for is_test
      
      - PADDLE_ENFORCE adjusted
      
      - Rebase to templatization commit
      
      - Compilation fix
      
      - compilation fix
      
      test=develop
      
      - lint
      
      test=develop
      
      - Fix to crash
      
      - Rebase to recent codebase
      
       - lin
      
      - lint
      
      - compilation fix
      619c797a
    • Z
      modified interpolate op to support tensor attribute, test=develop, test=document_preview (#19287) · 439d95e1
      Zhang Ting 提交于
      modified interpolate_op to support tensor attribute
      
      1. the parameter out_shape of image_resize、resize_nearest/bilinear/trilinear can be a list or a 1-D tensor variable. If a list, each element can be an integer or a tensor variable with shape: [1].
      
      2. the parameter scale of above Ops can be a 1-D tensor variable.
      modified document of image_resize, resize_nearest, resize_bilinear, resize_trilinear and add some code example.
      439d95e1
    • Z
      add crop_tensor_op, test=develop, test=document_preview (#19314) · b3888941
      Zhang Ting 提交于
      add crop_tensor op. The main difference with crop is :
      
      1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration.
      
      2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x
      
      offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].
      b3888941
  5. 19 9月, 2019 6 次提交
    • L
      Refactor conv computeINT8 (#19574) · 2c32c2d6
      lidanqing 提交于
      * fix conflicts
      test=develop
      
      * change mask_bias_reorder
      test=develop
      
      * add ComputeMask function to make code clear
      test=develop
      
      * change according to reviews
      test=develop
      
      * change according to reviews
      test=develop
      2c32c2d6
    • A
      Add template functions for Acquire primitive/primitive_desc (#19867) · c7e68892
      Adam 提交于
      * Add template functions for Acquire primitive/primitive_desc
      test=develop
      
      * Move acquire primitive descriptor to protected section
      test=develop
      c7e68892
    • A
      Remove constraint that last dimension is forced to be 1 in cross_entropy (#19606) · b125e327
      Aurelius84 提交于
      * Remove constraint that last dimension is forced to be 1 in cross_entropy
      test=develop
      
      * modify labels last dims test=develop
      b125e327
    • W
      add precise roi pooling op test=develop (#18960) · a7c440d3
      wopeizl 提交于
      * add precise roi pooling op test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * detail the description test=develop
      
      * test=develop
      
      * elaborate the doc for return type test=develop
      
      * test=develop
      a7c440d3
    • Y
      Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6
      Yiqun Liu 提交于
      * Add fc_elementwise_layernorm_fuse pass and unittest.
      
      * Add fused_fc_elementwise_layernorm op and its GPU kernel.
      test=develop
      
      * Apply fc_elementwise_layernorm_fuse_pass to GPU inference.
      
      * Add the setting of attrs in the definition of binary_op.
      test=develop
      
      * Add comment.
      
      * Implement the unittest.
      test=develop
      
      * Change the unittest name of layer_norm.
      test=develop
      3cd985a6
    • W
      Strided slice (#19642) · 47af618f
      wangchaochaohu 提交于
      * strided_slice op basic function test=develop
      
      * test=develop rewrite and fix
      
      * fix bug test=develop
      
      * fix for the PADDLE_ENFORCE usage
      
      * add some unit testw
      
      * fix for the aip  test and copright and fix test=develop
      
      * fix API.spec test=develop
      
      * fix API.spec test=develop
      
      * add axis parameter test=develop
      
      * fix for the build error test=develop
      
      * fix python api  test=develop
      
      * fix the build test=develop
      
      * fix build test=develop
      
      * fix API spec test=develop
      
      * test=develop add some comment and single op test
      
      * fix API spece test=develop
      
      * fix test=develop
      
      * fix test=develop
      
      * fix api test=develop
      
      * fix api test=develop
      
      * fix API.spec test=develop
      
      * fix typo test=develop
      
      * fix API.spec test=develop
      
      * fix API typo test=develop
      
      * fix doc and API.spec test=develop
      47af618f
  6. 18 9月, 2019 4 次提交
  7. 17 9月, 2019 9 次提交
    • A
      Add MKLDNNhandlerT templatized class (#19801) · dfdd73cb
      Adam 提交于
      test=develop
      dfdd73cb
    • Z
      cabb9501
    • C
      add deformable conv v1 op and cpu version of deformable conv v2 (#18500) · 00efd1d8
      chengjuntao 提交于
      * add deformable conv v1 op, test=develop
      00efd1d8
    • L
      fix pow op, support tensor for agument factor. (#19313) · 677e7144
      liym27 提交于
      improve pow op according to reviews:
      1. Delete unnecessary judgement statements in PowGradOpDescMaker;
      2. Improve test of test_api;
      
      overload GetKernelTypeForVar
      
      add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow.
      test=develop,test=document_preview
      677e7144
    • L
      add tensor support for argument shape in reshape op; (#19268) · bd89a273
      liym27 提交于
      add support parameter inference when argument shape is a list containing integer and tensor variable;
      test=develop
      
      fix reshape op according to reviews:
      1. improve or message;
      2. improve test of test_api.
      test=develop,test=document_preview
      
      fix reshape op: Add error message in nn.py, test=develop
      
      add stop_gradient=True when attr(shape) is tensor Variable.
      change examples in API reshape.
      test=develop,test=document_preview
      bd89a273
    • L
      add tensor(tensor and tensor in list) support for argument starts and ends in slice op; (#19208) · 88628016
      liym27 提交于
      add support parameter inference when arguments starts or ends is a list containing integer and tensor variable;
      test=develop,test=document_preview
      
      improve slice op according to review(from hongyu). test=develop
      
      fix slice op according to review: infer_flags, test=develop
      
      fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable.
      test=develop,test=document_preview
      
      fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop
      
      add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable.
      test=develop,test=document_preview
      88628016
    • L
      fix expand op: (#19302) · e9e3c087
      liym27 提交于
      1. add tensor support for argument expand_times in expand op;
      2. add support parameter inference when argument expand_times is a list containing integer and tensor variable;
      
      improve expand op according to reviews:
      1. add doc of ExpandTimes in expand_op.cc;
      2. improve the test of test_api.
      
      add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples.
      test=develop,test=document_preview
      e9e3c087
    • L
      cpu Conv double grad (#19672) · b76343c3
      lvmengsi 提交于
      * cpu conv_grad_grad
      b76343c3
    • Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy (#19770) · 93c85c93
      翟飞跃 提交于
      * Implement the operator with sprase matrix multiply
      
      * Update the URL of mklml library.
      
      test=develop
      
      * Disable MKLML implematation when using no-linux.
      
      test=develop
      
      * optimize bp with mkl sparse matrix
      test=develop
      
      * tmp add fused_emb_seq layer
      
      * Add the support of padding_idx attribute.
      
      test=develop
      
      * add padding_idx support
      test=develop
      
      * implement grad refer lego
      test=develop
      93c85c93
  8. 16 9月, 2019 4 次提交
    • Y
      Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      
      * Enhance fc_fuse_pass to enable fusing relu.
      
      * Allow print the shapes of var_desc in graph.
      test=develop
      
      * Enhance fc_fuse_pass_tester.
      
      * Remove the use of PADDLE_ENFORCE.
      test=develop
      
      * Correct the number of ops after fusing.
      test=develop
      
      * Fix a typo.
      test=develop
      
      * Set activation_type to null when there is no relu in fc.
      test=develop
      
      * Refine fc_fuse_pass's codes.
      
      * Enable the set of shape for tensor.
      
      * Refine repeated_fc_relu_pass and add unittest.
      test=develop
      c67c8758
    • Z
      add kernel for squeeze_op, test=develop (#19656) · 52673956
      zhongpu 提交于
      * add kernel for squeeze_op, test=develop
      
      * delete comment, test=develop
      52673956
    • Z
      add kernel for unstack_op, test=develop (#19538) · 2a81c367
      zhongpu 提交于
      * add kernel for unstack_op, test=develop
      
      * add kernel for unstack_op, test=develop
      
      * add kernel for unstack_op, test=develop
      
      * adjust the code format, test=develop
      
      * modify some comment, test=develop
      2a81c367
    • K
      fix softmax axis!=-1. test=develop (#19800) · 99c78b77
      Kaipeng Deng 提交于
      99c78b77
  9. 14 9月, 2019 1 次提交
  10. 12 9月, 2019 2 次提交
  11. 11 9月, 2019 5 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
    • Z
      Make leaky relu inplacable (#19676) · 0daa5c97
      Zeng Jinle 提交于
      * make leaky relu inplacable, test=develop
      
      * force add unittests to pass coverage, test=develop
      0daa5c97
    • Z
      refine math_op_patch, test=develop (#19727) · 078a6782
      Zeng Jinle 提交于
      078a6782
    • J
      - Softmax mkl-dnn refactoring (#19615) · 47f670d5
      Jacek Czaja 提交于
      test=develop
      
      - Cosmetic fixes
      
      test=develop
      47f670d5
    • Y
      Implement the GPU kernel of fc operator (#19687) · a65c728e
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      a65c728e