1. 11 9月, 2019 1 次提交
    • Y
      Implement the GPU kernel of fc operator (#19687) · a65c728e
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      a65c728e
  2. 05 9月, 2019 3 次提交
  3. 04 9月, 2019 1 次提交
  4. 03 9月, 2019 2 次提交
  5. 02 9月, 2019 1 次提交
  6. 29 8月, 2019 1 次提交
  7. 20 8月, 2019 1 次提交
  8. 19 8月, 2019 1 次提交
  9. 01 8月, 2019 1 次提交
  10. 24 7月, 2019 1 次提交
    • B
      Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60
      Bob Zhu 提交于
      * extend matmul op to support multiple head multiplication
      
      With the support of multiple head, the multiplication of two big matrixes is
      split into multiplication of several (head_number) small matrixes. e.g. if
      Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
      as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
      [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
      220eef60
  11. 28 6月, 2019 1 次提交
  12. 25 6月, 2019 1 次提交
    • H
      Sequence mask support tensor (#18249) · df2eee71
      Hongyu Liu 提交于
      * sequnce mask support max length tensor input; test=develop
      
      * add rnn_impl.py; test=develop
      
      * add basic gru lstm unittest; test=develop
      
      * fix api spec; test=develop
      
      * fix sequence_mask op bug;
      test=develop
      test=document_preview
      
      * change +-*x to elmentwise_op; test=develop
      
      * add mkl flag; test=develop
      
      * fix rnn impl bug; test=develop
      
      * update api spec; test=develop
      
      * fix doc bug; test=develop
      
      * fix lstm bugs; test=develop
      df2eee71
  13. 14 6月, 2019 1 次提交
  14. 12 6月, 2019 1 次提交
  15. 10 6月, 2019 1 次提交
    • Y
      Enable seq_pool op to accept len 0 input (#17284) · 33d1e565
      Yibing Liu 提交于
      * Enable seq_pool op to accept len 0 input
      
      test=develop
      
      * Update sequence_pool's api
      
      test=develop
      
      * Add more unittest cases for seq_pool op
      
      test=develop
      
      * Remove legacy comments
      
      test=develop
      
      * Don't use template in op maker
      
      test=develop
      33d1e565
  16. 30 5月, 2019 1 次提交
  17. 29 5月, 2019 1 次提交
    • Y
      Optimize the concat and split kernel for specical cases when the number of... · 5782ddda
      Yiqun Liu 提交于
      Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)
      
      * Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
      test=develop
      
      * Refine codes.
      test=develop
      
      * Correct the condition.
      test=develop
      
      * Move the define of tmp_data outside the if statement.
      
      * Print the cudnn minor version.
      test=develop
      
      * Fix the case when in_num/o_num is 1 in concat/split op.
      test=develop
      
      * Remove const_cast.
      test=develop
      5782ddda
  18. 24 5月, 2019 1 次提交
    • T
      [CPU] refine cpu softmax bwd (#17534) · 7ae461eb
      tensor-tang 提交于
      * refine softmax fwd
      
      test=develop
      
      * refine cpu softmax bwd
      
      test=develop
      
      * fix batch size
      
      test=develop
      
      * fix compile issue with gpu
      
      test=develop
      
      * add value clip
      7ae461eb
  19. 23 5月, 2019 1 次提交
  20. 21 5月, 2019 1 次提交
  21. 16 5月, 2019 1 次提交
  22. 15 5月, 2019 1 次提交
  23. 10 5月, 2019 1 次提交
  24. 07 5月, 2019 1 次提交
    • K
      Softmax_cross_entropy op add axis (#16806) · a71d8fdb
      Kaipeng Deng 提交于
      * add attr axis infershape. test=develop
      
      * add CUDA kernel. test=develop
      
      * fix unittest. test=develop
      
      * fix unittest for soft_label. test=develop
      
      * fix fp16 unittest. test=develop
      
      * remove comment code. test=develop
      
      * refine test for axis. test=develop
      
      * add python api. test=develop
      
      * fix doc. test=develop
      
      * fix fp16 unittest. test=develop
      
      * fix ngraph test. test=develop
      
      * fix ENFORCE for test_imperative_transformer. test=develop
      
      * fit for ngraph test. test=develop
      
      * fix after rebase develop. test=develop
      
      * fix doc. test=develop
      
      * fix API.spec. test=develop
      
      * fix test_layers. test=develop
      
      * fix format. test=develop
      a71d8fdb
  25. 20 4月, 2019 1 次提交
  26. 17 4月, 2019 1 次提交
    • K
      fix overflow by int32 mul test=develop (#16794) · c474e7dd
      Kevin 提交于
      * fix overflow by int32 mul test=develop
      
      * fix reference nullptr
      
      * fix codestyle test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify . to -> test=develop
      c474e7dd
  27. 12 4月, 2019 3 次提交
  28. 25 3月, 2019 1 次提交
  29. 20 3月, 2019 2 次提交
  30. 18 3月, 2019 2 次提交
  31. 14 3月, 2019 2 次提交
  32. 12 3月, 2019 1 次提交