1. 24 7月, 2019 1 次提交
    • B
      Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60
      Bob Zhu 提交于
      * extend matmul op to support multiple head multiplication
      
      With the support of multiple head, the multiplication of two big matrixes is
      split into multiplication of several (head_number) small matrixes. e.g. if
      Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
      as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
      [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
      220eef60
  2. 28 6月, 2019 1 次提交
  3. 25 6月, 2019 1 次提交
    • H
      Sequence mask support tensor (#18249) · df2eee71
      Hongyu Liu 提交于
      * sequnce mask support max length tensor input; test=develop
      
      * add rnn_impl.py; test=develop
      
      * add basic gru lstm unittest; test=develop
      
      * fix api spec; test=develop
      
      * fix sequence_mask op bug;
      test=develop
      test=document_preview
      
      * change +-*x to elmentwise_op; test=develop
      
      * add mkl flag; test=develop
      
      * fix rnn impl bug; test=develop
      
      * update api spec; test=develop
      
      * fix doc bug; test=develop
      
      * fix lstm bugs; test=develop
      df2eee71
  4. 14 6月, 2019 1 次提交
  5. 12 6月, 2019 1 次提交
  6. 10 6月, 2019 1 次提交
    • Y
      Enable seq_pool op to accept len 0 input (#17284) · 33d1e565
      Yibing Liu 提交于
      * Enable seq_pool op to accept len 0 input
      
      test=develop
      
      * Update sequence_pool's api
      
      test=develop
      
      * Add more unittest cases for seq_pool op
      
      test=develop
      
      * Remove legacy comments
      
      test=develop
      
      * Don't use template in op maker
      
      test=develop
      33d1e565
  7. 30 5月, 2019 1 次提交
  8. 29 5月, 2019 1 次提交
    • Y
      Optimize the concat and split kernel for specical cases when the number of... · 5782ddda
      Yiqun Liu 提交于
      Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)
      
      * Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
      test=develop
      
      * Refine codes.
      test=develop
      
      * Correct the condition.
      test=develop
      
      * Move the define of tmp_data outside the if statement.
      
      * Print the cudnn minor version.
      test=develop
      
      * Fix the case when in_num/o_num is 1 in concat/split op.
      test=develop
      
      * Remove const_cast.
      test=develop
      5782ddda
  9. 24 5月, 2019 1 次提交
    • T
      [CPU] refine cpu softmax bwd (#17534) · 7ae461eb
      tensor-tang 提交于
      * refine softmax fwd
      
      test=develop
      
      * refine cpu softmax bwd
      
      test=develop
      
      * fix batch size
      
      test=develop
      
      * fix compile issue with gpu
      
      test=develop
      
      * add value clip
      7ae461eb
  10. 23 5月, 2019 1 次提交
  11. 21 5月, 2019 1 次提交
  12. 16 5月, 2019 1 次提交
  13. 15 5月, 2019 1 次提交
  14. 10 5月, 2019 1 次提交
  15. 07 5月, 2019 1 次提交
    • K
      Softmax_cross_entropy op add axis (#16806) · a71d8fdb
      Kaipeng Deng 提交于
      * add attr axis infershape. test=develop
      
      * add CUDA kernel. test=develop
      
      * fix unittest. test=develop
      
      * fix unittest for soft_label. test=develop
      
      * fix fp16 unittest. test=develop
      
      * remove comment code. test=develop
      
      * refine test for axis. test=develop
      
      * add python api. test=develop
      
      * fix doc. test=develop
      
      * fix fp16 unittest. test=develop
      
      * fix ngraph test. test=develop
      
      * fix ENFORCE for test_imperative_transformer. test=develop
      
      * fit for ngraph test. test=develop
      
      * fix after rebase develop. test=develop
      
      * fix doc. test=develop
      
      * fix API.spec. test=develop
      
      * fix test_layers. test=develop
      
      * fix format. test=develop
      a71d8fdb
  16. 20 4月, 2019 1 次提交
  17. 17 4月, 2019 1 次提交
    • K
      fix overflow by int32 mul test=develop (#16794) · c474e7dd
      Kevin 提交于
      * fix overflow by int32 mul test=develop
      
      * fix reference nullptr
      
      * fix codestyle test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify . to -> test=develop
      c474e7dd
  18. 12 4月, 2019 3 次提交
  19. 25 3月, 2019 1 次提交
  20. 20 3月, 2019 2 次提交
  21. 18 3月, 2019 2 次提交
  22. 14 3月, 2019 2 次提交
  23. 12 3月, 2019 1 次提交
  24. 08 3月, 2019 3 次提交
  25. 07 3月, 2019 1 次提交
  26. 04 3月, 2019 3 次提交
  27. 28 2月, 2019 1 次提交
  28. 26 2月, 2019 1 次提交
  29. 22 2月, 2019 2 次提交
    • T
      Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
      tensor-tang 提交于
      * Revert "Optimze Gelu with MKL Erf function (#15770)"
      
      This reverts commit 676995c8.
      
      * test=develop
      ee2321de
    • Y
      Optimze Gelu with MKL Erf function (#15770) · 676995c8
      Yihua Xu 提交于
      * Optimize for gelu operator
      
      * Set up the low accuracy mode of MKL ERF function.
      
      test=develop
      
      * Only enable MKLML ERF when OS is linux
      
      * Use the speical mklml version included vmsErf function to verify gelu mkl kernel.
      
      test=develop
      
      * Add the CUDA macro to avoid NVCC's compile issue.
      
      test=develop
      
      * Add the TODO comments for mklml library modification.
      
      test=develop
      
      * Clean Code
      
      test=develop
      
      * Add the comment of marco for NVCC compiler.
      
      test=develop
      676995c8
  30. 19 2月, 2019 1 次提交