1. 07 1月, 2020 8 次提交
  2. 06 1月, 2020 8 次提交
  3. 05 1月, 2020 1 次提交
  4. 04 1月, 2020 1 次提交
  5. 03 1月, 2020 5 次提交
    • S
      register int/int64_t/float16 in pow/square kernel,test=develop (#22023) · 7f4abaf2
      SunAhong1993 提交于
      * register int/int64_t/float16 in  pow/square kernel,test=develop
      
      * add abs/square/exp type,test=develop
      7f4abaf2
    • L
      register NoNeedBufferVarsInference for max_pool_grad_op, test=develop (#22055) · 3f653c83
      Leo Chen 提交于
      * fix test_conv2d_ngraph for grad diff, test=develop
      
      * register NoNeedBufferVarsInference for max_pool_grad_op, test=develop
      
      * refine error message, test=develop
      
      * fix numpy, test=develop
      
      * disable test conv2d_ngraph_op, test=develop
      Co-authored-by: NZhang Ting <709968123@qq.com>
      3f653c83
    • Y
      Add the first implememtation of fusion_group op (#19621) · d4832077
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      
      * Add DeviceCodePool to manage all device codes.
      
      * Add the first implementation fusion_group op.
      
      * Add unit-test for fusion_group op.
      
      * Add the check of result.
      
      * Add the check of nvrtc in unit-test.
      test=develop
      
      * Add comment to explain the inputs, outputs and features of fusion_group op.
      test=develop
      
      * Disable fusion_group op for mac and windows.
      test=develop
      
      * Make the compiling of device code return status instead of hanging up.
      test=develop
      
      * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
      
      * Unify fusion_group_op's input and output names.
      test=develop
      
      * Add the check of CUDA driver library in unittest.
      test=develop
      
      * Refine the calling of PADDLE_ENFORCE.
      test=develop
      d4832077
    • M
      [DNNL] 3D Fully-Connected (#21746) · 61921084
      Michał Gallus 提交于
      61921084
    • F
      fix generate_proposal_labesl op (#21793) · aa2ed0dc
      FDInSky 提交于
      * test=develop fix generate_proposal_labesl op
      aa2ed0dc
  6. 02 1月, 2020 2 次提交
  7. 01 1月, 2020 1 次提交
  8. 31 12月, 2019 1 次提交
  9. 30 12月, 2019 4 次提交
  10. 29 12月, 2019 1 次提交
    • L
      Fix multi-threads memory out of bounds error for passes (#21920) · 196e20df
      liu zhengxi 提交于
      * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop
      
      * fix attention_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
      196e20df
  11. 27 12月, 2019 5 次提交
  12. 26 12月, 2019 3 次提交