1. 03 1月, 2020 3 次提交
    • Y
      Add the first implememtation of fusion_group op (#19621) · d4832077
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      
      * Add DeviceCodePool to manage all device codes.
      
      * Add the first implementation fusion_group op.
      
      * Add unit-test for fusion_group op.
      
      * Add the check of result.
      
      * Add the check of nvrtc in unit-test.
      test=develop
      
      * Add comment to explain the inputs, outputs and features of fusion_group op.
      test=develop
      
      * Disable fusion_group op for mac and windows.
      test=develop
      
      * Make the compiling of device code return status instead of hanging up.
      test=develop
      
      * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
      
      * Unify fusion_group_op's input and output names.
      test=develop
      
      * Add the check of CUDA driver library in unittest.
      test=develop
      
      * Refine the calling of PADDLE_ENFORCE.
      test=develop
      d4832077
    • M
      [DNNL] 3D Fully-Connected (#21746) · 61921084
      Michał Gallus 提交于
      61921084
    • F
      fix generate_proposal_labesl op (#21793) · aa2ed0dc
      FDInSky 提交于
      * test=develop fix generate_proposal_labesl op
      aa2ed0dc
  2. 02 1月, 2020 2 次提交
  3. 31 12月, 2019 1 次提交
  4. 30 12月, 2019 1 次提交
  5. 27 12月, 2019 3 次提交
  6. 26 12月, 2019 2 次提交
  7. 25 12月, 2019 3 次提交
  8. 24 12月, 2019 3 次提交
    • A
      Optimize adam speed (#21777) · 51a86d2b
      Aurelius84 提交于
      * optimize adam speed by removing _finish_update test=develop
      
      * fix SparseAdamFunctor param list test=develop
      
      * Remove scale_op in expect_list of adam_op test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * modify PADDLE_ENFORCE usage test=develop
      
      * fix op_type in lamb_op.cc test=develop
      
      * fix errors ostream format bug test=develop
      
      * add betaPowOut in ngraph op test=develop
      
      * fix ngraph::op api for gcc8 test=develop
      
      * clean code test=develop
      
      * modify struct into class test=develop
      
      * remove code of beta1Tensor in lamb_op test=develop
      51a86d2b
    • F
      Update iou_similarity op to support non-normalized bbox (#21671) · 6b9fbcf3
      FDInSky 提交于
      Update iou_similarity op to support non-normalized bbox
      6b9fbcf3
    • G
      Modify the while_loop API (#21844) · 46f9184a
      guofei 提交于
      46f9184a
  9. 23 12月, 2019 2 次提交
  10. 20 12月, 2019 1 次提交
  11. 19 12月, 2019 4 次提交
  12. 17 12月, 2019 1 次提交
  13. 16 12月, 2019 3 次提交
  14. 15 12月, 2019 1 次提交
  15. 12 12月, 2019 2 次提交
    • J
      Add reshape int8 mkldnn op (#21428) · d419b859
      joanna.wozna.intel 提交于
      * Add reshape int8 op
      
      test=develop
      
      * Change test to CPUPlace
      
      test=develop
      
      * Correct tests
      
      test=develop
      d419b859
    • T
      memory leak for cpu (#21174) · 9ad940fd
      tangwei12 提交于
      * add fake init for the trainer, fix large memory hold in the trainer
      * do not merge recv vars from a remote endpoint, test=develop
      * add recv and save op, merge slice var in one op, save memory
      * remove hsigmoid with pull sparse, test=develop
      9ad940fd
  16. 11 12月, 2019 1 次提交
  17. 10 12月, 2019 5 次提交
    • W
    • Z
      refine some grad op makers, test=develop (#21629) · 29f64c8c
      Zeng Jinle 提交于
      29f64c8c
    • M
      Dropout with seed (#21590) · e2d849b9
      mapingshuo 提交于
      * add seed op
      e2d849b9
    • A
      MKL-DNN 1.0 Update (#20162) · e81f0228
      Adam 提交于
      * MKLDNN v1.0 rebase to Paddle 1.6
      test=develop
      
      * Add hacky paddle::string::to_string() implementation
      
      * vectorize<int64-t>() -> vectorize() cleanup
      test=develop
      
      * PADDLE_ENFORCE and void_cast fixes
      test=develop
      
      * Rebase changes
      test=develop
      
      * Cosmetics
      test=develop
      
      * Delete MKL from mkldnn.cmake
      test=develop
      
      * CMake debug commands
      test=develop
      
      * Delete MKLDNN_VERBOSE and rebase fixes
      test=develop
      
      * Rebase fixes
      test=develop
      
      * Temporarily disable int8 resnet101 vgg16 and vgg19 tests
      test=develop
      
      * Add libmkldnn.so.1 to python setup
      test=develop
      
      * Add libmkldnn.so.1 to inference_lib cmake after rebase
      test=develop
      
      * Post rebase fixes + FC int8 changes
      test=develop
      
      * Fix LRN NHWC
      test=develop
      
      * Fix NHWC conv3d
      test=develop
      
      * Windows build fix + next conv3d fix
      test=develop
      
      * Fix conv2d on AVX2 machines
      test=develop
      e81f0228
    • W
      Mean gpu optimize (#21643) · 95b95a28
      wangchaochaohu 提交于
      * accelerate mean op test=develop
      95b95a28
  18. 06 12月, 2019 2 次提交