1. 07 2月, 2020 1 次提交
  2. 06 2月, 2020 1 次提交
  3. 05 2月, 2020 1 次提交
  4. 04 2月, 2020 2 次提交
  5. 02 2月, 2020 1 次提交
  6. 31 1月, 2020 1 次提交
    • M
      [DNNL] Fix accuracy in INT8 FC (#22404) · 269db0d1
      Michał Gallus 提交于
      * Enable quantize to reorder to nchw as well
      
      * Correct FC MKL-DNN input dim requirements to accept 3D
      
      * Improve DNNL FC format, error and 3D input handling
      
      test=develop
      
      * Improve error checking in FC
      
      test=develop
      
      * Improve PADDLE_ENFORCE messages in fc-related files
      
      * Remove data layout attribute from obligatory pass args
      
      test=develop
      
      * Fix message in fc_mkldnn_pass to be logically correct
      
      test=develop
      269db0d1
  7. 25 1月, 2020 1 次提交
  8. 19 1月, 2020 1 次提交
  9. 17 1月, 2020 2 次提交
    • Y
      Implement a common python unittest to test the ir passes. (#22209) · b7cac50b
      Yiqun Liu 提交于
      * Implement a common python unittest to test the ir passes.
      test=develop
      
      * Save the results in np.array and support to startup on CPU.
      test=develop
      
      * Fix the unittest.
      test=develop
      
      * Add check_program to check whether the optimized program is different from the origin one.
      test=develop
      
      * Remove the inferface all_ops.
      test=develop
      
      * Add exception test in pass_test.
      test=develop
      b7cac50b
    • T
      integrated HALF_ASYNC to communicator (#21869) · 82bc814a
      tangwei12 提交于
      * add half_async in the communicator
      * fix DistributedStrategy
      82bc814a
  10. 16 1月, 2020 2 次提交
  11. 15 1月, 2020 1 次提交
  12. 14 1月, 2020 4 次提交
  13. 13 1月, 2020 1 次提交
  14. 10 1月, 2020 2 次提交
    • W
      fix the bug of profile update (#22207) · 621d3e0b
      wangchaochaohu 提交于
      * fix the bug of profile update test=develop
      621d3e0b
    • Z
      Add bn and relu fuse pass (#22048) · 46189b16
      Zhen Wang 提交于
      * add bn and relu fuse pass
      
      * add op attr assert and dtype assert
      
      * fix some inputs&&outputs bugs for the fused op and pattern.
      
      * add the unittest for fuse_bn_act_pass. test=develop
      
      * use normative enforce statements. test=develop
      
      * add the cpu test. test=develop
      
      * add the support of batch_size=1 for the bn with relu op. test=develop
      
      * add the error type for paddle throws. test=develop
      
      * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
      46189b16
  15. 09 1月, 2020 4 次提交
  16. 07 1月, 2020 3 次提交
  17. 06 1月, 2020 3 次提交
  18. 05 1月, 2020 1 次提交
  19. 03 1月, 2020 2 次提交
    • Y
      Add the first implememtation of fusion_group op (#19621) · d4832077
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      
      * Add DeviceCodePool to manage all device codes.
      
      * Add the first implementation fusion_group op.
      
      * Add unit-test for fusion_group op.
      
      * Add the check of result.
      
      * Add the check of nvrtc in unit-test.
      test=develop
      
      * Add comment to explain the inputs, outputs and features of fusion_group op.
      test=develop
      
      * Disable fusion_group op for mac and windows.
      test=develop
      
      * Make the compiling of device code return status instead of hanging up.
      test=develop
      
      * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
      
      * Unify fusion_group_op's input and output names.
      test=develop
      
      * Add the check of CUDA driver library in unittest.
      test=develop
      
      * Refine the calling of PADDLE_ENFORCE.
      test=develop
      d4832077
    • M
      [DNNL] 3D Fully-Connected (#21746) · 61921084
      Michał Gallus 提交于
      61921084
  20. 29 12月, 2019 1 次提交
    • L
      Fix multi-threads memory out of bounds error for passes (#21920) · 196e20df
      liu zhengxi 提交于
      * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop
      
      * fix attention_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
      196e20df
  21. 27 12月, 2019 1 次提交
  22. 25 12月, 2019 2 次提交
  23. 24 12月, 2019 1 次提交
    • A
      Optimize adam speed (#21777) · 51a86d2b
      Aurelius84 提交于
      * optimize adam speed by removing _finish_update test=develop
      
      * fix SparseAdamFunctor param list test=develop
      
      * Remove scale_op in expect_list of adam_op test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * modify PADDLE_ENFORCE usage test=develop
      
      * fix op_type in lamb_op.cc test=develop
      
      * fix errors ostream format bug test=develop
      
      * add betaPowOut in ngraph op test=develop
      
      * fix ngraph::op api for gcc8 test=develop
      
      * clean code test=develop
      
      * modify struct into class test=develop
      
      * remove code of beta1Tensor in lamb_op test=develop
      51a86d2b
  24. 20 12月, 2019 1 次提交
    • T
      add table id in cache shuffle (#21585) · c3cf42d0
      Thunderbrook 提交于
      * general table
      
      * add sparse table
      test=develop
      
      * no cvm
      test=develop
      
      * add no_cvm
      test=develop
      
      * add note
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * code style
      test=develop
      
      * add key of optimizer
      test=develop
      
      * solve pslib stop core
      test=develop
      
      * barrier
      test=develop
      
      * add notes
      test=develop
      
      * add table id in cache shuffle
      test=develop
      
      * table id
      test=develop
      
      * code style
      test=develop
      c3cf42d0