1. 04 3月, 2019 4 次提交
  2. 27 2月, 2019 1 次提交
  3. 26 2月, 2019 3 次提交
    • Y
      Optimize the CUDA implementation of sequence_expand op by reduce the times of... · f4634d76
      Yiqun Liu 提交于
      Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493)
      
      * Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
      test=develop
      
      * Refine the op benchmark to support setting lod in config.
      test=develop
      f4634d76
    • G
      This PR improve performance of prior_box op about 1.25x faster on CPU. (#15909) · 630c1e83
      guomingz 提交于
      * This PR improve performance of prior_box op about 1.25x faster on CPU.
      
      * Test Env:SKX 8180 with fake data on 28 threads(bs=1).
      * The below table shows the ~25% improvement which generated by [eval_tp_fake_data.py](https://github.com/PaddlePaddle/Paddle/issues/15618#issuecomment-464613976).
      
      | Type |Event | Calls |   Total     |  Min.    |   Max.      |  Ave.      |  Ratio.|
      | ---------------- | ------------------ | ---- | ------- | -------- | -------- | ------------ | -------- |
      | w/ optimization  | thread0::prior_box | 6000 | 921.201 | 0.110572 | 0.383402 | **0.153533** | 0.084585 |
      | w/o optimization | thread0::prior_box | 6000 | 1151.85 | 0.102276 | 0.426702 | **0.191976** | 0.103337 |
      
      test=develop
      
      * Fix the style issue.
      
      test=develop
      630c1e83
    • C
      Add alloc_continuous_space_op (#15900) · 7ca8553d
      chengduo 提交于
      * add alloc_continuous_space_op
      test=develop
      
      * Polish code
      test=develop
      
      * follow comment
      test=develop
      7ca8553d
  4. 25 2月, 2019 3 次提交
    • M
      Improve code reuse at MKL-DNN sum · 6ebe9877
      Michal Gallus 提交于
      test=develop
      6ebe9877
    • J
      [MKL-DNN] MKL-DNN specific Tensor modification (#15429) · dec9cf53
      Jacek Czaja 提交于
      * - Implemented draft of primitive desc keeping in Tensor
      
      test=develop
      
      - TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented
      
      - Added nchw and nc formats setting for sake of compatiblity
      
      Fixed unit tests
      
      - Worakaround to problem with 5D data in conv
      
      - Added 3D and 1D MKL-DNN formats for name handles for tensor
      
      test=develop
      
      - Fix to UTs
      
      test=develop
      
      - Conv fp32 op was updated
      
      Cosmetic fixes
      
      test=develop
      
      - tensor mkldnn cosmetics
      
      test=develop
      
      - Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils
      
      * - Lint fixes
      
      test=develop
      
      * - setting prim dec in Tensor , sets also layout to kMKLDNN
      
      test=develop
      
      * - Moved creation of prim desc totally out of Tensor
      
      test=develop
      
      * - Cosmetic fixes adter review
      
      test=develop
      dec9cf53
    • X
      polish · 5dd281f7
      Xin Pan 提交于
      test=develop
      5dd281f7
  5. 24 2月, 2019 2 次提交
  6. 22 2月, 2019 11 次提交
  7. 21 2月, 2019 3 次提交
    • K
      Add new ut and remove unnecessary code · 1578c60b
      Krzysztof Binias 提交于
      test=develop
      1578c60b
    • X
      add per kernel config and remove const_cast. · 5eb87506
      Xin Pan 提交于
      test=develop
      5eb87506
    • D
      Profiler refine and add CUDA runtime api tracer (#15301) · a83e4704
      Dun 提交于
      * refine profiler && add runtime tracer
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix bug && test=develop
      
      * add thread id map && test=develop
      
      * test=develop
      
      * testing
      
      * bug fix
      
      * remove cuda event && refine code && test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * fix windows temp file && test=develop
      
      * test=develop
      
      * fix windows bug && test=develop
      
      * fix start up issue && test=develop
      
      * code polish &&  test=develop
      
      * remove unused code && test=develop
      
      * add some cupti cbid && test=develop
      
      * add FLAGS_multiple_of_cupti_buffer_size && test=develop
      
      * fix compile error && test=develop
      
      * add keyword && test=develop
      
      * fix && test=develop
      
      * code polish && test=develop
      a83e4704
  8. 20 2月, 2019 2 次提交
  9. 19 2月, 2019 6 次提交
    • T
      fix warnings (#15790) · e1c707fe
      tensor-tang 提交于
      * fix warnings
      
      test=develop
      
      * fix enforce test
      
      test=develop
      e1c707fe
    • X
      update comment · f2262d73
      xuezhong 提交于
      test=develop
      f2262d73
    • X
      refine code · c5360a3f
      xuezhong 提交于
      c5360a3f
    • M
      Enable cross_entropy operator for a ngraph engine (#15674) · df23a6f8
      mozga-intel 提交于
      * Enable cross_entropy operator for a ngraph engine
      test=develop
      
      * Update tests
      test=develop
      
      * Added PADDLE_ENFORCE for the batch_norm operator
      test=develop
      
      * Update the message about which format are supported right now
      test=develop
      df23a6f8
    • Y
      Correct the doc in Python API (#15725) · 56a5039e
      Yiqun Liu 提交于
      * Correct the comment in control_flow.py.
      
      * Correct the argument list of ops.
      test=develop
      
      * Update API.spec.
      test=develop
      
      * Skip op_callstack attr for all op apis.
      test=develop
      
      * Remove use_mkldnn and is_test from python api.
      test=develop
      
      * Remove use_mkldnn and is_test from op_proto_maker and hard-coding them in python when generating doc string.
      test=develop
      56a5039e
    • B
      Add ngraph op coverage (#15721) · 72061b0a
      baojun 提交于
      72061b0a
  10. 18 2月, 2019 5 次提交