1. 26 2月, 2019 6 次提交
    • J
      - MKL-DNN pooling updated to set_prim_desc · c63f6b20
      Jacek Czaja 提交于
      - MKLDNN ops revisited
      
      - disabled softmax modifications
      
      - disabled elementwise_add
      
      - reverted LRN modifications
      
      - reverted SUM primitive
      
      - Partial reviing of softmax
      
      - Enable softmax
      
      - Softmax changes
      
      - LRN is back
      
      - LRN partially disabled
      
      - LRN is back
      
      - LRN fix
      
      - compilation fixes
      
      - Sum fixed(hopefully)
      
      - Enabling (partially) elementwise_add
      
      - Fixes to elemenwise_add
      
      - Lint fixes
      
      quantize fix
      
      - compilation fix
      
      test=develop
      
      Disabling pooling
      
      - Disabled quantize op
      
      test=develop
      c63f6b20
    • Q
      8e439ccf
    • Q
      loosly check in the InferShape of cross_entropy_op. (#15863) · f4846bf3
      qingqing01 提交于
      * loosly check in cross_entropy_op when soft_label is True
      * Add Runtime assertion in backward infer_shape check.
      * Skip InferShape check when un-know the input dimensions
      f4846bf3
    • Y
      Optimize the CUDA implementation of sequence_expand op by reduce the times of... · f4634d76
      Yiqun Liu 提交于
      Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493)
      
      * Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
      test=develop
      
      * Refine the op benchmark to support setting lod in config.
      test=develop
      f4634d76
    • G
      This PR improve performance of prior_box op about 1.25x faster on CPU. (#15909) · 630c1e83
      guomingz 提交于
      * This PR improve performance of prior_box op about 1.25x faster on CPU.
      
      * Test Env:SKX 8180 with fake data on 28 threads(bs=1).
      * The below table shows the ~25% improvement which generated by [eval_tp_fake_data.py](https://github.com/PaddlePaddle/Paddle/issues/15618#issuecomment-464613976).
      
      | Type |Event | Calls |   Total     |  Min.    |   Max.      |  Ave.      |  Ratio.|
      | ---------------- | ------------------ | ---- | ------- | -------- | -------- | ------------ | -------- |
      | w/ optimization  | thread0::prior_box | 6000 | 921.201 | 0.110572 | 0.383402 | **0.153533** | 0.084585 |
      | w/o optimization | thread0::prior_box | 6000 | 1151.85 | 0.102276 | 0.426702 | **0.191976** | 0.103337 |
      
      test=develop
      
      * Fix the style issue.
      
      test=develop
      630c1e83
    • C
      Add alloc_continuous_space_op (#15900) · 7ca8553d
      chengduo 提交于
      * add alloc_continuous_space_op
      test=develop
      
      * Polish code
      test=develop
      
      * follow comment
      test=develop
      7ca8553d
  2. 25 2月, 2019 12 次提交
  3. 24 2月, 2019 3 次提交
  4. 23 2月, 2019 2 次提交
  5. 22 2月, 2019 17 次提交