1. 08 10月, 2021 1 次提交
  2. 07 10月, 2021 1 次提交
    • O
      Merge pull request #20725 from mologie:fix-dnn-tf-on-arm · a3d7811f
      Oliver Kuckertz 提交于
      * dnn: fix unaligned memory access crash on armv7
      
      The getTensorContent function would return a Mat pointing to some
      member of a Protobuf-encoded message. Protobuf does not make any
      alignment guarantees, which results in a crash on armv7 when loading
      models while bit 2 is set in /proc/cpu/alignment (or the relevant
      kernel feature for alignment compatibility is disabled). Any read
      attempt from the previously unaligned data member would send SIGBUS.
      
      As workaround, this commit makes an aligned copy via existing clone
      functionality in getTensorContent. The unsafe copy=false option is
      removed. Unfortunately, a rather crude hack in PReLUSubgraph in fact
      writes(!) to the Protobuf message. We limit ourselves to fixing the
      alignment issues in this commit, and add getTensorContentRefUnaligned
      to cover the write case with a safe memcpy. A FIXME marks the issue.
      
      * dnn: reduce amount of .clone() calls
      
      * dnn: update FIXME comment
      Co-authored-by: NAlexander Alekhin <alexander.a.alekhin@gmail.com>
      a3d7811f
  3. 06 10月, 2021 4 次提交
  4. 05 10月, 2021 7 次提交
  5. 04 10月, 2021 2 次提交
  6. 03 10月, 2021 2 次提交
  7. 02 10月, 2021 6 次提交
  8. 01 10月, 2021 3 次提交
  9. 30 9月, 2021 1 次提交
  10. 29 9月, 2021 3 次提交
  11. 28 9月, 2021 4 次提交
  12. 27 9月, 2021 1 次提交
  13. 26 9月, 2021 2 次提交
  14. 25 9月, 2021 1 次提交
  15. 24 9月, 2021 2 次提交
    • E
      Tile: · 91ff45fb
      easonycwang 提交于
      This submission is used to improve the performance of the inpaint algorithm for 3 channels images(RGB or BGR).
      
      Reason:
      The original algorithm implementation did not consider the cache hits.
      The loop of channels is outside the core loop, so the perfmance is not very good.
      Moving the channel loop inside the core loop can significantly improve cache hits, thereby improving performance.
      
      Performance:
      360P, about >= 30% improvement
      iphone8P: 5.52ms -> 3.75ms
      iphone6s: 14.04ms -> 9.15ms
      91ff45fb
    • A