1. 28 7月, 2018 1 次提交
    • E
      ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size · eb277739
      Eugeniy Paltsev 提交于
      As for today we don't setup SMP_CACHE_BYTES and cache_line_size for
      ARC, so they are set to L1_CACHE_BYTES by default. L1 line length
      (L1_CACHE_BYTES) might be easily smaller than L2 line (which is
      usually the case BTW). This breaks code.
      
      For example this breaks ethernet infrastructure on HSDK/AXS103 boards
      with IOC disabled, involving manual cache flushes
      Functions which alloc and manage sk_buff packet data area rely on
      SMP_CACHE_BYTES define. In the result we can share last L2 cache
      line in sk_buff linear packet data area between DMA buffer and
      some useful data in other structure. So we can lose this data when
      we invalidate DMA buffer.
      
         sk_buff linear packet data area
                      |
                      |
                      |         skb->end        skb->tail
                      V            |                |
                                   V                V
      ----------------------------------------------.
            packet data            | <tail padding> |  <useful data in other struct>
      ----------------------------------------------.
      
      ---------------------.--------------------------------------------------.
           SLC line        |             SLC (L2 cache) line (128B)           |
      ---------------------.--------------------------------------------------.
              ^                                     ^
              |                                     |
           These cache lines will be invalidated when we invalidate skb
           linear packet data area before DMA transaction starting.
      
      This leads to issues painful to debug as it reproduces only if
      (sk_buff->end - sk_buff->tail) < SLC_LINE_SIZE and
      if we have some useful data right after sk_buff->end.
      
      Fix that by hardcode SMP_CACHE_BYTES to max line length we may have.
      Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      eb277739
  2. 31 8月, 2017 2 次提交
  3. 04 8月, 2017 1 次提交
  4. 03 5月, 2017 2 次提交
    • V
      ARCv2: mm: micro-optimize region flush generated code · f734a310
      Vineet Gupta 提交于
      DC_CTRL.RGN_OP is 3 bits wide, however only 1 bit is used in current
      programming model (0: flush, 1: invalidate)
      
      The current code targetting 3 bits leads to additional 8 byte AND
      operation which can be elided given that only 1 bit is ever set by
      software and/or looked at by hardware
      
      before
      ------
      
      | 80b63324 <__dma_cache_wback_inv_l1>:
      | 80b63324:	clri	r3
      | 80b63328:	lr	r2,[dc_ctrl]
      | 80b6332c:	and	r2,r2,0xfffff1ff	<--- 8 bytes insn
      | 80b63334:	or	r2,r2,576
      | 80b63338:	sr	r2,[dc_ctrl]
      | ...
      | ...
      | 80b63360 <__dma_cache_inv_l1>:
      | 80b63360:	clri	r3
      | 80b63364:	lr	r2,[dc_ctrl]
      | 80b63368:	and	r2,r2,0xfffff1ff	<--- 8 bytes insn
      | 80b63370:	bset_s	r2,r2,0x9
      | 80b63372:	sr	r2,[dc_ctrl]
      | ...
      | ...
      | 80b6338c <__dma_cache_wback_l1>:
      | 80b6338c:	clri	r3
      | 80b63390:	lr	r2,[dc_ctrl]
      | 80b63394:	and	r2,r2,0xfffff1ff	<--- 8 bytes insn
      | 80b6339c:	sr	r2,[dc_ctrl]
      
      after (AND elided totally in 2 cases, replaced with 2 byte BCLR in 3rd)
      -----
      
      | 80b63324 <__dma_cache_wback_inv_l1>:
      | 80b63324:	clri	r3
      | 80b63328:	lr	r2,[dc_ctrl]
      | 80b6332c:	or	r2,r2,576
      | 80b63330:	sr	r2,[dc_ctrl]
      | ...
      | ...
      | 80b63358 <__dma_cache_inv_l1>:
      | 80b63358:	clri	r3
      | 80b6335c:	lr	r2,[dc_ctrl]
      | 80b63360:	bset_s	r2,r2,0x9
      | 80b63362:	sr	r2,[dc_ctrl]
      | ...
      | ...
      | 80b6337c <__dma_cache_wback_l1>:
      | 80b6337c:	clri	r3
      | 80b63380:	lr	r2,[dc_ctrl]
      | 80b63384:	bclr_s	r2,r2,0x9
      | 80b63386:	sr	r2,[dc_ctrl]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      f734a310
    • V
      ARCv2: mm: Implement cache region flush operations · 0d77117f
      Vineet Gupta 提交于
      These are more efficient than the per-line ops
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      0d77117f
  5. 19 1月, 2017 2 次提交
  6. 25 10月, 2016 1 次提交
    • V
      ARCv2: IOC: use @ioc_enable not @ioc_exist where intended · cf986d47
      Vineet Gupta 提交于
      if user disables IOC from debugger at startup (by clearing @ioc_enable),
      @ioc_exists is cleared too. This means boot prints don't capture the
      fact that IOC was present but disabled which could be misleading.
      
      So invert how we use @ioc_enable and @ioc_exists and make it more
      canonical. @ioc_exists represent whether hardware is present or not and
      stays same whether enabled or not. @ioc_enable is still user driven,
      but will be auto-disabled if IOC hardware is not present, i.e. if
      @ioc_exist=0. This is opposite to what we were doing before, but much
      clearer.
      
      This means @ioc_enable is now the "exported" toggle in rest of code such
      as dma mapping API.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      cf986d47
  7. 01 10月, 2016 1 次提交
  8. 19 3月, 2016 1 次提交
    • V
      ARCv2: ioremap: Support dynamic peripheral address space · deaf7565
      Vineet Gupta 提交于
      The peripheral address space is architectural address window which is
      uncached and typically used to wire up peripherals.
      
      For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region
      0xC000_0000 - 0xFFFF_FFFF.
      
      For ARCv2 based HS38 cores the start address is flexible and can be
      0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg
      (typically done in bootloader)
      
      Further in cas of PAE, the physical address can extend beyond 4GB so
      need to confine this check, otherwise all pages beyond 4GB will be
      treated as uncached
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      deaf7565
  9. 21 12月, 2015 1 次提交
    • A
      ARC: mm: fix building for MMU v2 · 4b32e89a
      Alexey Brodkin 提交于
      ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
      define ARC_REG_IC_PTAG for MMU versions >= 3.
      
      But current implementation of cache_line_loop_vX() routines assumes
      availability of all of them (v2, v3 and v4) simultaneously.
      
      And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
      compilation problem:
      ---------------------------------->8-------------------------------
        CC      arch/arc/mm/cache.o
      arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
      arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
         aux_tag = ARC_REG_IC_PTAG;
                   ^
      arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
      scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
      ---------------------------------->8-------------------------------
      
      The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
      version being used.
      
      We don't use it in cache_line_loop_v2() anyways so who cares.
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      4b32e89a
  10. 29 10月, 2015 1 次提交
  11. 20 8月, 2015 1 次提交
    • A
      ARCv2: Support IO Coherency and permutations involving L1 and L2 caches · f2b0b25a
      Alexey Brodkin 提交于
      In case of ARCv2 CPU there're could be following configurations
      that affect cache handling for data exchanged with peripherals
      via DMA:
       [1] Only L1 cache exists
       [2] Both L1 and L2 exist, but no IO coherency unit
       [3] L1, L2 caches and IO coherency unit exist
      
      Current implementation takes care of [1] and [2].
      Moreover support of [2] is implemented with run-time check
      for SLC existence which is not super optimal.
      
      This patch introduces support of [3] and rework of DMA ops
      usage. Instead of doing run-time check every time a particular
      DMA op is executed we'll have 3 different implementations of
      DMA ops and select appropriate one during init.
      
      As for IOC support for it we need:
       [a] Implement empty DMA ops because IOC takes care of cache
           coherency with DMAed data
       [b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
           This is required to make IOC work in first place and also
           serves as optimization as LD/ST to coherent buffers can be
           srviced from caches w/o going all the way to memory
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      [vgupta:
        -Added some comments about IOC gains
        -Marked dma ops as static,
        -Massaged changelog a bit]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      f2b0b25a
  12. 25 6月, 2015 1 次提交
  13. 22 6月, 2015 2 次提交
  14. 13 10月, 2014 1 次提交
  15. 16 6月, 2014 1 次提交
  16. 03 6月, 2014 1 次提交
  17. 06 11月, 2013 1 次提交
  18. 05 9月, 2013 1 次提交
    • V
      ARC: fix new Section mismatches in build (post __cpuinit cleanup) · 07b9b651
      Vineet Gupta 提交于
      ```------------>8--------------------
      WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the
      function read_arc_build_cfg_regs() to the function
      .init.text:read_decode_cache_bcr()
      
      WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the
      function read_arc_build_cfg_regs() to the function
      .init.text:read_decode_mmu_bcr()
      ```
      
      ------------>8--------------------
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      07b9b651
  19. 22 6月, 2013 3 次提交
  20. 10 5月, 2013 1 次提交
  21. 16 2月, 2013 1 次提交
  22. 11 2月, 2013 1 次提交
    • V
      ARC: Fundamental ARCH data-types/defines · 3be80aae
      Vineet Gupta 提交于
      * L1_CACHE_SHIFT
      * PAGE_SIZE, PAGE_OFFSET
      * struct pt_regs, struct user_regs_struct
      * struct thread_struct, cpu_relax(), task_pt_regs(), start_thread(), ...
      * struct thread_info, THREAD_SIZE, INIT_THREAD_INFO(), TIF_*, ...
      * BUG()
      * ELF_*
      * Elf_*
      
      To disallow user-space visibility into some of the core kernel data-types
      such as struct pt_regs, #ifdef __KERNEL__ which also makes the UAPI header
      spit (further patch in the series) to NOT export it to asm/uapi/ptrace.h
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Cc: Jonas Bonn <jonas.bonn@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      3be80aae