1. 07 5月, 2013 6 次提交
    • V
      ARC: [mm] Lazy D-cache flush (non aliasing VIPT) · eacd0e95
      Vineet Gupta 提交于
      flush_dcache_page( ) is MM hook to ensure that a page has consistent
      views between kernel and userspace. Thus it is called when
      
      * kernel writes to a page which at some later point could get mapped to
        userspace (so kernel mapping needs to be flushed-n-inv)
      * kernel is about to read from a page with possible userspace mappings
        (so userspace mappings needs to be made coherent with kernel ones)
      
      However for Non aliasing VIPT dcache, any userspace mapping will always
      be congruent to kernel mapping. Thus d-cache need need not be flushed at
      all (or delayed indefinitely).
      
      The only reason it does need to be flushed is when mapping code pages.
      Since icache doesn't snoop dcache, those dirty dcache lines need to be
      written back to memory and icache line invalidated so that icache lines
      fetch will get the right data.
      
      Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.
      
      (1) FPGA @ 80 MHZ
      
      Processor, Processes - times in microseconds - smaller is better
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      3.9-rc6-a Linux 3.9.0-r   80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
      3.9-rc6-b Linux 3.9.0-r   80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
      3.9-rc7-c Linux 3.9.0-r   80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
                                                                      ^^^^ ^^^^ ^^^
      
      File & VM system latencies in microseconds - smaller is better
      -------------------------------------------------------------------------------
      Host                 OS   0K File      10K File     Mmap    Prot   Page 100fd
                              Create Delete Create Delete Latency Fault  Fault selct
      --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
      3.9-rc6-a Linux 3.9.0-r  317.8  204.2 1122.3  375.1 3522.0 4.288     20.7 126.8
      3.9-rc6-b Linux 3.9.0-r  298.7  223.0 1141.6  367.8 3531.0 4.866     20.9 126.4
      3.9-rc7-c Linux 3.9.0-r  278.4  179.2  862.1  339.3 3705.0 3.223     20.3 126.6
                               ^^^^^  ^^^^^  ^^^^^  ^^^^
      
      (2) Customer Silicon @ 500 MHz (166 MHz mem)
      
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      abilis-ba Linux 3.9.0-r  497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
      abilis-ca Linux 3.9.0-r  497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
                                                                      ^^^^ ^^^^ ^^^
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      eacd0e95
    • V
      ARC: [mm] micro-optimize page size icache invalidate · 764531cc
      Vineet Gupta 提交于
      start address is already page aligned and size is const PAGE_SIZE,
      thus fixups for alignment not needed in generated code.
      
      bloat-o-meter vmlinux-mm5 vmlinux
      add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32)
      function                                     old     new   delta
      __inv_icache_page                             82      50     -32
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      764531cc
    • V
      ARC: [mm] remove the pessimistic all-alias-invalidate icache helpers · 7f250a0f
      Vineet Gupta 提交于
      No users of this code anymore - so RIP !
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      7f250a0f
    • V
      ARC: [mm] consolidate icache/dcache sync code · 94bad1af
      Vineet Gupta 提交于
      Now that we have same helper used for all icache invalidates (i.e.
      vaddr+paddr based exact line invalidate), consolidate the open coded
      calls into one place.
      
      Also rename flush_icache_range_vaddr => __sync_icache_dcache
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      94bad1af
    • V
      ARC: [mm] optimise icache flush for kernel mappings · 7586bf72
      Vineet Gupta 提交于
      This change continues the theme from prev commit - this time icache
      handling for kernel's own code modification (vmalloc: loadable modules,
      breakpoints for kprobes/kgdb...)
      
      flush_icache_range() calls the CDU icache helper with vaddr to enable
      exact line invalidate.
      
      For a true kernel-virtual mapping, the vaddr is actually virtual hence
      valid as index into cache. For kprobes breakpoint however, the vaddr arg
      is actually paddr - since that's how normal kernel is mapped in ARC
      memory map.  This implies that CDU will use the same addr for
      indexing as for tag match - which is fine since kernel code would only
      have that "implicit" mapping and none other.
      
      This should speed up module loading significantly - specially on default
      ARC700 icache configurations (32k) which alias.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      7586bf72
    • V
      ARC: [mm] optimise icache flush for user mappings · 24603fdd
      Vineet Gupta 提交于
      ARC icache doesn't snoop dcache thus executable pages need to be made
      coherent before mapping into userspace in flush_icache_page().
      
      However ARC700 CDU (hardware cache flush module) requires both vaddr
      (index in cache) as well as paddr (tag match) to correctly identify a
      line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
      the paddr only based flush_icache_page() API couldn't be implemented
      efficiently. It had to loop thru all possible alias indexes and perform
      the invalidate operation (ofcourse the cache op would only succeed at
      the index(es) where tag matches - typically only 1, but the cost of
      visiting all the cache-bins needs to paid nevertheless).
      
      Turns out however that the vaddr (along with paddr) is available in
      update_mmu_cache() hence better suits ARC icache flush semantics.
      With both vaddr+paddr, exactly one flush operation per line is done.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      24603fdd
  2. 09 4月, 2013 1 次提交
  3. 16 2月, 2013 3 次提交