1. 30 10月, 2014 1 次提交
    • T
      ARM: 8180/1: mm: implement no-highmem fast path in kmap_atomic_pfn() · 9ff0bb5b
      Thomas Petazzoni 提交于
      Since CONFIG_HIGHMEM got enabled on ARMv5 Kirkwood, we have noticed a
      very significant drop in networking performance. The test were
      conducted on an OpenBlocks A7 board. Without this patch, the outgoing
      performance measured with iperf are:
      
       - highmem OFF, TSO OFF   544 Mbit/s
       - highmem OFF, TSO ON	  942 Mbit/s
       - highmem ON,  TSO OFF   306 Mbit/s
       - highmem ON,  TSO ON    246 Mbit/s
      
      On this Kirkwood platform, the L2 cache is a Feroceon cache, and with
      this cache, all the range operations have to be done on virtual
      addresses and not physical addresses. Therefore, whenever
      CONFIG_HIGHMEM is enabled, the cache maintenance operations call
      kmap_atomic_pfn() and kunmap_atomic().
      
      However, kmap_atomic_pfn() does not implement the same fast path for
      non-highmem pages as the one implemented in kmap_atomic(), and this is
      one of the reason for the performance drop. While this patch does not
      fully restore the performances, it clearly improves them a lot:
      
            	      	        without patch  with patch
      
       - highmem ON, TSO OFF   306 Mbit/s     387 Mbit/s
       - highmem ON, TSO ON    246 Mbit/s     434 Mbit/s
      
      We're still far from the !CONFIG_HIGHMEM performances, but it does
      improve a bit the situation.
      
      Thanks a lot to Ezequiel Garcia and Gregory Clement for all the
      testing work around this topic.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9ff0bb5b
  2. 17 10月, 2014 1 次提交
    • R
      ARM: expand fixmap region to 3MB · 836a2418
      Rob Herring 提交于
      With commit a05e54c1 ("ARM: 8031/2: change fixmap mapping region to
      support 32 CPUs"), the fixmap region was expanded to 2MB, but it
      precluded any other uses of the fixmap region. In order to support other
      uses the fixmap region needs to be expanded beyond 2MB. Fortunately, the
      adjacent 1MB range 0xffe00000-0xfff00000 is availabe.
      
      Remove fixmap_page_table ptr and lookup the page table via the virtual
      address so that the fixmap region can span more that one pmd. The 2nd
      pmd is already created since it is shared with the vector page.
      Signed-off-by: NRob Herring <robh@kernel.org>
      [kees: fixed CONFIG_DEBUG_HIGHMEM get_fixmap() calls]
      [kees: moved pte allocation outside of CONFIG_HIGHMEM]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      836a2418
  3. 23 4月, 2014 2 次提交
  4. 20 3月, 2012 1 次提交
  5. 27 1月, 2012 2 次提交
  6. 20 12月, 2010 1 次提交
    • N
      ARM: get rid of kmap_high_l1_vipt() · 39af22a7
      Nicolas Pitre 提交于
      Since commit 3e4d3af5 "mm: stack based kmap_atomic()", it is no longer
      necessary to carry an ad hoc version of kmap_atomic() added in commit
      7e5a69e8 "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope
      with reentrancy.
      
      In fact, it is now actively wrong to rely on fixed kmap type indices
      (namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a
      concurrent instance of it may reuse any slot for any purpose.
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      39af22a7
  7. 28 10月, 2010 1 次提交
  8. 27 10月, 2010 1 次提交
    • P
      mm: stack based kmap_atomic() · 3e4d3af5
      Peter Zijlstra 提交于
      Keep the current interface but ignore the KM_type and use a stack based
      approach.
      
      The advantage is that we get rid of crappy code like:
      
      	#define __KM_PTE			\
      		(in_nmi() ? KM_NMI_PTE : 	\
      		 in_irq() ? KM_IRQ_PTE :	\
      		 KM_PTE0)
      
      and in general can stop worrying about what context we're in and what kmap
      slots might be appropriate for that.
      
      The downside is that FRV kmap_atomic() gets more expensive.
      
      For now we use a CPP trick suggested by Andrew:
      
        #define kmap_atomic(page, args...) __kmap_atomic(page)
      
      to avoid having to touch all kmap_atomic() users in a single patch.
      
      [ not compiled on:
        - mn10300: the arch doesn't actually build with highmem to begin with ]
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e4d3af5
  9. 10 8月, 2010 1 次提交
    • C
      kmap_atomic: make kunmap_atomic() harder to misuse · 597781f3
      Cesar Eduardo Barros 提交于
      kunmap_atomic() is currently at level -4 on Rusty's "Hard To Misuse"
      list[1] ("Follow common convention and you'll get it wrong"), except in
      some architectures when CONFIG_DEBUG_HIGHMEM is set[2][3].
      
      kunmap() takes a pointer to a struct page; kunmap_atomic(), however, takes
      takes a pointer to within the page itself.  This seems to once in a while
      trip people up (the convention they are following is the one from
      kunmap()).
      
      Make it much harder to misuse, by moving it to level 9 on Rusty's list[4]
      ("The compiler/linker won't let you get it wrong").  This is done by
      refusing to build if the type of its first argument is a pointer to a
      struct page.
      
      The real kunmap_atomic() is renamed to kunmap_atomic_notypecheck()
      (which is what you would call in case for some strange reason calling it
      with a pointer to a struct page is not incorrect in your code).
      
      The previous version of this patch was compile tested on x86-64.
      
      [1] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
      [2] In these cases, it is at level 5, "Do it right or it will always
          break at runtime."
      [3] At least mips and powerpc look very similar, and sparc also seems to
          share a common ancestor with both; there seems to be quite some
          degree of copy-and-paste coding here. The include/asm/highmem.h file
          for these three archs mention x86 CPUs at its top.
      [4] http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
      [5] As an aside, could someone tell me why mn10300 uses unsigned long as
          the first parameter of kunmap_atomic() instead of void *?
      Signed-off-by: NCesar Eduardo Barros <cesarb@cesarb.net>
      Cc: Russell King <linux@arm.linux.org.uk> (arch/arm)
      Cc: Ralf Baechle <ralf@linux-mips.org> (arch/mips)
      Cc: David Howells <dhowells@redhat.com> (arch/frv, arch/mn10300)
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> (arch/mn10300)
      Cc: Kyle McMartin <kyle@mcmartin.ca> (arch/parisc)
      Cc: Helge Deller <deller@gmx.de> (arch/parisc)
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> (arch/parisc)
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> (arch/powerpc)
      Cc: Paul Mackerras <paulus@samba.org> (arch/powerpc)
      Cc: "David S. Miller" <davem@davemloft.net> (arch/sparc)
      Cc: Thomas Gleixner <tglx@linutronix.de> (arch/x86)
      Cc: Ingo Molnar <mingo@redhat.com> (arch/x86)
      Cc: "H. Peter Anvin" <hpa@zytor.com> (arch/x86)
      Cc: Arnd Bergmann <arnd@arndb.de> (include/asm-generic)
      Cc: Rusty Russell <rusty@rustcorp.com.au> ("Hard To Misuse" list)
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      597781f3
  10. 31 7月, 2010 1 次提交
  11. 09 6月, 2010 1 次提交
  12. 14 4月, 2010 1 次提交
    • N
      ARM: 6007/1: fix highmem with VIPT cache and DMA · 7e5a69e8
      Nicolas Pitre 提交于
      The VIVT cache of a highmem page is always flushed before the page
      is unmapped.  This cache flush is explicit through flush_cache_kmaps()
      in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
      kunmap_atomic().  There is also an implicit flush of those highmem pages
      that were part of a process that just terminated making those pages free
      as the whole VIVT cache has to be flushed on every task switch. Hence
      unmapped highmem pages need no cache maintenance in that case.
      
      However unmapped pages may still be cached with a VIPT cache because the
      cache is tagged with physical addresses.  There is no need for a whole
      cache flush during task switching for that reason, and despite the
      explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
      some highmem pages that were mapped in user space end up still cached
      even when they become unmapped.
      
      So, we do have to perform cache maintenance on those unmapped highmem
      pages in the context of DMA when using a VIPT cache.  Unfortunately,
      it is not possible to perform that cache maintenance using physical
      addresses as all the L1 cache maintenance coprocessor functions accept
      virtual addresses only.  Therefore we have no choice but to set up a
      temporary virtual mapping for that purpose.
      
      And of course the explicit cache flushing when unmapping a highmem page
      on a system with a VIPT cache now can go, which should increase
      performance.
      
      While at it, because the code in __flush_dcache_page() has to be modified
      anyway, let's also make sure the mapped highmem pages are pinned with
      kmap_high_get() for the duration of the cache maintenance operation.
      Because kunmap() does unmap highmem pages lazily, it was reported by
      Gary King <GKing@nvidia.com> that those pages ended up being unmapped
      during cache maintenance on SMP causing segmentation faults.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7e5a69e8
  13. 14 12月, 2009 1 次提交
  14. 11 10月, 2009 1 次提交
  15. 05 9月, 2009 1 次提交
    • N
      ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem · 7929eb9c
      Nicolas Pitre 提交于
      Let's suppose a highmem page is kmap'd with kmap().  A pkmap entry is
      used, the page mapped to it, and the virtual cache is dirtied.  Then
      kunmap() is used which does virtually nothing except for decrementing a
      usage count.
      
      Then, let's suppose the _same_ page gets mapped using kmap_atomic().
      It is therefore mapped onto a fixmap entry instead, which has a
      different virtual address unaware of the dirty cache data for that page
      sitting in the pkmap mapping.
      
      Fortunately it is easy to know if a pkmap mapping still exists for that
      page and use it directly with kmap_atomic(), thanks to kmap_high_get().
      
      And actual testing with a printk in the added code path shows that this
      condition is actually met *extremely* frequently.  Seems that we've been
      quite lucky that things have worked so well with highmem so far.
      
      Cc: stable@kernel.org
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7929eb9c
  16. 16 3月, 2009 1 次提交
    • N
      [ARM] kmap support · d73cd428
      Nicolas Pitre 提交于
      The kmap virtual area borrows a 2MB range at the top of the 16MB area
      below PAGE_OFFSET currently reserved for kernel modules and/or the
      XIP kernel.  This 2MB corresponds to the range covered by 2 consecutive
      second-level page tables, or a single pmd entry as seen by the Linux
      page table abstraction.  Because XIP kernels are unlikely to be seen
      on systems needing highmem support, there shouldn't be any shortage of
      VM space for modules (14 MB for modules is still way more than twice the
      typical usage).
      
      Because the virtual mapping of highmem pages can go away at any moment
      after kunmap() is called on them, we need to bypass the delayed cache
      flushing provided by flush_dcache_page() in that case.
      
      The atomic kmap versions are based on fixmaps, and
      __cpuc_flush_dcache_page() is used directly in that case.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      d73cd428