1. 26 5月, 2014 1 次提交
    • V
      ARM: 8043/1: uprobes need icache flush after xol write · 72e6ae28
      Victor Kamensky 提交于
      After instruction write into xol area, on ARM V7
      architecture code need to flush dcache and icache to sync
      them up for given set of addresses. Having just
      'flush_dcache_page(page)' call is not enough - it is
      possible to have stale instruction sitting in icache
      for given xol area slot address.
      
      Introduce arch_uprobe_ixol_copy weak function
      that by default calls uprobes copy_to_page function and
      than flush_dcache_page function and on ARM define new one
      that handles xol slot copy in ARM specific way
      
      flush_uprobe_xol_access function shares/reuses implementation
      with/of flush_ptrace_access function and takes care of writing
      instruction to user land address space on given variety of
      different cache types on ARM CPUs. Because
      flush_uprobe_xol_access does not have vma around
      flush_ptrace_access was split into two parts. First that
      retrieves set of condition from vma and common that receives
      those conditions as flags.
      
      Note ARM cache flush function need kernel address
      through which instruction write happened, so instead
      of using uprobes copy_to_page function changed
      code to explicitly map page and do memcpy.
      
      Note arch_uprobe_copy_ixol function, in similar way as
      copy_to_user_page function, has preempt_disable/preempt_enable.
      Signed-off-by: NVictor Kamensky <victor.kamensky@linaro.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NDavid A. Long <dave.long@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      72e6ae28
  2. 29 12月, 2013 1 次提交
  3. 17 6月, 2013 1 次提交
  4. 06 6月, 2013 1 次提交
    • M
      ARM: 7746/1: mm: lazy cache flushing on non-mapped pages · 81f28946
      Ming Lei 提交于
      Currently flush_dcache_page() thinks pages as non-mapped if
      mapping_mapped(mapping) return false. This approach is very
      coase:
      	- mmap on part of file may cause all pages backed on
      	the file being thought as mmaped
      
      	- file-backed pages aren't mapped into user space actually
      	if the memory mmaped on the file isn't accessed
      
      This patch uses page_mapped() to decide if the page has been
      mapped.
      
      From the attached test code, I find there is much performance
      improvement(>25%) when accessing page caches via read under this
      situations, so memcpy benefits a lot from not flushing cache
      under this situation.
      
      No.   read time without the patch	No. read time with the patch
      ================================================================
      No. 0, time  22615636 us		No. 0, time  22014717 us
      No. 1, time  4387851 us 		No. 1, time  3113184 us
      No. 2, time  4276535 us 		No. 2, time  3005244 us
      No. 3, time  4259821 us 		No. 3, time  3001565 us
      No. 4, time  4263811 us 		No. 4, time  3002748 us
      No. 5, time  4258486 us 		No. 5, time  3004104 us
      No. 6, time  4253009 us 		No. 6, time  3002188 us
      No. 7, time  4262809 us 		No. 7, time  2998196 us
      No. 8, time  4264525 us 		No. 8, time  3007255 us
      No. 9, time  4267795 us 		No. 9, time  3005094 us
      
      1), No.0. is to read the file from storage device, and others are
      to read the file from page caches basically.
      2), file size is 512M, and is on ext4 over usb mass storage.
      3), the test is done on Pandaboard.
      
      unsigned int  sum = 0;
      unsigned long sum_val = 0;
      
      static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
      {
      	return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
      		(tv2->tv_usec - tv1->tv_usec);
      }
      
      int main(int argc, char *argv[])
      {
      	char *mbuf, fbuf;
      	int fd;
      	int i;
      	unsigned long page_size, size;
      	struct stat stat;
      	struct timeval t1, t2;
      	unsigned char *rbuf = malloc(32 * page_size);
      
      	if (!rbuf) {
      		printf("	%sn", "malloc failed");
      		exit(-1);
      	}
      
      	page_size = getpagesize();
      	fd = open(argv[1], O_RDWR);
      	assert(fd >= 0);
      
      	fstat(fd, &stat);
      	size = stat.st_size;
      	printf("%s: file %s, size %lu, page size %lun",
      		argv[0],
      		argv[1], size, page_size);
      
      	gettimeofday(&t1, NULL);
      	mbuf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
      	if (!mbuf) {
      		printf("	%sn", "mmap failed");
      		exit(-1);
      	}
      
      	for (i = 0 ; i < size ; i += (page_size * 32)) {
      		int rcnt;
      		lseek(fd, i, SEEK_SET);
      		rcnt = read(fd, rbuf, page_size * 32);
      		if (rcnt != page_size * 32) {
      			printf("%s: read faildn", __func__);
      			exit(-1);
      		}
      	}
      	free(rbuf);
      	munmap(mbuf, size);
      	gettimeofday(&t2, NULL);
      	printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));
      
      	close(fd);
      }
      
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      81f28946
  5. 04 6月, 2013 1 次提交
    • S
      ARM: mm: Add support for flushing HugeTLB pages. · 0b19f933
      Steve Capper 提交于
      On ARM we use the __flush_dcache_page function to flush the dcache
      of pages when needed; usually when the PG_dcache_clean bit is unset
      and we are setting a PTE.
      
      A HugeTLB page is represented as a compound page consisting of an
      array of pages. Thus to flush the dcache of a HugeTLB page, one must
      flush more than a single page.
      
      This patch modifies __flush_dcache_page such that all constituent
      pages of a HugeTLB page are flushed.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      0b19f933
  6. 17 4月, 2013 1 次提交
  7. 09 10月, 2012 1 次提交
    • M
      mm: replace vma prio_tree with an interval tree · 6b2dbba8
      Michel Lespinasse 提交于
      Implement an interval tree as a replacement for the VMA prio_tree.  The
      algorithms are similar to lib/interval_tree.c; however that code can't be
      directly reused as the interval endpoints are not explicitly stored in the
      VMA.  So instead, the common algorithm is moved into a template and the
      details (node type, how to get interval endpoints from the node, etc) are
      filled in using the C preprocessor.
      
      Once the interval tree functions are available, using them as a
      replacement to the VMA prio tree is a relatively simple, mechanical job.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b2dbba8
  8. 11 8月, 2012 1 次提交
  9. 29 3月, 2012 1 次提交
  10. 27 1月, 2012 2 次提交
  11. 21 5月, 2011 1 次提交
  12. 16 5月, 2011 1 次提交
  13. 20 12月, 2010 1 次提交
    • N
      ARM: get rid of kmap_high_l1_vipt() · 39af22a7
      Nicolas Pitre 提交于
      Since commit 3e4d3af5 "mm: stack based kmap_atomic()", it is no longer
      necessary to carry an ad hoc version of kmap_atomic() added in commit
      7e5a69e8 "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope
      with reentrancy.
      
      In fact, it is now actively wrong to rely on fixed kmap type indices
      (namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a
      concurrent instance of it may reuse any slot for any purpose.
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      39af22a7
  14. 15 11月, 2010 1 次提交
  15. 05 10月, 2010 1 次提交
    • W
      ARM: 6386/1: flush_ptrace_access: invalidate correct I-cache alias · c4e259c8
      Will Deacon 提交于
      copy_to_user_page can be used by access_process_vm to write to an
      executable page of a process using a mapping acquired by kmap.
      For systems with I-cache aliasing, flushing the I-cache using the
      Kernel mapping may leave stale data in the I-cache if the user
      mapping is of a different colour.
      
      This patch introduces a flush_icache_alias function to flush.c,
      which calls flush_icache_range with a mapping of the specified
      colour. flush_ptrace_access is then modified to call this new
      function instead of coherent_kern_range in the case of an aliasing
      I-cache and a non-aliasing D-cache.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      c4e259c8
  16. 19 9月, 2010 4 次提交
  17. 14 4月, 2010 1 次提交
    • N
      ARM: 6007/1: fix highmem with VIPT cache and DMA · 7e5a69e8
      Nicolas Pitre 提交于
      The VIVT cache of a highmem page is always flushed before the page
      is unmapped.  This cache flush is explicit through flush_cache_kmaps()
      in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
      kunmap_atomic().  There is also an implicit flush of those highmem pages
      that were part of a process that just terminated making those pages free
      as the whole VIVT cache has to be flushed on every task switch. Hence
      unmapped highmem pages need no cache maintenance in that case.
      
      However unmapped pages may still be cached with a VIPT cache because the
      cache is tagged with physical addresses.  There is no need for a whole
      cache flush during task switching for that reason, and despite the
      explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
      some highmem pages that were mapped in user space end up still cached
      even when they become unmapped.
      
      So, we do have to perform cache maintenance on those unmapped highmem
      pages in the context of DMA when using a VIPT cache.  Unfortunately,
      it is not possible to perform that cache maintenance using physical
      addresses as all the L1 cache maintenance coprocessor functions accept
      virtual addresses only.  Therefore we have no choice but to set up a
      temporary virtual mapping for that purpose.
      
      And of course the explicit cache flushing when unmapping a highmem page
      on a system with a VIPT cache now can go, which should increase
      performance.
      
      While at it, because the code in __flush_dcache_page() has to be modified
      anyway, let's also make sure the mapped highmem pages are pinned with
      kmap_high_get() for the duration of the cache maintenance operation.
      Because kunmap() does unmap highmem pages lazily, it was reported by
      Gary King <GKing@nvidia.com> that those pages ended up being unmapped
      during cache maintenance on SMP causing segmentation faults.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7e5a69e8
  18. 14 12月, 2009 2 次提交
  19. 04 12月, 2009 5 次提交
  20. 02 12月, 2009 3 次提交
  21. 30 10月, 2009 1 次提交
    • R
      ARM: Fix errata 411920 workarounds · df71dfd4
      Russell King 提交于
      Errata 411920 indicates that any "invalidate entire instruction cache"
      operation can fail if the right conditions are present.  This is not
      limited just to those operations in flush.c, but elsewhere.  Place the
      workaround in the already existing __flush_icache_all() function
      instead.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      df71dfd4
  22. 24 9月, 2009 1 次提交
  23. 02 9月, 2009 1 次提交
    • N
      ARM: 5687/1: fix an oops with highmem · 13f96d8f
      Nicolas Pitre 提交于
      In xdr_partial_copy_from_skb() there is that sequence:
      
      		kaddr = kmap_atomic(*ppage, KM_SKB_SUNRPC_DATA);
      		[...]
      		flush_dcache_page(*ppage);
      		kunmap_atomic(kaddr, KM_SKB_SUNRPC_DATA);
      
      Mixing flush_dcache_page() and kmap_atomic() is a bit odd,
      especially since kunmap_atomic() must deal with cache issues
      already.  OTOH the non-highmem case must use flush_dcache_page()
      as kunmap_atomic() becomes a no op with no cache maintenance.
      
      Problem is that with highmem the implementation of kmap_atomic()
      doesn't set page->virtual, and page_address(page) returns 0 in
      that case. Here flush_dcache_page() calls __flush_dcache_page()
      which calls __cpuc_flush_dcache_page(page_address(page)) resulting
      in a kernel oops.
      
      None of the kmap_atomic() implementations uses set_page_address().
      Hence we can assume page_address() is always expected to return 0 in
      that case. Let's conditionally call __cpuc_flush_dcache_page() only
      when the page address is non zero, and perform that test only when
      highmem is configured.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      13f96d8f
  24. 01 5月, 2009 1 次提交
  25. 16 3月, 2009 1 次提交
    • N
      [ARM] kmap support · d73cd428
      Nicolas Pitre 提交于
      The kmap virtual area borrows a 2MB range at the top of the 16MB area
      below PAGE_OFFSET currently reserved for kernel modules and/or the
      XIP kernel.  This 2MB corresponds to the range covered by 2 consecutive
      second-level page tables, or a single pmd entry as seen by the Linux
      page table abstraction.  Because XIP kernels are unlikely to be seen
      on systems needing highmem support, there shouldn't be any shortage of
      VM space for modules (14 MB for modules is still way more than twice the
      typical usage).
      
      Because the virtual mapping of highmem pages can go away at any moment
      after kunmap() is called on them, we need to bypass the delayed cache
      flushing provided by flush_dcache_page() in that case.
      
      The atomic kmap versions are based on fixmaps, and
      __cpuc_flush_dcache_page() is used directly in that case.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      d73cd428
  26. 01 9月, 2008 1 次提交
  27. 03 7月, 2008 1 次提交
  28. 09 1月, 2007 1 次提交
  29. 13 12月, 2006 1 次提交
    • R
      [ARM] Unuse another Linux PTE bit · ad1ae2fe
      Russell King 提交于
      L_PTE_ASID is not really required to be stored in every PTE, since we
      can identify it via the address passed to set_pte_at().  So, create
      set_pte_ext() which takes the address of the PTE to set, the Linux
      PTE value, and the additional CPU PTE bits which aren't encoded in
      the Linux PTE value.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ad1ae2fe