1. 17 11月, 2011 1 次提交
  2. 31 10月, 2011 1 次提交
  3. 26 7月, 2011 1 次提交
  4. 09 7月, 2011 1 次提交
    • B
      mm/nommu.c: fix remap_pfn_range() · 8f3b1327
      Bob Liu 提交于
      remap_pfn_range() means map physical address pfn<<PAGE_SHIFT to user addr.
      
      For nommu arch it's implemented by vma->vm_start = pfn << PAGE_SHIFT which
      is wrong acroding the original meaning of this function.  And some driver
      developer using remap_pfn_range() with correct parameter will get
      unexpected result because vm_start is changed.  It should be implementd
      like addr = pfn << PAGE_SHIFT but which is meanless on nommu arch, this
      patch just make it simply return.
      
      Parameter name and setting of vma->vm_flags also be fixed.
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: David Howells <dhowells@redhat.com>
      Acked-by: NGreg Ungerer <gerg@uclinux.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f3b1327
  5. 23 6月, 2011 1 次提交
    • T
      ptrace: kill trivial tracehooks · a288eecc
      Tejun Heo 提交于
      At this point, tracehooks aren't useful to mainline kernel and mostly
      just add an extra layer of obfuscation.  Although they have comments,
      without actual in-kernel users, it is difficult to tell what are their
      assumptions and they're actually trying to achieve.  To mainline
      kernel, they just aren't worth keeping around.
      
      This patch kills the following trivial tracehooks.
      
      * Ones testing whether task is ptraced.  Replace with ->ptrace test.
      
      	tracehook_expect_breakpoints()
      	tracehook_consider_ignored_signal()
      	tracehook_consider_fatal_signal()
      
      * ptrace_event() wrappers.  Call directly.
      
      	tracehook_report_exec()
      	tracehook_report_exit()
      	tracehook_report_vfork_done()
      
      * ptrace_release_task() wrapper.  Call directly.
      
      	tracehook_finish_release_task()
      
      * noop
      
      	tracehook_prepare_release_task()
      	tracehook_report_death()
      
      This doesn't introduce any behavior change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      a288eecc
  6. 25 5月, 2011 8 次提交
  7. 29 3月, 2011 1 次提交
  8. 24 3月, 2011 1 次提交
  9. 10 3月, 2011 1 次提交
  10. 14 1月, 2011 1 次提交
    • M
      mlock: do not hold mmap_sem for extended periods of time · 53a7706d
      Michel Lespinasse 提交于
      __get_user_pages gets a new 'nonblocking' parameter to signal that the
      caller is prepared to re-acquire mmap_sem and retry the operation if
      needed.  This is used to split off long operations if they are going to
      block on a disk transfer, or when we detect contention on the mmap_sem.
      
      [akpm@linux-foundation.org: remove ref to rwsem_is_contended()]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53a7706d
  11. 24 12月, 2010 2 次提交
  12. 25 11月, 2010 1 次提交
  13. 30 10月, 2010 1 次提交
    • A
      audit mmap · 120a795d
      Al Viro 提交于
      Normal syscall audit doesn't catch 5th argument of syscall.  It also
      doesn't catch the contents of userland structures pointed to be
      syscall argument, so for both old and new mmap(2) ABI it doesn't
      record the descriptor we are mapping.  For old one it also misses
      flags.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      120a795d
  14. 27 10月, 2010 1 次提交
  15. 21 8月, 2010 1 次提交
  16. 14 8月, 2010 1 次提交
  17. 26 5月, 2010 1 次提交
  18. 26 3月, 2010 2 次提交
  19. 25 3月, 2010 1 次提交
  20. 13 3月, 2010 1 次提交
  21. 07 3月, 2010 2 次提交
    • S
      nommu: get_user_pages(): pin last page on non-page-aligned start · c08c6e1f
      Steven J. Magnani 提交于
      The noMMU version of get_user_pages() fails to pin the last page when the
      start address isn't page-aligned.  The patch fixes this in a way that
      makes find_extend_vma() congruent to its MMU cousin.
      Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c08c6e1f
    • R
      mm: change anon_vma linking to fix multi-process server scalability issue · 5beb4930
      Rik van Riel 提交于
      The old anon_vma code can lead to scalability issues with heavily forking
      workloads.  Specifically, each anon_vma will be shared between the parent
      process and all its child processes.
      
      In a workload with 1000 child processes and a VMA with 1000 anonymous
      pages per process that get COWed, this leads to a system with a million
      anonymous pages in the same anon_vma, each of which is mapped in just one
      of the 1000 processes.  However, the current rmap code needs to walk them
      all, leading to O(N) scanning complexity for each page.
      
      This can result in systems where one CPU is walking the page tables of
      1000 processes in page_referenced_one, while all other CPUs are stuck on
      the anon_vma lock.  This leads to catastrophic failure for a benchmark
      like AIM7, where the total number of processes can reach in the tens of
      thousands.  Real workloads are still a factor 10 less process intensive
      than AIM7, but they are catching up.
      
      This patch changes the way anon_vmas and VMAs are linked, which allows us
      to associate multiple anon_vmas with a VMA.  At fork time, each child
      process gets its own anon_vmas, in which its COWed pages will be
      instantiated.  The parents' anon_vma is also linked to the VMA, because
      non-COWed pages could be present in any of the children.
      
      This reduces rmap scanning complexity to O(1) for the pages of the 1000
      child processes, with O(N) complexity for at most 1/N pages in the system.
       This reduces the average scanning cost in heavily forking workloads from
      O(N) to 2.
      
      The only real complexity in this patch stems from the fact that linking a
      VMA to anon_vmas now involves memory allocations.  This means vma_adjust
      can fail, if it needs to attach a VMA to anon_vma structures.  This in
      turn means error handling needs to be added to the calling functions.
      
      A second source of complexity is that, because there can be multiple
      anon_vmas, the anon_vma linking in vma_adjust can no longer be done under
      "the" anon_vma lock.  To prevent the rmap code from walking up an
      incomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag.  This bit
      flag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h
      to make sure it is impossible to compile a kernel that needs both symbolic
      values for the same bitflag.
      
      Some test results:
      
      Without the anon_vma changes, when AIM7 hits around 9.7k users (on a test
      box with 16GB RAM and not quite enough IO), the system ends up running
      >99% in system time, with every CPU on the same anon_vma lock in the
      pageout code.
      
      With these changes, AIM7 hits the cross-over point around 29.7k users.
      This happens with ~99% IO wait time, there never seems to be any spike in
      system time.  The anon_vma lock contention appears to be resolved.
      
      [akpm@linux-foundation.org: cleanups]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5beb4930
  22. 17 1月, 2010 4 次提交
  23. 07 1月, 2010 2 次提交
    • J
      NOMMU: Use copy_*_user_page() in access_process_vm() · 7959722b
      Jie Zhang 提交于
      The MMU code uses the copy_*_user_page() variants in access_process_vm()
      rather than copy_*_user() as the former includes an icache flush.  This
      is important when doing things like setting software breakpoints with
      gdb.  So switch the NOMMU code over to do the same.
      
      This patch makes the reasonable assumption that copy_from_user_page()
      won't fail - which is probably fine, as we've checked the VMA from which
      we're copying is usable, and the copy is not allowed to cross VMAs.  The
      one case where it might go wrong is if the VMA is a device rather than
      RAM, and that device returns an error which - in which case rubbish will
      be returned rather than EIO.
      Signed-off-by: NJie Zhang <jie.zhang@analog.com>
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NDavid McCullough <david_mccullough@mcafee.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NGreg Ungerer <gerg@uclinux.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7959722b
    • M
      NOMMU: Avoiding duplicate icache flushes of shared maps · cfe79c00
      Mike Frysinger 提交于
      When working with FDPIC, there are many shared mappings of read-only
      code regions between applications (the C library, applet packages like
      busybox, etc.), but the current do_mmap_pgoff() function will issue an
      icache flush whenever a VMA is added to an MM instead of only doing it
      when the map is initially created.
      
      The flush can instead be done when a region is first mmapped PROT_EXEC.
      Note that we may not rely on the first mapping of a region being
      executable - it's possible for it to be PROT_READ only, so we have to
      remember whether we've flushed the region or not, and then flush the
      entire region when a bit of it is made executable.
      
      However, this also affects the brk area.  That will no longer be
      executable.  We can mprotect() it to PROT_EXEC on MPU-mode kernels, but
      for NOMMU mode kernels, when it increases the brk allocation, making
      sys_brk() flush the extra from the icache should suffice.  The brk area
      probably isn't used by NOMMU programs since the brk area can only use up
      the leavings from the stack allocation, where the stack allocation is
      larger than requested.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cfe79c00
  24. 31 12月, 2009 1 次提交
  25. 16 12月, 2009 1 次提交
  26. 01 11月, 2009 1 次提交