1. 10 10月, 2014 1 次提交
    • S
      mm: introduce a general RCU get_user_pages_fast() · 2667f50e
      Steve Capper 提交于
      This series implements general forms of get_user_pages_fast and
      __get_user_pages_fast in core code and activates them for arm and arm64.
      
      These are required for Transparent HugePages to function correctly, as a
      futex on a THP tail will otherwise result in an infinite loop (due to the
      core implementation of __get_user_pages_fast always returning 0).
      
      Unfortunately, a futex on THP tail can be quite common for certain
      workloads; thus THP is unreliable without a __get_user_pages_fast
      implementation.
      
      This series may also be beneficial for direct-IO heavy workloads and
      certain KVM workloads.
      
      This patch (of 6):
      
      get_user_pages_fast() attempts to pin user pages by walking the page
      tables directly and avoids taking locks.  Thus the walker needs to be
      protected from page table pages being freed from under it, and needs to
      block any THP splits.
      
      One way to achieve this is to have the walker disable interrupts, and rely
      on IPIs from the TLB flushing code blocking before the page table pages
      are freed.
      
      On some platforms we have hardware broadcast of TLB invalidations, thus
      the TLB flushing code doesn't necessarily need to broadcast IPIs; and
      spuriously broadcasting IPIs can hurt system performance if done too
      often.
      
      This problem has been solved on PowerPC and Sparc by batching up page
      table pages belonging to more than one mm_user, then scheduling an
      rcu_sched callback to free the pages.  This RCU page table free logic has
      been promoted to core code and is activated when one enables
      HAVE_RCU_TABLE_FREE.  Unfortunately, these architectures implement their
      own get_user_pages_fast routines.
      
      The RCU page table free logic coupled with an IPI broadcast on THP split
      (which is a rare event), allows one to protect a page table walker by
      merely disabling the interrupts during the walk.
      
      This patch provides a general RCU implementation of get_user_pages_fast
      that can be used by architectures that perform hardware broadcast of TLB
      invalidations.
      
      It is based heavily on the PowerPC implementation by Nick Piggin.
      
      [akpm@linux-foundation.org: various comment fixes]
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Tested-by: NDann Frazier <dann.frazier@canonical.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2667f50e
  2. 24 9月, 2014 1 次提交
    • A
      kvm: Faults which trigger IO release the mmap_sem · 234b239b
      Andres Lagar-Cavilla 提交于
      When KVM handles a tdp fault it uses FOLL_NOWAIT. If the guest memory
      has been swapped out or is behind a filemap, this will trigger async
      readahead and return immediately. The rationale is that KVM will kick
      back the guest with an "async page fault" and allow for some other
      guest process to take over.
      
      If async PFs are enabled the fault is retried asap from an async
      workqueue. If not, it's retried immediately in the same code path. In
      either case the retry will not relinquish the mmap semaphore and will
      block on the IO. This is a bad thing, as other mmap semaphore users
      now stall as a function of swap or filemap latency.
      
      This patch ensures both the regular and async PF path re-enter the
      fault allowing for the mmap semaphore to be relinquished in the case
      of IO wait.
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      234b239b
  3. 07 8月, 2014 1 次提交
  4. 05 6月, 2014 5 次提交