1. 23 10月, 2008 1 次提交
  2. 27 7月, 2008 1 次提交
    • N
      x86: lockless get_user_pages_fast() · 8174c430
      Nick Piggin 提交于
      Implement get_user_pages_fast without locking in the fastpath on x86.
      
      Do an optimistic lockless pagetable walk, without taking mmap_sem or any
      page table locks or even mmap_sem.  Page table existence is guaranteed by
      turning interrupts off (combined with the fact that we're always looking
      up the current mm, means we can do the lockless page table walk within the
      constraints of the TLB shootdown design).  Basically we can do this
      lockless pagetable walk in a similar manner to the way the CPU's pagetable
      walker does not have to take any locks to find present ptes.
      
      This patch (combined with the subsequent ones to convert direct IO to use
      it) was found to give about 10% performance improvement on a 2 socket 8
      core Intel Xeon system running an OLTP workload on DB2 v9.5
      
       "To test the effects of the patch, an OLTP workload was run on an IBM
        x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
        2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
        runs with and without the patch resulted in an overall performance
        benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
        __up_read and __down_read routines that is seen during thread contention
        for system resources was reduced from 2.8% down to .05%.  Monitoring the
        /proc/vmstat output from the patched run showed that the counter for
        fast_gup contained a very high number while the fast_gup_slow value was
        zero."
      
      (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
      counter we had for the number of times the slowpath was invoked).
      
      The main reason for the improvement is that DB2 has multiple threads each
      issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
      contend the mmap_sem cacheline, and can also contend on page table locks.
      
      I would anticipate larger performance gains on larger systems, however I
      think DB2 uses an adaptive mix of threads and processes, so it could be
      that thread contention remains pretty constant as machine size increases.
      In which case, we stuck with "only" a 10% gain.
      
      The downside of using get_user_pages_fast is that if there is not a pte
      with the correct permissions for the access, we end up falling back to
      get_user_pages and so the get_user_pages_fast is a bit of extra work.
      However this should not be the common case in most performance critical
      code.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: Kconfig fix]
      [akpm@linux-foundation.org: Makefile fix/cleanup]
      [akpm@linux-foundation.org: warning fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8174c430
  3. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  4. 09 7月, 2008 9 次提交
  5. 11 10月, 2007 1 次提交