1. 25 1月, 2012 1 次提交
    • D
      x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t · 7a7546b3
      David Vrabel 提交于
      If NR_CPUS < 256 then arch_spinlock_t is only 16 bits wide but struct
      xen_spinlock is 32 bits.  When a spin lock is contended and
      xl->spinners is modified the two bytes immediately after the spin lock
      would be corrupted.
      
      This is a regression caused by 84eb950d
      (x86, ticketlock: Clean up types and accessors) which reduced the size
      of arch_spinlock_t.
      
      Fix this by making xl->spinners a u8 if NR_CPUS < 256.  A
      BUILD_BUG_ON() is also added to check the sizes of the two structures
      are compatible.
      
      In many cases this was not noticable as there would often be padding
      bytes after the lock (e.g., if any of CONFIG_GENERIC_LOCKBREAK,
      CONFIG_DEBUG_SPINLOCK, or CONFIG_DEBUG_LOCK_ALLOC were enabled).
      
      The bnx2 driver is affected. In struct bnx2, phy_lock and
      indirect_lock may have no padding after them.  Contention on phy_lock
      would corrupt indirect_lock making it appear locked and the driver
      would deadlock.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Acked-by: NIan Campbell <ian.campbell@citrix.com>
      CC: stable@kernel.org #only 3.2
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7a7546b3
  2. 10 1月, 2012 1 次提交
  3. 22 11月, 2011 1 次提交
  4. 20 11月, 2011 1 次提交
  5. 17 11月, 2011 5 次提交
  6. 14 11月, 2011 1 次提交
  7. 12 11月, 2011 3 次提交
  8. 11 11月, 2011 1 次提交
  9. 10 11月, 2011 5 次提交
  10. 08 11月, 2011 1 次提交
  11. 07 11月, 2011 1 次提交
  12. 03 11月, 2011 2 次提交
    • A
      thp: share get_huge_page_tail() · b35a35b5
      Andrea Arcangeli 提交于
      This avoids duplicating the function in every arch gup_fast.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b35a35b5
    • A
      mm: thp: tail page refcounting fix · 70b50f94
      Andrea Arcangeli 提交于
      Michel while working on the working set estimation code, noticed that
      calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
      wasn't safe, if the pfn ended up being a tail page of a transparent
      hugepage under splitting by __split_huge_page_refcount().
      
      He then found the problem could also theoretically materialize with
      page_cache_get_speculative() during the speculative radix tree lookups
      that uses get_page_unless_zero() in SMP if the radix tree page is freed
      and reallocated and get_user_pages is called on it before
      page_cache_get_speculative has a chance to call get_page_unless_zero().
      
      So the best way to fix the problem is to keep page_tail->_count zero at
      all times.  This will guarantee that get_page_unless_zero() can never
      succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
      is unused for all tail pages of a compound page, so we can simply
      account the tail page references there and transfer them to
      tail_page->_count in __split_huge_page_refcount() (in addition to the
      head_page->_mapcount).
      
      While debugging this s/_count/_mapcount/ change I also noticed get_page is
      called by direct-io.c on pages returned by get_user_pages.  That wasn't
      entirely safe because the two atomic_inc in get_page weren't atomic.  As
      opposed to other get_user_page users like secondary-MMU page fault to
      establish the shadow pagetables would never call any superflous get_page
      after get_user_page returns.  It's safer to make get_page universally safe
      for tail pages and to use get_page_foll() within follow_page (inside
      get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
      pages without taking any locks because it is run within PT lock protected
      critical sections (PT lock for pte and page_table_lock for
      pmd_trans_huge).
      
      The standard get_page() as invoked by direct-io instead will now take
      the compound_lock but still only for tail pages.  The direct-io paths
      are usually I/O bound and the compound_lock is per THP so very
      finegrined, so there's no risk of scalability issues with it.  A simple
      direct-io benchmarks with all lockdep prove locking and spinlock
      debugging infrastructure enabled shows identical performance and no
      overhead.  So it's worth it.  Ideally direct-io should stop calling
      get_page() on pages returned by get_user_pages().  The spinlock in
      get_page() is already optimized away for no-THP builds but doing
      get_page() on tail pages returned by GUP is generally a rare operation
      and usually only run in I/O paths.
      
      This new refcounting on page_tail->_mapcount in addition to avoiding new
      RCU critical sections will also allow the working set estimation code to
      work without any further complexity associated to the tail page
      refcounting with THP.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: NMichel Lespinasse <walken@google.com>
      Reviewed-by: NMichel Lespinasse <walken@google.com>
      Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: <stable@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      70b50f94
  13. 02 11月, 2011 17 次提交