1. 08 10月, 2010 7 次提交
  2. 07 10月, 2010 2 次提交
  3. 05 10月, 2010 2 次提交
    • H
      ksm: fix bad user data when swapping · 4e31635c
      Hugh Dickins 提交于
      Building under memory pressure, with KSM on 2.6.36-rc5, collapsed with
      an internal compiler error: typically indicating an error in swapping.
      
      Perhaps there's a timing issue which makes it now more likely, perhaps
      it's just a long time since I tried for so long: this bug goes back to
      KSM swapping in 2.6.33.
      
      Notice how reuse_swap_page() allows an exclusive page to be reused, but
      only does SetPageDirty if it can delete it from swap cache right then -
      if it's currently under Writeback, it has to be left in cache and we
      don't SetPageDirty, but the page can be reused.  Fine, the dirty bit
      will get set in the pte; but notice how zap_pte_range() does not bother
      to transfer pte_dirty to page_dirty when unmapping a PageAnon.
      
      If KSM chooses to share such a page, it will look like a clean copy of
      swapcache, and not be written out to swap when its memory is needed;
      then stale data read back from swap when it's needed again.
      
      We could fix this in reuse_swap_page() (or even refuse to reuse a
      page under writeback), but it's more honest to fix my oversight in
      KSM's write_protect_page().  Several days of testing on three machines
      confirms that this fixes the issue they showed.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e31635c
    • H
      ksm: fix page_address_in_vma anon_vma oops · 4829b906
      Hugh Dickins 提交于
      2.6.36-rc1 commit 21d0d443 "rmap:
      resurrect page_address_in_vma anon_vma check" was right to resurrect
      that check; but now that it's comparing anon_vma->roots instead of
      just anon_vmas, there's a danger of oopsing on a NULL anon_vma.
      
      In most cases no NULL anon_vma ever gets here; but it turns out that
      occasionally KSM, when enabled on a forked or forking process, will
      itself call page_address_in_vma() on a "half-KSM" page left over from
      an earlier failed attempt to merge - whose page_anon_vma() is NULL.
      
      It's my bug that those should be getting here at all: I thought they
      were already dealt with, this oops proves me wrong, I'll fix it in
      the next release - such pages are effectively pinned until their
      process exits, since rmap cannot find their ptes (though swapoff can).
      
      For now just work around it by making page_address_in_vma() safe (and
      add a comment on why that check is wanted anyway).  A similar check
      in __page_check_anon_rmap() is safe because do_page_add_anon_rmap()
      already excluded KSM pages.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4829b906
  4. 26 9月, 2010 1 次提交
  5. 25 9月, 2010 1 次提交
  6. 24 9月, 2010 4 次提交
  7. 23 9月, 2010 4 次提交
  8. 22 9月, 2010 1 次提交
  9. 21 9月, 2010 2 次提交
    • T
      percpu: fix pcpu_last_unit_cpu · 46b30ea9
      Tejun Heo 提交于
      pcpu_first/last_unit_cpu are used to track which cpu has the first and
      last units assigned.  This in turn is used to determine the span of a
      chunk for man/unmap cache flushes and whether an address belongs to
      the first chunk or not in per_cpu_ptr_to_phys().
      
      When the number of possible CPUs isn't power of two, a chunk may
      contain unassigned units towards the end of a chunk.  The logic to
      determine pcpu_last_unit_cpu was incorrect when there was an unused
      unit at the end of a chunk.  It failed to ignore the unused unit and
      assigned the unused marker NR_CPUS to pcpu_last_unit_cpu.
      
      This was discovered through kdump failure which was caused by
      malfunctioning per_cpu_ptr_to_phys() on a kvm setup with 50 possible
      CPUs by CAI Qian.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NCAI Qian <caiqian@redhat.com>
      Cc: stable@kernel.org
      46b30ea9
    • H
      mm: further fix swapin race condition · 31c4a3d3
      Hugh Dickins 提交于
      Commit 4969c119 ("mm: fix swapin race condition") is now agreed to
      be incomplete.  There's a race, not very much less likely than the
      original race envisaged, in which it is further necessary to check that
      the swapcache page's swap has not changed.
      
      Here's the reasoning: cast in terms of reuse_swap_page(), but probably
      could be reformulated to rely on try_to_free_swap() instead, or on
      swapoff+swapon.
      
      A, faults into do_swap_page(): does page1 = lookup_swap_cache(swap1) and
      comes through the lock_page(page1).
      
      B, a racing thread of the same process, faults on the same address: does
      page1 = lookup_swap_cache(swap1) and now waits in lock_page(page1), but
      for whatever reason is unlucky not to get the lock any time soon.
      
      A carries on through do_swap_page(), a write fault, but cannot reuse the
      swap page1 (another reference to swap1).  Unlocks the page1 (but B
      doesn't get it yet), does COW in do_wp_page(), page2 now in that pte.
      
      C, perhaps the parent of A+B, comes in and write faults the same swap
      page1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed.
      
      kswapd comes in after some time (B still unlucky) and swaps out some
      pages from A+B and C: it allocates the original swap1 to page2 in A+B,
      and some other swap2 to the original page1 now in C.  But does not
      immediately free page1 (actually it couldn't: B holds a reference),
      leaving it in swap cache for now.
      
      B at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes.
      Is pte_same(*page_table, orig_pte)? Yes, because page2 has now been
      given the swap1 which page1 used to have.  So B proceeds to insert page1
      into A+B's page_table, though its content now belongs to C, quite
      different from what A wrote there.
      
      B ought to have checked that page1's swap was still swap1.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31c4a3d3
  10. 10 9月, 2010 14 次提交
  11. 29 8月, 2010 1 次提交
    • H
      mm: fix hang on anon_vma->root->lock · f1819427
      Hugh Dickins 提交于
      After several hours, kbuild tests hang with anon_vma_prepare() spinning on
      a newly allocated anon_vma's lock - on a box with CONFIG_TREE_PREEMPT_RCU=y
      (which makes this very much more likely, but it could happen without).
      
      The ever-subtle page_lock_anon_vma() now needs a further twist: since
      anon_vma_prepare() and anon_vma_fork() are liable to change the ->root
      of a reused anon_vma structure at any moment, page_lock_anon_vma()
      needs to check page_mapped() again before succeeding, otherwise
      page_unlock_anon_vma() might address a different root->lock.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1819427
  12. 27 8月, 2010 1 次提交