1. 23 6月, 2006 2 次提交
    • C
      [PATCH] Swapless page migration: add R/W migration entries · 0697212a
      Christoph Lameter 提交于
      Implement read/write migration ptes
      
      We take the upper two swapfiles for the two types of migration ptes and define
      a series of macros in swapops.h.
      
      The VM is modified to handle the migration entries.  migration entries can
      only be encountered when the page they are pointing to is locked.  This limits
      the number of places one has to fix.  We also check in copy_pte_range and in
      mprotect_pte_range() for migration ptes.
      
      We check for migration ptes in do_swap_cache and call a function that will
      then wait on the page lock.  This allows us to effectively stop all accesses
      to apge.
      
      Migration entries are created by try_to_unmap if called for migration and
      removed by local functions in migrate.c
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration (I've no NUMA, just
        hacking it up to migrate recklessly while running load), I've hit the
        BUG_ON(!PageLocked(p)) in migration_entry_to_page.
      
        This comes from an orphaned migration entry, unrelated to the current
        correctly locked migration, but hit by remove_anon_migration_ptes as it
        checks an address in each vma of the anon_vma list.
      
        Such an orphan may be left behind if an earlier migration raced with fork:
        copy_one_pte can duplicate a migration entry from parent to child, after
        remove_anon_migration_ptes has checked the child vma, but before it has
        removed it from the parent vma.  (If the process were later to fault on this
        orphaned entry, it would hit the same BUG from migration_entry_wait.)
      
        This could be fixed by locking anon_vma in copy_one_pte, but we'd rather
        not.  There's no such problem with file pages, because vma_prio_tree_add
        adds child vma after parent vma, and the page table locking at each end is
        enough to serialize.  Follow that example with anon_vma: add new vmas to the
        tail instead of the head.
      
        (There's no corresponding problem when inserting migration entries,
        because a missed pte will leave the page count and mapcount high, which is
        allowed for.  And there's no corresponding problem when migrating via swap,
        because a leftover swap entry will be correctly faulted.  But the swapless
        method has no refcounting of its entries.)
      
      From: Ingo Molnar <mingo@elte.hu>
      
        pte_unmap_unlock() takes the pte pointer as an argument.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration, gcc has tried to exec
        a pointer instead of a string: smells like COW mappings are not being
        properly write-protected on fork.
      
        The protection in copy_one_pte looks very convincing, until at last you
        realize that the second arg to make_migration_entry is a boolean "write",
        and SWP_MIGRATION_READ is 30.
      
        Anyway, it's better done like in change_pte_range, using
        is_write_migration_entry and make_migration_entry_read.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Remove unnecessary obfuscation from sys_swapon's range check on swap type,
        which blew up causing memory corruption once swapless migration made
        MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NChristoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      From: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0697212a
    • H
      [PATCH] likely cleanup: remove unlikely in sys_mprotect() · b344e05c
      Hua Zhong 提交于
      With likely/unlikely profiling on my not-so-busy-typical-developmentsystem
      there are 5k misses vs 2k hits.  So I guess we should remove the unlikely.
      Signed-off-by: NHua Zhong <hzhong@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b344e05c
  2. 22 3月, 2006 1 次提交
    • Z
      [PATCH] Enable mprotect on huge pages · 8f860591
      Zhang, Yanmin 提交于
      2.6.16-rc3 uses hugetlb on-demand paging, but it doesn_t support hugetlb
      mprotect.
      
      From: David Gibson <david@gibson.dropbear.id.au>
      
        Remove a test from the mprotect() path which checks that the mprotect()ed
        range on a hugepage VMA is hugepage aligned (yes, really, the sense of
        is_aligned_hugepage_range() is the opposite of what you'd guess :-/).
      
        In fact, we don't need this test.  If the given addresses match the
        beginning/end of a hugepage VMA they must already be suitably aligned.  If
        they don't, then mprotect_fixup() will attempt to split the VMA.  The very
        first test in split_vma() will check for a badly aligned address on a
        hugepage VMA and return -EINVAL if necessary.
      
      From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      
        On i386 and x86-64, pte flag _PAGE_PSE collides with _PAGE_PROTNONE.  The
        identify of hugetlb pte is lost when changing page protection via mprotect.
        A page fault occurs later will trigger a bug check in huge_pte_alloc().
      
        The fix is to always make new pte a hugetlb pte and also to clean up
        legacy code where _PAGE_PRESENT is forced on in the pre-faulting day.
      Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8f860591
  3. 23 11月, 2005 1 次提交
    • H
      [PATCH] unpaged: private write VM_RESERVED · 83e9b7e9
      Hugh Dickins 提交于
      The PageReserved removal in 2.6.15-rc1 issued a "deprecated" message when you
      tried to mmap or mprotect MAP_PRIVATE PROT_WRITE a VM_RESERVED, and failed
      with -EACCES: because do_wp_page lacks the refinement to COW pages in those
      areas, nor do we expect to find anonymous pages in them; and it seemed just
      bloat to add code for handling such a peculiar case.  But immediately it
      caused vbetool and ddcprobe (using lrmi) to fail.
      
      So revert the "deprecated" messages, letting mmap and mprotect succeed.  But
      leave do_wp_page's BUG_ON(vma->vm_flags & VM_RESERVED) in place until we've
      added the code to do it right: so this particular patch is only good if the
      app doesn't really need to write to that private area.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      83e9b7e9
  4. 30 10月, 2005 3 次提交
    • H
      [PATCH] mm: pte_offset_map_lock loops · 705e87c0
      Hugh Dickins 提交于
      Convert those common loops using page_table_lock on the outside and
      pte_offset_map within to use just pte_offset_map_lock within instead.
      
      These all hold mmap_sem (some exclusively, some not), so at no level can a
      page table be whipped away from beneath them.  But whereas pte_alloc loops
      tested with the "atomic" pmd_present, these loops are testing with pmd_none,
      which on i386 PAE tests both lower and upper halves.
      
      That's now unsafe, so add a cast into pmd_none to test only the vital lower
      half: we lose a little sensitivity to a corrupt middle directory, but not
      enough to worry about.  It appears that i386 and UML were the only
      architectures vulnerable in this way, and pgd and pud no problem.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      705e87c0
    • N
      [PATCH] core remove PageReserved · b5810039
      Nick Piggin 提交于
      Remove PageReserved() calls from core code by tightening VM_RESERVED
      handling in mm/ to cover PageReserved functionality.
      
      PageReserved special casing is removed from get_page and put_page.
      
      All setting and clearing of PageReserved is retained, and it is now flagged
      in the page_alloc checks to help ensure we don't introduce any refcount
      based freeing of Reserved pages.
      
      MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
      deprecated.  We never completely handled it correctly anyway, and is be
      reintroduced in future if required (Hugh has a proof of concept).
      
      Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
      arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
      be trivially removed.
      
      Last real user of PageReserved is swsusp, which uses PageReserved to
      determine whether a struct page points to valid memory or not.  This still
      needs to be addressed (a generic page_is_ram() should work).
      
      A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
      thus mapcounted and count towards shared rss).  These writes to the struct
      page could cause excessive cacheline bouncing on big systems.  There are a
      number of ways this could be addressed if it is an issue.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      
      Refcount bug fix for filemap_xip.c
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b5810039
    • H
      [PATCH] mm: vm_stat_account unshackled · ab50b8ed
      Hugh Dickins 提交于
      The original vm_stat_account has fallen into disuse, with only one user, and
      only one user of vm_stat_unaccount.  It's easier to keep track if we convert
      them all to __vm_stat_account, then free it from its __shackles.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ab50b8ed
  5. 22 9月, 2005 1 次提交
  6. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4