1. 01 7月, 2006 2 次提交
    • C
      [PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPED · f3dbd344
      Christoph Lameter 提交于
      The current NR_FILE_MAPPED is used by zone reclaim and the dirty load
      calculation as the number of mapped pagecache pages.  However, that is not
      true.  NR_FILE_MAPPED includes the mapped anonymous pages.  This patch
      separates those and therefore allows an accurate tracking of the anonymous
      pages per zone.
      
      It then becomes possible to determine the number of unmapped pages per zone
      and we can avoid scanning for unmapped pages if there are none.
      
      Also it may now be possible to determine the mapped/unmapped ratio in
      get_dirty_limit.  Isnt the number of anonymous pages irrelevant in that
      calculation?
      
      Note that this will change the meaning of the number of mapped pages reported
      in /proc/vmstat /proc/meminfo and in the per node statistics.  This may affect
      user space tools that monitor these counters!  NR_FILE_MAPPED works like
      NR_FILE_DIRTY.  It is only valid for pagecache pages.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f3dbd344
    • C
      [PATCH] zoned vm counters: convert nr_mapped to per zone counter · 65ba55f5
      Christoph Lameter 提交于
      nr_mapped is important because it allows a determination of how many pages of
      a zone are not mapped, which would allow a more efficient means of determining
      when we need to reclaim memory in a zone.
      
      We take the nr_mapped field out of the page state structure and define a new
      per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off
      from NR_MAPPED in the next patch).
      
      We replace the use of nr_mapped in various kernel locations.  This avoids the
      looping over all processors in try_to_free_pages(), writeback, reclaim (swap +
      zone reclaim).
      
      [akpm@osdl.org: bugfix]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      65ba55f5
  2. 26 6月, 2006 1 次提交
  3. 23 6月, 2006 5 次提交
    • C
      [PATCH] More page migration: use migration entries for file pages · 04e62a29
      Christoph Lameter 提交于
      This implements the use of migration entries to preserve ptes of file backed
      pages during migration.  Processes can therefore be migrated back and forth
      without loosing their connection to pagecache pages.
      
      Note that we implement the migration entries only for linear mappings.
      Nonlinear mappings still require the unmapping of the ptes for migration.
      
      And another writepage() ugliness shows up.  writepage() can drop the page
      lock.  Therefore we have to remove migration ptes before calling writepages()
      in order to avoid having migration entries point to unlocked pages.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      04e62a29
    • C
      [PATCH] More page migration: do not inc/dec rss counters · 442c9137
      Christoph Lameter 提交于
      If we install a migration entry then the rss not really decreases since the
      page is just moved somewhere else.  We can save ourselves the work of
      decrementing and later incrementing which will just eventually cause cacheline
      bouncing.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      442c9137
    • C
      [PATCH] Swapless page migration: rip out swap based logic · d75a0fcd
      Christoph Lameter 提交于
      Rip the page migration logic out.
      
      Remove all code that has to do with swapping during page migration.
      
      This also guts the ability to migrate pages to swap.  No one used that so lets
      let it go for good.
      
      Page migration should be a bit broken after this patch.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d75a0fcd
    • C
      [PATCH] Swapless page migration: add R/W migration entries · 0697212a
      Christoph Lameter 提交于
      Implement read/write migration ptes
      
      We take the upper two swapfiles for the two types of migration ptes and define
      a series of macros in swapops.h.
      
      The VM is modified to handle the migration entries.  migration entries can
      only be encountered when the page they are pointing to is locked.  This limits
      the number of places one has to fix.  We also check in copy_pte_range and in
      mprotect_pte_range() for migration ptes.
      
      We check for migration ptes in do_swap_cache and call a function that will
      then wait on the page lock.  This allows us to effectively stop all accesses
      to apge.
      
      Migration entries are created by try_to_unmap if called for migration and
      removed by local functions in migrate.c
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration (I've no NUMA, just
        hacking it up to migrate recklessly while running load), I've hit the
        BUG_ON(!PageLocked(p)) in migration_entry_to_page.
      
        This comes from an orphaned migration entry, unrelated to the current
        correctly locked migration, but hit by remove_anon_migration_ptes as it
        checks an address in each vma of the anon_vma list.
      
        Such an orphan may be left behind if an earlier migration raced with fork:
        copy_one_pte can duplicate a migration entry from parent to child, after
        remove_anon_migration_ptes has checked the child vma, but before it has
        removed it from the parent vma.  (If the process were later to fault on this
        orphaned entry, it would hit the same BUG from migration_entry_wait.)
      
        This could be fixed by locking anon_vma in copy_one_pte, but we'd rather
        not.  There's no such problem with file pages, because vma_prio_tree_add
        adds child vma after parent vma, and the page table locking at each end is
        enough to serialize.  Follow that example with anon_vma: add new vmas to the
        tail instead of the head.
      
        (There's no corresponding problem when inserting migration entries,
        because a missed pte will leave the page count and mapcount high, which is
        allowed for.  And there's no corresponding problem when migrating via swap,
        because a leftover swap entry will be correctly faulted.  But the swapless
        method has no refcounting of its entries.)
      
      From: Ingo Molnar <mingo@elte.hu>
      
        pte_unmap_unlock() takes the pte pointer as an argument.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Several times while testing swapless page migration, gcc has tried to exec
        a pointer instead of a string: smells like COW mappings are not being
        properly write-protected on fork.
      
        The protection in copy_one_pte looks very convincing, until at last you
        realize that the second arg to make_migration_entry is a boolean "write",
        and SWP_MIGRATION_READ is 30.
      
        Anyway, it's better done like in change_pte_range, using
        is_write_migration_entry and make_migration_entry_read.
      
      From: Hugh Dickins <hugh@veritas.com>
      
        Remove unnecessary obfuscation from sys_swapon's range check on swap type,
        which blew up causing memory corruption once swapless migration made
        MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NChristoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      From: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0697212a
    • C
      [PATCH] page migration cleanup: rename "ignrefs" to "migration" · 7352349a
      Christoph Lameter 提交于
      migrate is a better name since it is only used by page migration.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7352349a
  4. 22 3月, 2006 2 次提交
  5. 10 3月, 2006 1 次提交
  6. 01 3月, 2006 1 次提交
  7. 02 2月, 2006 3 次提交
  8. 19 1月, 2006 1 次提交
  9. 10 1月, 2006 1 次提交
  10. 09 1月, 2006 1 次提交
  11. 07 1月, 2006 2 次提交
  12. 30 11月, 2005 1 次提交
  13. 29 11月, 2005 2 次提交
    • R
      [PATCH] temporarily disable swap token on memory pressure · f7b7fd8f
      Rik van Riel 提交于
      Some users (hi Zwane) have seen a problem when running a workload that
      eats nearly all of physical memory - th system does an OOM kill, even
      when there is still a lot of swap free.
      
      The problem appears to be a very big task that is holding the swap
      token, and the VM has a very hard time finding any other page in the
      system that is swappable.
      
      Instead of ignoring the swap token when sc->priority reaches 0, we could
      simply take the swap token away from the memory hog and make sure we
      don't give it back to the memory hog for a few seconds.
      
      This patch resolves the problem Zwane ran into.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f7b7fd8f
    • L
      mm: re-architect the VM_UNPAGED logic · 6aab341e
      Linus Torvalds 提交于
      This replaces the (in my opinion horrible) VM_UNMAPPED logic with very
      explicit support for a "remapped page range" aka VM_PFNMAP.  It allows a
      VM area to contain an arbitrary range of page table entries that the VM
      never touches, and never considers to be normal pages.
      
      Any user of "remap_pfn_range()" automatically gets this new
      functionality, and doesn't even have to mark the pages reserved or
      indeed mark them any other way.  It just works.  As a side effect, doing
      mmap() on /dev/mem works for arbitrary ranges.
      
      Sparc update from David in the next commit.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6aab341e
  14. 23 11月, 2005 2 次提交
    • H
      [PATCH] unpaged: anon in VM_UNPAGED · ee498ed7
      Hugh Dickins 提交于
      copy_one_pte needs to copy the anonymous COWed pages in a VM_UNPAGED area,
      zap_pte_range needs to free them, do_wp_page needs to COW them: just like
      ordinary pages, not like the unpaged.
      
      But recognizing them is a little subtle: because PageReserved is no longer a
      condition for remap_pfn_range, we can now mmap all of /dev/mem (whether the
      distro permits, and whether it's advisable on this or that architecture, is
      another matter).  So if we can see a PageAnon, it may not be ours to mess with
      (or may be ours from elsewhere in the address space).  I suspect there's an
      entertaining insoluble self-referential problem here, but the page_is_anon
      function does a good practical job, and MAP_PRIVATE PROT_WRITE VM_UNPAGED will
      always be an odd choice.
      
      In updating the comment on page_address_in_vma, noticed a potential NULL
      dereference, in a path we don't actually take, but fixed it.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ee498ed7
    • H
      [PATCH] unpaged: VM_NONLINEAR VM_RESERVED · 101d2be7
      Hugh Dickins 提交于
      There's one peculiar use of VM_RESERVED which the previous patch left behind:
      because VM_NONLINEAR's try_to_unmap_cluster uses vm_private_data as a swapout
      cursor, but should never meet VM_RESERVED vmas, it was a way of extending
      VM_NONLINEAR to VM_RESERVED vmas using vm_private_data for some other purpose.
       But that's an empty set - they don't have the populate function required.  So
      just throw away those VM_RESERVED tests.
      
      But one more interesting in rmap.c has to go too: try_to_unmap_one will want
      to swap out an anonymous page from VM_RESERVED or VM_UNPAGED area.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      101d2be7
  15. 30 10月, 2005 8 次提交
    • H
      [PATCH] mm: update comments to pte lock · b8072f09
      Hugh Dickins 提交于
      Updated several references to page_table_lock in common code comments.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b8072f09
    • H
      [PATCH] mm: fix rss and mmlist locking · f412ac08
      Hugh Dickins 提交于
      A couple of oddities were guarded by page_table_lock, no longer properly
      guarded when that is split.
      
      The mm_counters of file_rss and anon_rss: make those an atomic_t, or an
      atomic64_t if the architecture supports it, in such a case.  Definitions by
      courtesy of Christoph Lameter: who spent considerable effort on more scalable
      ways of counting, but found insufficient benefit in practice.
      
      And adding an mm with swap to the mmlist for swapoff: the list is well-
      guarded by its own lock, but the list_empty check now has to be repeated
      inside it.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f412ac08
    • H
      [PATCH] mm: split page table lock · 4c21e2f2
      Hugh Dickins 提交于
      Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
      a many-threaded application which concurrently initializes different parts of
      a large anonymous area.
      
      This patch corrects that, by using a separate spinlock per page table page, to
      guard the page table entries in that page, instead of using the mm's single
      page_table_lock.  (But even then, page_table_lock is still used to guard page
      table allocation, and anon_vma allocation.)
      
      In this implementation, the spinlock is tucked inside the struct page of the
      page table page: with a BUILD_BUG_ON in case it overflows - which it would in
      the case of 32-bit PA-RISC with spinlock debugging enabled.
      
      Splitting the lock is not quite for free: another cacheline access.  Ideally,
      I suppose we would use split ptlock only for multi-threaded processes on
      multi-cpu machines; but deciding that dynamically would have its own costs.
      So for now enable it by config, at some number of cpus - since the Kconfig
      language doesn't support inequalities, let preprocessor compare that with
      NR_CPUS.  But I don't think it's worth being user-configurable: for good
      testing of both split and unsplit configs, split now at 4 cpus, and perhaps
      change that to 8 later.
      
      There is a benefit even for singly threaded processes: kswapd can be attacking
      one part of the mm while another part is busy faulting.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c21e2f2
    • H
      [PATCH] mm: rmap with inner ptlock · c0718806
      Hugh Dickins 提交于
      rmap's page_check_address descend without page_table_lock.  First just
      pte_offset_map in case there's no pte present worth locking for, then take
      page_table_lock for the full check, and pass ptl back to caller in the same
      style as pte_offset_map_lock.  __xip_unmap, page_referenced_one and
      try_to_unmap_one use pte_unmap_unlock.  try_to_unmap_cluster also.
      
      page_check_address reformatted to avoid progressive indentation.  No use is
      made of its one error code, return NULL when it fails.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c0718806
    • H
      [PATCH] mm: update_hiwaters just in time · 365e9c87
      Hugh Dickins 提交于
      update_mem_hiwater has attracted various criticisms, in particular from those
      concerned with mm scalability.  Originally it was called whenever rss or
      total_vm got raised.  Then many of those callsites were replaced by a timer
      tick call from account_system_time.  Now Frank van Maarseveen reports that to
      be found inadequate.  How about this?  Works for Frank.
      
      Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
      update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
      mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
      by 1): those are hot paths.  Do the opposite, update only when about to lower
      rss (usually by many), or just before final accounting in do_exit.  Handle
      mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
      that whoever collects these hiwater statistics do the work of taking the
      maximum with rss or total_vm.
      
      And there has been no collector of these hiwater statistics in the tree.  The
      new convention needs an example, so match Frank's usage by adding a VmPeak
      line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
      (High-Water-Mark or High-Water-Memory).
      
      There was a particular anomaly during mremap move, that hiwater_vm might be
      captured too high.  A fleeting such anomaly remains, but it's quickly
      corrected now, whereas before it would stick.
      
      What locking?  None: if the app is racy then these statistics will be racy,
      it's not worth any overhead to make them exact.  But whenever it suits,
      hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
      page_table_lock (for now) or with preemption disabled (later on): without
      going to any trouble, minimize the time between reading current values and
      updating, to minimize those occasions when a racing thread bumps a count up
      and back down in between.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      365e9c87
    • N
      [PATCH] core remove PageReserved · b5810039
      Nick Piggin 提交于
      Remove PageReserved() calls from core code by tightening VM_RESERVED
      handling in mm/ to cover PageReserved functionality.
      
      PageReserved special casing is removed from get_page and put_page.
      
      All setting and clearing of PageReserved is retained, and it is now flagged
      in the page_alloc checks to help ensure we don't introduce any refcount
      based freeing of Reserved pages.
      
      MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
      deprecated.  We never completely handled it correctly anyway, and is be
      reintroduced in future if required (Hugh has a proof of concept).
      
      Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
      arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
      be trivially removed.
      
      Last real user of PageReserved is swsusp, which uses PageReserved to
      determine whether a struct page points to valid memory or not.  This still
      needs to be addressed (a generic page_is_ram() should work).
      
      A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
      thus mapcounted and count towards shared rss).  These writes to the struct
      page could cause excessive cacheline bouncing on big systems.  There are a
      number of ways this could be addressed if it is an issue.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      
      Refcount bug fix for filemap_xip.c
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b5810039
    • H
      [PATCH] mm: rss = file_rss + anon_rss · 4294621f
      Hugh Dickins 提交于
      I was lazy when we added anon_rss, and chose to change as few places as
      possible.  So currently each anonymous page has to be counted twice, in rss
      and in anon_rss.  Which won't be so good if those are atomic counts in some
      configurations.
      
      Change that around: keep file_rss and anon_rss separately, and add them
      together (with get_mm_rss macro) when the total is needed - reading two
      atomics is much cheaper than updating two atomics.  And update anon_rss
      upfront, typically in memory.c, not tucked away in page_add_anon_rmap.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4294621f
    • R
      [PATCH] swaptoken tuning · fcdae29a
      Rik Van Riel 提交于
      It turns out that the original swap token implementation, by Song Jiang, only
      enforced the swap token while the task holding the token is handling a page
      fault.  This patch approximates that, without adding an additional flag to the
      mm_struct, by checking whether the mm->mmap_sem is held for reading, like the
      page fault code does.
      
      This patch has the effect of automatically, and gradually, disabling the
      enforcement of the swap token when there is little or no paging going on, and
      "turning up" the intensity of the swap token code the more the task holding
      the token is thrashing.
      
      Thanks to Song Jiang for pointing out this aspect of the token based thrashing
      control concept.
      
      The new code shows a slight degradation over the old swap token code, but
      still a big win over running without the swap token.
      
      2.6.12+ swap token disabled
      
      $ for i in `seq 10` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done
      101.74user 23.13system 8:26.91elapsed 24%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (38597major+430315minor)pagefaults 0swaps
      101.98user 24.91system 8:03.06elapsed 26%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (33939major+430457minor)pagefaults 0swaps
      101.93user 22.12system 7:34.90elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (33166major+421267minor)pagefaults 0swaps
      101.82user 22.38system 8:31.40elapsed 24%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (39338major+433262minor)pagefaults 0swaps
      
      2.6.12+ swap token enabled, timeout 300 seconds
      
      $ for i in `seq 4` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done
      102.58user 16.08system 3:41.44elapsed 53%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (19707major+285786minor)pagefaults 0swaps
      102.07user 19.56system 4:00.64elapsed 50%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (19012major+299259minor)pagefaults 0swaps
      102.64user 18.25system 4:07.31elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (21990major+304831minor)pagefaults 0swaps
      101.39user 19.41system 5:15.81elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (24850major+323321minor)pagefaults 0swaps
      
      2.6.12+ with new swap token code, timeout 300 seconds
      
      $ for i in `seq 4` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done
      101.87user 24.66system 5:53.20elapsed 35%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (26848major+363497minor)pagefaults 0swaps
      102.83user 19.95system 4:17.25elapsed 47%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (19946major+305722minor)pagefaults 0swaps
      102.09user 19.46system 5:12.57elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (25461major+334994minor)pagefaults 0swaps
      101.67user 20.61system 4:52.97elapsed 41%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (22190major+329508minor)pagefaults 0swaps
      Signed-off-by: NRik Van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fcdae29a
  16. 05 9月, 2005 5 次提交
  17. 24 6月, 2005 1 次提交
    • C
      [PATCH] xip: fs/mm: execute in place · ceffc078
      Carsten Otte 提交于
      - generic_file* file operations do no longer have a xip/non-xip split
      - filemap_xip.c implements a new set of fops that require get_xip_page
        aop to work proper. all new fops are exported GPL-only (don't like to
        see whatever code use those except GPL modules)
      - __xip_unmap now uses page_check_address, which is no longer static
        in rmap.c, and defined in linux/rmap.h
      - mm/filemap.h is now much more clean, plainly having just Linus'
        inline funcs moved here from filemap.c
      - fix includes in filemap_xip to make it build cleanly on i386
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ceffc078
  18. 22 6月, 2005 1 次提交
    • H
      [PATCH] can_share_swap_page: use page_mapcount · c475a8ab
      Hugh Dickins 提交于
      Remember that ironic get_user_pages race?  when the raised page_count on a
      page swapped out led do_wp_page to decide that it had to copy on write, so
      substituted a different page into userspace.  2.6.7 onwards have Andrea's
      solution, where try_to_unmap_one backs out if it finds page_count raised.
      
      Which works, but is unsatisfying (rmap.c has no other page_count heuristics),
      and was found a few months ago to hang an intensive page migration test.  A
      year ago I was hesitant to engage page_mapcount, now it seems the right fix.
      
      So remove the page_count hack from try_to_unmap_one; and use activate_page in
      unuse_mm when dropping lock, to replace its secondary effect of helping
      swapoff to make progress in that case.
      
      Simplify can_share_swap_page (now called only on anonymous pages) to check
      page_mapcount + page_swapcount == 1: still needs the page lock to stabilize
      their (pessimistic) sum, but does not need swapper_space.tree_lock for that.
      
      In do_swap_page, move swap_free and unlock_page below page_add_anon_rmap, to
      keep sum on the high side, and correct when can_share_swap_page called.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c475a8ab