1. 23 6月, 2006 1 次提交
    • A
      [PATCH] zone handle unaligned zone boundaries · cb2b95e1
      Andy Whitcroft 提交于
      The buddy allocator has a requirement that boundaries between contigious
      zones occur aligned with the the MAX_ORDER ranges.  Where they do not we
      will incorrectly merge pages cross zone boundaries.  This can lead to pages
      from the wrong zone being handed out.
      
      Originally the buddy allocator would check that buddies were in the same
      zone by referencing the zone start and end page frame numbers.  This was
      removed as it became very expensive and the buddy allocator already made
      the assumption that zones boundaries were aligned.
      
      It is clear that not all configurations and architectures are honouring
      this alignment requirement.  Therefore it seems safest to reintroduce
      support for non-aligned zone boundaries.  This patch introduces a new check
      when considering a page a buddy it compares the zone_table index for the
      two pages and refuses to merge the pages where they do not match.  The
      zone_table index is unique for each node/zone combination when
      FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when
      SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size).
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cb2b95e1
  2. 26 4月, 2006 1 次提交
  3. 11 4月, 2006 1 次提交
  4. 22 3月, 2006 8 次提交
  5. 21 2月, 2006 1 次提交
  6. 18 2月, 2006 1 次提交
  7. 08 2月, 2006 1 次提交
  8. 15 1月, 2006 1 次提交
  9. 12 1月, 2006 2 次提交
  10. 10 1月, 2006 1 次提交
  11. 09 1月, 2006 3 次提交
  12. 07 1月, 2006 5 次提交
    • D
      [PATCH] FRV: Make futex code compilable on nommu [try #2] · 7ee1dd3f
      David Howells 提交于
      Make the futex code compilable and usable on NOMMU by making the attempt to
      handle page faults conditional on CONFIG_MMU.  If this is not enabled, then
      we can assume that EFAULT returned from futex_atomic_op_inuser() is not
      recoverable, and that the address lies outside of valid memory.
      
      handle_mm_fault() is made to BUG if called on NOMMU without attempting to
      invoke the actual handler (__handle_mm_fault).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7ee1dd3f
    • D
      [PATCH] NOMMU: Make SYSV IPC SHM use ramfs facilities on NOMMU · b0e15190
      David Howells 提交于
      The attached patch makes the SYSV IPC shared memory facilities use the new
      ramfs facilities on a no-MMU kernel.
      
      The following changes are made:
      
       (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to
           allow the IPC SHM facilities to commune with the tiny-shmem and shmem
           code.
      
       (2) ramfs files now need resizing using do_truncate() rather than by modifying
           the inode size directly (see shmem_file_setup()). This causes ramfs to
           attempt to bind a block of pages of sufficient size to the inode.
      
       (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b0e15190
    • R
      [PATCH] Shut up warnings in ipc/shm.c · 03b00ebc
      Russell King 提交于
      Fix two warnings in ipc/shm.c
      
      ipc/shm.c:122: warning: statement with no effect
      ipc/shm.c:560: warning: statement with no effect
      
      by converting the macros to empty inline functions.  For safety, let's do
      all three.  This also has the advantage that typechecking gets performed
      even without CONFIG_SHMEM enabled.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      03b00ebc
    • B
      [PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store · f6b3ec23
      Badari Pulavarty 提交于
      Here is the patch to implement madvise(MADV_REMOVE) - which frees up a
      given range of pages & its associated backing store.  Current
      implementation supports only shmfs/tmpfs and other filesystems return
      -ENOSYS.
      
      "Some app allocates large tmpfs files, then when some task quits and some
      client disconnect, some memory can be released.  However the only way to
      release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli
      
      Databases want to use this feature to drop a section of their bufferpool
      (shared memory segments) - without writing back to disk/swap space.
      
      This feature is also useful for supporting hot-plug memory on UML.
      
      Concerns raised by Andrew Morton:
      
      - "We have no plan for holepunching!  If we _do_ have such a plan (or
        might in the future) then what would the API look like?  I think
        sys_holepunch(fd, start, len), so we should start out with that."
      
      - Using madvise is very weird, because people will ask "why do I need to
        mmap my file before I can stick a hole in it?"
      
      - None of the other madvise operations call into the filesystem in this
        manner.  A broad question is: is this capability an MM operation or a
        filesytem operation?  truncate, for example, is a filesystem operation
        which sometimes has MM side-effects.  madvise is an mm operation and with
        this patch, it gains FS side-effects, only they're really, really
        significant ones."
      
      Comments:
      
      - Andrea suggested the fs operation too but then it's more efficient to
        have it as a mm operation with fs side effects, because they don't
        immediatly know fd and physical offset of the range.  It's possible to
        fixup in userland and to use the fs operation but it's more expensive,
        the vmas are already in the kernel and we can use them.
      
      Short term plan &  Future Direction:
      
      - We seem to need this interface only for shmfs/tmpfs files in the short
        term.  We have to add hooks into the filesystem for correctness and
        completeness.  This is what this patch does.
      
      - In the future, plan is to support both fs and mmap apis also.  This
        also involves (other) filesystem specific functions to be implemented.
      
      - Current patch doesn't support VM_NONLINEAR - which can be addressed in
        the future.
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Andrea Arcangeli <andrea@suse.de>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f6b3ec23
    • H
      [PATCH] reiser4: vfs: add truncate_inode_pages_range() · d7339071
      Hans Reiser 提交于
      This patch makes truncate_inode_pages_range from truncate_inode_pages.
      truncate_inode_pages became a one-liner call to truncate_inode_pages_range.
      
      Reiser4 needs truncate_inode_pages_ranges because it tries to keep
      correspondence between existences of metadata pointing to data pages and pages
      to which those metadata point to.  So, when metadata of certain part of file
      is removed from filesystem tree, only pages of corresponding range are to be
      truncated.
      
      (Needed by the madvise(MADV_REMOVE) patch)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d7339071
  13. 17 12月, 2005 1 次提交
  14. 12 12月, 2005 1 次提交
  15. 01 12月, 2005 1 次提交
    • L
      VM: add "vm_insert_page()" function · a145dd41
      Linus Torvalds 提交于
      This is what a lot of drivers will actually want to use to insert
      individual pages into a user VMA.  It doesn't have the old PageReserved
      restrictions of remap_pfn_range(), and it doesn't complain about partial
      remappings.
      
      The page you insert needs to be a nice clean kernel allocation, so you
      can't insert arbitrary page mappings with this, but that's not what
      people want.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a145dd41
  16. 30 11月, 2005 2 次提交
  17. 29 11月, 2005 1 次提交
    • L
      mm: re-architect the VM_UNPAGED logic · 6aab341e
      Linus Torvalds 提交于
      This replaces the (in my opinion horrible) VM_UNMAPPED logic with very
      explicit support for a "remapped page range" aka VM_PFNMAP.  It allows a
      VM area to contain an arbitrary range of page table entries that the VM
      never touches, and never considers to be normal pages.
      
      Any user of "remap_pfn_range()" automatically gets this new
      functionality, and doesn't even have to mark the pages reserved or
      indeed mark them any other way.  It just works.  As a side effect, doing
      mmap() on /dev/mem works for arbitrary ranges.
      
      Sparc update from David in the next commit.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6aab341e
  18. 23 11月, 2005 2 次提交
    • H
      [PATCH] unpaged: VM_UNPAGED · 0b14c179
      Hugh Dickins 提交于
      Although we tend to associate VM_RESERVED with remap_pfn_range, quite a few
      drivers set VM_RESERVED on areas which are then populated by nopage.  The
      PageReserved removal in 2.6.15-rc1 changed VM_RESERVED not to free pages in
      zap_pte_range, without changing those drivers not to set it: so their pages
      just leak away.
      
      Let's not change miscellaneous drivers now: introduce VM_UNPAGED at the core,
      to flag the special areas where the ptes may have no struct page, or if they
      have then it's not to be touched.  Replace most instances of VM_RESERVED in
      core mm by VM_UNPAGED.  Force it on in remap_pfn_range, and the sparc and
      sparc64 io_remap_pfn_range.
      
      Revert addition of VM_RESERVED to powerpc vdso, it's not needed there.  Is it
      needed anywhere?  It still governs the mm->reserved_vm statistic, and special
      vmas not to be merged, and areas not to be core dumped; but could probably be
      eliminated later (the drivers are probably specifying it because in 2.4 it
      kept swapout off the vma, but in 2.6 we work from the LRU, which these pages
      don't get on).
      
      Use the VM_SHM slot for VM_UNPAGED, and define VM_SHM to 0: it serves no
      purpose whatsoever, and should be removed from drivers when we clean up.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NWilliam Irwin <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0b14c179
    • H
      [PATCH] unpaged: unifdefed PageCompound · 664beed0
      Hugh Dickins 提交于
      It looks like snd_xxx is not the only nopage to be using PageReserved as a way
      of holding a high-order page together: which no longer works, but is masked by
      our failure to free from VM_RESERVED areas.  We cannot fix that bug without
      first substituting another way to hold the high-order page together, while
      farming out the 0-order pages from within it.
      
      That's just what PageCompound is designed for, but it's been kept under
      CONFIG_HUGETLB_PAGE.  Remove the #ifdefs: which saves some space (out- of-line
      put_page), doesn't slow down what most needs to be fast (already using
      hugetlb), and unifies the way we handle high-order pages.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      664beed0
  19. 19 11月, 2005 1 次提交
  20. 15 11月, 2005 1 次提交
  21. 07 11月, 2005 1 次提交
  22. 30 10月, 2005 3 次提交
    • D
      [PATCH] memory hotplug: sysfs and add/remove functions · 3947be19
      Dave Hansen 提交于
      This adds generic memory add/remove and supporting functions for memory
      hotplug into a new file as well as a memory hotplug kernel config option.
      
      Individual architecture patches will follow.
      
      For now, disable memory hotplug when swsusp is enabled.  There's a lot of
      churn there right now.  We'll fix it up properly once it calms down.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3947be19
    • H
      [PATCH] mm: split page table lock · 4c21e2f2
      Hugh Dickins 提交于
      Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
      a many-threaded application which concurrently initializes different parts of
      a large anonymous area.
      
      This patch corrects that, by using a separate spinlock per page table page, to
      guard the page table entries in that page, instead of using the mm's single
      page_table_lock.  (But even then, page_table_lock is still used to guard page
      table allocation, and anon_vma allocation.)
      
      In this implementation, the spinlock is tucked inside the struct page of the
      page table page: with a BUILD_BUG_ON in case it overflows - which it would in
      the case of 32-bit PA-RISC with spinlock debugging enabled.
      
      Splitting the lock is not quite for free: another cacheline access.  Ideally,
      I suppose we would use split ptlock only for multi-threaded processes on
      multi-cpu machines; but deciding that dynamically would have its own costs.
      So for now enable it by config, at some number of cpus - since the Kconfig
      language doesn't support inequalities, let preprocessor compare that with
      NR_CPUS.  But I don't think it's worth being user-configurable: for good
      testing of both split and unsplit configs, split now at 4 cpus, and perhaps
      change that to 8 later.
      
      There is a benefit even for singly threaded processes: kswapd can be attacking
      one part of the mm while another part is busy faulting.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c21e2f2
    • H
      [PATCH] mm: follow_page with inner ptlock · deceb6cd
      Hugh Dickins 提交于
      Final step in pushing down common core's page_table_lock.  follow_page no
      longer wants caller to hold page_table_lock, uses pte_offset_map_lock itself;
      and so no page_table_lock is taken in get_user_pages itself.
      
      But get_user_pages (and get_futex_key) do then need follow_page to pin the
      page for them: take Daniel's suggestion of bitflags to follow_page.
      
      Need one for WRITE, another for TOUCH (it was the accessed flag before:
      vanished along with check_user_page_readable, but surely get_numa_maps is
      wrong to mark every page it finds as accessed), another for GET.
      
      And another, ANON to dispose of untouched_anonymous_page: it seems silly for
      that to descend a second time, let follow_page observe if there was no page
      table and return ZERO_PAGE if so.  Fix minor bug in that: check VM_LOCKED -
      make_pages_present ought to make readonly anonymous present.
      
      Give get_numa_maps a cond_resched while we're there.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      deceb6cd