1. 18 7月, 2007 12 次提交
    • C
      CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic · 2a7326b5
      Christoph Lameter 提交于
      The bounce buffer logic is included on systems that do not need it.  If a
      system does not have zones like ZONE_DMA and ZONE_HIGHMEM that can lead to
      the use of bounce buffers then there is no need to reserve memory pools etc
      etc.  This is true f.e.  for SGI Altix.
      
      Also nicifies the Makefile and gets rid of the tricky "and" there.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a7326b5
    • R
      Freezer: make kernel threads nonfreezable by default · 83144186
      Rafael J. Wysocki 提交于
      Currently, the freezer treats all tasks as freezable, except for the kernel
      threads that explicitly set the PF_NOFREEZE flag for themselves.  This
      approach is problematic, since it requires every kernel thread to either
      set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
      care for the freezing of tasks at all.
      
      It seems better to only require the kernel threads that want to or need to
      be frozen to use some freezer-related code and to remove any
      freezer-related code from the other (nonfreezable) kernel threads, which is
      done in this patch.
      
      The patch causes all kernel threads to be nonfreezable by default (ie.  to
      have PF_NOFREEZE set by default) and introduces the set_freezable()
      function that should be called by the freezable kernel threads in order to
      unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
      threads call set_freezable(), so it shouldn't cause any (intentional)
      change of behaviour to appear.  Additionally, it updates documentation to
      describe the freezing of tasks more accurately.
      
      [akpm@linux-foundation.org: build fixes]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@nigel.suspend2.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83144186
    • C
      Add VM_BUG_ON in case someone uses page_mapping on a slab page · b5fab14e
      Christoph Lameter 提交于
      Detect slab objects being passed to the page oriented functions of the VM.
      
      It is not sufficient to simply return NULL because the functions calling
      page_mapping may depend on other items of the page_struct also to be setup
      properly.  Moreover slab object may not be properly aligned.  The page
      oriented functions of the VM expect to operate on page aligned, page sized
      objects.  Operations on object straddling page boundaries may only affect the
      objects partially which may lead to surprising results.
      
      It is better to detect eventually remaining uses and eliminate them.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5fab14e
    • C
      Slab allocators: Cleanup zeroing allocations · 81cda662
      Christoph Lameter 提交于
      It becomes now easy to support the zeroing allocs with generic inline
      functions in slab.h.  Provide inline definitions to allow the continued use of
      kzalloc, kmem_cache_zalloc etc but remove other definitions of zeroing
      functions from the slab allocators and util.c.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81cda662
    • C
      SLUB: add some more inlines and #ifdef CONFIG_SLUB_DEBUG · 0c710013
      Christoph Lameter 提交于
      Add #ifdefs around data structures only needed if debugging is compiled into
      SLUB.
      
      Add inlines to small functions to reduce code size.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c710013
    • C
      Slab allocators: consistent ZERO_SIZE_PTR support and NULL result semantics · 6cb8f913
      Christoph Lameter 提交于
      Define ZERO_OR_NULL_PTR macro to be able to remove the checks from the
      allocators.  Move ZERO_SIZE_PTR related stuff into slab.h.
      
      Make ZERO_SIZE_PTR work for all slab allocators and get rid of the
      WARN_ON_ONCE(size == 0) that is still remaining in SLAB.
      
      Make slub return NULL like the other allocators if a too large memory segment
      is requested via __kmalloc.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cb8f913
    • R
      mm: clean up and kernelify shrinker registration · 8e1f936b
      Rusty Russell 提交于
      I can never remember what the function to register to receive VM pressure
      is called.  I have to trace down from __alloc_pages() to find it.
      
      It's called "set_shrinker()", and it needs Your Help.
      
      1) Don't hide struct shrinker.  It contains no magic.
      2) Don't allocate "struct shrinker".  It's not helpful.
      3) Call them "register_shrinker" and "unregister_shrinker".
      4) Call the function "shrink" not "shrinker".
      5) Reduce the 17 lines of waffly comments to 13, but document it properly.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: David Chinner <dgc@sgi.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e1f936b
    • A
      Lumpy Reclaim V4 · 5ad333eb
      Andy Whitcroft 提交于
      When we are out of memory of a suitable size we enter reclaim.  The current
      reclaim algorithm targets pages in LRU order, which is great for fairness at
      order-0 but highly unsuitable if you desire pages at higher orders.  To get
      pages of higher order we must shoot down a very high proportion of memory;
      >95% in a lot of cases.
      
      This patch set adds a lumpy reclaim algorithm to the allocator.  It targets
      groups of pages at the specified order anchored at the end of the active and
      inactive lists.  This encourages groups of pages at the requested orders to
      move from active to inactive, and active to free lists.  This behaviour is
      only triggered out of direct reclaim when higher order pages have been
      requested.
      
      This patch set is particularly effective when utilised with an
      anti-fragmentation scheme which groups pages of similar reclaimability
      together.
      
      This patch set is based on Peter Zijlstra's lumpy reclaim V2 patch which forms
      the foundation.  Credit to Mel Gorman for sanitity checking.
      
      Mel said:
      
        The patches have an application with hugepage pool resizing.
      
        When lumpy-reclaim is used used with ZONE_MOVABLE, the hugepages pool can
        be resized with greater reliability.  Testing on a desktop machine with 2GB
        of RAM showed that growing the hugepage pool with ZONE_MOVABLE on it's own
        was very slow as the success rate was quite low.  Without lumpy-reclaim,
        each attempt to grow the pool by 100 pages would yield 1 or 2 hugepages.
        With lumpy-reclaim, getting 40 to 70 hugepages on each attempt was typical.
      
      [akpm@osdl.org: ia64 pfn_to_nid fixes and loop cleanup]
      [bunk@stusta.de: static declarations for internal functions]
      [a.p.zijlstra@chello.nl: initial lumpy V2 implementation]
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Bob Picco <bob.picco@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ad333eb
    • M
      handle kernelcore=: generic · ed7ed365
      Mel Gorman 提交于
      This patch adds the kernelcore= parameter for x86.
      
      Once all patches are applied, a new command-line parameter exist and a new
      sysctl.  This patch adds the necessary documentation.
      
      From: Yasunori Goto <y-goto@jp.fujitsu.com>
      
        When "kernelcore" boot option is specified, kernel can't boot up on ia64
        because of an infinite loop.  In addition, the parsing code can be handled
        in an architecture-independent manner.
      
        This patch uses common code to handle the kernelcore= parameter.  It is
        only available to architectures that support arch-independent zone-sizing
        (i.e.  define CONFIG_ARCH_POPULATES_NODE_MAP).  Other architectures will
        ignore the boot parameter.
      
      [bunk@stusta.de: make cmdline_parse_kernelcore() static]
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Acked-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed7ed365
    • M
      Allow huge page allocations to use GFP_HIGH_MOVABLE · 396faf03
      Mel Gorman 提交于
      Huge pages are not movable so are not allocated from ZONE_MOVABLE.  However,
      as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it
      can be used to satisfy hugepage allocations even when the system has been
      running a long time.  This allows an administrator to resize the hugepage pool
      at runtime depending on the size of ZONE_MOVABLE.
      
      This patch adds a new sysctl called hugepages_treat_as_movable.  When a
      non-zero value is written to it, future allocations for the huge page pool
      will use ZONE_MOVABLE.  Despite huge pages being non-movable, we do not
      introduce additional external fragmentation of note as huge pages are always
      the largest contiguous block we care about.
      
      [akpm@linux-foundation.org: various fixes]
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      396faf03
    • M
      Create the ZONE_MOVABLE zone · 2a1e274a
      Mel Gorman 提交于
      The following 8 patches against 2.6.20-mm2 create a zone called ZONE_MOVABLE
      that is only usable by allocations that specify both __GFP_HIGHMEM and
      __GFP_MOVABLE.  This has the effect of keeping all non-movable pages within a
      single memory partition while allowing movable allocations to be satisfied
      from either partition.  The patches may be applied with the list-based
      anti-fragmentation patches that groups pages together based on mobility.
      
      The size of the zone is determined by a kernelcore= parameter specified at
      boot-time.  This specifies how much memory is usable by non-movable
      allocations and the remainder is used for ZONE_MOVABLE.  Any range of pages
      within ZONE_MOVABLE can be released by migrating the pages or by reclaiming.
      
      When selecting a zone to take pages from for ZONE_MOVABLE, there are two
      things to consider.  First, only memory from the highest populated zone is
      used for ZONE_MOVABLE.  On the x86, this is probably going to be ZONE_HIGHMEM
      but it would be ZONE_DMA on ppc64 or possibly ZONE_DMA32 on x86_64.  Second,
      the amount of memory usable by the kernel will be spread evenly throughout
      NUMA nodes where possible.  If the nodes are not of equal size, the amount of
      memory usable by the kernel on some nodes may be greater than others.
      
      By default, the zone is not as useful for hugetlb allocations because they are
      pinned and non-migratable (currently at least).  A sysctl is provided that
      allows huge pages to be allocated from that zone.  This means that the huge
      page pool can be resized to the size of ZONE_MOVABLE during the lifetime of
      the system assuming that pages are not mlocked.  Despite huge pages being
      non-movable, we do not introduce additional external fragmentation of note as
      huge pages are always the largest contiguous block we care about.
      
      Credit goes to Andy Whitcroft for catching a large variety of problems during
      review of the patches.
      
      This patch creates an additional zone, ZONE_MOVABLE.  This zone is only usable
      by allocations which specify both __GFP_HIGHMEM and __GFP_MOVABLE.  Hot-added
      memory continues to be placed in their existing destination as there is no
      mechanism to redirect them to a specific zone.
      
      [y-goto@jp.fujitsu.com: Fix section mismatch of memory hotplug related code]
      [akpm@linux-foundation.org: various fixes]
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a1e274a
    • M
      Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated · 769848c0
      Mel Gorman 提交于
      It is often known at allocation time whether a page may be migrated or not.
      This patch adds a flag called __GFP_MOVABLE and a new mask called
      GFP_HIGH_MOVABLE.  Allocations using the __GFP_MOVABLE can be either migrated
      using the page migration mechanism or reclaimed by syncing with backing
      storage and discarding.
      
      An API function very similar to alloc_zeroed_user_highpage() is added for
      __GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable().  The
      flags used by alloc_zeroed_user_highpage() are not changed because it would
      change the semantics of an existing API.  After this patch is applied there
      are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
      be marked deprecated if this patch is merged.
      
      Note that this patch includes a minor cleanup to the use of __GFP_ZERO in
      shmem.c to keep all flag modifications to inode->mapping in the
      shmem_dir_alloc() helper function.  This clean-up suggestion is courtesy of
      Hugh Dickens.
      
      Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the
      concept.  Credit to Hugh Dickens for catching issues with shmem swap vector
      and ramfs allocations.
      
      [akpm@linux-foundation.org: build fix]
      [hugh@veritas.com: __GFP_ZERO cleanup]
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      769848c0
  2. 17 7月, 2007 28 次提交