1. 17 10月, 2007 7 次提交
    • M
      Group short-lived and reclaimable kernel allocations · e12ba74d
      Mel Gorman 提交于
      This patch marks a number of allocations that are either short-lived such as
      network buffers or are reclaimable such as inode allocations.  When something
      like updatedb is called, long-lived and unmovable kernel allocations tend to
      be spread throughout the address space which increases fragmentation.
      
      This patch groups these allocations together as much as possible by adding a
      new MIGRATE_TYPE.  The MIGRATE_RECLAIMABLE type is for allocations that can be
      reclaimed on demand, but not moved.  i.e.  they can be migrated by deleting
      them and re-reading the information from elsewhere.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e12ba74d
    • C
      Categorize GFP flags · 6cb06229
      Christoph Lameter 提交于
      The function of GFP_LEVEL_MASK seems to be unclear.  In order to clear up
      the mystery we get rid of it and replace GFP_LEVEL_MASK with 3 sets of GFP
      flags:
      
      GFP_RECLAIM_MASK	Flags used to control page allocator reclaim behavior.
      
      GFP_CONSTRAINT_MASK	Flags used to limit where allocations can occur.
      
      GFP_SLAB_BUG_MASK	Flags that the slab allocator BUG()s on.
      
      These replace the uses of GFP_LEVEL mask in the slab allocators and in
      vmalloc.c.
      
      The use of the flags not included in these sets may occur as a result of a
      slab allocation standing in for a page allocation when constructing scatter
      gather lists.  Extraneous flags are cleared and not passed through to the
      page allocator.  __GFP_MOVABLE/RECLAIMABLE, __GFP_COLD and __GFP_COMP will
      now be ignored if passed to a slab allocator.
      
      Change the allocation of allocator meta data in SLAB and vmalloc to not
      pass through flags listed in GFP_CONSTRAINT_MASK.  SLAB already removes the
      __GFP_THISNODE flag for such allocations.  Generalize that to also cover
      vmalloc.  The use of GFP_CONSTRAINT_MASK also includes __GFP_HARDWALL.
      
      The impact of allocator metadata placement on access latency to the
      cachelines of the object itself is minimal since metadata is only
      referenced on alloc and free.  The attempt is still made to place the meta
      data optimally but we consistently allow fallback both in SLAB and vmalloc
      (SLUB does not need to allocate metadata like that).
      
      Allocator metadata may serve multiple in kernel users and thus should not
      be subject to the limitations arising from a single allocation context.
      
      [akpm@linux-foundation.org: fix fallback_alloc()]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cb06229
    • C
      Memoryless nodes: SLUB support · f64dc58c
      Christoph Lameter 提交于
      Simply switch all for_each_online_node to for_each_node_state(NORMAL_MEMORY).
      That way SLUB only operates on nodes with regular memory.  Any allocation
      attempt on a memoryless node or a node with just highmem will fall whereupon
      SLUB will fetch memory from a nearby node (depending on how memory policies
      and cpuset describe fallback).
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Tested-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: NBob Picco <bob.picco@hp.com>
      Cc: Nishanth Aravamudan <nacc@us.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@skynet.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f64dc58c
    • C
      Slab allocators: fail if ksize is called with a NULL parameter · ef8b4520
      Christoph Lameter 提交于
      A NULL pointer means that the object was not allocated.  One cannot
      determine the size of an object that has not been allocated.  Currently we
      return 0 but we really should BUG() on attempts to determine the size of
      something nonexistent.
      
      krealloc() interprets NULL to mean a zero sized object.  Handle that
      separately in krealloc().
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef8b4520
    • S
      {slub, slob}: use unlikely() for kfree(ZERO_OR_NULL_PTR) check · 2408c550
      Satyam Sharma 提交于
      Considering kfree(NULL) would normally occur only in error paths and
      kfree(ZERO_SIZE_PTR) is uncommon as well, so let's use unlikely() for the
      condition check in SLUB's and SLOB's kfree() to optimize for the common
      case.  SLAB has this already.
      Signed-off-by: NSatyam Sharma <satyam@infradead.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2408c550
    • C
      SLUB: direct pass through of page size or higher kmalloc requests · aadb4bc4
      Christoph Lameter 提交于
      This gets rid of all kmalloc caches larger than page size.  A kmalloc
      request larger than PAGE_SIZE > 2 is going to be passed through to the page
      allocator.  This works both inline where we will call __get_free_pages
      instead of kmem_cache_alloc and in __kmalloc.
      
      kfree is modified to check if the object is in a slab page. If not then
      the page is freed via the page allocator instead. Roughly similar to what
      SLOB does.
      
      Advantages:
      - Reduces memory overhead for kmalloc array
      - Large kmalloc operations are faster since they do not
        need to pass through the slab allocator to get to the
        page allocator.
      - Performance increase of 10%-20% on alloc and 50% on free for
        PAGE_SIZEd allocations.
        SLUB must call page allocator for each alloc anyways since
        the higher order pages which that allowed avoiding the page alloc calls
        are not available in a reliable way anymore. So we are basically removing
        useless slab allocator overhead.
      - Large kmallocs yields page aligned object which is what
        SLAB did. Bad things like using page sized kmalloc allocations to
        stand in for page allocate allocs can be transparently handled and are not
        distinguishable from page allocator uses.
      - Checking for too large objects can be removed since
        it is done by the page allocator.
      
      Drawbacks:
      - No accounting for large kmalloc slab allocations anymore
      - No debugging of large kmalloc slab allocations.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aadb4bc4
    • A
      slub.c:early_kmem_cache_node_alloc() shouldn't be __init · 1cd7daa5
      Adrian Bunk 提交于
      WARNING: mm/built-in.o(.text+0x24bd3): Section mismatch: reference to .init.text:early_kmem_cache_node_alloc (between 'init_kmem_cache_nodes' and 'calculate_sizes')
      ...
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Acked-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1cd7daa5
  2. 12 9月, 2007 1 次提交
  3. 31 8月, 2007 1 次提交
  4. 23 8月, 2007 2 次提交
  5. 10 8月, 2007 2 次提交
    • C
      SLUB: Fix dynamic dma kmalloc cache creation · 1ceef402
      Christoph Lameter 提交于
      The dynamic dma kmalloc creation can run into trouble if a
      GFP_ATOMIC allocation is the first one performed for a certain size
      of dma kmalloc slab.
      
      - Move the adding of the slab to sysfs into a workqueue
        (sysfs does GFP_KERNEL allocations)
      - Do not call kmem_cache_destroy() (uses slub_lock)
      - Only acquire the slub_lock once and--if we cannot wait--do a trylock.
      
        This introduces a slight risk of the first kmalloc(x, GFP_DMA|GFP_ATOMIC)
        for a range of sizes failing due to another process holding the slub_lock.
        However, we only need to acquire the spinlock once in order to establish
        each power of two DMA kmalloc cache. The possible conflict is with the
        slub_lock taken during slab management actions (create / remove slab cache).
      
        It is rather typical that a driver will first fill its buffers using
        GFP_KERNEL allocations which will wait until the slub_lock can be acquired.
        Drivers will also create its slab caches first outside of an atomic
        context before starting to use atomic kmalloc from an interrupt context.
      
        If there are any failures then they will occur early after boot or when
        loading of multiple drivers concurrently. Drivers can already accomodate
        failures of GFP_ATOMIC for other reasons. Retries will then create the slab.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      1ceef402
    • C
      SLUB: Remove checks for MAX_PARTIAL from kmem_cache_shrink · fcda3d89
      Christoph Lameter 提交于
      The MAX_PARTIAL checks were supposed to be an optimization. However, slab
      shrinking is a manually triggered process either through running slabinfo
      or by the kernel calling kmem_cache_shrink.
      
      If one really wants to shrink a slab then all operations should be done
      regardless of the size of the partial list. This also fixes an issue that
      could surface if the number of partial slabs was initially above MAX_PARTIAL
      in kmem_cache_shrink and later drops below MAX_PARTIAL through the
      elimination of empty slabs on the partial list (rare). In that case a few
      slabs may be left off the partial list (and only be put back when they
      are empty).
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      fcda3d89
  6. 31 7月, 2007 2 次提交
  7. 20 7月, 2007 2 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
    • L
      slub: fix ksize() for zero-sized pointers · 9550b105
      Linus Torvalds 提交于
      The slab and slob allocators already did this right, but slub would call
      "get_object_page()" on the magic ZERO_SIZE_PTR, with all kinds of nasty
      end results.
      
      Noted by Ingo Molnar.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9550b105
  8. 18 7月, 2007 20 次提交
  9. 17 7月, 2007 1 次提交
  10. 07 7月, 2007 1 次提交
  11. 04 7月, 2007 1 次提交