1. 18 7月, 2007 12 次提交
  2. 17 7月, 2007 1 次提交
  3. 07 7月, 2007 1 次提交
  4. 04 7月, 2007 1 次提交
  5. 24 6月, 2007 1 次提交
  6. 17 6月, 2007 2 次提交
  7. 09 6月, 2007 1 次提交
  8. 01 6月, 2007 1 次提交
  9. 31 5月, 2007 1 次提交
    • C
      SLUB: Fix NUMA / SYSFS bootstrap issue · 8ffa6875
      Christoph Lameter 提交于
      We need this patch in ASAP.  Patch fixes the mysterious hang that remained
      on some particular configurations with lockdep on after the first fix that
      moved the #idef CONFIG_SLUB_DEBUG to the right location.  See
      http://marc.info/?t=117963072300001&r=1&w=2
      
      The kmem_cache_node cache is very special because it is needed for NUMA
      bootstrap.  Under certain conditions (like for example if lockdep is
      enabled and significantly increases the size of spinlock_t) the structure
      may become exactly the size as one of the larger caches in the kmalloc
      array.
      
      That early during bootstrap we cannot perform merging properly.  The unique
      id for the kmem_cache_node cache will match one of the kmalloc array.
      Sysfs will complain about a duplicate directory entry.  All of this occurs
      while the console is not yet fully operational.  Thus boot may appear to be
      silently failing.
      
      The kmem_cache_node cache is very special.  During early boostrap the main
      allocation function is not operational yet and so we have to run our own
      small special alloc function during early boot.  It is also special in that
      it is never freed.
      
      We really do not want any merging on that cache.  Set the refcount -1 and
      forbid merging of slabs that have a negative refcount.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ffa6875
  10. 24 5月, 2007 2 次提交
  11. 17 5月, 2007 6 次提交
  12. 11 5月, 2007 2 次提交
    • C
      SLUB: remove nr_cpu_ids hack · bcf889f9
      Christoph Lameter 提交于
      This was in SLUB in order to head off trouble while the nr_cpu_ids
      functionality was not merged.  Its merged now so no need to still have this.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bcf889f9
    • C
      slub: support concurrent local and remote frees and allocs on a slab · 894b8788
      Christoph Lameter 提交于
      Avoid atomic overhead in slab_alloc and slab_free
      
      SLUB needs to use the slab_lock for the per cpu slabs to synchronize with
      potential kfree operations.  This patch avoids that need by moving all free
      objects onto a lockless_freelist.  The regular freelist continues to exist
      and will be used to free objects.  So while we consume the
      lockless_freelist the regular freelist may build up objects.
      
      If we are out of objects on the lockless_freelist then we may check the
      regular freelist.  If it has objects then we move those over to the
      lockless_freelist and do this again.  There is a significant savings in
      terms of atomic operations that have to be performed.
      
      We can even free directly to the lockless_freelist if we know that we are
      running on the same processor.  So this speeds up short lived objects.
      They may be allocated and freed without taking the slab_lock.  This is
      particular good for netperf.
      
      In order to maximize the effect of the new faster hotpath we extract the
      hottest performance pieces into inlined functions.  These are then inlined
      into kmem_cache_alloc and kmem_cache_free.  So hotpath allocation and
      freeing no longer requires a subroutine call within SLUB.
      
      [I am not sure that it is worth doing this because it changes the easy to
      read structure of slub just to reduce atomic ops.  However, there is
      someone out there with a benchmark on 4 way and 8 way processor systems
      that seems to show a 5% regression vs.  Slab.  Seems that the regression is
      due to increased atomic operations use vs.  SLAB in SLUB).  I wonder if
      this is applicable or discernable at all in a real workload?]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      894b8788
  13. 10 5月, 2007 9 次提交
    • C
      Move remote node draining out of slab allocators · 4037d452
      Christoph Lameter 提交于
      Currently the slab allocators contain callbacks into the page allocator to
      perform the draining of pagesets on remote nodes.  This requires SLUB to have
      a whole subsystem in order to be compatible with SLAB.  Moving node draining
      out of the slab allocators avoids a section of code in SLUB.
      
      Move the node draining so that is is done when the vm statistics are updated.
      At that point we are already touching all the cachelines with the pagesets of
      a processor.
      
      Add a expire counter there.  If we have to update per zone or global vm
      statistics then assume that the pageset will require subsequent draining.
      
      The expire counter will be decremented on each vm stats update pass until it
      reaches zero.  Then we will drain one batch from the pageset.  The draining
      will cause vm counter updates which will then cause another expiration until
      the pcp is empty.  So we will drain a batch every 3 seconds.
      
      Note that remote node draining is a somewhat esoteric feature that is required
      on large NUMA systems because otherwise significant portions of system memory
      can become trapped in pcp queues.  The number of pcp is determined by the
      number of processors and nodes in a system.  A system with 4 processors and 2
      nodes has 8 pcps which is okay.  But a system with 1024 processors and 512
      nodes has 512k pcps with a high potential for large amount of memory being
      caught in them.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4037d452
    • C
      vmstat: use our own timer events · d1187ed2
      Christoph Lameter 提交于
      vmstat is currently using the cache reaper to periodically bring the
      statistics up to date.  The cache reaper does only exists in SLUB as a way to
      provide compatibility with SLAB.  This patch removes the vmstat calls from the
      slab allocators and provides its own handling.
      
      The advantage is also that we can use a different frequency for the updates.
      Refreshing vm stats is a pretty fast job so we can run this every second and
      stagger this by only one tick.  This will lead to some overlap in large
      systems.  F.e a system running at 250 HZ with 1024 processors will have 4 vm
      updates occurring at once.
      
      However, the vm stats update only accesses per node information.  It is only
      necessary to stagger the vm statistics updates per processor in each node.  Vm
      counter updates occurring on distant nodes will not cause cacheline
      contention.
      
      We could implement an alternate approach that runs the first processor on each
      node at the second and then each of the other processor on a node on a
      subsequent tick.  That may be useful to keep a large amount of the second free
      of timer activity.  Maybe the timer folks will have some feedback on this one?
      
      [jirislaby@gmail.com: add missing break]
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1187ed2
    • R
      Add suspend-related notifications for CPU hotplug · 8bb78442
      Rafael J. Wysocki 提交于
      Since nonboot CPUs are now disabled after tasks and devices have been
      frozen and the CPU hotplug infrastructure is used for this purpose, we need
      special CPU hotplug notifications that will help the CPU-hotplug-aware
      subsystems distinguish normal CPU hotplug events from CPU hotplug events
      related to a system-wide suspend or resume operation in progress.  This
      patch introduces such notifications and causes them to be used during
      suspend and resume transitions.  It also changes all of the
      CPU-hotplug-aware subsystems to take these notifications into consideration
      (for now they are handled in the same way as the corresponding "normal"
      ones).
      
      [oleg@tv-sign.ru: cleanups]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8bb78442
    • P
      krealloc: fix kerneldoc comments · 7ae439ce
      Pekka J Enberg 提交于
      No "blank" (or "*") line is allowed between the function name and lines for
      it parameter(s).
      
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ae439ce
    • C
      SLUB: rework slab order determination · 5e6d444e
      Christoph Lameter 提交于
      In some cases SLUB is creating uselessly slabs that are larger than
      slub_max_order. Also the layout of some of the slabs was not satisfactory.
      
      Go to an iterarive approach.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e6d444e
    • C
      SLUB: include lifetime stats and sets of cpus / nodes in tracking output · 45edfa58
      Christoph Lameter 提交于
      We have information about how long an object existed and about the nodes and
      cpus where the allocations and frees took place.  Add that information to the
      tracking output in /sys/slab/xx/alloc_calls and /sys/slab/free_calls
      
      This will then enable slabinfo to output nice reports like this:
      
        christoph@qirst:~/slub$ ./slabinfo kmalloc-128
      
        Slabcache: kmalloc-128           Aliases:  0 Order :  0
      
        Sizes (bytes)     Slabs              Debug                Memory
        ------------------------------------------------------------------------
        Object :     128  Total  :      12   Sanity Checks : On   Total:   49152
        SlabObj:     200  Full   :       7   Redzoning     : On   Used :   24832
        SlabSiz:    4096  Partial:       4   Poisoning     : On   Loss :   24320
        Loss   :      72  CpuSlab:       1   Tracking      : On   Lalig:   13968
        Align  :       8  Objects:      20   Tracing       : Off  Lpadd:    1152
      
        kmalloc-128 has no kmem_cache operations
      
        kmalloc-128: Kernel object allocation
        -----------------------------------------------------------------------
              6 param_sysfs_setup+0x71/0x130 age=284512/284512/284512 pid=1 nodes=0-1,3
             11 percpu_populate+0x39/0x80 age=283914/284428/284512 pid=1 nodes=0
             21 __register_chrdev_region+0x31/0x170 age=282896/284347/284473 pid=1-1705 nodes=0-2
              1 sys_inotify_init+0x76/0x1c0 age=283423 pid=1004 nodes=0
             19 as_get_io_context+0x32/0xd0 age=6/247567/283988 pid=1-11782 nodes=0,2
             10 ida_pre_get+0x4a/0x80 age=277666/283773/284526 pid=0-2177 nodes=0,2
             24 kobject_kset_add_dir+0x37/0xb0 age=282727/283860/284472 pid=1-1723 nodes=0-2
              1 acpi_ds_build_internal_buffer_obj+0xd3/0x11d age=284508 pid=1 nodes=0
             24 con_insert_unipair+0xd7/0x110 age=284438/284438/284438 pid=1 nodes=0,2
              1 uart_open+0x2d2/0x4b0 age=283896 pid=1 nodes=0
             26 dma_pool_create+0x73/0x1a0 age=282762/282833/282916 pid=1705-1723 nodes=0
              1 neigh_table_init_no_netlink+0xd2/0x210 age=284461 pid=1 nodes=0
              2 neigh_parms_alloc+0x2b/0xe0 age=284410/284411/284412 pid=1 nodes=2
              2 neigh_resolve_output+0x1e1/0x280 age=276289/276291/276293 pid=0-2443 nodes=0
              1 netlink_kernel_create+0x90/0x170 age=284472 pid=1 nodes=0
              4 xt_alloc_table_info+0x39/0xf0 age=283958/283958/283959 pid=1 nodes=1
              3 fn_hash_insert+0x473/0x720 age=277653/277661/277666 pid=2177-2185 nodes=0
              1 get_mtrr_state+0x285/0x2a0 age=284526 pid=0 nodes=0
              1 cacheinfo_cpu_callback+0x26d/0x3e0 age=284458 pid=1 nodes=0
             29 kernel_param_sysfs_setup+0x25/0x90 age=284511/284511/284512 pid=1 nodes=0-1,3
              5 process_zones+0x5e/0x170 age=284546/284546/284546 pid=0 nodes=0
              1 drm_core_init+0x48/0x160 age=284421 pid=1 nodes=2
      
        kmalloc-128: Kernel object freeing
        ------------------------------------------------------------------------
            163 <not-available> age=4295176847 pid=0 nodes=0-3
              1 __vunmap+0x6e/0xf0 age=282907 pid=1723 nodes=0
             28 free_as_io_context+0x12/0x90 age=9243/262197/283474 pid=42-11754 nodes=0
              1 acpi_get_object_info+0x1b7/0x1d4 age=284475 pid=1 nodes=0
              1 do_acpi_find_child+0x45/0x4e age=284475 pid=1 nodes=0
      
        NUMA nodes           :    0    1    2    3
        ------------------------------------------
        All slabs                 7    2    2    1
        Partial slabs             2    2    0    0
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45edfa58
    • C
      SLUB: add CONFIG_SLUB_DEBUG · 41ecc55b
      Christoph Lameter 提交于
      CONFIG_SLUB_DEBUG can be used to switch off the debugging and sysfs components
      of SLUB.  Thus SLUB will be able to replace SLOB.  SLUB can arrange objects in
      a denser way than SLOB and the code size should be minimal without debugging
      and sysfs support.
      
      Note that CONFIG_SLUB_DEBUG is materially different from CONFIG_SLAB_DEBUG.
      CONFIG_SLAB_DEBUG is used to enable slab debugging in SLAB.  SLUB enables
      debugging via a boot parameter.  SLUB debug code should always be present.
      
      CONFIG_SLUB_DEBUG can be modified in the embedded config section.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41ecc55b
    • C
      SLUB: move tracking definitions and check_valid_pointer() away from debug code · 02cbc874
      Christoph Lameter 提交于
      Move the tracking definitions and the check_valid_pointer() function away from
      the debugging related functions.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      02cbc874
    • C
      SLUB: consolidate trace code · 636f0d7d
      Christoph Lameter 提交于
      Trace in both slab_alloc and slab_free has a lot of common code.  Use a single
      function for both.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      636f0d7d