1. 09 5月, 2007 2 次提交
  2. 08 5月, 2007 14 次提交
  3. 03 5月, 2007 1 次提交
  4. 04 4月, 2007 1 次提交
  5. 02 3月, 2007 1 次提交
  6. 21 2月, 2007 1 次提交
  7. 12 2月, 2007 6 次提交
  8. 06 1月, 2007 1 次提交
  9. 23 12月, 2006 2 次提交
  10. 14 12月, 2006 4 次提交
    • E
      [PATCH] SLAB: use a multiply instead of a divide in obj_to_index() · 6a2d7a95
      Eric Dumazet 提交于
      When some objects are allocated by one CPU but freed by another CPU we can
      consume lot of cycles doing divides in obj_to_index().
      
      (Typical load on a dual processor machine where network interrupts are
      handled by one particular CPU (allocating skbufs), and the other CPU is
      running the application (consuming and freeing skbufs))
      
      Here on one production server (dual-core AMD Opteron 285), I noticed this
      divide took 1.20 % of CPU_CLK_UNHALTED events in kernel.  But Opteron are
      quite modern cpus and the divide is much more expensive on oldest
      architectures :
      
      On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1
      cycle for a multiply.
      
      Doing some math, we can use a reciprocal multiplication instead of a divide.
      
      If we want to compute V = (A / B)  (A and B being u32 quantities)
      we can instead use :
      
      V = ((u64)A * RECIPROCAL(B)) >> 32 ;
      
      where RECIPROCAL(B) is precalculated to ((1LL << 32) + (B - 1)) / B
      
      Note :
      
      I wrote pure C code for clarity. gcc output for i386 is not optimal but
      acceptable :
      
      mull   0x14(%ebx)
      mov    %edx,%eax // part of the >> 32
      xor     %edx,%edx // useless
      mov    %eax,(%esp) // could be avoided
      mov    %edx,0x4(%esp) // useless
      mov    (%esp),%ebx
      
      [akpm@osdl.org: small cleanups]
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6a2d7a95
    • P
      [PATCH] cpuset: rework cpuset_zone_allowed api · 02a0e53d
      Paul Jackson 提交于
      Elaborate the API for calling cpuset_zone_allowed(), so that users have to
      explicitly choose between the two variants:
      
        cpuset_zone_allowed_hardwall()
        cpuset_zone_allowed_softwall()
      
      Until now, whether or not you got the hardwall flavor depended solely on
      whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
      argument.
      
      If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
      version.
      
      Unfortunately, this meant that users would end up with the softwall version
      without thinking about it.  Since only the softwall version might sleep,
      this led to bugs with possible sleeping in interrupt context on more than
      one occassion.
      
      The hardwall version requires that the current tasks mems_allowed allows
      the node of the specified zone (or that you're in interrupt or that
      __GFP_THISNODE is set or that you're on a one cpuset system.)
      
      The softwall version, depending on the gfp_mask, might allow a node if it
      was allowed in the nearest enclusing cpuset marked mem_exclusive (which
      requires taking the cpuset lock 'callback_mutex' to evaluate.)
      
      This patch removes the cpuset_zone_allowed() call, and forces the caller to
      explicitly choose between the hardwall and the softwall case.
      
      If the caller wants the gfp_mask to determine this choice, they should (1)
      be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
      cpuset_zone_allowed_softwall() routine.
      
      This adds another 100 or 200 bytes to the kernel text space, due to the few
      lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
      routines.  It should save a few instructions executed for the calls that
      turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
      set (before the call) then check (within the call) the __GFP_HARDWALL flag.
      
      For the most critical call, from get_page_from_freelist(), the same
      instructions are executed as before -- the old cpuset_zone_allowed()
      routine it used to call is the same code as the
      cpuset_zone_allowed_softwall() routine that it calls now.
      
      Not a perfect win, but seems worth it, to reduce this chance of hitting a
      sleeping with irq off complaint again.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      02a0e53d
    • C
      [PATCH] More slab.h cleanups · 55935a34
      Christoph Lameter 提交于
      More cleanups for slab.h
      
      1. Remove tabs from weird locations as suggested by Pekka
      
      2. Drop the check for NUMA and SLAB_DEBUG from the fallback section
         as suggested by Pekka.
      
      3. Uses static inline for the fallback defs as also suggested by Pekka.
      
      4. Make kmem_ptr_valid take a const * argument.
      
      5. Separate the NUMA fallback definitions from the kmalloc_track fallback
         definitions.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      55935a34
    • C
      [PATCH] slab: fix sleeping in atomic bug · dd47ea75
      Christoph Lameter 提交于
      Fallback_alloc() does not do the check for GFP_WAIT as done in
      cache_grow().  Thus interrupts are disabled when we call kmem_getpages()
      which results in the failure.
      
      Duplicate the handling of GFP_WAIT in cache_grow().
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Jay Cliburn <jacliburn@bellsouth.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd47ea75
  11. 11 12月, 2006 1 次提交
  12. 09 12月, 2006 3 次提交
  13. 08 12月, 2006 3 次提交
    • H
      [PATCH] struct seq_operations and struct file_operations constification · 15ad7cdc
      Helge Deller 提交于
       - move some file_operations structs into the .rodata section
      
       - move static strings from policy_types[] array into the .rodata section
      
       - fix generic seq_operations usages, so that those structs may be defined
         as "const" as well
      
      [akpm@osdl.org: couple of fixes]
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      15ad7cdc
    • A
      [PATCH] slab: use probe_kernel_address() · 138ae663
      Andrew Morton 提交于
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      138ae663
    • C
      [PATCH] slab: better fallback allocation behavior · 3c517a61
      Christoph Lameter 提交于
      Currently we simply attempt to allocate from all allowed nodes using
      GFP_THISNODE.  However, GFP_THISNODE does not do reclaim (it wont do any at
      all if the recent GFP_THISNODE patch is accepted).  If we truly run out of
      memory in the whole system then fallback_alloc may return NULL although
      memory may still be available if we would perform more thorough reclaim.
      
      This patch changes fallback_alloc() so that we first only inspect all the
      per node queues for available slabs.  If we find any then we allocate from
      those.  This avoids slab fragmentation by first getting rid of all partial
      allocated slabs on every node before allocating new memory.
      
      If we cannot satisfy the allocation from any per node queue then we extend
      a slab.  We now call into the page allocator without specifying
      GFP_THISNODE.  The page allocator will then implement its own fallback (in
      the given cpuset context), perform necessary reclaim (again considering not
      a single node but the whole set of allowed nodes) and then return pages for
      a new slab.
      
      We identify from which node the pages were allocated and then insert the
      pages into the corresponding per node structure.  In order to do so we need
      to modify cache_grow() to take a parameter that specifies the new slab.
      kmem_getpages() can no longer set the GFP_THISNODE flag since we need to be
      able to use kmem_getpage to allocate from an arbitrary node.  GFP_THISNODE
      needs to be specified when calling cache_grow().
      
      One key advantage is that the decision from which node to allocate new
      memory is removed from slab fallback processing.  The patch allows to go
      back to use of the page allocators fallback/reclaim logic.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c517a61