1. 11 11月, 2011 1 次提交
  2. 28 9月, 2011 1 次提交
    • V
      mm: restrict access to slab files under procfs and sysfs · ab067e99
      Vasiliy Kulikov 提交于
      Historically /proc/slabinfo and files under /sys/kernel/slab/* have
      world read permissions and are accessible to the world.  slabinfo
      contains rather private information related both to the kernel and
      userspace tasks.  Depending on the situation, it might reveal either
      private information per se or information useful to make another
      targeted attack.  Some examples of what can be learned by
      reading/watching for /proc/slabinfo entries:
      
      1) dentry (and different *inode*) number might reveal other processes fs
      activity.  The number of dentry "active objects" doesn't strictly show
      file count opened/touched by a process, however, there is a good
      correlation between them.  The patch "proc: force dcache drop on
      unauthorized access" relies on the privacy of dentry count.
      
      2) different inode entries might reveal the same information as (1), but
      these are more fine granted counters.  If a filesystem is mounted in a
      private mount point (or even a private namespace) and fs type differs from
      other mounted fs types, fs activity in this mount point/namespace is
      revealed.  If there is a single ecryptfs mount point, the whole fs
      activity of a single user is revealed.  Number of files in ecryptfs
      mount point is a private information per se.
      
      3) fuse_* reveals number of files / fs activity of a user in a user
      private mount point.  It is approx. the same severity as ecryptfs
      infoleak in (2).
      
      4) sysfs_dir_cache similar to (2) reveals devices' addition/removal,
      which can be otherwise hidden by "chmod 0700 /sys/".  With 0444 slabinfo
      the precise number of sysfs files is known to the world.
      
      5) buffer_head might reveal some kernel activity.  With other
      information leaks an attacker might identify what specific kernel
      routines generate buffer_head activity.
      
      6) *kmalloc* infoleaks are very situational.  Attacker should watch for
      the specific kmalloc size entry and filter the noise related to the unrelated
      kernel activity.  If an attacker has relatively silent victim system, he
      might get rather precise counters.
      
      Additional information sources might significantly increase the slabinfo
      infoleak benefits.  E.g. if an attacker knows that the processes
      activity on the system is very low (only core daemons like syslog and
      cron), he may run setxid binaries / trigger local daemon activity /
      trigger network services activity / await sporadic cron jobs activity
      / etc. and get rather precise counters for fs and network activity of
      these privileged tasks, which is unknown otherwise.
      
      Also hiding slabinfo and /sys/kernel/slab/* is a one step to complicate
      exploitation of kernel heap overflows (and possibly, other bugs).  The
      related discussion:
      
      http://thread.gmane.org/gmane.linux.kernel/1108378
      
      To keep compatibility with old permission model where non-root
      monitoring daemon could watch for kernel memleaks though slabinfo one
      should do:
      
          groupadd slabinfo
          usermod -a -G slabinfo $MONITOR_USER
      
      And add the following commands to init scripts (to mountall.conf in
      Ubuntu's upstart case):
      
          chmod g+r /proc/slabinfo /sys/kernel/slab/*/*
          chgrp slabinfo /proc/slabinfo /sys/kernel/slab/*/*
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Reviewed-by: NKees Cook <kees@ubuntu.com>
      Reviewed-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Acked-by: NChristoph Lameter <cl@gentwo.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      CC: Valdis.Kletnieks@vt.edu
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      CC: Alan Cox <alan@linux.intel.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      ab067e99
  3. 04 8月, 2011 2 次提交
  4. 01 8月, 2011 1 次提交
    • S
      slab: use print_hex_dump · fdde6abb
      Sebastian Andrzej Siewior 提交于
      Less code and the advantage of ascii dump.
      
      before:
      | Slab corruption: names_cache start=c5788000, len=4096
      | 000: 6b 6b 01 00 00 00 56 00 00 00 24 00 00 00 2a 00
      | 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      | 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff
      | 030: ff ff ff ff e2 b4 17 18 c7 e4 08 06 00 01 08 00
      | 040: 06 04 00 01 e2 b4 17 18 c7 e4 0a 00 00 01 00 00
      | 050: 00 00 00 00 0a 00 00 02 6b 6b 6b 6b 6b 6b 6b 6b
      
      after:
      | Slab corruption: size-4096 start=c38a9000, len=4096
      | 000: 6b 6b 01 00 00 00 56 00 00 00 24 00 00 00 2a 00  kk....V...$...*.
      | 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      | 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff  ................
      | 030: ff ff ff ff d2 56 5f aa db 9c 08 06 00 01 08 00  .....V_.........
      | 040: 06 04 00 01 d2 56 5f aa db 9c 0a 00 00 01 00 00  .....V_.........
      | 050: 00 00 00 00 0a 00 00 02 6b 6b 6b 6b 6b 6b 6b 6b  ........kkkkkkkk
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      fdde6abb
  5. 31 7月, 2011 1 次提交
  6. 28 7月, 2011 1 次提交
  7. 22 7月, 2011 1 次提交
  8. 21 7月, 2011 1 次提交
  9. 18 7月, 2011 1 次提交
    • H
      slab: fix DEBUG_SLAB build · c225150b
      Hugh Dickins 提交于
      Fix CONFIG_SLAB=y CONFIG_DEBUG_SLAB=y build error and warnings.
      
      Now that ARCH_SLAB_MINALIGN defaults to __alignof__(unsigned long long),
      it is always defined (when slab.h included), but cannot be used in #if:
      mm/slab.c: In function `cache_alloc_debugcheck_after':
      mm/slab.c:3156:5: warning: "__alignof__" is not defined
      mm/slab.c:3156:5: error: missing binary operator before token "("
      make[1]: *** [mm/slab.o] Error 1
      
      So just remove the #if and #endif lines, but then 64-bit build warns:
      mm/slab.c: In function `cache_alloc_debugcheck_after':
      mm/slab.c:3156:6: warning: cast from pointer to integer of different size
      mm/slab.c:3158:10: warning: format `%d' expects type `int', but argument
                                  3 has type `long unsigned int'
      Fix those with casts, whatever the actual type of ARCH_SLAB_MINALIGN.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      c225150b
  10. 04 6月, 2011 1 次提交
  11. 21 5月, 2011 1 次提交
    • L
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds 提交于
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      
      So this fixes things up a bit, using
      
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      268bb0ce
  12. 31 3月, 2011 1 次提交
  13. 23 3月, 2011 1 次提交
  14. 12 3月, 2011 1 次提交
  15. 14 2月, 2011 1 次提交
    • P
      Revert "slab: Fix missing DEBUG_SLAB last user" · 3ff84a7f
      Pekka Enberg 提交于
      This reverts commit 5c5e3b33.
      
      The commit breaks ARM thusly:
      
      | Mount-cache hash table entries: 512
      | slab error in verify_redzone_free(): cache `idr_layer_cache': memory outside object was overwritten
      | Backtrace:
      | [<c0227088>] (dump_backtrace+0x0/0x110) from [<c0431afc>] (dump_stack+0x18/0x1c)
      | [<c0431ae4>] (dump_stack+0x0/0x1c) from [<c0293304>] (__slab_error+0x28/0x30)
      | [<c02932dc>] (__slab_error+0x0/0x30) from [<c0293a74>] (cache_free_debugcheck+0x1c0/0x2b8)
      | [<c02938b4>] (cache_free_debugcheck+0x0/0x2b8) from [<c0293f78>] (kmem_cache_free+0x3c/0xc0)
      | [<c0293f3c>] (kmem_cache_free+0x0/0xc0) from [<c032b1c8>] (ida_get_new_above+0x19c/0x1c0)
      | [<c032b02c>] (ida_get_new_above+0x0/0x1c0) from [<c02af7ec>] (alloc_vfsmnt+0x54/0x144)
      | [<c02af798>] (alloc_vfsmnt+0x0/0x144) from [<c0299830>] (vfs_kern_mount+0x30/0xec)
      | [<c0299800>] (vfs_kern_mount+0x0/0xec) from [<c0299908>] (kern_mount_data+0x1c/0x20)
      | [<c02998ec>] (kern_mount_data+0x0/0x20) from [<c02146c4>] (sysfs_init+0x68/0xc8)
      | [<c021465c>] (sysfs_init+0x0/0xc8) from [<c02137d4>] (mnt_init+0x90/0x1b0)
      | [<c0213744>] (mnt_init+0x0/0x1b0) from [<c0213388>] (vfs_caches_init+0x100/0x140)
      | [<c0213288>] (vfs_caches_init+0x0/0x140) from [<c0208c0c>] (start_kernel+0x2e8/0x368)
      | [<c0208924>] (start_kernel+0x0/0x368) from [<c0208034>] (__enable_mmu+0x0/0x2c)
      | c0113268: redzone 1:0xd84156c5c032b3ac, redzone 2:0xd84156c5635688c0.
      | slab error in cache_alloc_debugcheck_after(): cache `idr_layer_cache': double free, or memory outside object was overwritten
      | ...
      | c011307c: redzone 1:0x9f91102ffffffff, redzone 2:0x9f911029d74e35b
      | slab: Internal list corruption detected in cache 'idr_layer_cache'(24), slabp c0113000(16). Hexdump:
      |
      | 000: 20 4f 10 c0 20 4f 10 c0 7c 00 00 00 7c 30 11 c0
      | 010: 10 00 00 00 10 00 00 00 00 00 c9 17 fe ff ff ff
      | 020: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
      | 030: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
      | 040: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
      | 050: fe ff ff ff fe ff ff ff fe ff ff ff 11 00 00 00
      | 060: 12 00 00 00 13 00 00 00 14 00 00 00 15 00 00 00
      | 070: 16 00 00 00 17 00 00 00 c0 88 56 63
      | kernel BUG at /home/rmk/git/linux-2.6-rmk/mm/slab.c:2928!
      
      Reference: https://lkml.org/lkml/2011/2/7/238
      Cc: <stable@kernel.org> # 2.6.35.y and later
      Reported-and-analyzed-by: NRussell King <rmk@arm.linux.org.uk>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      3ff84a7f
  16. 24 1月, 2011 1 次提交
  17. 15 1月, 2011 1 次提交
  18. 07 1月, 2011 1 次提交
  19. 17 12月, 2010 1 次提交
  20. 15 12月, 2010 1 次提交
    • T
      workqueue: convert cancel_rearming_delayed_work[queue]() users to cancel_delayed_work_sync() · afe2c511
      Tejun Heo 提交于
      cancel_rearming_delayed_work[queue]() has been superceded by
      cancel_delayed_work_sync() quite some time ago.  Convert all the
      in-kernel users.  The conversions are completely equivalent and
      trivial.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: netdev@vger.kernel.org
      Cc: Anton Vorontsov <cbou@mail.ru>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: xfs-masters@oss.sgi.com
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: netfilter-devel@vger.kernel.org
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: linux-nfs@vger.kernel.org
      afe2c511
  21. 29 11月, 2010 1 次提交
    • S
      tracing/slab: Move kmalloc tracepoint out of inline code · 85beb586
      Steven Rostedt 提交于
      The tracepoint for kmalloc is in the slab inlined code which causes
      every instance of kmalloc to have the tracepoint.
      
      This patch moves the tracepoint out of the inline code to the
      slab C file, which removes a large number of inlined trace
      points.
      
        objdump -dr vmlinux.slab| grep 'jmpq.*<trace_kmalloc' |wc -l
      213
        objdump -dr vmlinux.slab.patched| grep 'jmpq.*<trace_kmalloc' |wc -l
      1
      
      This also has a nice impact on size.
      
         text	   data	    bss	    dec	    hex	filename
      7023060	2121564	2482432	11627056	 b16a30	vmlinux.slab
      6970579	2109772	2482432	11562783	 b06f1f	vmlinux.slab.patched
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      85beb586
  22. 27 10月, 2010 1 次提交
  23. 10 8月, 2010 1 次提交
  24. 09 8月, 2010 1 次提交
    • C
      slab: fix object alignment · 1ab335d8
      Carsten Otte 提交于
      This patch fixes alignment of slab objects in case CONFIG_DEBUG_PAGEALLOC is
      active.
      Before this spot in kmem_cache_create, we have this situation:
      - align contains the required alignment of the object
      - cachep->obj_offset is 0 or equals align in case of CONFIG_DEBUG_SLAB
      - size equals the size of the object, or object plus trailing redzone in case
        of CONFIG_DEBUG_SLAB
      
      This spot tries to fill one page per object if the object is in certain size
      limits, however setting obj_offset to PAGE_SIZE - size does break the object
      alignment since size may not be aligned with the required alignment.
      This patch simply adds an ALIGN(size, align) to the equation and fixes the
      object size detection accordingly.
      
      This code in drivers/s390/cio/qdio_setup_init has lead to incorrectly aligned
      slab objects (sizeof(struct qdio_q) equals 1792):
      	qdio_q_cache = kmem_cache_create("qdio_q", sizeof(struct qdio_q),
      					 256, 0, NULL);
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      1ab335d8
  25. 20 7月, 2010 1 次提交
    • A
      slab: use deferable timers for its periodic housekeeping · 78b43536
      Arjan van de Ven 提交于
      slab has a "once every 2 second" timer for its housekeeping.
      As the number of logical processors is growing, its more and more
      common that this 2 second timer becomes the primary wakeup source.
      
      This patch turns this housekeeping timer into a deferable timer,
      which means that the timer does not interrupt idle, but just runs
      at the next event that wakes the cpu up.
      
      The impact is that the timer likely runs a bit later, but during the
      delay no code is running so there's not all that much reason for
      a difference in housekeeping to occur because of this delay.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      78b43536
  26. 09 6月, 2010 1 次提交
  27. 28 5月, 2010 3 次提交
    • L
      numa: slab: use numa_mem_id() for slab local memory node · 7d6e6d09
      Lee Schermerhorn 提交于
      Example usage of generic "numa_mem_id()":
      
      The mainline slab code, since ~ 2.6.19, does not handle memoryless nodes
      well.  Specifically, the "fast path"--____cache_alloc()--will never
      succeed as slab doesn't cache offnode object on the per cpu queues, and
      for memoryless nodes, all memory will be "off node" relative to
      numa_node_id().  This adds significant overhead to all kmem cache
      allocations, incurring a significant regression relative to earlier
      kernels [from before slab.c was reorganized].
      
      This patch uses the generic topology function "numa_mem_id()" to return
      the "effective local memory node" for the calling context.  This is the
      first node in the local node's generic fallback zonelist-- the same node
      that "local" mempolicy-based allocations would use.  This lets slab cache
      these "local" allocations and avoid fallback/refill on every allocation.
      
      N.B.: Slab will need to handle node and memory hotplug events that could
      change the value returned by numa_mem_id() for any given node if recent
      changes to address memory hotplug don't already address this.  E.g., flush
      all per cpu slab queues before rebuilding the zonelists while the
      "machine" is held in the stopped state.
      
      Performance impact on "hackbench 400 process 200"
      
      2.6.34-rc3-mmotm-100405-1609		no-patch	this-patch
      ia64 no memoryless nodes [avg of 10]:     11.713       11.637  ~0.65 diff
      ia64 cpus all on memless nodes  [10]:    228.259       26.484  ~8.6x speedup
      
      The slowdown of the patched kernel from ~12 sec to ~28 seconds when
      configured with memoryless nodes is the result of all cpus allocating from
      a single node's mm pagepool.  The cache lines of the single node are
      distributed/interleaved over the memory of the real physical nodes, but
      the zone lock, list heads, ...  of the single node with memory still each
      live in a single cache line that is accessed from all processors.
      
      x86_64 [8x6 AMD] [avg of 40]:		2.883	   2.845
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d6e6d09
    • A
      slab: convert cpu notifier to return encapsulate errno value · eac40680
      Akinobu Mita 提交于
      By the previous modification, the cpu notifier can return encapsulate
      errno value.  This converts the cpu notifiers for slab.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eac40680
    • J
      cpusets: new round-robin rotor for SLAB allocations · 6adef3eb
      Jack Steiner 提交于
      We have observed several workloads running on multi-node systems where
      memory is assigned unevenly across the nodes in the system.  There are
      numerous reasons for this but one is the round-robin rotor in
      cpuset_mem_spread_node().
      
      For example, a simple test that writes a multi-page file will allocate
      pages on nodes 0 2 4 6 ...  Odd nodes are skipped.  (Sometimes it
      allocates on odd nodes & skips even nodes).
      
      An example is shown below.  The program "lfile" writes a file consisting
      of 10 pages.  The program then mmaps the file & uses get_mempolicy(...,
      MPOL_F_NODE) to determine the nodes where the file pages were allocated.
      The output is shown below:
      
      	# ./lfile
      	 allocated on nodes: 2 4 6 0 1 2 6 0 2
      
      There is a single rotor that is used for allocating both file pages & slab
      pages.  Writing the file allocates both a data page & a slab page
      (buffer_head).  This advances the RR rotor 2 nodes for each page
      allocated.
      
      A quick confirmation seems to confirm this is the cause of the uneven
      allocation:
      
      	# echo 0 >/dev/cpuset/memory_spread_slab
      	# ./lfile
      	 allocated on nodes: 6 7 8 9 0 1 2 3 4 5
      
      This patch introduces a second rotor that is used for slab allocations.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Paul Menage <menage@google.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6adef3eb
  28. 25 5月, 2010 1 次提交
    • M
      cpuset,mm: fix no node to alloc memory when changing cpuset's mems · c0ff7453
      Miao Xie 提交于
      Before applying this patch, cpuset updates task->mems_allowed and
      mempolicy by setting all new bits in the nodemask first, and clearing all
      old unallowed bits later.  But in the way, the allocator may find that
      there is no node to alloc memory.
      
      The reason is that cpuset rebinds the task's mempolicy, it cleans the
      nodes which the allocater can alloc pages on, for example:
      
      (mpol: mempolicy)
      	task1			task1's mpol	task2
      	alloc page		1
      	  alloc on node0? NO	1
      				1		change mems from 1 to 0
      				1		rebind task1's mpol
      				0-1		  set new bits
      				0	  	  clear disallowed bits
      	  alloc on node1? NO	0
      	  ...
      	can't alloc page
      	  goto oom
      
      This patch fixes this problem by expanding the nodes range first(set newly
      allowed bits) and shrink it lazily(clear newly disallowed bits).  So we
      use a variable to tell the write-side task that read-side task is reading
      nodemask, and the write-side task clears newly disallowed nodes after
      read-side task ends the current memory allocation.
      
      [akpm@linux-foundation.org: fix spello]
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Paul Menage <menage@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ravikiran Thirumalai <kiran@scalex86.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0ff7453
  29. 20 5月, 2010 1 次提交
  30. 15 4月, 2010 1 次提交
    • S
      slab: Fix missing DEBUG_SLAB last user · 5c5e3b33
      Shiyong Li 提交于
      Even with SLAB_RED_ZONE and SLAB_STORE_USER enabled, kernel would NOT store
      redzone and last user data around allocated memory space if "arch cache line >
      sizeof(unsigned long long)". As a result, last user information is unexpectedly
      MISSED while dumping slab corruption log.
      
      This fix makes sure that redzone and last user tags get stored unless the
      required alignment breaks redzone's.
      Signed-off-by: NShiyong Li <shi-yong.li@motorola.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      5c5e3b33
  31. 10 4月, 2010 1 次提交
    • P
      slab: Generify kernel pointer validation · fc1c1833
      Pekka Enberg 提交于
      As suggested by Linus, introduce a kern_ptr_validate() helper that does some
      sanity checks to make sure a pointer is a valid kernel pointer.  This is a
      preparational step for fixing SLUB kmem_ptr_validate().
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc1c1833
  32. 08 4月, 2010 1 次提交
    • D
      slab: add memory hotplug support · 8f9f8d9e
      David Rientjes 提交于
      Slab lacks any memory hotplug support for nodes that are hotplugged
      without cpus being hotplugged.  This is possible at least on x86
      CONFIG_MEMORY_HOTPLUG_SPARSE kernels where SRAT entries are marked
      ACPI_SRAT_MEM_HOT_PLUGGABLE and the regions of RAM represent a seperate
      node.  It can also be done manually by writing the start address to
      /sys/devices/system/memory/probe for kernels that have
      CONFIG_ARCH_MEMORY_PROBE set, which is how this patch was tested, and
      then onlining the new memory region.
      
      When a node is hotadded, a nodelist for that node is allocated and
      initialized for each slab cache.  If this isn't completed due to a lack
      of memory, the hotadd is aborted: we have a reasonable expectation that
      kmalloc_node(nid) will work for all caches if nid is online and memory is
      available.
      
      Since nodelists must be allocated and initialized prior to the new node's
      memory actually being online, the struct kmem_list3 is allocated off-node
      due to kmalloc_node()'s fallback.
      
      When an entire node would be offlined, its nodelists are subsequently
      drained.  If slab objects still exist and cannot be freed, the offline is
      aborted.  It is possible that objects will be allocated between this
      drain and page isolation, so it's still possible that the offline will
      still fail, however.
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      8f9f8d9e
  33. 29 3月, 2010 1 次提交
  34. 27 2月, 2010 1 次提交
  35. 30 1月, 2010 1 次提交
    • N
      slab: fix regression in touched logic · 44b57f1c
      Nick Piggin 提交于
      When factoring common code into transfer_objects in commit 3ded175a ("slab: add
      transfer_objects() function"), the 'touched' logic got a bit broken. When
      refilling from the shared array (taking objects from the shared array), we are
      making use of the shared array so it should be marked as touched.
      
      Subsequently pulling an element from the cpu array and allocating it should
      also touch the cpu array, but that is taken care of after the alloc_done label.
      (So yes, the cpu array was getting touched = 1 twice).
      
      So revert this logic to how it worked in earlier kernels.
      
      This also affects the behaviour in __drain_alien_cache, which would previously
      'touch' the shared array and now does not. I think it is more logical not to
      touch there, because we are pushing objects into the shared array rather than
      pulling them off. So there is no good reason to postpone reaping them -- if the
      shared array is getting utilized, then it will get 'touched' in the alloc path
      (where this patch now restores the touch).
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      44b57f1c
  36. 12 1月, 2010 1 次提交
  37. 29 12月, 2009 1 次提交