提交 · aceda773606f2506a25b91aaafae87b2e4315834 · bug2833 / cloud-kernel

14 9月, 2009 1 次提交

由 Eric Dumazet 提交于 9月 03, 2009

When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
slab padding, we call restore_bytes() on the whole slab, not only
on the padding.
Acked-by: NChristoph Lameer <cl@linux-foundation.org>
Reported-by: NZdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

8a3d271d

04 9月, 2009 2 次提交

slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU · d76b1590

由 Eric Dumazet 提交于 9月 03, 2009

kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close() and
*before* sysfs_slab_remove() or risk rcu_free_slab() being called after
kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to kmem_cache_destroy()
a SLAB_DESTROY_BY_RCU enabled cache.

Cc: <stable@kernel.org>
Reported-by: NZdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

d76b1590

slub: release kobject if sysfs_create_group failed in sysfs_slab_add · 5788d8ad

由 Xiaotian Feng 提交于 7月 22, 2009

When CONFIG_SLUB_DEBUG is enabled, sysfs_slab_add should unlink and put the
kobject if sysfs_create_group failed. Otherwise, sysfs_slab_add returns error
then free kmem_cache s, thus memory of s->kobj is leaked.
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

5788d8ad

30 8月, 2009 1 次提交

SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256 · acdfcd04

由 Aaro Koskinen 提交于 8月 28, 2009

If the minalign is 64 bytes, then the 96 byte cache should not be created
because it would conflict with the 128 byte cache.

If the minalign is 256 bytes, patching the size_index table should not
result in a buffer overrun.

The calculation "(i - 1) / 8" used to access size_index[] is moved to
a separate function as suggested by Christoph Lameter.
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

acdfcd04

20 8月, 2009 1 次提交

SLUB: Fix some coding style issues · 5086c389

由 Amerigo Wang 提交于 8月 19, 2009

Signed-off-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

5086c389

19 8月, 2009 1 次提交

SLUB: Drop write permission to /proc/slabinfo · cf5d1131

由 WANG Cong 提交于 8月 18, 2009

SLUB does not support writes to /proc/slabinfo so there should not be write
permission to do that either.
Signed-off-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

cf5d1131

01 8月, 2009 1 次提交

slub: change kmem_cache->align to record the real alignment · dcb0ce1b

由 Zhang, Yanmin 提交于 7月 30, 2009

kmem_cache->align records the original align parameter value specified
by users. Function calculate_alignment might change it based on cache
line size. So change kmem_cache->align correspondingly.
Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

dcb0ce1b

28 7月, 2009 1 次提交

slub: use size and objsize orders to disable debug flags · 3de47213

由 David Rientjes 提交于 7月 27, 2009

This patch moves the masking of debugging flags which increase a cache's
min order due to metadata when `slub_debug=O' is used from
kmem_cache_flags() to kmem_cache_open().

Instead of defining the maximum metadata size increase in a preprocessor
macro, this approach uses the cache's ->size and ->objsize members to
determine if the min order increased due to debugging options.  If so,
the flags specified in the more appropriately named DEBUG_METADATA_FLAGS
are masked off.

This approach was suggested by Christoph Lameter
<cl@linux-foundation.org>.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

3de47213

10 7月, 2009 1 次提交

slub: add option to disable higher order debugging slabs · fa5ec8a1

由 David Rientjes 提交于 7月 07, 2009

When debugging is enabled, slub requires that additional metadata be
stored in slabs for certain options: SLAB_RED_ZONE, SLAB_POISON, and
SLAB_STORE_USER.

Consequently, it may require that the minimum possible slab order needed
to allocate a single object be greater when using these options.  The
most notable example is for objects that are PAGE_SIZE bytes in size.

Higher minimum slab orders may cause page allocation failures when oom or
under heavy fragmentation.

This patch adds a new slub_debug option, which disables debugging by
default for caches that would have resulted in higher minimum orders:

	slub_debug=O

When this option is used on systems with 4K pages, kmalloc-4096, for
example, will not have debugging enabled by default even if
CONFIG_SLUB_DEBUG_ON is defined because it would have resulted in a
order-1 minimum slab order.
Reported-by: NLarry Finger <Larry.Finger@lwfinger.net>
Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

fa5ec8a1

08 7月, 2009 1 次提交

kmemleak: Trace the kmalloc_large* functions in slub · e4f7c0b4

由 Catalin Marinas 提交于 7月 07, 2009

The kmalloc_large() and kmalloc_large_node() functions were missed when
adding the kmemleak hooks to the slub allocator. However, they should be
traced to avoid false positives.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>

e4f7c0b4

26 6月, 2009 1 次提交

fix RCU-callback-after-kmem_cache_destroy problem in sl[aou]b · 7ed9f7e5

由 Paul E. McKenney 提交于 6月 25, 2009

Jesper noted that kmem_cache_destroy() invokes synchronize_rcu() rather than
rcu_barrier() in the SLAB_DESTROY_BY_RCU case, which could result in RCU
callbacks accessing a kmem_cache after it had been destroyed.

Cc: <stable@kernel.org>
Acked-by: NMatt Mackall <mpm@selenic.com>
Reported-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

7ed9f7e5

25 6月, 2009 1 次提交

SLUB: Don't pass __GFP_FAIL for the initial allocation · ba52270d

由 Pekka Enberg 提交于 6月 24, 2009

SLUB uses higher order allocations by default but falls back to small
orders under memory pressure. Make sure the GFP mask used in the initial
allocation doesn't include __GFP_NOFAIL.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ba52270d

19 6月, 2009 1 次提交

mm: Extend gfp masking to the page allocator · dcce284a

由 Benjamin Herrenschmidt 提交于 6月 18, 2009

The page allocator also needs the masking of gfp flags during boot,
so this moves it out of slab/slub and uses it with the page allocator
as well.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dcce284a

17 6月, 2009 1 次提交

page allocator: use a pre-calculated value instead of num_online_nodes() in fast paths · 62bc62a8

由 Christoph Lameter 提交于 6月 16, 2009

num_online_nodes() is called in a number of places but most often by the
page allocator when deciding whether the zonelist needs to be filtered
based on cpusets or the zonelist cache.  This is actually a heavy function
and touches a number of cache lines.

This patch stores the number of online nodes at boot time and updates the
value when nodes get onlined and offlined.  The value is then used in a
number of important paths in place of num_online_nodes().

[rientjes@google.com: do not override definition of node_set_online() with macro]
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

62bc62a8

15 6月, 2009 3 次提交

kmemcheck: add hooks for the page allocator · b1eeab67

由 Vegard Nossum 提交于 11月 25, 2008

This adds support for tracking the initializedness of memory that
was allocated with the page allocator. Highmem requests are not
tracked.

Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>

[build fix for !CONFIG_KMEMCHECK]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

[rebased for mainline inclusion]
Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>

b1eeab67

SLUB: Fix early boot GFP_DMA allocations · 964cf35c

由 Nick Piggin 提交于 6月 15, 2009

Recent change to use slab allocations earlier exposed a bug where
SLUB can call schedule_work and try to call sysfs before it is
safe to do so.
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

964cf35c

slub: add hooks for kmemcheck · 5a896d9e

由 Vegard Nossum 提交于 4月 04, 2008

Parts of this patch were contributed by Pekka Enberg but merged for
atomicity.

Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: NVegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

[rebased for mainline inclusion]
Signed-off-by: NVegard Nossum <vegardno@ifi.uio.no>

5a896d9e

14 6月, 2009 2 次提交

SLUB: Don't print out OOM warning for __GFP_NOFAIL · 95f85989

由 Pekka Enberg 提交于 6月 11, 2009

We must check for __GFP_NOFAIL like the page allocator does; otherwise we end
up with false positives. While at it, add the printk_ratelimit() check in SLUB
as well.

Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

95f85989

SLUB: fix build when !SLUB_DEBUG · 26c02cf0

由 Alexander Beregalov 提交于 6月 11, 2009

Fix this build error when CONFIG_SLUB_DEBUG is not set:
mm/slub.c: In function 'slab_out_of_memory':
mm/slub.c:1551: error: 'struct kmem_cache_node' has no member named 'nr_slabs'
mm/slub.c:1552: error: 'struct kmem_cache_node' has no member named 'total_objects'

[ penberg@cs.helsinki.fi: cleanups ]
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

26c02cf0

12 6月, 2009 3 次提交

slab,slub: don't enable interrupts during early boot · 7e85ee0c

由 Pekka Enberg 提交于 6月 12, 2009

As explained by Benjamin Herrenschmidt:

  Oh and btw, your patch alone doesn't fix powerpc, because it's missing
  a whole bunch of GFP_KERNEL's in the arch code... You would have to
  grep the entire kernel for things that check slab_is_available() and
  even then you'll be missing some.

  For example, slab_is_available() didn't always exist, and so in the
  early days on powerpc, we used a mem_init_done global that is set form
  mem_init() (not perfect but works in practice). And we still have code
  using that to do the test.

Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
in early boot code to avoid enabling interrupts.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

7e85ee0c

slab: setup allocators earlier in the boot sequence · 83b519e8

由 Pekka Enberg 提交于 6月 10, 2009

This patch makes kmalloc() available earlier in the boot sequence so we can get
rid of some bootmem allocations. The bulk of the changes are due to
kmem_cache_init() being called with interrupts disabled which requires some
changes to allocator boostrap code.

Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps
before we call mem_init() during boot as reported by Ingo Molnar:

  We have a hard crash in the WP-protect code:

  [    0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000
  [    0.000000]      EDI 00000188  ESI 00000ac7  EBP c17eaf9c  ESP c17eaf8c
  [    0.000000]      EBX 000014e0  EDX 0000000e  ECX 01856067  EAX 00000001
  [    0.000000]      err 00000003  EIP c10135b1   CS 00000060  flg 00010002
  [    0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc
  [    0.000000]        00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0
  [    0.000000]        c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003
  [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203
  [    0.000000] Call Trace:
  [    0.000000]  [<c15357c2>] ? printk+0x14/0x16
  [    0.000000]  [<c10135b1>] ? do_test_wp_bit+0x19/0x23
  [    0.000000]  [<c17fd410>] ? test_wp_bit+0x26/0x64
  [    0.000000]  [<c17fd7e5>] ? mem_init+0x1ba/0x1d8
  [    0.000000]  [<c17f1668>] ? start_kernel+0x164/0x2f7
  [    0.000000]  [<c17f1322>] ? unknown_bootoption+0x0/0x19c
  [    0.000000]  [<c17f106a>] ? __init_begin+0x6a/0x6f
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

83b519e8

kmemleak: Add the slub memory allocation/freeing hooks · 06f22f13

由 Catalin Marinas 提交于 6月 11, 2009

This patch adds the callbacks to kmemleak_(alloc|free) functions from the
slub allocator.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>

06f22f13

11 6月, 2009 1 次提交

SLUB: Out-of-memory diagnostics · 781b2ba6

由 Pekka Enberg 提交于 6月 10, 2009

As suggested by Mel Gorman, add out-of-memory diagnostics to the SLUB allocator
to make debugging OOM conditions easier. This patch helped hunt down a nasty
OOM issue that popped up every now that was caused by SLUB debugging code which
forced 4096 byte allocations to use order 1 pages even in the fallback case.

An example print out looks like this:

  <snip page allocator out-of-memory message>
  SLUB: Unable to allocate memory on node -1 (gfp=20)
    cache: kmalloc-4096, object size: 4096, buffer size: 4168, default order: 3, min order: 1
    node 0: slabs: 95, objs: 665, free: 0
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

781b2ba6

06 5月, 2009 1 次提交

mm: SLUB fix reclaim_state · 1eb5ac64

由 Nick Piggin 提交于 5月 05, 2009

SLUB does not correctly account reclaim_state.reclaimed_slab, so it will
break memory reclaim. Account it like SLAB does.

Cc: stable@kernel.org
Cc: linux-mm@kvack.org
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

1eb5ac64

23 4月, 2009 1 次提交

slub: enforce MAX_ORDER · 818cf590

由 David Rientjes 提交于 4月 23, 2009

slub_max_order may not be equal to or greater than MAX_ORDER.

Additionally, if a single object cannot be placed in a slab of
slub_max_order, it still must allocate slabs below MAX_ORDER.
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

818cf590

12 4月, 2009 1 次提交

tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and tracepoint part · 02af61bb

由 Zhaolei 提交于 4月 10, 2009

Impact: refactor code for future changes

Current kmemtrace.h is used both as header file of kmemtrace and kmem's
tracepoints definition.

Tracepoints' definition file may be used by other code, and should only have
definition of tracepoint.

We can separate include/trace/kmemtrace.h into 2 files:

  include/linux/kmemtrace.h: header file for kmemtrace
  include/trace/kmem.h:      definition of kmem tracepoints
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

02af61bb

03 4月, 2009 2 次提交

kmemtrace: trace kfree() calls with NULL or zero-length objects · 2121db74

由 Pekka Enberg 提交于 3月 25, 2009

Impact: also output kfree(NULL) entries

This patch moves the trace_kfree() calls before the ZERO_OR_NULL_PTR
check so that we can trace call-sites that call kfree() with NULL many
times which might be an indication of a bug.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237971957.30175.18.camel@penberg-laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2121db74

kmemtrace: use tracepoints · ca2b84cb

由 Eduard - Gabriel Munteanu 提交于 3月 23, 2009

kmemtrace now uses tracepoints instead of markers. We no longer need to
use format specifiers to pass arguments.
Signed-off-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
[ folded: Use the new TP_PROTO and TP_ARGS to fix the build.     ]
[ folded: fix build when CONFIG_KMEMTRACE is disabled.           ]
[ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ]
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ca2b84cb

23 3月, 2009 1 次提交

slub: use get_track() · 1a00df4a

由 Akinobu Mita 提交于 3月 07, 2009

Use get_track() in set_track()
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

1a00df4a

25 2月, 2009 1 次提交

slub: rename calculate_min_partial() to set_min_partial() · c0bdb232

由 David Rientjes 提交于 2月 25, 2009

As suggested by Christoph Lameter, rename calculate_min_partial() to
set_min_partial() as the function doesn't really do any calculations.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

c0bdb232

23 2月, 2009 2 次提交

slub: add min_partial sysfs tunable · 73d342b1

由 David Rientjes 提交于 2月 22, 2009

Now that a cache's min_partial has been moved to struct kmem_cache, it's
possible to easily tune it from userspace by adding a sysfs attribute.

It may not be desirable to keep a large number of partial slabs around
if a cache is used infrequently and memory, especially when constrained
by a cgroup, is scarce.  It's better to allow userspace to set the
minimum policy per cache instead of relying explicitly on
kmem_cache_shrink().

The memory savings from simply moving min_partial from struct
kmem_cache_node to struct kmem_cache is obviously not significant
(unless maybe you're from SGI or something), at the largest it's

	# allocated caches * (MAX_NUMNODES - 1) * sizeof(unsigned long)

The true savings occurs when userspace reduces the number of partial
slabs that would otherwise be wasted, especially on machines with a
large number of nodes (ia64 with CONFIG_NODES_SHIFT at 10 for default?).
As well as the kernel estimates ideal values for n->min_partial and
ensures it's within a sane range, userspace has no other input other
than writing to /sys/kernel/slab/cache/shrink.

There simply isn't any better heuristic to add when calculating the
partial values for a better estimate that works for all possible caches.
And since it's currently a static value, the user really has no way of
reclaiming that wasted space, which can be significant when constrained
by a cgroup (either cpusets or, later, memory controller slab limits)
without shrinking it entirely.

This also allows the user to specify that increased fragmentation and
more partial slabs are actually desired to avoid the cost of allocating
new slabs at runtime for specific caches.

There's also no reason why this should be a per-struct kmem_cache_node
value in the first place.  You could argue that a machine would have
such node size asymmetries that it should be specified on a per-node
basis, but we know nobody is doing that right now since it's a purely
static value at the moment and there's no convenient way to tune that
via slub's sysfs interface.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

73d342b1

slub: move min_partial to struct kmem_cache · 3b89d7d8

由 David Rientjes 提交于 2月 22, 2009

Although it allows for better cacheline use, it is unnecessary to save a
copy of the cache's min_partial value in each kmem_cache_node.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

3b89d7d8

20 2月, 2009 3 次提交

SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants · fe1200b6

由 Christoph Lameter 提交于 2月 17, 2009

As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

fe1200b6

SLUB: Fix default slab order for big object sizes · e8120ff1

由 Zhang Yanmin 提交于 2月 12, 2009

The default order of kmalloc-8192 on 2*4 stoakley is an issue of
calculate_order.

slab_size       order           name
-------------------------------------------------
4096            3               sgpool-128
8192            2               kmalloc-8192
16384           3               kmalloc-16384

kmalloc-8192's default order is smaller than sgpool-128's.

On 4*4 tigerton machine, a similiar issue appears on another kmem_cache.

Function calculate_order uses 'min_objects /= 2;' to shrink. Plus size
calculation/checking in slab_order, sometimes above issue appear.

Below patch against 2.6.29-rc2 fixes it.

I checked the default orders of all kmem_cache and they don't become
smaller than before. So the patch wouldn't hurt performance.

Signed-off-by Zhang Yanmin <yanmin.zhang@linux.intel.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

e8120ff1

SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants · ffadd4d0

由 Christoph Lameter 提交于 2月 17, 2009

As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

ffadd4d0

15 2月, 2009 1 次提交

lockdep: annotate reclaim context (__GFP_NOFS) · cf40bd16

由 Nick Piggin 提交于 1月 21, 2009

Here is another version, with the incremental patch rolled up, and
added reclaim context annotation to kswapd, and allocation tracing
to slab allocators (which may only ever reach the page allocator
in rare cases, so it is good to put annotations here too).

Haven't tested this version as such, but it should be getting closer
to merge worthy ;)

--
After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
allocation when it should not have been, I thought it might be a good idea to
try to catch this kind of thing with lockdep.

I coded up a little idea that seems to work. Unfortunately the system has to
actually be in __GFP_FS page reclaim, then take the lock, before it will mark
it. But at least that might still be some orders of magnitude more common
(and more debuggable) than an actual deadlock condition, so we have some
improvement I hope (the concept is no less complete than discovery of a lock's
interrupt contexts).

I guess we could even do the same thing with __GFP_IO (normal reclaim), and
even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
code paths, so let's start there and see how it goes.

It *seems* to work. I did a quick test.

=================================
[ INFO: inconsistent lock state ]
2.6.28-rc6-00007-ged313489-dirty #26
---------------------------------
inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
{in-reclaim-W} state was registered at:
  [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60
  [<ffffffff80268f71>] lock_acquire+0x91/0xc0
  [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310
  [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd]
  [<ffffffff8020903b>] _stext+0x3b/0x170
  [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
  [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
  [<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 3929
hardirqs last  enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310
hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310
softirqs last  enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0
softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0

other info that might help us debug this:
1 lock held by modprobe/8526:
 #0:  (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]

stack backtrace:
Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged313489-dirty #26
Call Trace:
 [<ffffffff80265483>] print_usage_bug+0x193/0x1d0
 [<ffffffff80266530>] mark_lock+0xaf0/0xca0
 [<ffffffff80266735>] mark_held_locks+0x55/0xc0
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60
 [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580
 [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd]
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffff8020903b>] _stext+0x3b/0x170
 [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10
 [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180
 [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190
 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cf40bd16

12 2月, 2009 1 次提交

mm: Export symbol ksize() · b1aabecd

由 Kirill A. Shutemov 提交于 2月 10, 2009

Commit 7b2cd92a ("crypto: api - Fix
zeroing on free") added modular user of ksize(). Export that to fix
crypto.ko compilation.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

b1aabecd

28 1月, 2009 1 次提交

slub: fix per cpu kmem_cache_cpu array memory leak · 37189094

由 David Rientjes 提交于 1月 27, 2009

The per cpu array of kmem_cache_cpu structures accomodates
NR_KMEM_CACHE_CPU such structs.

When this array overflows and a struct is allocated by kmalloc(), it may
have an address at the upper bound of this array.  If this happens, it
does not get freed and the per cpu kmem_cache_cpu_free pointer will be out
of bounds after kmem_cache_destroy() or cpu offlining.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

37189094

14 1月, 2009 1 次提交

SLUB: Use ->objsize from struct kmem_cache_cpu in slab_free() · 6047a007

由 Pekka Enberg 提交于 1月 14, 2009

There's no reason to use ->objsize from struct kmem_cache in slab_free() for
the SLAB_DEBUG_OBJECTS case. All it does is generate extra cache pressure as we
try very hard not to touch struct kmem_cache in the fast-path.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

6047a007

06 1月, 2009 1 次提交

trivial: fix an -> a typos in documentation and comments · 0211a9c8

由 Frederik Schwarzer 提交于 12月 29, 2008

It is always "an" if there is a vowel _spoken_ (not written).
So it is:
"an hour" (spoken vowel)
but
"a uniform" (spoken 'j')
Signed-off-by: NFrederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

0211a9c8

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致