提交 · ab4d5ed5eeda4f57c50d14131ce1b1da75d0c938 · openeuler / Kernel

06 10月, 2010 1 次提交

slub: Enable sysfs support for !CONFIG_SLUB_DEBUG · ab4d5ed5

由 Christoph Lameter 提交于 10月 05, 2010

Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.

Make SYSFS support independent of CONFIG_SLUB_DEBUG
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>

ab4d5ed5

02 10月, 2010 2 次提交

slub: reduce differences between SMP and NUMA · 7340cc84

由 Christoph Lameter 提交于 9月 28, 2010

Reduce the #ifdefs and simplify bootstrap by making SMP and NUMA as much alike
as possible. This means that there will be an additional indirection to get to
the kmem_cache_node field under SMP.
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>

7340cc84

slub: Dynamically size kmalloc cache allocations · 51df1142

由 Christoph Lameter 提交于 8月 20, 2010

kmalloc caches are statically defined and may take up a lot of space just
because the sizes of the node array has to be dimensioned for the largest
node count supported.

This patch makes the size of the kmem_cache structure dynamic throughout by
creating a kmem_cache slab cache for the kmem_cache objects. The bootstrap
occurs by allocating the initial one or two kmem_cache objects from the
page allocator.

C2->C3
	- Fix various issues indicated by David
	- Make create kmalloc_cache return a kmem_cache * pointer.
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@kernel.org>

51df1142

11 8月, 2010 1 次提交

dma-mapping: rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN · a6eb9fe1

由 FUJITA Tomonori 提交于 8月 10, 2010

Now each architecture has the own dma_get_cache_alignment implementation.

dma_get_cache_alignment returns the minimum DMA alignment.  Architectures
define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed
buffer is DMA-safe; the buffer doesn't share a cache with the others).  So
we can unify dma_get_cache_alignment implementations.

This patch:

dma_get_cache_alignment() needs to know if an architecture defines
ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA
alignment restriction).  However, slab.h define ARCH_KMALLOC_MINALIGN if
architectures doesn't define it.

Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN.
ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub
(except for crypto).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a6eb9fe1

09 8月, 2010 1 次提交

slub: add missing __percpu markup in mm/slub_def.h · 1b5ad248

由 Namhyung Kim 提交于 8月 07, 2010

kmem_cache->cpu_slab is a percpu pointer but was missing __percpu
markup.  Add it.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NPekka Enberg <penberg@kernel.org>

1b5ad248

09 6月, 2010 1 次提交

tracing: Remove kmemtrace ftrace plugin · 039ca4e7

由 Li Zefan 提交于 5月 26, 2010

We have been resisting new ftrace plugins and removing existing
ones, and kmemtrace has been superseded by kmem trace events
and perf-kmem, so we remove it.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ remove kmemtrace from the makefile, handle slob too ]
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

039ca4e7

30 5月, 2010 1 次提交

SLUB: Allow full duplication of kmalloc array for 390 · 0f1f6942

由 Christoph Lameter 提交于 5月 27, 2010

Commit 756dee75 ("SLUB: Get rid of dynamic DMA
kmalloc cache allocation") makes S390 run out of kmalloc caches.  Increase the
number of kmalloc caches to a safe size.

Cc: <stable@kernel.org> [ .33 and .34 ]
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

0f1f6942

25 5月, 2010 1 次提交

slub: move kmem_cache_node into it's own cacheline · 73367bd8

由 Alexander Duyck 提交于 5月 21, 2010

This patch is meant to improve the performance of SLUB by moving the local
kmem_cache_node lock into it's own cacheline separate from kmem_cache.
This is accomplished by simply removing the local_node when NUMA is enabled.

On my system with 2 nodes I saw around a 5% performance increase w/
hackbench times dropping from 6.2 seconds to 5.9 seconds on average.  I
suspect the performance gain would increase as the number of nodes
increases, but I do not have the data to currently back that up.

Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713
Cc: <stable@kernel.org>
Reported-by: NAlex Shi <alex.shi@intel.com>
Tested-by: NAlex Shi <alex.shi@intel.com>
Acked-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

73367bd8

20 5月, 2010 1 次提交

mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slub_def.h> · 4581ced3

由 David Woodhouse 提交于 5月 19, 2010

Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

4581ced3

20 12月, 2009 3 次提交

SLUB: this_cpu: Remove slub kmem_cache fields · ff12059e

由 Christoph Lameter 提交于 12月 18, 2009

Remove the fields in struct kmem_cache_cpu that were used to cache data from
struct kmem_cache when they were in different cachelines. The cacheline that
holds the per cpu array pointer now also holds these values. We can cut down
the struct kmem_cache_cpu size to almost half.

The get_freepointer() and set_freepointer() functions that used to be only
intended for the slow path now are also useful for the hot path since access
to the size field does not require accessing an additional cacheline anymore.
This results in consistent use of functions for setting the freepointer of
objects throughout SLUB.

Also we initialize all possible kmem_cache_cpu structures when a slab is
created. No need to initialize them when a processor or node comes online.
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

ff12059e

SLUB: Get rid of dynamic DMA kmalloc cache allocation · 756dee75

由 Christoph Lameter 提交于 12月 18, 2009

Dynamic DMA kmalloc cache allocation is troublesome since the
new percpu allocator does not support allocations in atomic contexts.
Reserve some statically allocated kmalloc_cpu structures instead.
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

756dee75

SLUB: Use this_cpu operations in slub · 9dfc6e68

由 Christoph Lameter 提交于 12月 18, 2009

Using per cpu allocations removes the needs for the per cpu arrays in the
kmem_cache struct. These could get quite big if we have to support systems
with thousands of cpus. The use of this_cpu_xx operations results in:

1. The size of kmem_cache for SMP configuration shrinks since we will only
   need 1 pointer instead of NR_CPUS. The same pointer can be used by all
   processors. Reduces cache footprint of the allocator.

2. We can dynamically size kmem_cache according to the actual nodes in the
   system meaning less memory overhead for configurations that may potentially
   support up to 1k NUMA nodes / 4k cpus.

3. We can remove the diddle widdle with allocating and releasing of
   kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
   alloc logic will do it all for us. Removes some portions of the cpu hotplug
   functionality.

4. Fastpath performance increases since per cpu pointer lookups and
   address calculations are avoided.

V7-V8
- Convert missed get_cpu_slab() under CONFIG_SLUB_STATS
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

9dfc6e68

11 12月, 2009 1 次提交

tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING · 0f24f128

由 Li Zefan 提交于 12月 11, 2009

Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
enabled, otherwise perf-kmem will show wrong stats ifndef
CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
be traced by both trace_kmalloc() and trace_kmem_cache_alloc().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <4B21F89A.7000801@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f24f128

30 8月, 2009 1 次提交

SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256 · acdfcd04

由 Aaro Koskinen 提交于 8月 28, 2009

If the minalign is 64 bytes, then the 96 byte cache should not be created
because it would conflict with the 128 byte cache.

If the minalign is 256 bytes, patching the size_index table should not
result in a buffer overrun.

The calculation "(i - 1) / 8" used to access size_index[] is moved to
a separate function as suggested by Christoph Lameter.
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

acdfcd04

06 8月, 2009 1 次提交

slab: remove duplicate kmem_cache_init_late() declarations · bbff2e43

由 Wu Fengguang 提交于 8月 06, 2009

kmem_cache_init_late() has been declared in slab.h

CC: Nick Piggin <npiggin@suse.de>
CC: Matt Mackall <mpm@selenic.com>
CC: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

bbff2e43

08 7月, 2009 1 次提交

kmemleak: Trace the kmalloc_large* functions in slub · e4f7c0b4

由 Catalin Marinas 提交于 7月 07, 2009

The kmalloc_large() and kmalloc_large_node() functions were missed when
adding the kmemleak hooks to the slub allocator. However, they should be
traced to avoid false positives.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>

e4f7c0b4

12 6月, 2009 1 次提交

slab,slub: don't enable interrupts during early boot · 7e85ee0c

由 Pekka Enberg 提交于 6月 12, 2009

As explained by Benjamin Herrenschmidt:

  Oh and btw, your patch alone doesn't fix powerpc, because it's missing
  a whole bunch of GFP_KERNEL's in the arch code... You would have to
  grep the entire kernel for things that check slab_is_available() and
  even then you'll be missing some.

  For example, slab_is_available() didn't always exist, and so in the
  early days on powerpc, we used a mem_init_done global that is set form
  mem_init() (not perfect but works in practice). And we still have code
  using that to do the test.

Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
in early boot code to avoid enabling interrupts.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

7e85ee0c

12 4月, 2009 1 次提交

tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and tracepoint part · 02af61bb

由 Zhaolei 提交于 4月 10, 2009

Impact: refactor code for future changes

Current kmemtrace.h is used both as header file of kmemtrace and kmem's
tracepoints definition.

Tracepoints' definition file may be used by other code, and should only have
definition of tracepoint.

We can separate include/trace/kmemtrace.h into 2 files:

  include/linux/kmemtrace.h: header file for kmemtrace
  include/trace/kmem.h:      definition of kmem tracepoints
Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

02af61bb

03 4月, 2009 1 次提交

kmemtrace: use tracepoints · ca2b84cb

由 Eduard - Gabriel Munteanu 提交于 3月 23, 2009

kmemtrace now uses tracepoints instead of markers. We no longer need to
use format specifiers to pass arguments.
Signed-off-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
[ folded: Use the new TP_PROTO and TP_ARGS to fix the build.     ]
[ folded: fix build when CONFIG_KMEMTRACE is disabled.           ]
[ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ]
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ca2b84cb

23 2月, 2009 1 次提交

slub: move min_partial to struct kmem_cache · 3b89d7d8

由 David Rientjes 提交于 2月 22, 2009

Although it allows for better cacheline use, it is unnecessary to save a
copy of the cache's min_partial value in each kmem_cache_node.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

3b89d7d8

20 2月, 2009 3 次提交

SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants · fe1200b6

由 Christoph Lameter 提交于 2月 17, 2009

As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

fe1200b6

SLUB: Do not pass 8k objects through to the page allocator · 51735a7c

由 Pekka Enberg 提交于 2月 20, 2009

Increase the maximum object size in SLUB so that 8k objects are not
passed through to the page allocator anymore. The network stack uses 8k
objects for performance critical operations.

The patch is motivated by a SLAB vs. SLUB regression in the netperf
benchmark. The problem is that the kfree(skb->head) call in
skb_release_data() that is subject to page allocator pass-through as the
size passed to __alloc_skb() is larger than 4 KB in this test.

As explained by Yanmin Zhang:

  I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server
  pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's
  result is about 2.3 times of SLUB's. After applying the reverting patch,
  the result difference between SLUB and SLAB becomes 1% which we might
  consider as fluctuation.

[ penberg@cs.helsinki.fi: fix oops in kmalloc() ]
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

51735a7c

SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants · ffadd4d0

由 Christoph Lameter 提交于 2月 17, 2009

As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

ffadd4d0

30 12月, 2008 1 次提交

tracing/kmemtrace: normalize the raw tracer event to the unified tracing API · 36994e58

由 Frederic Weisbecker 提交于 12月 29, 2008

Impact: new tracer plugin

This patch adapts kmemtrace raw events tracing to the unified tracing API.

To enable and use this tracer, just do the following:

 echo kmemtrace > /debugfs/tracing/current_tracer
 cat /debugfs/tracing/trace

You will have the following output:

 # tracer: kmemtrace
 #
 #
 # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
 # FREE   |      |     |       |              |   |            |        |
 # |

type_id 1 call_site 18446744071565527833 ptr 18446612134395152256
type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1
type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
type_id 1 call_site 18446744071565585534 ptr 18446612134405955584

That was to stay backward compatible with the format output produced in
inux/tracepoint.h.

This is the default ouput, but note that I tried something else.

If you change an option:

echo kmem_minimalistic > /debugfs/trace_options

and then cat /debugfs/trace, you will have the following output:

 # tracer: kmemtrace
 #
 #
 # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
 # FREE   |      |     |       |              |   |            |        |
 # |

   -      C                            0xffff88007c088780          file_free_rcu
   +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
   -      C                            0xffff88007cad6000          putname
   +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
   +      K    240    240   000000d0   0xffff8800790dc780     -1   d_alloc
   -      C                            0xffff88007cad6000          putname
   +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
   +      K    240    240   000000d0   0xffff8800790dc870     -1   d_alloc
   -      C                            0xffff88007cad6000          putname
   +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
   +      K    240    240   000000d0   0xffff8800790dc960     -1   d_alloc
   +      K   1304   1312   000000d0   0xffff8800791d7340     -1   reiserfs_alloc_inode
   -      C                            0xffff88007cad6000          putname
   +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
   -      C                            0xffff88007cad6000          putname
   +      K    992   1000   000000d0   0xffff880079045b58     -1   alloc_inode
   +      K    768   1024   000080d0   0xffff88007c096400     -1   alloc_pipe_info
   +      K    240    240   000000d0   0xffff8800790dca50     -1   d_alloc
   +      K    272    320   000080d0   0xffff88007c088780     -1   get_empty_filp
   +      K    272    320   000080d0   0xffff88007c088000     -1   get_empty_filp

Yeah I shall confess kmem_minimalistic should be: kmem_alternative.

Whatever, I find it more readable but this a personal opinion of course.
We can drop it if you want.

On the ALLOC/FREE column, + means an allocation and - a free.

On the type column, you have K = kmalloc, C = cache, P = page

I would like the flags to be GFP_* strings but that would not be easy to not
break the column with strings....

About the node...it seems to always be -1. I don't know why but that shouldn't
be difficult to find.

I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would
be more easy to find the tracer headers if they are all in their common
directory.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

36994e58

29 12月, 2008 1 次提交

kmemtrace: SLUB hooks. · 5b882be4

由 Eduard - Gabriel Munteanu 提交于 8月 19, 2008

This adds hooks for the SLUB allocator, to allow tracing with kmemtrace.
Signed-off-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

5b882be4

05 8月, 2008 1 次提交

SLUB: dynamic per-cache MIN_PARTIAL · 5595cffc

由 Pekka Enberg 提交于 8月 05, 2008

This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial
value that is calculated from object size. The bigger the object size, the more
pages we keep on the partial list.

I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example
script of the fio benchmarking tool. The script stresses the networking
subsystem which should also give a fairly good beating of kmalloc() et al.

To run the test yourself, first clone the fio repository:

  git clone git://git.kernel.dk/fio.git

and then run the following command n times on your machine:

  time ./fio examples/netio

The results on my 2-way 64-bit x86 machine are as follows:

  [ the minimum, maximum, and average are captured from 50 individual runs ]

                 real time (seconds)
                 min      max      avg      sd
  SLAB           22.76    23.38    22.98    0.17
  SLUB           22.80    25.78    23.46    0.72
  SLUB (dynamic) 22.74    23.54    23.00    0.20

                 sys time (seconds)
                 min      max      avg      sd
  SLAB           6.90     8.28     7.70     0.28
  SLUB           7.42     16.95    8.89     2.28
  SLUB (dynamic) 7.17     8.64     7.73     0.29

                 user time (seconds)
                 min      max      avg      sd
  SLAB           36.89    38.11    37.50    0.29
  SLUB           30.85    37.99    37.06    1.67
  SLUB (dynamic) 36.75    38.07    37.59    0.32

As you can see from the above numbers, this patch brings SLUB to the same level
as SLAB for this particular workload fixing a ~2% regression. I'd expect this
change to help similar workloads that allocate a lot of objects that are close
to the size of a page.

Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

5595cffc

27 7月, 2008 1 次提交

SL*B: drop kmem cache argument from constructor · 51cc5068

由 Alexey Dobriyan 提交于 7月 25, 2008

Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres.  Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.

Non-trivial places are:
	arch/powerpc/mm/init_64.c
	arch/powerpc/mm/hugetlbpage.c

This is flag day, yes.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51cc5068

05 7月, 2008 1 次提交

Christoph has moved · cde53535

由 Christoph Lameter 提交于 7月 04, 2008

Remove all clameter@sgi.com addresses from the kernel tree since they will
become invalid on June 27th.  Change my maintainer email address for the
slab allocators to cl@linux-foundation.org (which will be the new email
address for the future).
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cde53535

04 7月, 2008 1 次提交

slub: Do not use 192 byte sized cache if minimum alignment is 128 byte · 41d54d3b

由 Christoph Lameter 提交于 7月 03, 2008

The 192 byte cache is not necessary if we have a basic alignment of 128
byte. If it would be used then the 192 would be aligned to the next 128 byte
boundary which would result in another 256 byte cache. Two 256 kmalloc caches
cause sysfs to complain about a duplicate entry.

MIPS needs 128 byte aligned kmalloc caches and spits out warnings on boot without
this patch.
Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

41d54d3b

27 4月, 2008 3 次提交

slub: Fallback to minimal order during slab page allocation · 65c3376a

由 Christoph Lameter 提交于 4月 14, 2008

If any higher order allocation fails then fall back the smallest order
necessary to contain at least one object. This enables fallback for all
allocations to order 0 pages. The fallback will waste more memory (objects
will not fit neatly) and the fallback slabs will be not as efficient as larger
slabs since they contain less objects.

Note that SLAB also depends on order 1 allocations for some slabs that waste
too much memory if forced into PAGE_SIZE'd page. SLUB now can now deal with
failing order 1 allocs which SLAB cannot do.

Add a new field min that will contain the objects for the smallest possible order
for a slab cache.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

65c3376a

slub: Update statistics handling for variable order slabs · 205ab99d

由 Christoph Lameter 提交于 4月 14, 2008

Change the statistics to consider that slabs of the same slabcache
can have different number of objects in them since they may be of
different order.

Provide a new sysfs field

	total_objects

which shows the total objects that the allocated slabs of a slabcache
could hold.

Add a max field that holds the largest slab order that was ever used
for a slab cache.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

205ab99d

slub: Add kmem_cache_order_objects struct · 834f3d11

由 Christoph Lameter 提交于 4月 14, 2008

Pack the order and the number of objects into a single word.
This saves some memory in the kmem_cache_structure and more importantly
allows us to fetch both values atomically.

Later the slab orders become runtime configurable and we need to fetch these
two items together in order to properly allocate a slab and initialize its
objects.

Fix the race by fetching the order and the number of objects in one word.

[penberg@cs.helsinki.fi: fix memset() page order in new_slab()]
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

834f3d11

14 4月, 2008 1 次提交

slub: No need for per node slab counters if !SLUB_DEBUG · 0f389ec6

由 Christoph Lameter 提交于 4月 14, 2008

The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.

SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).

Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.

[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

0f389ec6

04 3月, 2008 1 次提交

slub: Fix up comments · 6446faa2

由 Christoph Lameter 提交于 2月 15, 2008

Provide comments and fix up various spelling / style issues.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>

6446faa2

15 2月, 2008 3 次提交

slub: Support 4k kmallocs again to compensate for page allocator slowness · 331dc558

由 Christoph Lameter 提交于 2月 14, 2008

Currently we hand off PAGE_SIZEd kmallocs to the page allocator in the
mistaken belief that the page allocator can handle these allocations
effectively. However, measurements indicate a minimum slowdown by the
factor of 8 (and that is only SMP, NUMA is much worse) vs the slub fastpath
which causes regressions in tbench.

Increase the number of kmalloc caches by one so that we again handle 4k
kmallocs directly from slub. 4k page buffering for the page allocator
will be performed by slub like done by slab.

At some point the page allocator fastpath should be fixed. A lot of the kernel
would benefit from a faster ability to allocate a single page. If that is
done then the 4k allocs may again be forwarded to the page allocator and this
patch could be reverted.
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NChristoph Lameter <clameter@sgi.com>

331dc558

slub: Determine gfpflags once and not every time a slab is allocated · b7a49f0d

由 Christoph Lameter 提交于 2月 14, 2008

Currently we determine the gfp flags to pass to the page allocator
each time a slab is being allocated.

Determine the bits to be set at the time the slab is created. Store
in a new allocflags field and add the flags in allocate_slab().
Acked-by: NMel Gorman <mel@csn.ul.ie>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NChristoph Lameter <clameter@sgi.com>

b7a49f0d

slub: kmalloc page allocator pass-through cleanup · eada35ef

由 Pekka Enberg 提交于 2月 11, 2008

This adds a proper function for kmalloc page allocator pass-through. While it
simplifies any code that does slab tracing code a lot, I think it's a
worthwhile cleanup in itself.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NChristoph Lameter <clameter@sgi.com>

eada35ef

08 2月, 2008 1 次提交

SLUB: Support for performance statistics · 8ff12cfc

由 Christoph Lameter 提交于 2月 07, 2008

The statistics provided here allow the monitoring of allocator behavior but
at the cost of some (minimal) loss of performance. Counters are placed in
SLUB's per cpu data structure. The per cpu structure may be extended by the
statistics to grow larger than one cacheline which will increase the cache
footprint of SLUB.

There is a compile option to enable/disable the inclusion of the runtime
statistics and its off by default.

The slabinfo tool is enhanced to support these statistics via two options:

-D 	Switches the line of information displayed for a slab from size
	mode to activity mode.

-A	Sorts the slabs displayed by activity. This allows the display of
	the slabs most important to the performance of a certain load.

-r	Report option will report detailed statistics on

Example (tbench load):

slabinfo -AD		->Shows the most active slabs

Name                   Objects    Alloc     Free   %Fast
skbuff_fclone_cache         33 111953835 111953835  99  99
:0000192                  2666  5283688  5281047  99  99
:0001024                   849  5247230  5246389  83  83
vm_area_struct            1349   119642   118355  91  22
:0004096                    15    66753    66751  98  98
:0000064                  2067    25297    23383  98  78
dentry                   10259    28635    18464  91  45
:0000080                 11004    18950     8089  98  98
:0000096                  1703    12358    10784  99  98
:0000128                   762    10582     9875  94  18
:0000512                   184     9807     9647  95  81
:0002048                   479     9669     9195  83  65
anon_vma                   777     9461     9002  99  71
kmalloc-8                 6492     9981     5624  99  97
:0000768                   258     7174     6931  58  15

So the skbuff_fclone_cache is of highest importance for the tbench load.
Pretty high load on the 192 sized slab. Look for the aliases

slabinfo -a | grep 000192
:0000192     <- xfs_btree_cur filp kmalloc-192 uid_cache tw_sock_TCP
	request_sock_TCPv6 tw_sock_TCPv6 skbuff_head_cache xfs_ili

Likely skbuff_head_cache.


Looking into the statistics of the skbuff_fclone_cache is possible through

slabinfo skbuff_fclone_cache	->-r option implied if cache name is mentioned


.... Usual output ...

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath             111953360 111946981  99  99
Slowpath                 1044     7423   0   0
Page Alloc                272      264   0   0
Add partial                25      325   0   0
Remove partial             86      264   0   0
RemoteObj/SlabFrozen      350     4832   0   0
Total                111954404 111954404

Flushes       49 Refill        0
Deactivate Full=325(92%) Empty=0(0%) ToHead=24(6%) ToTail=1(0%)

Looks good because the fastpath is overwhelmingly taken.


skbuff_head_cache:

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath              5297262  5259882  99  99
Slowpath                 4477    39586   0   0
Page Alloc                937      824   0   0
Add partial                 0     2515   0   0
Remove partial           1691      824   0   0
RemoteObj/SlabFrozen     2621     9684   0   0
Total                 5301739  5299468

Deactivate Full=2620(100%) Empty=0(0%) ToHead=0(0%) ToTail=0(0%)


Descriptions of the output:

Total:		The total number of allocation and frees that occurred for a
		slab

Fastpath:	The number of allocations/frees that used the fastpath.

Slowpath:	Other allocations

Page Alloc:	Number of calls to the page allocator as a result of slowpath
		processing

Add Partial:	Number of slabs added to the partial list through free or
		alloc (occurs during cpuslab flushes)

Remove Partial:	Number of slabs removed from the partial list as a result of
		allocations retrieving a partial slab or by a free freeing
		the last object of a slab.

RemoteObj/Froz:	How many times were remotely freed object encountered when a
		slab was about to be deactivated. Frozen: How many times was
		free able to skip list processing because the slab was in use
		as the cpuslab of another processor.

Flushes:	Number of times the cpuslab was flushed on request
		(kmem_cache_shrink, may result from races in __slab_alloc)

Refill:		Number of times we were able to refill the cpuslab from
		remotely freed objects for the same slab.

Deactivate:	Statistics how slabs were deactivated. Shows how they were
		put onto the partial list.

In general fastpath is very good. Slowpath without partial list processing is
also desirable. Any touching of partial list uses node specific locks which
may potentially cause list lock contention.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>

8ff12cfc

05 2月, 2008 2 次提交

Explain kmem_cache_cpu fields · da89b79e

由 Christoph Lameter 提交于 1月 07, 2008

Add some comments explaining the fields of the kmem_cache_cpu structure.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

da89b79e

SLUB: rename defrag to remote_node_defrag_ratio · 9824601e

由 Christoph Lameter 提交于 1月 07, 2008

The NUMA defrag works by allocating objects from partial slabs on remote
nodes.  Rename it to

	remote_node_defrag_ratio

to be clear about this.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

9824601e

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功