提交 · 1c6fe9465941df04a1ad8f009bd6d95b20072a58 · openeuler / raspberrypi-kernel

27 10月, 2005 1 次提交

[PATCH] NUMA: broken per cpu pageset counters · 1c6fe946

由 Magnus Damm 提交于 10月 26, 2005

The NUMA counters in struct per_cpu_pageset (linux/mmzone.h) are never
cleared today.  This works ok for CPU 0 on NUMA machines because
boot_pageset[] is already zero, but for other CPU:s this results in
uninitialized counters.
Signed-off-by: NMagnus Damm <magnus@valinux.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1c6fe946

21 10月, 2005 1 次提交

[PATCH] Fix handling spurious page fault for hugetlb region · ac9b9c66

由 Hugh Dickins 提交于 10月 20, 2005

This reverts commit 3359b54c and
replaces it with a cleaner version that is purely based on page table
operations, so that the synchronization between inode size and hugetlb
mappings becomes moot.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ac9b9c66

20 10月, 2005 3 次提交

[PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory · 281dd25c

由 Yasunori Goto 提交于 10月 19, 2005

This introduces a limit parameter to the core bootmem allocator; The new
parameter indicates that physical memory allocated by the bootmem
allocator should be within the requested limit.

We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
is the only api used for swiotlb.

The existing alloc_bootmem_low_pages() api could instead have been
changed and made to pass right limit to the core allocator.  But that
would make the patch more intrusive for 2.6.14, as other arches use
alloc_bootmem_low_pages().  We may be done that post 2.6.14 as a
cleanup.

With this, swiotlb gets memory within 4G for both x86_64 and ia64
arches.
Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

281dd25c

[PATCH] mm: hugetlb truncation fixes · 1c59827d

由 Hugh Dickins 提交于 10月 19, 2005

hugetlbfs allows truncation of its files (should it?), but hugetlb.c often
forgets that: crashes and misaccounting ensue.

copy_hugetlb_page_range better grab the src page_table_lock since we don't
want to guess what happens if concurrently truncated. unmap_hugepage_range
rss accounting must not assume the full range was mapped. follow_hugetlb_page
must guard with page_table_lock and be prepared to exit early.

Restyle copy_hugetlb_page_range with a for loop like the others there.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1c59827d

[PATCH] Handle spurious page fault for hugetlb region · 3359b54c

由 Seth, Rohit 提交于 10月 18, 2005

The hugetlb pages are currently pre-faulted.  At the time of mmap of
hugepages, we populate the new PTEs.  It is possible that HW has already
cached some of the unused PTEs internally.  These stale entries never
get a chance to be purged in existing control flow.

This patch extends the check in page fault code for hugepages.  Check if
a faulted address falls with in size for the hugetlb file backing it.
We return VM_FAULT_MINOR for these cases (assuming that the arch
specific page-faulting code purges the stale entry for the archs that
need it).
Signed-off-by: NRohit Seth <rohit.seth@intel.com>

[ This is apparently arguably an ia64 port bug. But the code won't
  hurt, and for now it fixes a real problem on some ia64 machines ]
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3359b54c

17 10月, 2005 1 次提交

Fix memory ordering bug in page reclaim · 3d80636a

由 Linus Torvalds 提交于 10月 16, 2005

As noticed by Nick Piggin, we need to make sure that we check the page
count before we check for PageDirty, since the dirty check is only valid
if the count implies that we're the only possible ones holding the page.

We always did do this, but the code needs a read-memory-barrier to make
sure that the orderign is also honored by the CPU.

(The writer side is ordered due to the atomic decrement and test on the
page count, see the discussion on linux-kernel)
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3d80636a

12 10月, 2005 2 次提交

[PATCH] Don't map the same page too much · f5154a98

由 Hugh Dickins 提交于 10月 11, 2005

Refuse to install a page into a mapping if the mapping count is already
ridiculously large.

You probably cannot trigger this on 32-bit architectures, but on a
64-bit setup we should protect against it.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f5154a98

[PATCH] madvise: Avoid returning error code -EBADF for anonymous mappings · 1bef4003

由 Suzuki 提交于 10月 11, 2005

Revert this recent correctness change: Douglas Crosher <dcrosher@scieneer.com>
reported that it broke an existing application, and that madvise() works
without error on anonymous mappings on Solaris.

This means that madvise() will remain non-standards-compliant: we should
return -EBADF for all requests against non-file-backed vma's, but Linux only
does this for MADV_WILLNEED requests.
Signed-off-by: NSuzuki K P <suzuki@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1bef4003

09 10月, 2005 1 次提交

[PATCH] gfp flags annotations - part 1 · dd0fc66f

由 Al Viro 提交于 10月 07, 2005

 - added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd0fc66f

01 10月, 2005 1 次提交

Revert "x86-64: Reverse order of bootmem lists" · 6e3254c4

由 Linus Torvalds 提交于 9月 30, 2005

As requested by Thomas Gleixner <tglx@linutronix.de>:

  "5d3d0f77 breaks a couple of ARM
   boards, which depend on the historical bootmem allocation order.
   There is a cleaner solution around to remove the pgdat list
   completely, but this is a topic for post 2.6.14

   Andi signalled ACK already."
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6e3254c4

28 9月, 2005 2 次提交

[PATCH] kmalloc_node IRQ safety fix · 5c382300

由 Alok N Kataria 提交于 9月 27, 2005

In kmalloc_node we are checking if the allocation is for the same node when
interrupts are "on".  This may lead to an allocation on another node than
intended.

This patch just shifts the check for the current node in __cache_alloc_node
when interrupts are disabled.
Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5c382300

[PATCH] mm: move_pte to remap ZERO_PAGE · 8b1f3124

由 Nick Piggin 提交于 9月 27, 2005

Move the ZERO_PAGE remapping complexity to the move_pte macro in
asm-generic, have it conditionally depend on
__HAVE_ARCH_MULTIPLE_ZERO_PAGE, which gets defined for MIPS.

For architectures without __HAVE_ARCH_MULTIPLE_ZERO_PAGE, move_pte becomes
a noop.

From: Hugh Dickins <hugh@veritas.com>

Fix nasty little bug we've missed in Nick's mremap move ZERO_PAGE patch.
The "pte" at that point may be a swap entry or a pte_file entry: we must
check pte_present before perhaps corrupting such an entry.

Patch below against 2.6.14-rc2-mm1, but the same bug is in 2.6.14-rc2's
mm/mremap.c, and more dangerous there since it's affecting all arches: I
think the safest course is to send Nick's patch and Yoichi's build fix and
this fix (build tested) on to Linus - so only MIPS can be affected.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8b1f3124

24 9月, 2005 1 次提交

[PATCH] revert oversized kmalloc check · dbdb9045

由 Andrew Morton 提交于 9月 23, 2005

As davem points out, this wasn't such a great idea.  There may be some code
which does:

	size = 1024*1024;
	while (kmalloc(size, ...) == 0)
		size /= 2;

which will now explode.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dbdb9045

23 9月, 2005 4 次提交

[PATCH] Fix bd_claim() error code. · f7b3a435

由 Rob Landley 提交于 9月 22, 2005

Problem: In some circumstances, bd_claim() is returning the wrong error
code.

If we try to swapon an unused block device that isn't swap formatted, we
get -EINVAL.  But if that same block device is already mounted, we instead
get -EBUSY, even though it still isn't a valid swap device.

This issue came up on the busybox list trying to get the error message
from "swapon -a" right.  If a swap device is already enabled, we get -EBUSY,
and we shouldn't report this as an error.  But we can't distinguish the two
-EBUSY conditions, which are very different errors.

In the code, bd_claim() returns either 0 or -EBUSY, but in this case busy
means "somebody other than sys_swapon has already claimed this", and
_that_ means this block device can't be a valid swap device.  So return
-EINVAL there.
Signed-off-by: NRob Landley <rob@landley.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f7b3a435

[PATCH] __kmalloc: Generate BUG if size requested is too large. · eafb4270

由 Christoph Lameter 提交于 9月 22, 2005

I had an issue on ia64 where I got a bug in kernel/workqueue because
kzalloc returned a NULL pointer due to the task structure getting too big
for the slab allocator.  Usually these cases are caught by the kmalloc
macro in include/linux/slab.h.

Compilation will fail if a too big value is passed to kmalloc.

However, kzalloc uses __kmalloc which has no check for that.  This patch
makes __kmalloc bug if a too large entity is requested.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

eafb4270

[PATCH] slab: fix handling of pages from foreign NUMA nodes · ff69416e

由 Christoph Lameter 提交于 9月 22, 2005

The numa slab allocator may allocate pages from foreign nodes onto the
lists for a particular node if a node runs out of memory.  Inspecting the
slab->nodeid field will not reflect that the page is now in use for the
slabs of another node.

This patch fixes that issue by adding a node field to free_block so that
the caller can indicate which node currently uses a slab.

Also removes the check for the current node from kmalloc_cache_node since
the process may shift later to another node which may lead to an allocation
on another node than intended.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ff69416e

[PATCH] slab: alpha inlining fix · 7243cc05

由 Ivan Kokshaysky 提交于 9月 22, 2005

It is essential that index_of() be inlined.  But alpha undoes the gcc
inlining hackery and index_of() ends up out-of-line.  So fiddle with things
to make that function inline again.

Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7243cc05

22 9月, 2005 2 次提交

[PATCH] mm: add a note about partially hardcoded VM_* flags · 7e2cff42

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 21, 2005

Hugh made me note this line for permission checking in mprotect():

		if ((newflags & ~(newflags >> 4)) & 0xf) {

after figuring out what's that about, I decided it's nasty enough.  Btw
Hugh itself didn't like the 0xf.

We can safely change it to VM_READ|VM_WRITE|VM_EXEC because we never change
VM_SHARED, so no need to check that.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7e2cff42

[PATCH] fix locking comment in unmap_region() · f10df686

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 21, 2005

That comment is plain wrong (we even take the pagetable lock inside
unmap_region()).
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f10df686

18 9月, 2005 1 次提交

[PATCH] fix mm/Kconfig spelling · f3519f91

由 Dave Hansen 提交于 9月 16, 2005

Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f3519f91

15 9月, 2005 2 次提交

[PATCH] Fix slab BUG_ON() triggered by change in array cache size · c7e43c78

由 Alok Kataria 提交于 9月 14, 2005

With the new changes that we made in the initialization of the slab
allocator, we first setup the cache from which array caches are allocated,
and then the cache, from which kmem_list3's are allocated.

Now if the array cache comes from a cache in which objsize > 32, (in this
instance size-64) then, first size-64 cache will be allocated and then the
size-128 (if this is the cache from which kmem_list3's are going to be
allocated).

So with these new changes, we are not guaranteed that we will be
initializing the malloc_sizes array in a serialized order. Thus there is
a bug in __find_general_cachep, as we are checking whether the first
cache_sizes ptr is NULL.

This is replaced by checking whether the array-cache cache is initialized.
Attached is a patch which does that.  Boots fine on a x86-64, with
DEBUG_SPIN, DEBUG_SLAB, and preempt.

Attached is a patch which does that.  Boots fine on a x86-64, with
DEBUG_SPIN, DEBUG_SLAB, and preempt.Thanks & Regards, Alok
Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhitdayal.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c7e43c78

[PATCH] error path in setup_arg_pages() misses vm_unacct_memory() · 2fd4ef85

由 Hugh Dickins 提交于 9月 14, 2005

Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
security_vm_enough_memory tend to forget to vm_unacct_memory when a
failure occurs further down (typically in setup_arg_pages variants).

These are all users of insert_vm_struct, and that reservation will only
be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.

So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
Committed_AS each time they're run. But don't add VM_ACCOUNT to them,
it's inappropriate to reserve against the very unlikely case that gdb
be used to COW a vDSO page - we ought to do something about that in
do_wp_page, but there are yet other inconsistencies to be resolved.

The safe and economical way to fix this is to let insert_vm_struct do
the security_vm_enough_memory check when it finds VM_ACCOUNT is set.

And the MIPS irix_brk has been calling security_vm_enough_memory before
calling do_brk which repeats it, doubly accounting and so also leaking.
Remove that, and all the fs and arch calls to security_vm_enough_memory:
give it a less misleading name later on.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-Off-By: NKirill Korotaev <dev@sw.ru>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2fd4ef85

13 9月, 2005 4 次提交

[PATCH] use add_taint() for setting tainted bit flags · 9f158333

由 Randy Dunlap 提交于 9月 13, 2005

Use the add_taint() interface for setting tainted bit flags instead of
doing it manually.
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9f158333

[PATCH] Fix MPOL_F_VERIFY · 5b952b3c

由 Andi Kleen 提交于 9月 13, 2005

There was a pretty bad bug in there that the code would always check the full
VMA, not the range the user requested.

When the VMA to be checked was merged with the previous VMA this could lead to
spurious failures.
Signed-off-by: N"Andi Kleen" <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5b952b3c

[PATCH] vm: kswapd cleanup: use pgdat · 8d0986e2

由 Con Kolivas 提交于 9月 13, 2005

Use the pgdat pointer we've already defined in wakeup_kswapd
Signed-off-by: NCon Kolivas <kernel@kolivas.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8d0986e2

[PATCH] x86-64: Reverse order of bootmem lists · 5d3d0f77

由 Andi Kleen 提交于 9月 12, 2005

This leads to bootmem allocating first from node 0 instead
of from the last node.  This avoids swiotlb allocating on the last node, which
doesn't really work on a machine with >4GB.

Note: there is a better patch around from someone else that gets
rid of the pgdat list completely.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5d3d0f77

12 9月, 2005 1 次提交

[PATCH] uclinux: add NULL check, 0 end valid check and some more exports to nommu.c · 66aa2b4b

由 Greg Ungerer 提交于 9月 12, 2005

Move call to get_mm_counter() in update_mem_hiwater() to be
inside the check for tsk->mm being null. Otherwise you can be
following a null pointer here. This patch submitted by
Javier Herrero <jherrero@hvsistemas.es>.

Modify the end check for munmap regions to allow for the
legacy behavior of 0 being valid. Pretty much all current
uClinux system libc malloc's pass in 0 as the end point.
A hard check will fail on these, so change the check so
that if it is non-zero it must be valid otherwise it fails.
A passed in value will always succeed (as it used too).

Also export a few more mm system functions - to be consistent
with the VM code exports.
Signed-off-by: NGreg Ungerer <gerg@uclinux.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

66aa2b4b

11 9月, 2005 5 次提交

[PATCH] mm: fix-up schedule_timeout() usage · 13e4b57f

由 Nishanth Aravamudan 提交于 9月 10, 2005

Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.
Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

13e4b57f

[PATCH] remove invalid comment in mm/page_alloc.c · 207f36ee

由 Renaud Lienhart 提交于 9月 10, 2005

free_pages_bulk() doesn't free the entire list if count == 0.
Signed-off-by: NRenaud Lienhart <renaud.lienhart@free.fr>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

207f36ee

[PATCH] mm/swap_state: Fix "nocast type" warnings · 9de75d11

由 Victor Fusco 提交于 9月 10, 2005

Fix the sparse warning "implicit cast to nocast type"
Signed-off-by: NVictor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: NDomen Puncer <domen@coderock.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9de75d11

[PATCH] mm/slab: fix sparse warnings · b2d55073

由 Victor Fusco 提交于 9月 10, 2005

Fix the sparse warning "implicit cast to nocast type"
Signed-off-by: NVictor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: NDomen Puncer <domen@coderock.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b2d55073

[PATCH] mm/filemap.c: make two functions static · 5ce7852c

由 Adrian Bunk 提交于 9月 10, 2005

With Nick Piggin <npiggin@suse.de>

Give some things static scope.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5ce7852c

10 9月, 2005 5 次提交

[PATCH] timer initialization cleanup: DEFINE_TIMER · 8d06afab

由 Ingo Molnar 提交于 9月 09, 2005

Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8d06afab

[PATCH] update kfree, vfree, and vunmap kerneldoc · 80e93eff

由 Pekka Enberg 提交于 9月 09, 2005

This patch clarifies NULL handling of kfree() and vfree().  I addition,
wording of calling context restriction for vfree() and vunmap() are changed
from "may not" to "must not."
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NManfred Spraul <manfred@colorfullife.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

80e93eff

[PATCH] Numa-aware slab allocator V5 · e498be7d

由 Christoph Lameter 提交于 9月 09, 2005

The NUMA API change that introduced kmalloc_node was accepted for
2.6.12-rc3.  Now it is possible to do slab allocations on a node to
localize memory structures.  This API was used by the pageset localization
patch and the block layer localization patch now in mm.  The existing
kmalloc_node is slow since it simply searches through all pages of the slab
to find a page that is on the node requested.  The two patches do a one
time allocation of slab structures at initialization and therefore the
speed of kmalloc node does not matter.

This patch allows kmalloc_node to be as fast as kmalloc by introducing node
specific page lists for partial, free and full slabs.  Slab allocation
improves in a NUMA system so that we are seeing a performance gain in AIM7
of about 5% with this patch alone.

More NUMA localizations are possible if kmalloc_node operates in an fast
way like kmalloc.

Test run on a 32p systems with 32G Ram.

w/o patch
Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1      485.36  100       485.3640     11.99      1.91   Sat Apr 30 14:01:51 2005
  100    26582.63   88       265.8263     21.89    144.96   Sat Apr 30 14:02:14 2005
  200    29866.83   81       149.3342     38.97    286.08   Sat Apr 30 14:02:53 2005
  300    33127.16   78       110.4239     52.71    426.54   Sat Apr 30 14:03:46 2005
  400    34889.47   80        87.2237     66.72    568.90   Sat Apr 30 14:04:53 2005
  500    35654.34   76        71.3087     81.62    714.55   Sat Apr 30 14:06:15 2005
  600    36460.83   75        60.7681     95.77    853.42   Sat Apr 30 14:07:51 2005
  700    35957.00   75        51.3671    113.30    990.67   Sat Apr 30 14:09:45 2005
  800    33380.65   73        41.7258    139.48   1140.86   Sat Apr 30 14:12:05 2005
  900    35095.01   76        38.9945    149.25   1281.30   Sat Apr 30 14:14:35 2005
 1000    36094.37   74        36.0944    161.24   1419.66   Sat Apr 30 14:17:17 2005

w/patch
Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1      484.27  100       484.2736     12.02      1.93   Sat Apr 30 15:59:45 2005
  100    28262.03   90       282.6203     20.59    143.57   Sat Apr 30 16:00:06 2005
  200    32246.45   82       161.2322     36.10    282.89   Sat Apr 30 16:00:42 2005
  300    37945.80   83       126.4860     46.01    418.75   Sat Apr 30 16:01:28 2005
  400    40000.69   81       100.0017     58.20    561.48   Sat Apr 30 16:02:27 2005
  500    40976.10   78        81.9522     71.02    696.95   Sat Apr 30 16:03:38 2005
  600    41121.54   78        68.5359     84.92    834.86   Sat Apr 30 16:05:04 2005
  700    44052.77   78        62.9325     92.48    971.53   Sat Apr 30 16:06:37 2005
  800    41066.89   79        51.3336    113.38   1111.15   Sat Apr 30 16:08:31 2005
  900    38918.77   79        43.2431    134.59   1252.57   Sat Apr 30 16:10:46 2005
 1000    41842.21   76        41.8422    139.09   1392.33   Sat Apr 30 16:13:05 2005

These are measurement taken directly after boot and show a greater
improvement than 5%.  However, the performance improvements become less
over time if the AIM7 runs are repeated and settle down at around 5%.

Links to earlier discussions:
http://marc.theaimsgroup.com/?t=111094594500003&r=1&w=2
http://marc.theaimsgroup.com/?t=111603406600002&r=1&w=2

Changelog V4-V5:
- alloc_arraycache and alloc_aliencache take node parameter instead of cpu
- fix initialization so that nodes without cpus are properly handled.
- simplify code in kmem_cache_init
- patch against Andrews temp mm3 release
- Add Shai to credits
- fallback to __cache_alloc from __cache_alloc_node if the node's cache
  is not available yet.

Changelog V3-V4:
- Patch against 2.6.12-rc5-mm1
- Cleanup patch integrated
- More and better use of for_each_node and for_each_cpu
- GCC 2.95 fix (do not use [] use [0])
- Correct determination of INDEX_AC
- Remove hack to cause an error on platforms that have no CONFIG_NUMA but nodes.
- Remove list3_data and list3_data_ptr macros for better readability

Changelog V2-V3:
- Made to patch against 2.6.12-rc4-mm1
- Revised bootstrap mechanism so that larger size kmem_list3 structs can be
  supported. Do a generic solution so that the right slab can be found
  for the internal structs.
- use for_each_online_node

Changelog V1-V2:
- Batching for freeing of wrong-node objects (alien caches)
- Locking changes and NUMA #ifdefs as requested by Manfred
Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
Signed-off-by: NShobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: NShai Fultheim <Shai@Scalex86.org>
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e498be7d

[PATCH] tmpfs: Enable atomic inode security labeling · 570bc1c2

由 Stephen Smalley 提交于 9月 09, 2005

This patch modifies tmpfs to call the inode_init_security LSM hook to set
up the incore inode security state for new inodes before the inode becomes
accessible via the dcache.

As there is no underlying storage of security xattrs in this case, it is
not necessary for the hook to return the (name, value, len) triple to the
tmpfs code, so this patch also modifies the SELinux hook function to
correctly handle the case where the (name, value, len) pointers are NULL.

The hook call is needed in tmpfs in order to support proper security
labeling of tmpfs inodes (e.g.  for udev with tmpfs /dev in Fedora).  With
this change in place, we should then be able to remove the
security_inode_post_create/mkdir/...  hooks safely.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

570bc1c2

[PATCH] update filesystems for new delete_inode behavior · fef26658

由 Mark Fasheh 提交于 9月 09, 2005

Update the file systems in fs/ implementing a delete_inode() callback to
call truncate_inode_pages().  One implementation note: In developing this
patch I put the calls to truncate_inode_pages() at the very top of those
filesystems delete_inode() callbacks in order to retain the previous
behavior.  I'm guessing that some of those could probably be optimized.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Acked-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

fef26658

09 9月, 2005 1 次提交

[PATCH] PCI: Run PCI driver initialization on local node · d42c6997

由 Andi Kleen 提交于 7月 06, 2005

Run PCI driver initialization on local node

Instead of adding messy kmalloc_node()s everywhere run the
PCI driver probe on the node local to the device.

This would not have helped for IDE, but should for
other more clean drivers that do more initialization in probe().
It won't help for drivers that do most of the work
on first open (like many network drivers)
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

d42c6997

08 9月, 2005 2 次提交

[PATCH] introduce and use kzalloc · dd392710

由 Pekka J Enberg 提交于 9月 06, 2005

This patch introduces a kzalloc wrapper and converts kernel/ to use it.  It
saves a little program text.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd392710

[PATCH] cpusets: confine oom_killer to mem_exclusive cpuset · ef08e3b4

由 Paul Jackson 提交于 9月 06, 2005

Now the real motivation for this cpuset mem_exclusive patch series seems
trivial.

This patch keeps a task in or under one mem_exclusive cpuset from provoking an
oom kill of a task under a non-overlapping mem_exclusive cpuset.  Since only
interrupt and GFP_ATOMIC allocations are allowed to escape mem_exclusive
containment, there is little to gain from oom killing a task under a
non-overlapping mem_exclusive cpuset, as almost all kernel and user memory
allocation must come from disjoint memory nodes.

This patch enables configuring a system so that a runaway job under one
mem_exclusive cpuset cannot cause the killing of a job in another such cpuset
that might be using very high compute and memory resources for a prolonged
time.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ef08e3b4