提交 · 6ae11b278bca1cd41651bae49a8c69de2f6a6262 · openeuler / raspberrypi-kernel

16 12月, 2009 5 次提交

hugetlb: add nodemask arg to huge page alloc, free and surplus adjust functions · 6ae11b27

由 Lee Schermerhorn 提交于 12月 14, 2009

In preparation for constraining huge page allocation and freeing by the
controlling task's numa mempolicy, add a "nodes_allowed" nodemask pointer
to the allocate, free and surplus adjustment functions.  For now, pass
NULL to indicate default behavior--i.e., use node_online_map.  A
subsqeuent patch will derive a non-default mask from the controlling
task's numa mempolicy.

Note that this method of updating the global hstate nr_hugepages under the
constraint of a nodemask simplifies keeping the global state
consistent--especially the number of persistent and surplus pages relative
to reservations and overcommit limits.  There are undoubtedly other ways
to do this, but this works for both interfaces: mempolicy and per node
attributes.

[rientjes@google.com: fix HIGHMEM compile error]
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: NMel Gorman <mel@csn.ul.ie>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NAndi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ae11b27

hugetlb: rework hstate_next_node_* functions · 9a76db09

由 Lee Schermerhorn 提交于 12月 14, 2009

Modify the hstate_next_node* functions to allow them to be called to
obtain the "start_nid".  Then, whereas prior to this patch we
unconditionally called hstate_next_node_to_{alloc|free}(), whether or not
we successfully allocated/freed a huge page on the node, now we only call
these functions on failure to alloc/free to advance to next allowed node.

Factor out the next_node_allowed() function to handle wrap at end of
node_online_map.  In this version, the allowed nodes include all of the
online nodes.
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: NMel Gorman <mel@csn.ul.ie>
Acked-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NAndi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a76db09

mm: move inc_zone_page_state(NR_ISOLATED) to just isolated place · 6d9c285a

由 KOSAKI Motohiro 提交于 12月 14, 2009

Christoph pointed out inc_zone_page_state(NR_ISOLATED) should be placed
in right after isolate_page().

This patch does it.
Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6d9c285a

mmap: don't return ENOMEM when mapcount is temporarily exceeded in munmap() · 659ace58

由 KOSAKI Motohiro 提交于 12月 14, 2009

On ia64, the following test program exit abnormally, because glibc thread
library called abort().

 ========================================================
 (gdb) bt
 #0  0xa000000000010620 in __kernel_syscall_via_break ()
 #1  0x20000000003208e0 in raise () from /lib/libc.so.6.1
 #2  0x2000000000324090 in abort () from /lib/libc.so.6.1
 #3  0x200000000027c3e0 in __deallocate_stack () from /lib/libpthread.so.0
 #4  0x200000000027f7c0 in start_thread () from /lib/libpthread.so.0
 #5  0x200000000047ef60 in __clone2 () from /lib/libc.so.6.1
 ========================================================

The fact is, glibc call munmap() when thread exitng time for freeing
stack, and it assume munlock() never fail.  However, munmap() often make
vma splitting and it with many mapcount make -ENOMEM.

Oh well, that's crazy, because stack unmapping never increase mapcount.
The maxcount exceeding is only temporary.  internal temporary exceeding
shouldn't make ENOMEM.

This patch does it.

 test_max_mapcount.c
 ==================================================================
  #include<stdio.h>
  #include<stdlib.h>
  #include<string.h>
  #include<pthread.h>
  #include<errno.h>
  #include<unistd.h>

  #define THREAD_NUM 30000
  #define MAL_SIZE (8*1024*1024)

 void *wait_thread(void *args)
 {
 	void *addr;

 	addr = malloc(MAL_SIZE);
 	sleep(10);

 	return NULL;
 }

 void *wait_thread2(void *args)
 {
 	sleep(60);

 	return NULL;
 }

 int main(int argc, char *argv[])
 {
 	int i;
 	pthread_t thread[THREAD_NUM], th;
 	int ret, count = 0;
 	pthread_attr_t attr;

 	ret = pthread_attr_init(&attr);
 	if(ret) {
 		perror("pthread_attr_init");
 	}

 	ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
 	if(ret) {
 		perror("pthread_attr_setdetachstate");
 	}

 	for (i = 0; i < THREAD_NUM; i++) {
 		ret = pthread_create(&th, &attr, wait_thread, NULL);
 		if(ret) {
 			fprintf(stderr, "[%d] ", count);
 			perror("pthread_create");
 		} else {
 			printf("[%d] create OK.\n", count);
 		}
 		count++;

 		ret = pthread_create(&thread[i], &attr, wait_thread2, NULL);
 		if(ret) {
 			fprintf(stderr, "[%d] ", count);
 			perror("pthread_create");
 		} else {
 			printf("[%d] create OK.\n", count);
 		}
 		count++;
 	}

 	sleep(3600);
 	return 0;
 }
 ==================================================================

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

659ace58

oom: dump stack and VM state when oom killer panics · 1b604d75

由 David Rientjes 提交于 12月 14, 2009

The oom killer header, including information such as the allocation order
and gfp mask, current's cpuset and memory controller, call trace, and VM
state information is currently only shown when the oom killer has selected
a task to kill.

This information is omitted, however, when the oom killer panics either
because of panic_on_oom sysctl settings or when no killable task was
found.  It is still relevant to know crucial pieces of information such as
the allocation order and VM state when diagnosing such issues, especially
at boot.

This patch displays the oom killer header whenever it panics so that bug
reports can include pertinent information to debug the issue, if possible.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b604d75

12 12月, 2009 1 次提交

mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe · b9255850

由 H. Peter Anvin 提交于 12月 08, 2009

Slightly adjust the logic for determining the size of the
copy_form_user() in do_pages_stat(); with this change, gcc can see
that the copying is safe.

Without this, we get a build error for i386 allyesconfig:

/home/hpa/kernel/linux-2.6-tip.urgent/arch/x86/include/asm/uaccess_32.h:213:
error: call to ‘copy_from_user_overflow’ declared with attribute
error: copy_from_user() buffer size is not provably correct

Unlike an earlier patch from Arjan, this doesn't introduce new
variables; merely reshuffles the compare so that gcc can see that an
overflow cannot happen.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
LKML-Reference: <20090926205406.30d55b08@infradead.org>

b9255850

11 12月, 2009 12 次提交

A
switch do_brk() to get_unmapped_area() · 2c6a1016
由 Al Viro 提交于 12月 03, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2c6a1016

Take arch_mmap_check() into get_unmapped_area() · 9206de95

由 Al Viro 提交于 12月 03, 2009

Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9206de95

A
fix a struct file leak in do_mmap_pgoff() · 8c7b49b3
由 Al Viro 提交于 11月 30, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
8c7b49b3

Unify sys_mmap* · f8b72560

由 Al Viro 提交于 11月 30, 2009

New helper - sys_mmap_pgoff(); switch syscalls to using it.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f8b72560

fix pgoff in "have to relocate" case of mremap() · 93587414

由 Al Viro 提交于 11月 24, 2009

Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

93587414

fix the arch checks in MREMAP_FIXED case · 097eed10

由 Al Viro 提交于 11月 24, 2009

Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

097eed10

fix checks for expand-in-place mremap · f106af4e

由 Al Viro 提交于 11月 24, 2009

Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f106af4e

do_mremap() untangling, part 3 · 1a0ef85f

由 Al Viro 提交于 11月 24, 2009

Take the check for being able to expand vma in place into a separate
helper.
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1a0ef85f

do_mremap() untangling, part 2 · ecc1a899

由 Al Viro 提交于 11月 24, 2009

Take the MREMAP_FIXED into a separate helper, simplify the living
hell out of conditions in both cases.
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ecc1a899

untangling do_mremap(), part 1 · 54f5de70

由 Al Viro 提交于 11月 24, 2009

Take locating vma and checks on it to a separate helper (it will be
shared between MREMAP_FIXED/non-MREMAP_FIXED cases when we split
them in the next patch)
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

54f5de70

tracing, slab: Fix no callsite ifndef CONFIG_KMEMTRACE · 0bb38a5c

由 Li Zefan 提交于 12月 11, 2009

For slab, if CONFIG_KMEMTRACE and CONFIG_DEBUG_SLAB are not set,
__do_kmalloc() will not track callers:

 # ./perf record -f -a -R -e kmem:kmalloc
 ^C
 # ./perf trace
 ...
          perf-2204  [000]   147.376774: kmalloc: call_site=c0529d2d ...
          perf-2204  [000]   147.400997: kmalloc: call_site=c0529d2d ...
          Xorg-1461  [001]   147.405413: kmalloc: call_site=0 ...
          Xorg-1461  [001]   147.405609: kmalloc: call_site=0 ...
       konsole-1776  [001]   147.405786: kmalloc: call_site=0 ...
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <4B21F8AE.6020804@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0bb38a5c

tracing, slab: Define kmem_cache_alloc_notrace ifdef CONFIG_TRACING · 0f24f128

由 Li Zefan 提交于 12月 11, 2009

Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
enabled, otherwise perf-kmem will show wrong stats ifndef
CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
be traced by both trace_kmalloc() and trace_kmem_cache_alloc().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <4B21F89A.7000801@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f24f128

10 12月, 2009 1 次提交

kill wait_on_page_writeback_range · 94004ed7

由 Christoph Hellwig 提交于 9月 30, 2009

All callers really want the more logical filemap_fdatawait_range interface,
so convert them to use it and merge wait_on_page_writeback_range into
filemap_fdatawait_range.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

94004ed7

06 12月, 2009 3 次提交

slab, kmemleak: pass the correct pointer to kmemleak_erase() · ddbf2e83

由 J. R. Okajima 提交于 12月 02, 2009

In ____cache_alloc(), the variable 'ac' may be changed after
cache_alloc_refill() and the following kmemleak_erase() may get an incorrect
pointer. Update 'ac' after cache_alloc_refill() unconditionally.

See the following URL for the discussion of this patch:

http://marc.info/?l=linux-kernel&m=125873373124187&w=2Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

ddbf2e83

slab, kmemleak: stop calling kmemleak_erase() unconditionally · f3d8b53a

由 J. R. Okajima 提交于 12月 02, 2009

When the gotten object is NULL (probably due to ENOMEM), kmemleak_erase() is
unnecessary here, It just sets NULL to where already is NULL. Add a condition.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

f3d8b53a

SLAB: Fix unlikely() annotation in __cache_alloc_node() · 8e15b79c

由 Tim Blechmann 提交于 11月 30, 2009

Branch profiling on my nehalem machine showed 99% incorrect branch hints:

   28459  7678524  99 __cache_alloc_node             slab.c               3551

Discussion on lkml [1] led to the solution to remove this hint.

[1] http://patchwork.kernel.org/patch/63517/Signed-off-by: NTim Blechmann <tim@klingt.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

8e15b79c

04 12月, 2009 2 次提交

tree-wide: fix assorted typos all over the place · af901ca1

由 André Goddard Rosa 提交于 11月 14, 2009

That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.
Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

af901ca1

mm: fix comments for invalidate_inode_pages2() · e9de25dd

由 Peng Tao 提交于 10月 19, 2009

invalidate_inode_pages2() returns -EBUSY *NOT* -EIO if any pages could not be
invalidated.
Signed-off-by: NPeng Tao <bergwolf@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

e9de25dd

03 12月, 2009 2 次提交

writeback: remove unused nonblocking and congestion checks · 0d99519e

由 Wu Fengguang 提交于 12月 03, 2009

- no one is calling wb_writeback and write_cache_pages with
  wbc.nonblocking=1 any more
- lumpy pageout will want to do nonblocking writeback without the
  congestion wait

So remove the congestion checks as suggested by Chris.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Alex Elder <aelder@sgi.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0d99519e

flusher: Fix PF_FROZEN race · bf7ec5bb

由 OGAWA Hirofumi 提交于 12月 03, 2009

To touch task->flags directly is racy. thaw_process() still has race
(changing non_current->flags, but this is another issue) though, I think
it's much better off.

So, use thaw_process() instead.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bf7ec5bb

01 12月, 2009 1 次提交

SLAB: Fix lockdep annotations for CPU hotplug · ce79ddc8

由 Pekka Enberg 提交于 11月 23, 2009

As reported by Paul McKenney:

  I am seeing some lockdep complaints in rcutorture runs that include
  frequent CPU-hotplug operations.  The tests are otherwise successful.
  My first thought was to send a patch that gave each array_cache
  structure's ->lock field its own struct lock_class_key, but you already
  have a init_lock_keys() that seems to be intended to deal with this.

  ------------------------------------------------------------------------

  =============================================
  [ INFO: possible recursive locking detected ]
  2.6.32-rc4-autokern1 #1
  ---------------------------------------------
  syslogd/2908 is trying to acquire lock:
   (&nc->lock){..-...}, at: [<c0000000001407f4>] .kmem_cache_free+0x118/0x2d4

  but task is already holding lock:
   (&nc->lock){..-...}, at: [<c0000000001411bc>] .kfree+0x1f0/0x324

  other info that might help us debug this:
  3 locks held by syslogd/2908:
   #0:  (&u->readlock){+.+.+.}, at: [<c0000000004556f8>] .unix_dgram_recvmsg+0x70/0x338
   #1:  (&nc->lock){..-...}, at: [<c0000000001411bc>] .kfree+0x1f0/0x324
   #2:  (&parent->list_lock){-.-...}, at: [<c000000000140f64>] .__drain_alien_cache+0x50/0xb8

  stack backtrace:
  Call Trace:
  [c0000000e8ccafc0] [c0000000000101e4] .show_stack+0x70/0x184 (unreliable)
  [c0000000e8ccb070] [c0000000000afebc] .validate_chain+0x6ec/0xf58
  [c0000000e8ccb180] [c0000000000b0ff0] .__lock_acquire+0x8c8/0x974
  [c0000000e8ccb280] [c0000000000b2290] .lock_acquire+0x140/0x18c
  [c0000000e8ccb350] [c000000000468df0] ._spin_lock+0x48/0x70
  [c0000000e8ccb3e0] [c0000000001407f4] .kmem_cache_free+0x118/0x2d4
  [c0000000e8ccb4a0] [c000000000140b90] .free_block+0x130/0x1a8
  [c0000000e8ccb540] [c000000000140f94] .__drain_alien_cache+0x80/0xb8
  [c0000000e8ccb5e0] [c0000000001411e0] .kfree+0x214/0x324
  [c0000000e8ccb6a0] [c0000000003ca860] .skb_release_data+0xe8/0x104
  [c0000000e8ccb730] [c0000000003ca2ec] .__kfree_skb+0x20/0xd4
  [c0000000e8ccb7b0] [c0000000003cf2c8] .skb_free_datagram+0x1c/0x5c
  [c0000000e8ccb830] [c00000000045597c] .unix_dgram_recvmsg+0x2f4/0x338
  [c0000000e8ccb920] [c0000000003c0f14] .sock_recvmsg+0xf4/0x13c
  [c0000000e8ccbb30] [c0000000003c28ec] .SyS_recvfrom+0xb4/0x130
  [c0000000e8ccbcb0] [c0000000003bfb78] .sys_recv+0x18/0x2c
  [c0000000e8ccbd20] [c0000000003ed388] .compat_sys_recv+0x14/0x28
  [c0000000e8ccbd90] [c0000000003ee1bc] .compat_sys_socketcall+0x178/0x220
  [c0000000e8ccbe30] [c0000000000085d4] syscall_exit+0x0/0x40

This patch fixes the issue by setting up lockdep annotations during CPU
hotplug.
Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

ce79ddc8

29 11月, 2009 1 次提交

SLUB: Fix __GFP_ZERO unlikely() annotation · 74e2134f

由 Pekka Enberg 提交于 11月 25, 2009

The unlikely() annotation in slab_alloc() covers too much of the expression.
It's actually very likely that the object is not NULL so use unlikely() only
for the __GFP_ZERO expression like SLAB does.

The patch reduces kernel text by 29 bytes on x86-64:

   text	   data	    bss	    dec	    hex	filename
  24185	   8560	    176	  32921	   8099	mm/slub.o.orig
  24156	   8560	    176	  32892	   807c	mm/slub.o
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>

74e2134f

26 11月, 2009 2 次提交

tracing: Fix kmem event exports · 4d795fb1

由 Ingo Molnar 提交于 11月 26, 2009

Commit 53d0422c ("tracing: Convert some kmem events to DEFINE_EVENT")
moved the kmem tracepoint creation from util.c to page_alloc.c,
but forgot to move the exports.

Move them back.

Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4d795fb1

tracing: Convert some kmem events to DEFINE_EVENT · 53d0422c

由 Li Zefan 提交于 11月 26, 2009

Use DECLARE_EVENT_CLASS to remove duplicate code:

   text    data     bss     dec     hex filename
 333987   69800   27228  431015   693a7 mm/built-in.o.old
 330030   69800   27228  427058   68432 mm/built-in.o

8 events are converted:

  kmem_alloc: kmalloc, kmem_cache_alloc
  kmem_alloc_node: kmalloc_node, kmem_cache_alloc_node
  kmem_free: kfree, kmem_cache_free
  mm_page: mm_page_alloc_zone_locked, mm_page_pcpu_drain

No change in functionality.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

53d0422c

25 11月, 2009 1 次提交

percpu: Fix kdump failure if booted with percpu_alloc=page · 3b034b0d

由 Vivek Goyal 提交于 11月 24, 2009

o kdump functionality reserves a per cpu area at boot time and exports the
  physical address of that area to user space through sys interface. This
  area stores some dump related information like cpu register states etc
  at the time of crash.

o We were assuming that per cpu area always come from linearly mapped meory
  region and using __pa() to determine physical address.
  With percpu_alloc=page, per cpu area can come from vmalloc region also and
  __pa() breaks.

o This patch implments a new function to convert per cpu address to
  physical address.

Before the patch, crash_notes addresses looked as follows.

cpu0 60fffff49800
cpu1 60fffff60800
cpu2 60fffff77800

These are bogus phsyical addresses.

After the patch, address are following.

cpu0 13eb44000
cpu1 13eb43000
cpu2 13eb42000
cpu3 13eb41000

These look fine. I got 4G of memory and /proc/iomem tell me following.

100000000-13fffffff : System RAM

tj: * added missing asm/io.h include reported by Stephen Rothwell
    * repositioned per_cpu_ptr_phys() in percpu.c and added comment.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>

3b034b0d

18 11月, 2009 2 次提交

mm: allow memory hotplug and hibernation in the same kernel · 6ad696d2

由 Andi Kleen 提交于 11月 17, 2009

Allow memory hotplug and hibernation in the same kernel

Memory hotplug and hibernation were exclusive in Kconfig.  This is
obviously a problem for distribution kernels who want to support both in
the same image.

After some discussions with Rafael and others the only problem is with
parallel memory hotadd or removal while a hibernation operation is in
process.  It was also working for s390 before.

This patch removes the Kconfig level exclusion, and simply makes the
memory add / remove functions grab the pm_mutex to exclude against
hibernation.

Fixes a regression - old kernels didn't exclude memory hotadd and
hibernation.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ad696d2

mm/memory_hotplug: fix section mismatch · e1319331

由 Hidetoshi Seto 提交于 11月 17, 2009

With CONFIG_MEMORY_HOTPLUG I got following warning:

WARNING: vmlinux.o(.text+0x1276b0): Section mismatch in reference from
the function hotadd_new_pgdat() to the function
.meminit.text:free_area_init_node()
The function hotadd_new_pgdat() references
the function __meminit free_area_init_node().
This is often because hotadd_new_pgdat lacks a __meminit
annotation or the annotation of free_area_init_node is wrong.

Use __ref to fix this.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1319331

12 11月, 2009 5 次提交

percpu: restructure pcpu_extend_area_map() to fix bugs and improve readability · 833af842

由 Tejun Heo 提交于 11月 11, 2009

pcpu_extend_area_map() had the following two bugs.

* It should return 1 if pcpu_lock was dropped and reacquired but it
  returned 0.  This could lead to oops if free_percpu() races with
  area map extension.

* pcpu_mem_free() was called under pcpu_lock.  pcpu_mem_free() might
  end up calling vfree() which isn't IRQ safe.  This could lead to
  deadlock through lock order inversion via IRQ.

In addition, Linus pointed out that the temporary lock dropping and
subtle three-way return value of pcpu_extend_area_map() was very ugly
and suggested to split the function into two - pcpu_need_to_extend()
and pcpu_extend_area_map().

This patch restructures pcpu_extend_area_map() as suggested and fixes
the two bugs.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>

833af842

memcg: fix wrong pointer initialization at page migration when memcg is disabled. · e00e4316

由 KAMEZAWA Hiroyuki 提交于 11月 11, 2009

Lee Schermerhorn reported that he saw bad pointer dereference in
mem_cgroup_end_migration() when he disabled memcg by boot option.

memcg's page migration logic works as

	mem_cgroup_prepare_migration(page, &ptr);
	do page migration
	mem_cgroup_end_migration(page, ptr);

Now, ptr is not initialized in prepare_migration when memcg is disabled
by boot option. This causes panic in end_migration. This patch fixes it.
Reported-by: NLee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e00e4316

page allocator: Do not allow interrupts to use ALLOC_HARDER · 9d0ed60f

由 Mel Gorman 提交于 11月 11, 2009

Commit 341ce06f ("page allocator:
calculate the alloc_flags for allocation only once") altered watermark
logic slightly by allowing rt_tasks that are handling an interrupt to set
ALLOC_HARDER.  This patch brings the watermark logic more in line with
2.6.30.

This change results in a reduction of the number high-order GFP_ATOMIC
allocation failures reported.  See
http://www.gossamer-threads.com/lists/linux/kernel/1144153

[rientjes@google.com: Spotted the problem]
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: NRik van Riel <riel@redhat.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d0ed60f

page allocator: always wake kswapd when restarting an allocation attempt after... · cc4a6851

由 Mel Gorman 提交于 11月 11, 2009

page allocator: always wake kswapd when restarting an allocation attempt after direct reclaim failed

If a direct reclaim makes no forward progress, it considers whether it
should go OOM or not.  Whether OOM is triggered or not, it may retry the
allocation afterwards.  In times past, this would always wake kswapd as
well but currently, kswapd is not woken up after direct reclaim fails.
For order-0 allocations, this makes little difference but if there is a
heavy mix of higher-order allocations that direct reclaim is failing for,
it might mean that kswapd is not rewoken for higher orders as much as it
did previously.

This patch wakes up kswapd when an allocation is being retried after a
direct reclaim failure.  It would be expected that kswapd is already
awake, but this has the effect of telling kswapd to reclaim at the higher
order as well.
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc4a6851

Thaw refrigerated bdi flusher threads before invoking kthread_stop on them · c62b17a5

由 Romit Dasgupta 提交于 11月 12, 2009

Unfreezes the bdi flusher task when the said task needs to exit.

Steps to reproduce this.
1) Mount a file system from MMC/SD card.
2) Unmount the file system. This creates a flusher task.
3) Attempt suspend to RAM. System is unresponsive.

This is because the bdi flusher thread is already in the refrigerator and will
remain so until it is thawed. The MMC driver suspend routine call stack will
ultimately issue a 'kthread_stop' on the bdi flusher thread and will block
until the flusher thread is exited. Since the bdi flusher thread is in the
refrigerator it never cleans up until thawed.
Signed-off-by: NRomit Dasgupta <romit@ti.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c62b17a5

10 11月, 2009 2 次提交

bootmem: Add free_bootmem_late() · 9f993ac3

由 FUJITA Tomonori 提交于 11月 10, 2009

Add a new function for freeing bootmem after the bootmem
allocator has been released and the unreserved pages given to
the page allocator.

This allows us to reserve bootmem and then release it if we
later discover it was not needed.

( This new API will be used by the swiotlb code to recover
  a significant amount of RAM (64MB). )
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
Cc: hannes@cmpxchg.org
Cc: tj@kernel.org
Cc: akpm@linux-foundation.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1257849980-22640-7-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f993ac3

highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE · d4515646

由 Soeren Sandmann 提交于 10月 28, 2009

Previously calling debug_kmap_atomic() with these types would
cause spurious warnings.

(triggered by SysProf using perf events)
Signed-off-by: NSoeren Sandmann Pedersen <sandmann@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: a.p.zijlstra@chello.nl
Cc: <stable@kernel.org> # .31.x
LKML-Reference: <ye8vdhz8krw.fsf@camel23.daimi.au.dk>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d4515646