1. 18 5月, 2009 1 次提交
    • Y
      mm, x86: remove MEMORY_HOTPLUG_RESERVE related code · 888a589f
      Yinghai Lu 提交于
      after:
      
       | commit b263295d
       | Author: Christoph Lameter <clameter@sgi.com>
       | Date:   Wed Jan 30 13:30:47 2008 +0100
       |
       |    x86: 64-bit, make sparsemem vmemmap the only memory model
      
      we don't have MEMORY_HOTPLUG_RESERVE anymore.
      
      Historically, x86-64 had an architecture-specific method for memory hotplug
      whereby it scanned the SRAT for physical memory ranges that could be
      potentially used for memory hot-add later. By reserving those ranges
      without physical memory, the memmap would be allocated and left dormant
      until needed. This depended on the DISCONTIG memory model which has been
      removed so the code implementing HOTPLUG_RESERVE is now dead.
      
      This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE.
      
      (Changelog authored by Mel.)
      
      v2: updated changelog, and remove hotadd= in doc
      
      [ Impact: remove dead code ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Reviewed-by: NMel Gorman <mel@csn.ul.ie>
      Workflow-found-OK-by: NAndrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A0C4910.7090508@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      888a589f
  2. 07 5月, 2009 2 次提交
  3. 01 4月, 2009 2 次提交
  4. 30 3月, 2009 1 次提交
  5. 13 3月, 2009 1 次提交
  6. 19 2月, 2009 2 次提交
    • K
      mm: fix memmap init for handling memory hole · cc2559bc
      KAMEZAWA Hiroyuki 提交于
      Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
      and memmap initialization was not done. This was a trouble for
      sparc boot.
      
      To fix this, the PFN should be initialized and marked as PG_reserved.
      This patch changes early_pfn_in_nid() return true if PFN is a hole.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Reported-by: NDavid Miller <davem@davemlloft.net>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc2559bc
    • K
      mm: clean up for early_pfn_to_nid() · f2dbcfa7
      KAMEZAWA Hiroyuki 提交于
      What's happening is that the assertion in mm/page_alloc.c:move_freepages()
      is triggering:
      
      	BUG_ON(page_zone(start_page) != page_zone(end_page));
      
      Once I knew this is what was happening, I added some annotations:
      
      	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
      		printk(KERN_ERR "move_freepages: Bogus zones: "
      		       "start_page[%p] end_page[%p] zone[%p]\n",
      		       start_page, end_page, zone);
      		printk(KERN_ERR "move_freepages: "
      		       "start_zone[%p] end_zone[%p]\n",
      		       page_zone(start_page), page_zone(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
      		       page_to_pfn(start_page), page_to_pfn(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_nid[%d] end_nid[%d]\n",
      		       page_to_nid(start_page), page_to_nid(end_page));
       ...
      
      And here's what I got:
      
      	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
      	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
      	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
      	move_freepages: start_nid[1] end_nid[0]
      
      My memory layout on this box is:
      
      [    0.000000] Zone PFN ranges:
      [    0.000000]   Normal   0x00000000 -> 0x0081ff5d
      [    0.000000] Movable zone start PFN for each node
      [    0.000000] early_node_map[8] active PFN ranges
      [    0.000000]     0: 0x00000000 -> 0x00020000
      [    0.000000]     1: 0x00800000 -> 0x0081f7ff
      [    0.000000]     1: 0x0081f800 -> 0x0081fe50
      [    0.000000]     1: 0x0081fed1 -> 0x0081fed8
      [    0.000000]     1: 0x0081feda -> 0x0081fedb
      [    0.000000]     1: 0x0081fedd -> 0x0081fee5
      [    0.000000]     1: 0x0081fee7 -> 0x0081ff51
      [    0.000000]     1: 0x0081ff59 -> 0x0081ff5d
      
      So it's a block move in that 0x81f600-->0x81f7ff region which triggers
      the problem.
      
      This patch:
      
      Declaration of early_pfn_to_nid() is scattered over per-arch include
      files, and it seems it's complicated to know when the declaration is used.
       I think it makes fix-for-memmap-init not easy.
      
      This patch moves all declaration to include/linux/mm.h
      
      After this,
        if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use static definition in include/linux/mm.h
        else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use generic definition in mm/page_alloc.c
        else
           -> per-arch back end function will be called.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reported-by: NDavid Miller <davem@davemlloft.net>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2dbcfa7
  7. 15 2月, 2009 1 次提交
    • N
      lockdep: annotate reclaim context (__GFP_NOFS) · cf40bd16
      Nick Piggin 提交于
      Here is another version, with the incremental patch rolled up, and
      added reclaim context annotation to kswapd, and allocation tracing
      to slab allocators (which may only ever reach the page allocator
      in rare cases, so it is good to put annotations here too).
      
      Haven't tested this version as such, but it should be getting closer
      to merge worthy ;)
      
      --
      After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
      allocation when it should not have been, I thought it might be a good idea to
      try to catch this kind of thing with lockdep.
      
      I coded up a little idea that seems to work. Unfortunately the system has to
      actually be in __GFP_FS page reclaim, then take the lock, before it will mark
      it. But at least that might still be some orders of magnitude more common
      (and more debuggable) than an actual deadlock condition, so we have some
      improvement I hope (the concept is no less complete than discovery of a lock's
      interrupt contexts).
      
      I guess we could even do the same thing with __GFP_IO (normal reclaim), and
      even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
      code paths, so let's start there and see how it goes.
      
      It *seems* to work. I did a quick test.
      
      =================================
      [ INFO: inconsistent lock state ]
      2.6.28-rc6-00007-ged313489-dirty #26
      ---------------------------------
      inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
      modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
      {in-reclaim-W} state was registered at:
        [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60
        [<ffffffff80268f71>] lock_acquire+0x91/0xc0
        [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310
        [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd]
        [<ffffffff8020903b>] _stext+0x3b/0x170
        [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
        [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
        [<ffffffffffffffff>] 0xffffffffffffffff
      irq event stamp: 3929
      hardirqs last  enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310
      hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310
      softirqs last  enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0
      softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0
      
      other info that might help us debug this:
      1 lock held by modprobe/8526:
       #0:  (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
      
      stack backtrace:
      Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged313489-dirty #26
      Call Trace:
       [<ffffffff80265483>] print_usage_bug+0x193/0x1d0
       [<ffffffff80266530>] mark_lock+0xaf0/0xca0
       [<ffffffff80266735>] mark_held_locks+0x55/0xc0
       [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
       [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60
       [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580
       [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310
       [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
       [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd]
       [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
       [<ffffffff8020903b>] _stext+0x3b/0x170
       [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10
       [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180
       [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190
       [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
       [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf40bd16
  8. 09 1月, 2009 1 次提交
  9. 07 1月, 2009 11 次提交
  10. 13 11月, 2008 1 次提交
    • D
      cpusets: update mems allowed in page allocator · e33c3b5e
      David Rientjes 提交于
      If all allowable memory is unreclaimable, it is possible to loop forever
      in the page allocator for ~__GFP_NORETRY allocations.
      
      During this time, it is also possible for a task's cpuset to expand its
      set of allowable nodes so that it now includes free memory.  The cached
      copy of this set, current->mems_allowed, is stale, however, since there
      has not been a subsequent call to cpuset_update_task_memory_state().
      
      The cached copy of the set of allowable nodes is now updated in the page
      allocator's slow path so the additional memory is available to
      get_page_from_freelist().
      
      [akpm@linux-foundation.org: add comment]
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e33c3b5e
  11. 07 11月, 2008 1 次提交
  12. 20 10月, 2008 10 次提交
  13. 17 10月, 2008 1 次提交
  14. 03 10月, 2008 1 次提交
    • A
      mm: handle initialising compound pages at orders greater than MAX_ORDER · 6babc32c
      Andy Whitcroft 提交于
      When we initialise a compound page we initialise the page flags and head
      page pointer for all base pages spanned by that page.  When we initialise
      a gigantic page (a page of order greater than or equal to MAX_ORDER) we
      have to initialise more than MAX_ORDER_NR_PAGES pages.  Currently we
      assume that all elements of the mem_map in this page are contigious in
      memory.  However this is only guarenteed out to MAX_ORDER_NR_PAGES pages,
      and with SPARSEMEM enabled they will not be contigious.  This leads us to
      walk off the end of the first section and scribble on everything which
      follows, BAD.
      
      When we reach a MAX_ORDER_NR_PAGES boundary we much locate the next
      section of the mem_map.  As gigantic pages can only be maximally aligned
      we know this will occur at exact multiple of MAX_ORDER_NR_PAGES pages from
      the start of the page.
      
      This is a bug fix for the gigantic page support in hugetlbfs.
      
      Credit to Mel Gorman for spotting the issue.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6babc32c
  15. 03 9月, 2008 2 次提交
  16. 13 8月, 2008 1 次提交
  17. 31 7月, 2008 1 次提交