1. 17 1月, 2010 1 次提交
  2. 24 12月, 2009 1 次提交
  3. 18 12月, 2009 1 次提交
    • R
      mm: Add notifier in pageblock isolation for balloon drivers · 925cc71e
      Robert Jennings 提交于
      Memory balloon drivers can allocate a large amount of memory which is not
      movable but could be freed to accomodate memory hotplug remove.
      
      Prior to calling the memory hotplug notifier chain the memory in the
      pageblock is isolated.  Currently, if the migrate type is not
      MIGRATE_MOVABLE the isolation will not proceed, causing the memory removal
      for that page range to fail.
      
      Rather than failing pageblock isolation if the migrateteype is not
      MIGRATE_MOVABLE, this patch checks if all of the pages in the pageblock,
      and not on the LRU, are owned by a registered balloon driver (or other
      entity) using a notifier chain.  If all of the non-movable pages are owned
      by a balloon, they can be freed later through the memory notifier chain
      and the range can still be isolated in set_migratetype_isolate().
      Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Brian King <brking@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Gerald Schaefer <geralds@linux.vnet.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      925cc71e
  4. 17 12月, 2009 1 次提交
    • Y
      x86: Fix checking of SRAT when node 0 ram is not from 0 · 32996250
      Yinghai Lu 提交于
      Found one system that boot from socket1 instead of socket0, SRAT get rejected...
      
      [    0.000000] SRAT: Node 1 PXM 0 0-a0000
      [    0.000000] SRAT: Node 1 PXM 0 100000-80000000
      [    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
      [    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
      [    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
      [    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
      [    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
      [    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
      [    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
      [    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
      ...
      [    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
      [    0.000000] NUMA: Using 20 for the hash shift.
      [    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
      [    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
      [    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
      [    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
      [    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
      [    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
      [    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
      [    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
      [    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
      [    0.000000] SRAT: SRAT not used.
      
      the early_node_map is not sorted because node0 with non zero start come first.
      
      so try to sort it right away after all regions are registered.
      
      also fixs refression by 8716273c (x86: Export srat physical topology)
      
      -v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
      -v3: update comments.
      Reported-and-tested-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B2579D2.3010201@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      32996250
  5. 16 12月, 2009 3 次提交
    • K
      oom-kill: fix NUMA constraint check with nodemask · 4365a567
      KAMEZAWA Hiroyuki 提交于
      Fix node-oriented allocation handling in oom-kill.c I myself think of this
      as a bugfix not as an ehnancement.
      
      In these days, things are changed as
        - alloc_pages() eats nodemask as its arguments, __alloc_pages_nodemask().
        - mempolicy don't maintain its own private zonelists.
        (And cpuset doesn't use nodemask for __alloc_pages_nodemask())
      
      So, current oom-killer's check function is wrong.
      
      This patch does
        - check nodemask, if nodemask && nodemask doesn't cover all
          node_states[N_HIGH_MEMORY], this is CONSTRAINT_MEMORY_POLICY.
        - Scan all zonelist under nodemask, if it hits cpuset's wall
          this faiulre is from cpuset.
      And
        - modifies the caller of out_of_memory not to call oom if __GFP_THISNODE.
          This doesn't change "current" behavior. If callers use __GFP_THISNODE
          it should handle "page allocation failure" by itself.
      
        - handle __GFP_NOFAIL+__GFP_THISNODE path.
          This is something like a FIXME but this gfpmask is not used now.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hioryu@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4365a567
    • W
      HWPOISON: detect free buddy pages explicitly · 8d22ba1b
      Wu Fengguang 提交于
      Most free pages in the buddy system have no PG_buddy set.
      Introduce is_free_buddy_page() for detecting them reliably.
      
      CC: Nick Piggin <npiggin@suse.de>
      CC: Mel Gorman <mel@linux.vnet.ibm.com>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      8d22ba1b
    • H
      mm: CONFIG_MMU for PG_mlocked · af8e3354
      Hugh Dickins 提交于
      Remove three degrees of obfuscation, left over from when we had
      CONFIG_UNEVICTABLE_LRU.  MLOCK_PAGES is CONFIG_HAVE_MLOCKED_PAGE_BIT is
      CONFIG_HAVE_MLOCK is CONFIG_MMU.  rmap.o (and memory-failure.o) are only
      built when CONFIG_MMU, so don't need such conditions at all.
      
      Somehow, I feel no compulsion to remove the CONFIG_HAVE_MLOCK* lines from
      169 defconfigs: leave those to evolve in due course.
      Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Izik Eidus <ieidus@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af8e3354
  6. 26 11月, 2009 2 次提交
    • I
      tracing: Fix kmem event exports · 4d795fb1
      Ingo Molnar 提交于
      Commit 53d0422c ("tracing: Convert some kmem events to DEFINE_EVENT")
      moved the kmem tracepoint creation from util.c to page_alloc.c,
      but forgot to move the exports.
      
      Move them back.
      
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4d795fb1
    • L
      tracing: Convert some kmem events to DEFINE_EVENT · 53d0422c
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
       333987   69800   27228  431015   693a7 mm/built-in.o.old
       330030   69800   27228  427058   68432 mm/built-in.o
      
      8 events are converted:
      
        kmem_alloc: kmalloc, kmem_cache_alloc
        kmem_alloc_node: kmalloc_node, kmem_cache_alloc_node
        kmem_free: kfree, kmem_cache_free
        mm_page: mm_page_alloc_zone_locked, mm_page_pcpu_drain
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53d0422c
  7. 12 11月, 2009 2 次提交
  8. 29 10月, 2009 1 次提交
    • A
      revert "mm: oom analysis: add buffer cache information to show_free_areas()" · b76146ed
      Andrew Morton 提交于
      Revert
      
          commit 71de1ccb
          Author:     KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
          AuthorDate: Mon Sep 21 17:01:31 2009 -0700
          Commit:     Linus Torvalds <torvalds@linux-foundation.org>
          CommitDate: Tue Sep 22 07:17:27 2009 -0700
      
              mm: oom analysis: add buffer cache information to show_free_areas()
      
      show_free_areas() is called during page allocation failures, and page
      allocation failures can occur in any calling context.
      
      But nr_blockdev_pages() takes VFS locks which should not be taken from
      hard IRQ context (at least).  The result is lockdep warnings (and
      deadlockability) during page allocation failures.
      
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b76146ed
  9. 24 9月, 2009 1 次提交
  10. 22 9月, 2009 25 次提交
  11. 16 9月, 2009 1 次提交
    • W
      HWPOISON: check and isolate corrupted free pages v2 · 2a7684a2
      Wu Fengguang 提交于
      If memory corruption hits the free buddy pages, we can safely ignore them.
      No one will access them until page allocation time, then prep_new_page()
      will automatically check and isolate PG_hwpoison page for us (for 0-order
      allocation).
      
      This patch expands prep_new_page() to check every component page in a high
      order page allocation, in order to completely stop PG_hwpoison pages from
      being recirculated.
      
      Note that the common case -- only allocating a single page, doesn't
      do any more work than before. Allocating > order 0 does a bit more work,
      but that's relatively uncommon.
      
      This simple implementation may drop some innocent neighbor pages, hopefully
      it is not a big problem because the event should be rare enough.
      
      This patch adds some runtime costs to high order page users.
      
      [AK: Improved description]
      
      v2: Andi Kleen:
      Port to -mm code
      Move check into separate function.
      Don't dump stack in bad_pages for hwpoisoned pages.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      2a7684a2
  12. 06 9月, 2009 1 次提交
    • M
      page-allocator: always change pageblock ownership when anti-fragmentation is disabled · dd5d241e
      Mel Gorman 提交于
      On low-memory systems, anti-fragmentation gets disabled as fragmentation
      cannot be avoided on a sufficiently large boundary to be worthwhile.  Once
      disabled, there is a period of time when all the pageblocks are marked
      MOVABLE and the expectation is that they get marked UNMOVABLE at each call
      to __rmqueue_fallback().
      
      However, when MAX_ORDER is large the pageblocks do not change ownership
      because the normal criteria are not met.  This has the effect of
      prematurely breaking up too many large contiguous blocks.  This is most
      serious on NOMMU systems which depend on high-order allocations to boot.
      This patch causes pageblocks to change ownership on every fallback when
      anti-fragmentation is disabled.  This prevents the large blocks being
      prematurely broken up.
      
      This is a fix to commit 49255c61 [page
      allocator: move check for disabled anti-fragmentation out of fastpath] and
      the problem affects 2.6.31-rc8.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Tested-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NGreg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd5d241e