1. 15 6月, 2012 1 次提交
  2. 29 3月, 2012 1 次提交
  3. 09 12月, 2011 1 次提交
    • T
      ia64: Use HAVE_MEMBLOCK_NODE_MAP · 98e4ae8a
      Tejun Heo 提交于
      ia64 used early_node_map[] just to prime free_area_init_nodes().  Now
      memblock can be used for the same purpose and early_node_map[] is
      scheduled to be dropped.  Use memblock instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64@vger.kernel.org
      98e4ae8a
  4. 25 5月, 2011 2 次提交
    • P
      mm: now that all old mmu_gather code is gone, remove the storage · 1c395176
      Peter Zijlstra 提交于
      Fold all the mmu_gather rework patches into one for submission
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reported-by: NHugh Dickins <hughd@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c395176
    • D
      arch, mm: filter disallowed nodes from arch specific show_mem functions · 7bf02ea2
      David Rientjes 提交于
      Architectures that implement their own show_mem() function did not pass
      the filter argument to show_free_areas() to appropriately avoid emitting
      the state of nodes that are disallowed in the current context.  This patch
      now passes the filter argument to show_free_areas() so those nodes are now
      avoided.
      
      This patch also removes the show_free_areas() wrapper around
      __show_free_areas() and converts existing callers to pass an empty filter.
      
      ia64 emits additional information for each node, so skip_free_areas_zone()
      must be made global to filter disallowed nodes and it is converted to use
      a nid argument rather than a zone for this use case.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7bf02ea2
  5. 21 5月, 2011 1 次提交
    • L
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds 提交于
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      
      So this fixes things up a bit, using
      
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      268bb0ce
  6. 25 3月, 2011 1 次提交
  7. 14 1月, 2011 1 次提交
  8. 01 7月, 2010 1 次提交
    • T
      [IA64] Fix spinaphore down_spin() · b70f4e85
      Tony Luck 提交于
      Typo in down_spin() meant it only read the low 32 bits of the
      "serve" value, instead of the full 64 bits. This results in the
      system hanging when the values in ticket/serve get larger than
      32-bits. A big enough system running the right test can hit this
      in a just a few hours.
      
      Broken since 883a3acf
          [IA64] Re-implement spinaphores using ticket lock concepts
      
      Reported via IRC by Bjorn Helgaas
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      b70f4e85
  9. 31 5月, 2010 1 次提交
  10. 28 5月, 2010 1 次提交
    • T
      [IA64] Fix build breakage · 4ec37de8
      Tony Luck 提交于
      In commit 0ac0c0d0
      cpusets: randomize node rotor used in cpuset_mem_spread_node()
      
      Jack Steiner fixed a problem with too many small tasks being
      assigned to node 0. Copy his code to ia64 to avoid build error.
      
          arch/ia64/kernel/smpboot.c:641: error: ‘cpu_to_node_map’ undeclared (first use in this function)
      
      In commit 3bccd996
      numa: ia64: use generic percpu var numa_node_id() implementation
      
      Lee Schermerhorn added some set_numa_node() calls - but these
      only work on CONFIG_NUMA=y configurations. Surround the calls
      with #ifdef CONFIG_NUMA
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4ec37de8
  11. 19 5月, 2010 1 次提交
  12. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  13. 07 3月, 2010 1 次提交
    • R
      mm: change anon_vma linking to fix multi-process server scalability issue · 5beb4930
      Rik van Riel 提交于
      The old anon_vma code can lead to scalability issues with heavily forking
      workloads.  Specifically, each anon_vma will be shared between the parent
      process and all its child processes.
      
      In a workload with 1000 child processes and a VMA with 1000 anonymous
      pages per process that get COWed, this leads to a system with a million
      anonymous pages in the same anon_vma, each of which is mapped in just one
      of the 1000 processes.  However, the current rmap code needs to walk them
      all, leading to O(N) scanning complexity for each page.
      
      This can result in systems where one CPU is walking the page tables of
      1000 processes in page_referenced_one, while all other CPUs are stuck on
      the anon_vma lock.  This leads to catastrophic failure for a benchmark
      like AIM7, where the total number of processes can reach in the tens of
      thousands.  Real workloads are still a factor 10 less process intensive
      than AIM7, but they are catching up.
      
      This patch changes the way anon_vmas and VMAs are linked, which allows us
      to associate multiple anon_vmas with a VMA.  At fork time, each child
      process gets its own anon_vmas, in which its COWed pages will be
      instantiated.  The parents' anon_vma is also linked to the VMA, because
      non-COWed pages could be present in any of the children.
      
      This reduces rmap scanning complexity to O(1) for the pages of the 1000
      child processes, with O(N) complexity for at most 1/N pages in the system.
       This reduces the average scanning cost in heavily forking workloads from
      O(N) to 2.
      
      The only real complexity in this patch stems from the fact that linking a
      VMA to anon_vmas now involves memory allocations.  This means vma_adjust
      can fail, if it needs to attach a VMA to anon_vma structures.  This in
      turn means error handling needs to be added to the calling functions.
      
      A second source of complexity is that, because there can be multiple
      anon_vmas, the anon_vma linking in vma_adjust can no longer be done under
      "the" anon_vma lock.  To prevent the rmap code from walking up an
      incomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag.  This bit
      flag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h
      to make sure it is impossible to compile a kernel that needs both symbolic
      values for the same bitflag.
      
      Some test results:
      
      Without the anon_vma changes, when AIM7 hits around 9.7k users (on a test
      box with 16GB RAM and not quite enough IO), the system ends up running
      >99% in system time, with every CPU on the same anon_vma lock in the
      pageout code.
      
      With these changes, AIM7 hits the cross-over point around 29.7k users.
      This happens with ~99% IO wait time, there never seems to be any spike in
      system time.  The anon_vma lock contention appears to be resolved.
      
      [akpm@linux-foundation.org: cleanups]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5beb4930
  14. 09 2月, 2010 1 次提交
    • T
      [IA64] Remove COMPAT_IA32 support · 32974ad4
      Tony Luck 提交于
      This has been broken since May 2008 when Al Viro killed altroot support.
      Since nobody has complained, it would appear that there are no users of
      this code (A plausible theory since the main OSVs that support ia64 prefer
      to use the IA32-EL software emulation).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      32974ad4
  15. 08 1月, 2010 1 次提交
    • T
      [IA64] __per_cpu_idtrs[] is a memory hog · 6c57a332
      Tony Luck 提交于
      __per_cpu_idtrs is statically allocated ... on CONFIG_NR_CPUS=4096
      systems it hogs 16MB of memory. This is way too much for a quite
      probably unused facility (only KVM uses dynamic TR registers).
      
      Change to an array of pointers, and allocate entries as needed on
      a per cpu basis.  Change the name too as the __per_cpu_ prefix is
      confusing (this isn't a classic <linux/percpu.h> type object).
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      6c57a332
  16. 07 1月, 2010 1 次提交
  17. 16 12月, 2009 1 次提交
  18. 15 12月, 2009 1 次提交
  19. 29 10月, 2009 2 次提交
    • R
      percpu: remove per_cpu__ prefix. · dd17c8f7
      Rusty Russell 提交于
      Now that the return from alloc_percpu is compatible with the address
      of per-cpu vars, it makes sense to hand around the address of per-cpu
      variables.  To make this sane, we remove the per_cpu__ prefix we used
      created to stop people accidentally using these vars directly.
      
      Now we have sparse, we can use that (next patch).
      
      tj: * Updated to convert stuff which were missed by or added after the
            original patch.
      
          * Kill per_cpu_var() macro.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      dd17c8f7
    • T
      percpu: make percpu symbols in ia64 unique · 877105cc
      Tejun Heo 提交于
      This patch updates percpu related symbols in ia64 such that percpu
      symbols are unique and don't clash with local symbols.  This serves
      two purposes of decreasing the possibility of global percpu symbol
      collision and allowing dropping per_cpu__ prefix from percpu symbols.
      
      * arch/ia64/kernel/setup.c: s/cpu_info/ia64_cpu_info/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64@vger.kernel.org
      877105cc
  20. 10 10月, 2009 1 次提交
  21. 02 10月, 2009 3 次提交
    • T
      ia64: convert to dynamic percpu allocator · 52594762
      Tejun Heo 提交于
      Unlike other archs, ia64 reserves space for percpu areas during early
      memory initialization.  These areas occupy a contiguous region indexed
      by cpu number on contiguous memory model or are grouped by node on
      discontiguous memory model.
      
      As allocation and initialization are done by the arch code, all that
      setup_per_cpu_areas() needs to do is communicating the determined
      layout to the percpu allocator.  This patch implements
      setup_per_cpu_areas() for both contig and discontig memory models and
      drops HAVE_LEGACY_PER_CPU_AREA.
      
      Please note that for contig model, the allocation itself is modified
      only to allocate for possible cpus instead of NR_CPUS.  As dynamic
      percpu allocator can handle non-direct mapping, there's no reason to
      allocate memory for cpus which aren't possible.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64 <linux-ia64@vger.kernel.org>
      52594762
    • T
      ia64: allocate percpu area for cpu0 like percpu areas for other cpus · 36886478
      Tejun Heo 提交于
      cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu,
      which is set up early in boot by head.S.  However, this doesn't
      guarantee that the area will be on the same node as cpu0 and the
      percpu area for cpu0 ends up very far away from percpu areas for other
      cpus which cause problems for congruent percpu allocator.
      
      This patch makes percpu area initialization allocate percpu area for
      cpu0 like any other cpus and copy it from __cpu0_per_cpu which now
      resides in the __init area.  This means that for cpu0, percpu area is
      first setup at __cpu0_per_cpu early by head.S and then moved to an
      area in the linear mapping during memory initialization and it's not
      allowed to take a pointer to percpu variables between head.S and
      memory initialization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64 <linux-ia64@vger.kernel.org>
      36886478
    • T
      ia64: don't alias VMALLOC_END to vmalloc_end · 126b3fcd
      Tejun Heo 提交于
      If CONFIG_VIRTUAL_MEM_MAP is enabled, ia64 defines macro VMALLOC_END
      as unsigned long variable vmalloc_end which is adjusted to prepare
      room for vmemmap.  This becomes probnlematic if a local variables
      vmalloc_end is defined in some function (not very unlikely) and
      VMALLOC_END is used in the function - the function thinks its
      referencing the global VMALLOC_END value but would be referencing its
      own local vmalloc_end variable.
      
      There's no reason VMALLOC_END should be a macro.  Just define it as an
      unsigned long variable if CONFIG_VIRTUAL_MEM_MAP is set to avoid nasty
      surprises.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64 <linux-ia64@vger.kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      126b3fcd
  22. 23 9月, 2009 4 次提交
  23. 22 9月, 2009 1 次提交
  24. 22 6月, 2009 1 次提交
  25. 18 6月, 2009 1 次提交
    • M
      [IA64] Convert ia64 to use int-ll64.h · e088a4ad
      Matthew Wilcox 提交于
      It is generally agreed that it would be beneficial for u64 to be an
      unsigned long long on all architectures.  ia64 (in common with several
      other 64-bit architectures) currently uses unsigned long.  Migrating
      piecemeal is too painful; this giant patch fixes all compilation warnings
      and errors that come as a result of switching to use int-ll64.h.
      
      Note that userspace will still see __u64 defined as unsigned long.  This
      is important as it affects C++ name mangling.
      
      [Updated by Tony Luck to change efi.h:efi_freemem_callback_t to use
       u64 for start/end rather than unsigned long]
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e088a4ad
  26. 16 6月, 2009 1 次提交
  27. 12 6月, 2009 1 次提交
    • R
      module: trim exception table on init free. · ad6561df
      Rusty Russell 提交于
      It's theoretically possible that there are exception table entries
      which point into the (freed) init text of modules.  These could cause
      future problems if other modules get loaded into that memory and cause
      an exception as we'd see the wrong fixup.  The only case I know of is
      kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).
      
      Amerigo fixed this long-standing FIXME in the x86 version, but this
      patch is more general.
      
      This implements trim_init_extable(); most archs are simple since they
      use the standard lib/extable.c sort code.  Alpha and IA64 use relative
      addresses in their fixups, so thier trimming is a slight variation.
      
      Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
      yet it defines its own sort_extable() which overrides the one in lib.
      It doesn't sort, so we have to mark deleted entries instead of
      actually trimming them.
      Inspired-by: NAmerigo Wang <amwang@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: linux-alpha@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      ad6561df
  28. 02 4月, 2009 1 次提交
  29. 27 3月, 2009 2 次提交
  30. 16 3月, 2009 1 次提交
  31. 19 2月, 2009 2 次提交
    • K
      mm: fix memmap init for handling memory hole · cc2559bc
      KAMEZAWA Hiroyuki 提交于
      Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
      and memmap initialization was not done. This was a trouble for
      sparc boot.
      
      To fix this, the PFN should be initialized and marked as PG_reserved.
      This patch changes early_pfn_in_nid() return true if PFN is a hole.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Reported-by: NDavid Miller <davem@davemlloft.net>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc2559bc
    • K
      mm: clean up for early_pfn_to_nid() · f2dbcfa7
      KAMEZAWA Hiroyuki 提交于
      What's happening is that the assertion in mm/page_alloc.c:move_freepages()
      is triggering:
      
      	BUG_ON(page_zone(start_page) != page_zone(end_page));
      
      Once I knew this is what was happening, I added some annotations:
      
      	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
      		printk(KERN_ERR "move_freepages: Bogus zones: "
      		       "start_page[%p] end_page[%p] zone[%p]\n",
      		       start_page, end_page, zone);
      		printk(KERN_ERR "move_freepages: "
      		       "start_zone[%p] end_zone[%p]\n",
      		       page_zone(start_page), page_zone(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
      		       page_to_pfn(start_page), page_to_pfn(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_nid[%d] end_nid[%d]\n",
      		       page_to_nid(start_page), page_to_nid(end_page));
       ...
      
      And here's what I got:
      
      	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
      	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
      	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
      	move_freepages: start_nid[1] end_nid[0]
      
      My memory layout on this box is:
      
      [    0.000000] Zone PFN ranges:
      [    0.000000]   Normal   0x00000000 -> 0x0081ff5d
      [    0.000000] Movable zone start PFN for each node
      [    0.000000] early_node_map[8] active PFN ranges
      [    0.000000]     0: 0x00000000 -> 0x00020000
      [    0.000000]     1: 0x00800000 -> 0x0081f7ff
      [    0.000000]     1: 0x0081f800 -> 0x0081fe50
      [    0.000000]     1: 0x0081fed1 -> 0x0081fed8
      [    0.000000]     1: 0x0081feda -> 0x0081fedb
      [    0.000000]     1: 0x0081fedd -> 0x0081fee5
      [    0.000000]     1: 0x0081fee7 -> 0x0081ff51
      [    0.000000]     1: 0x0081ff59 -> 0x0081ff5d
      
      So it's a block move in that 0x81f600-->0x81f7ff region which triggers
      the problem.
      
      This patch:
      
      Declaration of early_pfn_to_nid() is scattered over per-arch include
      files, and it seems it's complicated to know when the declaration is used.
       I think it makes fix-for-memmap-init not easy.
      
      This patch moves all declaration to include/linux/mm.h
      
      After this,
        if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use static definition in include/linux/mm.h
        else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use generic definition in mm/page_alloc.c
        else
           -> per-arch back end function will be called.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reported-by: NDavid Miller <davem@davemlloft.net>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2dbcfa7