1. 07 5月, 2007 1 次提交
  2. 08 12月, 2006 2 次提交
    • A
      [PATCH] numa node ids are int, page_to_nid and zone_to_nid should return int · 25ba77c1
      Andy Whitcroft 提交于
      NUMA node ids are passed as either int or unsigned int almost exclusivly
      page_to_nid and zone_to_nid both return unsigned long.  This is a throw
      back to when page_to_nid was a #define and was thus exposing the real type
      of the page flags field.
      
      In addition to fixing up the definitions of page_to_nid and zone_to_nid I
      audited the users of these functions identifying the following incorrect
      uses:
      
      1) mm/page_alloc.c show_node() -- printk dumping the node id,
      2) include/asm-ia64/pgalloc.h pgtable_quicklist_free() -- comparison
         against numa_node_id() which returns an int from cpu_to_node(), and
      3) mm/mpolicy.c check_pte_range -- used as an index in node_isset which
         uses bit_set which in generic code takes an int.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      25ba77c1
    • C
      [PATCH] Get rid of zone_table[] · 89689ae7
      Christoph Lameter 提交于
      The zone table is mostly not needed.  If we have a node in the page flags
      then we can get to the zone via NODE_DATA() which is much more likely to be
      already in the cpu cache.
      
      In case of SMP and UP NODE_DATA() is a constant pointer which allows us to
      access an exact replica of zonetable in the node_zones field.  In all of
      the above cases there will be no need at all for the zone table.
      
      The only remaining case is if in a NUMA system the node numbers do not fit
      into the page flags.  In that case we make sparse generate a table that
      maps sections to nodes and use that table to to figure out the node number.
       This table is sized to fit in a single cache line for the known 32 bit
      NUMA platform which makes it very likely that the information can be
      obtained without a cache miss.
      
      For sparsemem the zone table seems to be have been fairly large based on
      the maximum possible number of sections and the number of zones per node.
      There is some memory saving by removing zone_table.  The main benefit is to
      reduce the cache foootprint of the VM from the frequent lookups of zones.
      Plus it simplifies the page allocator.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      89689ae7
  3. 29 10月, 2006 1 次提交
    • Y
      [PATCH] memory hotplug: __GFP_NOWARN is better for __kmalloc_section_memmap() · f2d0aa5b
      Yasunori Goto 提交于
      Add __GFP_NOWARN flag to calling of __alloc_pages() in
      __kmalloc_section_memmap().  It can reduce noisy failure message.
      
      In ia64, section size is 1 GB, this means that order 8 pages are necessary
      for each section's memmap.  It is often very hard requirement under heavy
      memory pressure as you know.  So, __alloc_pages() gives up allocation and
      shows many noisy stack traces which means no page for each sections.
      (Current my environment shows 32 times of stack trace....)
      
      But, __kmalloc_section_memmap() calls vmalloc() after failure of it, and it
      can succeed allocation of memmap.  So, its stack trace warning becomes just
      noisy.  I suppose it shouldn't be shown.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f2d0aa5b
  4. 01 7月, 2006 1 次提交
  5. 28 6月, 2006 1 次提交
  6. 23 6月, 2006 1 次提交
  7. 22 5月, 2006 1 次提交
    • M
      [PATCH] SPARSEMEM incorrectly calculates section number · 12783b00
      Mike Kravetz 提交于
      A bad calculation/loop in __section_nr() could result in incorrect section
      information being put into sysfs memory entries.  This primarily impacts
      memory add operations as the sysfs information is used while onlining new
      memory.
      
      Fix suggested by Dave Hansen.
      
      Note that the bug may not be obvious from the patch.  It actually occurs in
      the function's return statement:
      
      	return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
      
      In the existing code, root_nr has already been multiplied by
      SECTIONS_PER_ROOT.
      Signed-off-by: NMike Kravetz <kravetz@us.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      12783b00
  8. 16 5月, 2006 1 次提交
  9. 02 5月, 2006 1 次提交
  10. 09 1月, 2006 1 次提交
  11. 30 10月, 2005 2 次提交
  12. 05 9月, 2005 3 次提交
    • D
      [PATCH] sparsemem extreme: hotplug preparation · 28ae55c9
      Dave Hansen 提交于
      This splits up sparse_index_alloc() into two pieces.  This is needed
      because we'll allocate the memory for the second level in a different place
      from where we actually consume it to keep the allocation from happening
      underneath a lock
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      28ae55c9
    • B
      [PATCH] sparsemem extreme implementation · 3e347261
      Bob Picco 提交于
      With cleanups from Dave Hansen <haveblue@us.ibm.com>
      
      SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to
      mem_sections.  This two level layout scheme is able to achieve smaller
      memory requirements for SPARSEMEM with the tradeoff of an additional shift
      and load when fetching the memory section.  The current SPARSEMEM
      implementation is a one dimensional array of mem_sections which is the
      default SPARSEMEM configuration.  The patch attempts isolates the
      implementation details of the physical layout of the sparsemem section
      array.
      
      SPARSEMEM_EXTREME requires bootmem to be functioning at the time of
      memory_present() calls.  This is not always feasible, so architectures
      which do not need it may allocate everything statically by using
      SPARSEMEM_STATIC.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3e347261
    • B
      [PATCH] SPARSEMEM EXTREME · 802f192e
      Bob Picco 提交于
      A new option for SPARSEMEM is ARCH_SPARSEMEM_EXTREME.  Architecture
      platforms with a very sparse physical address space would likely want to
      select this option.  For those architecture platforms that don't select the
      option, the code generated is equivalent to SPARSEMEM currently in -mm.
      I'll be posting a patch on ia64 ml which uses this new SPARSEMEM feature.
      
      ARCH_SPARSEMEM_EXTREME makes mem_section a one dimensional array of
      pointers to mem_sections.  This two level layout scheme is able to achieve
      smaller memory requirements for SPARSEMEM with the tradeoff of an
      additional shift and load when fetching the memory section.  The current
      SPARSEMEM -mm implementation is a one dimensional array of mem_sections
      which is the default SPARSEMEM configuration.  The patch attempts isolates
      the implementation details of the physical layout of the sparsemem section
      array.
      
      ARCH_SPARSEMEM_EXTREME depends on 64BIT and is by default boolean false.
      
      I've boot tested under aim load ia64 configured for ARCH_SPARSEMEM_EXTREME.
       I've also boot tested a 4 way Opteron machine with !ARCH_SPARSEMEM_EXTREME
      and tested with aim.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      802f192e
  13. 24 6月, 2005 2 次提交
    • A
      [PATCH] sparsemem hotplug base · 29751f69
      Andy Whitcroft 提交于
      Make sparse's initalization be accessible at runtime.  This allows sparse
      mappings to be created after boot in a hotplug situation.
      
      This patch is separated from the previous one just to give an indication how
      much of the sparse infrastructure is *just* for hotplug memory.
      
      The section_mem_map doesn't really store a pointer.  It stores something that
      is convenient to do some math against to get a pointer.  It isn't valid to
      just do *section_mem_map, so I don't think it should be stored as a pointer.
      
      There are a couple of things I'd like to store about a section.  First of all,
      the fact that it is !NULL does not mean that it is present.  There could be
      such a combination where section_mem_map *is* NULL, but the math gets you
      properly to a real mem_map.  So, I don't think that check is safe.
      
      Since we're storing 32-bit-aligned structures, we have a few bits in the
      bottom of the pointer to play with.  Use one bit to encode whether there's
      really a mem_map there, and the other one to tell whether there's a valid
      section there.  We need to distinguish between the two because sometimes
      there's a gap between when a section is discovered to be present and when we
      can get the mem_map for it.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      29751f69
    • A
      [PATCH] sparsemem memory model · d41dee36
      Andy Whitcroft 提交于
      Sparsemem abstracts the use of discontiguous mem_maps[].  This kind of
      mem_map[] is needed by discontiguous memory machines (like in the old
      CONFIG_DISCONTIGMEM case) as well as memory hotplug systems.  Sparsemem
      replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually
      become a complete replacement.
      
      A significant advantage over DISCONTIGMEM is that it's completely separated
      from CONFIG_NUMA.  When producing this patch, it became apparent in that NUMA
      and DISCONTIG are often confused.
      
      Another advantage is that sparse doesn't require each NUMA node's ranges to be
      contiguous.  It can handle overlapping ranges between nodes with no problems,
      where DISCONTIGMEM currently throws away that memory.
      
      Sparsemem uses an array to provide different pfn_to_page() translations for
      each SECTION_SIZE area of physical memory.  This is what allows the mem_map[]
      to be chopped up.
      
      In order to do quick pfn_to_page() operations, the section number of the page
      is encoded in page->flags.  Part of the sparsemem infrastructure enables
      sharing of these bits more dynamically (at compile-time) between the
      page_zone() and sparsemem operations.  However, on 32-bit architectures, the
      number of bits is quite limited, and may require growing the size of the
      page->flags type in certain conditions.  Several things might force this to
      occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of
      memory), an increase in the physical address space, or an increase in the
      number of used page->flags.
      
      One thing to note is that, once sparsemem is present, the NUMA node
      information no longer needs to be stored in the page->flags.  It might provide
      speed increases on certain platforms and will be stored there if there is
      room.  But, if out of room, an alternate (theoretically slower) mechanism is
      used.
      
      This patch introduces CONFIG_FLATMEM.  It is used in almost all cases where
      there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM
      often have to compile out the same areas of code.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NMartin Bligh <mbligh@aracnet.com>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d41dee36