1. 01 9月, 2007 1 次提交
  2. 12 5月, 2007 2 次提交
  3. 09 5月, 2007 1 次提交
  4. 21 3月, 2007 1 次提交
    • Z
      [IA64] min_low_pfn and max_low_pfn calculation fix · a3f5c338
      Zou Nan hai 提交于
      We have seen bad_pte_print when testing crashdump on an SN machine in
      recent 2.6.20 kernel.  There are tons of bad pte print (pfn < max_low_pfn)
      reports when the crash kernel boots up, all those reported bad pages
      are inside initmem range; That is because if the crash kernel code and
      data happens to be at the beginning of the 1st node. build_node_maps in
      discontig.c will bypass reserved regions with filter_rsvd_memory. Since
      min_low_pfn is calculated in build_node_map, so in this case, min_low_pfn
      will be greater than kernel code and data.
      
      Because pages inside initmem are freed and reused later, we saw
      pfn_valid check fail on those pages.
      
      I think this theoretically happen on a normal kernel. When I check
      min_low_pfn and max_low_pfn calculation in contig.c and discontig.c.
      I found more issues than this.
      
      1. min_low_pfn and max_low_pfn calculation is inconsistent between
      contig.c and discontig.c,
      min_low_pfn is calculated as the first page number of boot memmap in
      contig.c (Why? Though this may work at the most of the time, I don't
      think it is the right logic). It is calculated as the lowest physical
      memory page number bypass reserved regions in discontig.c.
      max_low_pfn is calculated include reserved regions in contig.c. It is
      calculated exclude reserved regions in discontig.c.
      
      2. If kernel code and data region is happen to be at the begin or the
      end of physical memory, when min_low_pfn and max_low_pfn calculation is
      bypassed kernel code and data, pages in initmem will report bad.
      
      3. initrd is also in reserved regions, if it is at the begin or at the
      end of physical memory, kernel will refuse to reuse the memory. Because
      the virt_addr_valid check in free_initrd_mem.
      
      So it is better to fix and clean up those issues.
      Calculate min_low_pfn and max_low_pfn in a consistent way.
      Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
      Acked-by: NJay Lan <jlan@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a3f5c338
  5. 07 3月, 2007 1 次提交
  6. 12 2月, 2007 1 次提交
  7. 06 2月, 2007 3 次提交
  8. 12 10月, 2006 1 次提交
    • M
      [PATCH] mm: use symbolic names instead of indices for zone initialisation · 6391af17
      Mel Gorman 提交于
      Arch-independent zone-sizing is using indices instead of symbolic names to
      offset within an array related to zones (max_zone_pfns).  The unintended
      impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead
      of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set.  As a result, the
      the machine fails to boot but will boot with CONFIG_HIGHMEM turned off.
      
      The following patch properly initialises the max_zone_pfns[] array and uses
      symbolic names instead of indices in each architecture using
      arch-independent zone-sizing.  Two users have successfully booted their
      powerpcs with it (one an ibook G4).  It has also been boot tested on x86,
      x86_64, ppc64 and ia64.  Please merge for 2.6.19-rc2.
      
      Credit to Benjamin Herrenschmidt for identifying the bug and rolling the
      first fix.  Additional credit to Johannes Berg and Andreas Schwab for
      reporting the problem and testing on powerpc.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6391af17
  9. 27 9月, 2006 3 次提交
  10. 04 8月, 2006 2 次提交
    • B
      [IA64] fix show_mem for VIRTUAL_MEM_MAP+FLATMEM · e44e41d0
      Bob Picco 提交于
      contig.c (FLATMEM) requires the same optimization as in discontig.c for show_mem
      when VIRTUAL_MEM_MAP is in use. Otherwise FLATMEM has softlockup timeouts.
      This was boot tested for memory configuration: SPARSEMEM,
      DISCONTIG+VIRTUAL_MEM_MAP, FLATMEM, FLATMEM+VIRTUAL_MEM_MAP and
      FLATMEM+VIRTUAL_MEM_MAP with largest memory gap less than LARGE_GAP by
      using boot parameter "mem=".
      
      This was boot tested and "echo m >/proc/sysrq-trigger" output evaluated for
      : FLATMEM, FLATMEM+VIRTUAL_MEM_MAP, DISCONTIGMEM+VIRTUAL_MEM_MAP and
      SPARSEMEM.
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e44e41d0
    • B
      [IA64] align high endpoint of VIRTUAL_MEM_MAP · 921eea1c
      Bob Picco 提交于
      Assure that vmem_map's high endpoint is MAX_ORDER aligned. Not doing so violates
      the buddy allocator algorithm. Also anyone using mem=XXX on boot line and
      not aligned to MAX_ORDER requires this patch in order to satisfy buddy
      allocator. vmem_map always starts at pfn 0. The potentially large MAX_ORDER
      on ia64 (due to hugetlbfs) requires that the end of vmem_map be aligned
      to MAX_ORDER_NR_PAGES.
      
      This was boot tested for: FLATMEM, FLATMEM+VIRTUAL_MEM_MAP,
      DISCONTIGMEM+VIRTUAL_MEM_MAP and SPARSEMEM.
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      921eea1c
  11. 05 7月, 2006 1 次提交
    • Y
      [PATCH] Fix copying of pgdat array on each node for ia64 memory hotplug · dd8041f1
      Yasunori Goto 提交于
      I found a bug in memory hot-add code for ia64.
      
      IA64's code has copies of pgdat's array on each node to reduce memory
      access over crossing node.  This array is used by NODE_DATA() macro.  When
      new node is hot-added, this pgdat's array should be updated and copied on
      new node too.
      
      However, I used for_each_online_node() in scatter_node_data() to copy
      it. This meant its array is not copied on new node.
      Because initialization of structures for new node was halfway,
      so online_node_map couldn't be set at this time.
      
      To copy arrays on new node, I changed it to check value of pgdat_list[]
      which is source array of copies.  I tested this patch with my Memory Hotadd
      emulation on Tiger4.  This patch is for 2.6.17-git20.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd8041f1
  12. 28 6月, 2006 3 次提交
  13. 14 4月, 2006 1 次提交
    • R
      [IA64] Make show_mem() skip holes in a pgdat · ace1d816
      Robin Holt 提交于
      This patch modifies ia64's show_mem() to walk the vmem_map page tables and
      rapidly skip forward across regions where the page tables are missing.
      This prevents the pfn_valid() check from causing numerous unnecessary
      page faults.
      
      Without this patch on a 512 node 512 cpu system where every node has four
      memory holes, the show_mem() call takes 1 hour 18 minutes.  With this
      patch, it takes less than 3 seconds.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      ace1d816
  14. 28 3月, 2006 2 次提交
  15. 23 3月, 2006 1 次提交
  16. 06 1月, 2006 1 次提交
    • A
      [IA64] support for cpu0 removal · ff741906
      Ashok Raj 提交于
      here is the BSP removal support for IA64. Its pretty much the same thing that
      was released a while back, but has your feedback incorporated.
      
      - Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param
      - Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp
      - some formatting nits (whitespace etc)
      
      This has been tested on tiger and long back by alex on hp systems as well.
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      ff741906
  17. 07 12月, 2005 1 次提交
  18. 09 11月, 2005 1 次提交
    • B
      [IA64] fix memory less node allocation · 97835245
      Bob Picco 提交于
      The original memory less node allocation attempted to use NODEDATA_ALIGN for
      alignment.  The bootmem allocator only allows a power of two alignments. This
      causes a BUG_ON for some nodes. For cpu only nodes just allocate with a
      PERCPU_PAGE_SIZE alignment.
      
      Some older firmware reports SLIT distances of 0xff and results in bestnode
      not being computed. This is now treated correctly.
      
      The failed allocation check was removed because it's redundant.  The
      bootmem allocator already makes this check.
      
      This fix has been boot tested on 4 node machine which has 4 cpu only nodes
      and 1 memory node.  Thanks to Pete Keilty for reporting this and helping me
      test it.
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      97835245
  19. 30 10月, 2005 1 次提交
    • D
      [PATCH] memory hotplug locking: node_size_lock · 208d54e5
      Dave Hansen 提交于
      pgdat->node_size_lock is basically only neeeded in one place in the normal
      code: show_mem(), which is the arch-specific sysrq-m printing function.
      
      Strictly speaking, the architectures not doing memory hotplug do no need this
      locking in show_mem().  However, they are all included for completeness.  This
      should also make any future consolidation of all of the implementations a
      little more straightforward.
      
      This lock is also held in the sparsemem code during a memory removal, as
      sections are invalidated.  This is the place there pfn_valid() is made false
      for a memory area that's being removed.  The lock is only required when doing
      pfn_valid() operations on memory which the user does not already have a
      reference on the page, such as in show_mem().
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      208d54e5
  20. 05 10月, 2005 1 次提交
  21. 07 7月, 2005 2 次提交
    • T
      [IA64] fix generic/up builds · 8d7e3517
      Tony Luck 提交于
      Jesse Barnes provided the original version of this patch months ago, but
      other changes kept conflicting with it, so it got deferred.  Greg Edwards
      dug it out of obscurity just over a week ago, and almost immediately
      another conflicting patch appeared (Bob Picco's memory-less nodes).
      
      I've resolved the conflicts and got it running again.  CONFIG_SGI_TIOCX
      is set to "y" in defconfig, which causes a Tiger to not boot (oops in
      tiocx_init).  But that can be resolved later ... get this in now before it
      gets stale again.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      8d7e3517
    • B
      [IA64] memory-less-nodes repost · 564601a5
      bob.picco 提交于
      I reworked how nodes with only CPUs are treated.  The patch below seems
      simpler to me and has eliminated the complicated routine
      reassign_cpu_only_nodes.  There isn't any longer the requirement
      to modify ACPI NUMA information which was in large part the
      complexity introduced in reassign_cpu_only_nodes. 
      
      This patch will produce a different number of nodes. For example,
      reassign_cpu_only_nodes would reduce two CPUonly nodes and one memory node
      configuration to one memory+CPUs node configuration.  This patch
      doesn't change the number of nodes which means the user will see three.  Two
      nodes without memory and one node with all the memory.
      
      While doing this patch, I noticed that early_nr_phys_cpus_node isn't serving
      any useful purpose.  It is called once in find_pernode_space but the value
      isn't used to computer pernode space.  
      Signed-off-by: Nbob.picco <bob.picco@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      564601a5
  22. 24 6月, 2005 1 次提交
    • D
      [PATCH] remove non-DISCONTIG use of pgdat->node_mem_map · 408fde81
      Dave Hansen 提交于
      This patch effectively eliminates direct use of pgdat->node_mem_map outside
      of the DISCONTIG code.  On a flat memory system, these fields aren't
      currently used, neither are they on a sparsemem system.
      
      There was also a node_mem_map(nid) macro on many architectures.  Its use
      along with the use of ->node_mem_map itself was not consistent.  It has
      been removed in favor of two new, more explicit, arch-independent macros:
      
      	pgdat_page_nr(pgdat, pagenr)
      	nid_page_nr(nid, pagenr)
      
      I called them "pgdat" and "nid" because we overload the term "node" to mean
      "NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways.  I
      believe the newer names are much clearer.
      
      These macros can be overridden in the sparsemem case with a theoretically
      slower operation using node_start_pfn and pfn_to_page(), instead.  We could
      make this the only behavior if people want, but I don't want to change too
      much at once.  One thing at a time.
      
      This patch removes more code than it adds.
      
      Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386
      generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64.  Full list
      here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/
      
      Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NMartin J. Bligh <mbligh@aracnet.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      408fde81
  23. 26 4月, 2005 1 次提交
    • R
      [IA64] Percpu quicklist for combined allocator for pgd/pmd/pte. · fde740e4
      Robin Holt 提交于
      This patch introduces using the quicklists for pgd, pmd, and pte levels
      by combining the alloc and free functions into a common set of routines.
      This greatly simplifies the reading of this header file.
      
      This patch is simple but necessary for large numa configurations.
      It simply ensures that only pages from the local node are added to a
      cpus quicklist.  This prevents the trapping of pages on a remote nodes
      quicklist by starting a process, touching a large number of pages to
      fill pmd and pte entries, migrating to another node, and then unmapping
      or exiting.  With those conditions, the pages get trapped and if the
      machine has more than 100 nodes of the same size, the calculation of
      the pgtable high water mark will be larger than any single node so page
      table cache flushing will never occur.
      
      I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
      this patch and did not notice any change.
      
      On an sn2 machine, there was a slight improvement which is possibly
      due to pages from other nodes trapped on the test node before starting
      the run.  I did not investigate further.
      
      This patch shrinks the quicklist based upon free memory on the node
      instead of the high/low water marks.  I have written it to enable
      preemption periodically and recalculate the amount to shrink every time
      we have freed enough pages that the quicklist size should have grown.
      I rescan the nodes zones each pass because other processess may be
      draining node memory at the same time as we are adding.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      fde740e4
  24. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4