1. 01 6月, 2007 1 次提交
  2. 09 5月, 2007 1 次提交
  3. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  4. 03 5月, 2007 6 次提交
    • K
      [PATCH] x86-64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation. · ae32b129
      Konrad Rzeszutek 提交于
      This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES
      to inhibit the machine from triggering an NMI while the CPUs
      are locked. This situation is happening on boxes with more
      than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed.
      
      It has been succesfully tested for regression on uni, 2, 4, 8
      32, and 64 CPU boxes with various memory configuration.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ae32b129
    • J
      [PATCH] x86: tighten kernel image page access rights · 6fb14755
      Jan Beulich 提交于
      On x86-64, kernel memory freed after init can be entirely unmapped instead
      of just getting 'poisoned' by overwriting with a debug pattern.
      
      On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
      can also be write-protected.
      
      Compared to the first version, this one prevents re-creating deleted
      mappings in the kernel image range on x86-64, if those got removed
      previously. This, together with the original changes, prevents temporarily
      having inconsistent mappings when cacheability attributes are being
      changed on such pages (e.g. from AGP code). While on i386 such duplicate
      mappings don't exist, the same change is done there, too, both for
      consistency and because checking pte_present() before using various other
      pte_XXX functions is a requirement anyway. At once, i386 code gets
      adjusted to use pte_huge() instead of open coding this.
      
      AK: split out cpa() changes
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6fb14755
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
    • V
      [PATCH] x86-64: Remove the identity mapping as early as possible · cfd243d4
      Vivek Goyal 提交于
      With the rewrite of the SMP trampoline and the early page
      allocator there is nothing that needs identity mapped pages,
      once we start executing C code.
      
      So add zap_identity_mappings into head64.c and remove
      zap_low_mappings() from much later in the code.  The functions
       are subtly different thus the name change.
      
      This also kills boot_level4_pgt which was from an earlier
      attempt to move the identity mappings as early as possible,
      and is now no longer needed.  Essentially I have replaced
      boot_level4_pgt with trampoline_level4_pgt in trampoline.S
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      cfd243d4
    • V
      [PATCH] x86-64: Kill temp boot pmds · dafe41ee
      Vivek Goyal 提交于
      Early in the boot process we need the ability to set
      up temporary mappings, before our normal mechanisms are
      initialized.  Currently this is used to map pages that
      are part of the page tables we are building and pages
      during the dmi scan.
      
      The core problem is that we are using the user portion of
      the page tables to implement this.  Which means that while
      this mechanism is active we cannot catch NULL pointer dereferences
      and we deviate from the normal ways of handling things.
      
      In this patch I modify early_ioremap to map pages into
      the kernel portion of address space, roughly where
      we will later put modules, and I make the discovery of
      which addresses we can use dynamic which removes all
      kinds of static limits and remove the dependencies
      on implementation details between different parts of the code.
      
      Now alloc_low_page() and unmap_low_page() use
      early_iomap() and early_iounmap() to allocate/map and
      unmap a page.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      dafe41ee
    • S
      [PATCH] x86-64: dma_ops as const · e6584504
      Stephen Hemminger 提交于
      The dma_ops structure can be const since it never changes
      after boot.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      e6584504
  5. 15 2月, 2007 3 次提交
  6. 07 12月, 2006 1 次提交
    • E
      [PATCH] x86-64: fix perms/range of vsyscall vma in /proc/*/maps · 103efcd9
      Ernie Petrides 提交于
      The final line of /proc/<pid>/maps on x86_64 for native 64-bit
      tasks shows an incorrect ending address and incorrect permissions.  There
      is only a single page mapped in this vsyscall region, and it is accessible
      for both read and execute.
      
      The patch below fixes this.  (Since 32-bit-compat tasks have a real vma
      with correct perms/range, no change is necessary for that scenario.)
      
      Before the patch, a "cat /proc/self/maps | tail -1" shows this:
      
              ffffffffff600000-ffffffffffe00000 ---p 00000000 [...]
      
      After the patch, this is the output:
      
              ffffffffff600000-ffffffffff601000 r-xp 00000000 [...]
      Signed-off-by: NErnie Petrides <petrides@redhat.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      103efcd9
  7. 21 11月, 2006 1 次提交
  8. 14 11月, 2006 1 次提交
  9. 12 10月, 2006 1 次提交
    • M
      [PATCH] mm: use symbolic names instead of indices for zone initialisation · 6391af17
      Mel Gorman 提交于
      Arch-independent zone-sizing is using indices instead of symbolic names to
      offset within an array related to zones (max_zone_pfns).  The unintended
      impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead
      of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set.  As a result, the
      the machine fails to boot but will boot with CONFIG_HIGHMEM turned off.
      
      The following patch properly initialises the max_zone_pfns[] array and uses
      symbolic names instead of indices in each architecture using
      arch-independent zone-sizing.  Two users have successfully booted their
      powerpcs with it (one an ibook G4).  It has also been boot tested on x86,
      x86_64, ppc64 and ia64.  Please merge for 2.6.19-rc2.
      
      Credit to Benjamin Herrenschmidt for identifying the bug and rolling the
      first fix.  Additional credit to Johannes Berg and Andreas Schwab for
      reporting the problem and testing on powerpc.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6391af17
  10. 01 10月, 2006 3 次提交
  11. 27 9月, 2006 2 次提交
    • M
      [PATCH] Account for memmap and optionally the kernel image as holes · 0e0b864e
      Mel Gorman 提交于
      The x86_64 code accounted for memmap and some portions of the the DMA zone as
      holes.  This was because those areas would never be reclaimed and accounting
      for them as memory affects min watermarks.  This patch will account for the
      memmap as a memory hole.  Architectures may optionally use set_dma_reserve()
      if they wish to account for a portion of memory in ZONE_DMA as a hole.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Keith Mannthey" <kmannth@gmail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e0b864e
    • M
      [PATCH] Have x86_64 use add_active_range() and free_area_init_nodes · 5cb248ab
      Mel Gorman 提交于
      Size zones and holes in an architecture independent manner for x86_64.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Keith Mannthey" <kmannth@gmail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5cb248ab
  12. 26 9月, 2006 4 次提交
    • C
      [PATCH] reduce MAX_NR_ZONES: remove two strange uses of MAX_NR_ZONES · 776ed98b
      Christoph Lameter 提交于
      I keep seeing zones on various platforms that are never used and wonder why we
      compile support for them into the kernel.  Counters show up for HIGHMEM and
      DMA32 that are alway zero.
      
      This patch allows the removal of ZONE_DMA32 for non x86_64 architectures and
      it will get rid of ZONE_HIGHMEM for arches not using highmem (like 64 bit
      architectures).  If an arch does not define CONFIG_HIGHMEM then ZONE_HIGHMEM
      will not be defined.  Similarly if an arch does not define CONFIG_ZONE_DMA32
      then ZONE_DMA32 will not be defined.
      
      No current architecture uses all the 4 zones (DMA,DMA32,NORMAL,HIGH) that we
      have now.  The patchset will reduce the number of zones for all platforms.
      
      On many platforms that do not have DMA32 or HIGHMEM this will reduce the
      number of zones by 50%.  F.e.  ia64 only uses DMA and NORMAL.
      
      Large amounts of memory can be saved for larger systemss that may have a few
      hundred NUMA nodes.
      
      With ZONE_DMA32 and ZONE_HIGHMEM support optional MAX_NR_ZONES will be 2 for
      many non i386 platforms and even for i386 without CONFIG_HIGHMEM set.
      
      Tested on ia64, x86_64 and on i386 with and without highmem.
      
      The patchset consists of 11 patches that are following this message.
      
      One could go even further than this patchset and also make ZONE_DMA optional
      because some platforms do not need a separate DMA zone and can do DMA to all
      of memory.  This could reduce MAX_NR_ZONES to 1.  Such a patchset will
      hopefully follow soon.
      
      This patch:
      
      Fix strange uses of MAX_NR_ZONES
      
      Sometimes we use MAX_NR_ZONES - x to refer to a zone.  Make that explicit.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      776ed98b
    • A
      [PATCH] Remove bogus warning from early_ioremap · d3cf7f06
      Andi Kleen 提交于
      It is correct for its only caller right now, but not for possible
      future others.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d3cf7f06
    • K
      [PATCH] x86_64 kernel mapping fix · 6ad91658
      Keith Mannthey 提交于
      Fix for the x86_64 kernel mapping code.  Without this patch the update path
      only inits one pmd_page worth of memory and tramples any entries on it.  now
      the calling convention to phys_pmd_init and phys_init is to always pass a
      [pmd/pud] page not an offset within a page.
      
      Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      6ad91658
    • J
      [PATCH] initialize end of memory variables as early as possible · caff0710
      Jan Beulich 提交于
      While an earlier patch already did a small step into that direction,
      this patch moves initialization of all memory end variables to as
      early as possible, so that dependent code doesn't need to check
      whether these variables have already been set.
      
      Also, remove a misleading (perhaps just outdated) comment, and make
      static a variable only used in a single file.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      caff0710
  13. 02 7月, 2006 1 次提交
  14. 01 7月, 2006 1 次提交
  15. 28 6月, 2006 2 次提交
  16. 27 6月, 2006 4 次提交
    • J
      [PATCH] x86_64: miscellaneous mm/init.c fixes · 5f51e139
      Jan Beulich 提交于
       - fix an off-by-one error in phys_pmd_init()
       - prevent phys_pmd_init() from removing mappings established earlier
       - fix the direct mapping early printk to in fact show the end of the range
       - remove an apparently orphan comment
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5f51e139
    • J
      [PATCH] x86_64: Calgary IOMMU - IOMMU abstractions · 0dc243ae
      Jon Mason 提交于
      This patch creates a new interface for IOMMUs by adding a centralized
      location for IOMMU allocation (for translation tables/apertures) and
      IOMMU initialization.  In creating these, code was moved around for
      abstraction, uniformity, and consiceness.
      
      Take note of the move of the iommu_setup bootarg parsing code to
      __setup.  This is enabled by moving back the location of the aperture
      allocation/detection to mem init (which while ugly, was already the
      location of the swiotlb_init).
      
      While a slight departure from the previous patch, I belive this provides
      the true intention of the previous versions of the patch which changed
      this code.  It also makes the addition of the upcoming calgary code much
      cleaner than previous patches.
      
      [AK: Removed one broken change. iommu_setup still has to be called
      early]
      Signed-off-by: NMuli Ben-Yehuda <muli@il.ibm.com>
      Signed-off-by: NJon Mason <jdmason@us.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0dc243ae
    • A
      [PATCH] x86_64: Get rid of pud_offset_k / __pud_offset_k · d2ae5b5f
      Andi Kleen 提交于
      pud_offset_k() equivalent to pud_offset() now.  Pointed out by Jan Beulich
      Similar for __pud_offset_ok, which needs a small change in the callers.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d2ae5b5f
    • G
      [PATCH] x86_64: x86_64 version of the smp alternative patch. · d167a518
      Gerd Hoffmann 提交于
      Changes are largely identical to the i386 version:
      
       * alternative #define are moved to the new alternative.h file.
       * one new elf section with pointers to the lock prefixes which can be
         nop'ed out for non-smp.
       * two new elf sections simliar to the "classic" alternatives to
         replace SMP code with simpler UP code.
       * fixup headers to use alternative.h instead of defining their own
         LOCK / LOCK_PREFIX macros.
      
      The patch reuses the i386 version of the alternatives code to avoid code
      duplication.  The code in alternatives.c was shuffled around a bit to
      reduce the number of #ifdefs needed.  It also got some tweaks needed for
      x86_64 (vsyscall page handling) and new features (noreplacement option
      which was x86_64 only up to now).  Debug printk's are changed from
      compile-time to runtime.
      
      Loosely based on a early version from Bastian Blank <waldi@debian.org>
      Signed-off-by: NGerd Hoffmann <kraxel@suse.de>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d167a518
  17. 10 4月, 2006 3 次提交
    • A
      [PATCH] x86_64: Rename e820_mapped to e820_any_mapped · eee5a9fa
      Arjan van de Ven 提交于
      Rename e820_mapped to e820_any_mapped since it tests if any part of the
      range is mapped according to the type.
      
      Later steps will introduce e820_all_mapped which will check if the
      entire range is mapped with the type.  Both have their merit.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eee5a9fa
    • A
      [PATCH] x86_64: Reserve SRAT hotadd memory on x86-64 · 68a3a7fe
      Andi Kleen 提交于
      From: Keith Mannthey, Andi Kleen
      
      Implement memory hotadd without sparsemem. The memory in the SRAT
      hotadd area is just preserved instead and can be activated later.
      
      There are a few restrictions:
      - Only one continuous hotadd area allowed per node
      
      The main problem is dealing with the many buggy SRAT tables
      that are out there. The strategy here is to reject anything
      suspicious.
      
      Originally from Keith Mannthey, with several hacks and changes by AK
      and also contributions from Andrew Morton
      
      [ TBD: Problems pointed out by KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>:
      
       1) Goto's rebuild_zonelist patch will not work if CONFIG_MEMORY_HOTPLUG=n.
      
          Rebuilding zonelist is necessary when the system has just memory <
          4G at boot, and hot add memory > 4G.  because x86_64 has DMA32,
          ZONE_NORAML is not included into zonelist at boot time if system
          doesn't have memory >4G at boot.
      
          [AK: should just force the higher zones at boot time when SRAT tells us]
      
       2) zone and node's spanned_pages and present_pages are not incremented.
          They should be.
      
          For example, our server (ia64/Fujitsu PrimeQuest) can equip memory
          from 4G to 1T(maybe 2T in future), and SRAT will *always* say we have
          possible 1T +memory.  (Microsoft requires "write all possible memory
          in SRAT") When we reserve memmap for possible 1T memory, Linux will
          not work well in +minimum 4G configuraion ;)
      
          [AK: needs limiting to 5-10% of max memory]
       ]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      68a3a7fe
    • A
      [PATCH] x86_64: Support memory hotadd without sparsemem · 9d99aaa3
      Andi Kleen 提交于
      Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating
      mem_maps. This only needs some untangling of ifdefs to enable the necessary
      code even without SPARSEMEM.
      
      Originally from Keith Mannthey, hacked by AK.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9d99aaa3
  18. 28 3月, 2006 1 次提交
  19. 26 3月, 2006 3 次提交