1. 30 11月, 2007 1 次提交
  2. 30 10月, 2007 1 次提交
  3. 17 10月, 2007 2 次提交
  4. 11 10月, 2007 2 次提交
  5. 27 7月, 2007 1 次提交
  6. 23 7月, 2007 4 次提交
    • S
      x86_64: fix section mismatch warning in init.c · dec2e6b7
      Sam Ravnborg 提交于
      Fix following warning:
      WARNING: vmlinux.o(.text+0x188ea): Section mismatch: reference to .init.text:__alloc_bootmem_core (between 'alloc_bootmem_high_node' and 'get_gate_vma')
      
      alloc_bootmem_high_node() is only used from __init scope so declare it __init.
      And in addition declare the weak variant __init too.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dec2e6b7
    • A
      x86: Fix alternatives and kprobes to remap write-protected kernel text · 19d36ccd
      Andi Kleen 提交于
      Reenable kprobes and alternative patching when the kernel text is write
      protected by DEBUG_RODATA
      
      Add a general utility function to change write protected text.  The new
      function remaps the code using vmap to write it and takes care of CPU
      synchronization.  It also does CLFLUSH to make icache recovery faster.
      
      There are some limitations on when the function can be used, see the
      comment.
      
      This is a newer version that also changes the paravirt_ops code.
      text_poke also supports multi byte patching now.
      
      Contains bug fixes from Zach Amsden and suggestions from Mathieu
      Desnoyers.
      
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Zach Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19d36ccd
    • G
      x86_64: Use read and write crX in .c files · f51c9452
      Glauber de Oliveira Costa 提交于
      This patch uses the read and write functions provided at system.h
      for control registers instead of writting raw assembly over and
      over again in .c files. Functions to manipulate cr2 and cr8 were
      provided, as they were lacking.
      
      Also, removed some extra space after closing brackets
      Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f51c9452
    • M
      x86: i386-show-unhandled-signals-v3 · abd4f750
      Masoud Asgharifard Sharbiani 提交于
      This patch makes the i386 behave the same way that x86_64 does when a
      segfault happens.  A line gets printed to the kernel log so that tools
      that need to check for failures can behave more uniformly between
      debug.show_unhandled_signals sysctl variable to 0 (or by doing echo 0 >
      /proc/sys/debug/exception-trace)
      
      Also, all of the lines being printed are now using printk_ratelimit() to
      deny the ability of DoS from a local user with a program like the
      following:
      
      main()
      {
             while (1)
                     if (!fork()) *(int *)0 = 0;
      }
      
      This new revision also includes the fix that Andrew did which got rid of
      new sysctl that was added to the system in earlier versions of this.
      Also, 'show-unhandled-signals' sysctl has been renamed back to the old
      'exception-trace' to avoid breakage of people's scripts.
      
      AK: Enabling by default for i386 will be likely controversal, but let's see what happens
      AK: Really folks, before complaining just fix your segfaults
      AK: I bet this will find a lot of silent issues
      Signed-off-by: NMasoud Sharbiani <masouds@google.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      [ Personally, I've found the complaints useful on x86-64, so I'm all for
        this. That said, I wonder if we could do it more prettily..   -Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      abd4f750
  7. 22 7月, 2007 2 次提交
    • J
      74a1ddc5
    • A
      x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu · 2aae950b
      Andi Kleen 提交于
      This implements new vDSO for x86-64.  The concept is similar
      to the existing vDSOs on i386 and PPC.  x86-64 has had static
      vsyscalls before,  but these are not flexible enough anymore.
      
      A vDSO is a ELF shared library supplied by the kernel that is mapped into
      user address space.  The vDSO mapping is randomized for each process
      for security reasons.
      
      Doing this was needed for clock_gettime, because clock_gettime
      always needs a syscall fallback and having one at a fixed
      address would have made buffer overflow exploits too easy to write.
      
      The vdso can be disabled with vdso=0
      
      It currently includes a new gettimeofday implemention and optimized
      clock_gettime(). The gettimeofday implementation is slightly faster
      than the one in the old vsyscall.  clock_gettime is significantly faster
      than the syscall for CLOCK_MONOTONIC and CLOCK_REALTIME.
      
      The new calls are generally faster than the old vsyscall.
      
      Advantages over the old x86-64 vsyscalls:
      - Extensible
      - Randomized
      - Cleaner
      - Easier to virtualize (the old static address range previously causes
      overhead e.g. for Xen because it has to create special page tables for it)
      
      Weak points:
      - glibc support still to be written
      
      The VM interface is partly based on Ingo Molnar's i386 version.
      
      Includes compile fix from Joachim Deguara
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2aae950b
  8. 22 6月, 2007 1 次提交
  9. 09 6月, 2007 1 次提交
    • B
      fix sysrq-m oops · 12710a56
      Bob Picco 提交于
      We aren't sampling for holes in memory.  Thus we encounter a section hole
      with empty section map pointer for SPARSEMEM and OOPs for show_mem.  This
      issue has been seen in 2.6.21, current git and current mm.  The patch below
      is for mainline and mm.  It was boot tested for SPARSEMEM, current VMEMMAP
      of Andy's in mm ml and DISCONTIGMEM.  A slightly different patch will be
      posted to stable for 2.6.21.
      
      Previous to commit f0a5a58a memory_present
      was called for node_start_pfn to node_end_pfn.  This would cover the
      hole(s) with reserved pages and valid sections.  Most SPARSEMEM supported
      arches do a pfn_valid check in show_mem before computing the page structure
      address.
      
      This issue was brought to my attention on IRC by Arnaldo Carvalho de Melo.
      Thanks to Arnaldo for testing.
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12710a56
  10. 01 6月, 2007 1 次提交
  11. 09 5月, 2007 1 次提交
  12. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  13. 03 5月, 2007 6 次提交
    • K
      [PATCH] x86-64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation. · ae32b129
      Konrad Rzeszutek 提交于
      This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES
      to inhibit the machine from triggering an NMI while the CPUs
      are locked. This situation is happening on boxes with more
      than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed.
      
      It has been succesfully tested for regression on uni, 2, 4, 8
      32, and 64 CPU boxes with various memory configuration.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ae32b129
    • J
      [PATCH] x86: tighten kernel image page access rights · 6fb14755
      Jan Beulich 提交于
      On x86-64, kernel memory freed after init can be entirely unmapped instead
      of just getting 'poisoned' by overwriting with a debug pattern.
      
      On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
      can also be write-protected.
      
      Compared to the first version, this one prevents re-creating deleted
      mappings in the kernel image range on x86-64, if those got removed
      previously. This, together with the original changes, prevents temporarily
      having inconsistent mappings when cacheability attributes are being
      changed on such pages (e.g. from AGP code). While on i386 such duplicate
      mappings don't exist, the same change is done there, too, both for
      consistency and because checking pte_present() before using various other
      pte_XXX functions is a requirement anyway. At once, i386 code gets
      adjusted to use pte_huge() instead of open coding this.
      
      AK: split out cpa() changes
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6fb14755
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
    • V
      [PATCH] x86-64: Remove the identity mapping as early as possible · cfd243d4
      Vivek Goyal 提交于
      With the rewrite of the SMP trampoline and the early page
      allocator there is nothing that needs identity mapped pages,
      once we start executing C code.
      
      So add zap_identity_mappings into head64.c and remove
      zap_low_mappings() from much later in the code.  The functions
       are subtly different thus the name change.
      
      This also kills boot_level4_pgt which was from an earlier
      attempt to move the identity mappings as early as possible,
      and is now no longer needed.  Essentially I have replaced
      boot_level4_pgt with trampoline_level4_pgt in trampoline.S
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      cfd243d4
    • V
      [PATCH] x86-64: Kill temp boot pmds · dafe41ee
      Vivek Goyal 提交于
      Early in the boot process we need the ability to set
      up temporary mappings, before our normal mechanisms are
      initialized.  Currently this is used to map pages that
      are part of the page tables we are building and pages
      during the dmi scan.
      
      The core problem is that we are using the user portion of
      the page tables to implement this.  Which means that while
      this mechanism is active we cannot catch NULL pointer dereferences
      and we deviate from the normal ways of handling things.
      
      In this patch I modify early_ioremap to map pages into
      the kernel portion of address space, roughly where
      we will later put modules, and I make the discovery of
      which addresses we can use dynamic which removes all
      kinds of static limits and remove the dependencies
      on implementation details between different parts of the code.
      
      Now alloc_low_page() and unmap_low_page() use
      early_iomap() and early_iounmap() to allocate/map and
      unmap a page.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      dafe41ee
    • S
      [PATCH] x86-64: dma_ops as const · e6584504
      Stephen Hemminger 提交于
      The dma_ops structure can be const since it never changes
      after boot.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      e6584504
  14. 15 2月, 2007 3 次提交
  15. 07 12月, 2006 1 次提交
    • E
      [PATCH] x86-64: fix perms/range of vsyscall vma in /proc/*/maps · 103efcd9
      Ernie Petrides 提交于
      The final line of /proc/<pid>/maps on x86_64 for native 64-bit
      tasks shows an incorrect ending address and incorrect permissions.  There
      is only a single page mapped in this vsyscall region, and it is accessible
      for both read and execute.
      
      The patch below fixes this.  (Since 32-bit-compat tasks have a real vma
      with correct perms/range, no change is necessary for that scenario.)
      
      Before the patch, a "cat /proc/self/maps | tail -1" shows this:
      
              ffffffffff600000-ffffffffffe00000 ---p 00000000 [...]
      
      After the patch, this is the output:
      
              ffffffffff600000-ffffffffff601000 r-xp 00000000 [...]
      Signed-off-by: NErnie Petrides <petrides@redhat.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      103efcd9
  16. 21 11月, 2006 1 次提交
  17. 14 11月, 2006 1 次提交
  18. 12 10月, 2006 1 次提交
    • M
      [PATCH] mm: use symbolic names instead of indices for zone initialisation · 6391af17
      Mel Gorman 提交于
      Arch-independent zone-sizing is using indices instead of symbolic names to
      offset within an array related to zones (max_zone_pfns).  The unintended
      impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead
      of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set.  As a result, the
      the machine fails to boot but will boot with CONFIG_HIGHMEM turned off.
      
      The following patch properly initialises the max_zone_pfns[] array and uses
      symbolic names instead of indices in each architecture using
      arch-independent zone-sizing.  Two users have successfully booted their
      powerpcs with it (one an ibook G4).  It has also been boot tested on x86,
      x86_64, ppc64 and ia64.  Please merge for 2.6.19-rc2.
      
      Credit to Benjamin Herrenschmidt for identifying the bug and rolling the
      first fix.  Additional credit to Johannes Berg and Andreas Schwab for
      reporting the problem and testing on powerpc.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6391af17
  19. 01 10月, 2006 3 次提交
  20. 27 9月, 2006 2 次提交
    • M
      [PATCH] Account for memmap and optionally the kernel image as holes · 0e0b864e
      Mel Gorman 提交于
      The x86_64 code accounted for memmap and some portions of the the DMA zone as
      holes.  This was because those areas would never be reclaimed and accounting
      for them as memory affects min watermarks.  This patch will account for the
      memmap as a memory hole.  Architectures may optionally use set_dma_reserve()
      if they wish to account for a portion of memory in ZONE_DMA as a hole.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Keith Mannthey" <kmannth@gmail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e0b864e
    • M
      [PATCH] Have x86_64 use add_active_range() and free_area_init_nodes · 5cb248ab
      Mel Gorman 提交于
      Size zones and holes in an architecture independent manner for x86_64.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Keith Mannthey" <kmannth@gmail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5cb248ab
  21. 26 9月, 2006 4 次提交
    • C
      [PATCH] reduce MAX_NR_ZONES: remove two strange uses of MAX_NR_ZONES · 776ed98b
      Christoph Lameter 提交于
      I keep seeing zones on various platforms that are never used and wonder why we
      compile support for them into the kernel.  Counters show up for HIGHMEM and
      DMA32 that are alway zero.
      
      This patch allows the removal of ZONE_DMA32 for non x86_64 architectures and
      it will get rid of ZONE_HIGHMEM for arches not using highmem (like 64 bit
      architectures).  If an arch does not define CONFIG_HIGHMEM then ZONE_HIGHMEM
      will not be defined.  Similarly if an arch does not define CONFIG_ZONE_DMA32
      then ZONE_DMA32 will not be defined.
      
      No current architecture uses all the 4 zones (DMA,DMA32,NORMAL,HIGH) that we
      have now.  The patchset will reduce the number of zones for all platforms.
      
      On many platforms that do not have DMA32 or HIGHMEM this will reduce the
      number of zones by 50%.  F.e.  ia64 only uses DMA and NORMAL.
      
      Large amounts of memory can be saved for larger systemss that may have a few
      hundred NUMA nodes.
      
      With ZONE_DMA32 and ZONE_HIGHMEM support optional MAX_NR_ZONES will be 2 for
      many non i386 platforms and even for i386 without CONFIG_HIGHMEM set.
      
      Tested on ia64, x86_64 and on i386 with and without highmem.
      
      The patchset consists of 11 patches that are following this message.
      
      One could go even further than this patchset and also make ZONE_DMA optional
      because some platforms do not need a separate DMA zone and can do DMA to all
      of memory.  This could reduce MAX_NR_ZONES to 1.  Such a patchset will
      hopefully follow soon.
      
      This patch:
      
      Fix strange uses of MAX_NR_ZONES
      
      Sometimes we use MAX_NR_ZONES - x to refer to a zone.  Make that explicit.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      776ed98b
    • A
      [PATCH] Remove bogus warning from early_ioremap · d3cf7f06
      Andi Kleen 提交于
      It is correct for its only caller right now, but not for possible
      future others.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d3cf7f06
    • K
      [PATCH] x86_64 kernel mapping fix · 6ad91658
      Keith Mannthey 提交于
      Fix for the x86_64 kernel mapping code.  Without this patch the update path
      only inits one pmd_page worth of memory and tramples any entries on it.  now
      the calling convention to phys_pmd_init and phys_init is to always pass a
      [pmd/pud] page not an offset within a page.
      
      Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      6ad91658
    • J
      [PATCH] initialize end of memory variables as early as possible · caff0710
      Jan Beulich 提交于
      While an earlier patch already did a small step into that direction,
      this patch moves initialization of all memory end variables to as
      early as possible, so that dependent code doesn't need to check
      whether these variables have already been set.
      
      Also, remove a misleading (perhaps just outdated) comment, and make
      static a variable only used in a single file.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      caff0710