1. 09 5月, 2007 5 次提交
    • C
      move die notifier handling to common code · 1eeb66a1
      Christoph Hellwig 提交于
      This patch moves the die notifier handling to common code.  Previous
      various architectures had exactly the same code for it.  Note that the new
      code is compiled unconditionally, this should be understood as an appel to
      the other architecture maintainer to implement support for it aswell (aka
      sprinkling a notify_die or two in the proper place)
      
      arm had a notifiy_die that did something totally different, I renamed it to
      arm_notify_die as part of the patch and made it static to the file it's
      declared and used at.  avr32 used to pass slightly less information through
      this interface and I brought it into line with the other architectures.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
      [bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NBryan Wu <bryan.wu@analog.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1eeb66a1
    • G
      Fixes and cleanups for earlyprintk aka boot console · 69331af7
      Gerd Hoffmann 提交于
      The console subsystem already has an idea of a boot console, using the
      CON_BOOT flag.  The implementation has some flaws though.  The major
      problem is that presence of a boot console makes register_console() ignore
      any other console devices (unless explicitly specified on the kernel
      command line).
      
      This patch fixes the console selection code to *not* consider a boot
      console a full-featured one, so the first non-boot console registering will
      become the default console instead.  This way the unregister call for the
      boot console in the register_console() function actually triggers and the
      handover from the boot console to the real console device works smoothly.
      Added a printk for the handover, so you know which console device the
      output goes to when the boot console stops printing messages.
      
      The disable_early_printk() call is obsolete with that patch, explicitly
      disabling the early console isn't needed any more as it works automagically
      with that patch.
      
      I've walked through the tree, dropped all disable_early_printk() instances
      found below arch/ and tagged the consoles with CON_BOOT if needed.  The
      code is tested on x86, sh (thanks to Paul) and mips (thanks to Ralf).
      
      Changes to last version: Rediffed against -rc3, adapted to mips cleanups by
      Ralf, fixed "udbg-immortal" cmd line arg on powerpc.
      Signed-off-by: NGerd Hoffmann <kraxel@exsuse.de>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69331af7
    • C
      ipmi: add new IPMI nmi watchdog handling · f64da958
      Corey Minyard 提交于
      Convert over to the new NMI handling for getting IPMI watchdog timeouts via an
      NMI.  This add config options to know if there is the ability to receive NMIs
      and if it has an NMI post processing call.  Then it modifies the IPMI watchdog
      to take advantage of this so that it can know if an NMI comes in.
      
      It also adds testing that the IPMI NMI watchdog works.
      Signed-off-by: NCorey Minyard <minyard@acm.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f64da958
    • C
      simplify the stacktrace code · ab1b6f03
      Christoph Hellwig 提交于
      Simplify the stacktrace code:
      
       - remove the unused task argument to save_stack_trace, it's always
         current
       - remove the all_contexts flag, it's alwasy 0
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab1b6f03
    • Y
      Fix section mismatch of memory hotplug related code. · a3142c8e
      Yasunori Goto 提交于
      This is to fix many section mismatches of code related to memory hotplug.
      I checked compile with memory hotplug on/off on ia64 and x86-64 box.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3142c8e
  2. 08 5月, 2007 2 次提交
  3. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  4. 03 5月, 2007 32 次提交
    • M
      MSI: arch must connect the irq and the msi_desc · 7fe3730d
      Michael Ellerman 提交于
      set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
      it at some point in their setup routine, and then the generic code sets up the
      reverse mapping from the msi_desc back to the irq.
      
      set_irq_msi() should do both connections, making it the one and only call
      required to connect an irq with it's MSI desc and vice versa.
      
      The arch code MUST call set_irq_msi(), and it must do so only once it's sure
      it's not going to fail the irq allocation.
      
      Given that there's no need for the arch to return the irq anymore, the return
      value from the arch setup routine just becomes 0 for success and anything else
      for failure.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      7fe3730d
    • D
      msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2) · f282b970
      Dan Williams 提交于
      Allows architectures to advertise that they support MSI rather than listing
      each architecture as a PCI_MSI dependency.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      f282b970
    • A
      [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c · f19cccf3
      Andi Kleen 提交于
      Fix:
      
      In file included from include2/asm/apic.h:5,
                       from include2/asm/smp.h:15,
                       from linux/arch/x86_64/kernel/genapic_flat.c:18:
      linux/include/linux/pm.h: In function ‘call_platform_enable_wakeup’:
      linux/include/linux/pm.h:331: error: ‘EIO’ undeclared (first use in this function)
      linux/include/linux/pm.h:331: error: (Each undeclared identifier is reported only once
      linux/include/linux/pm.h:331: error: for each function it appears in.)
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f19cccf3
    • A
      fac15a8e
    • A
      [PATCH] x86-64: Remove CONFIG_REORDER · 2136220d
      Andi Kleen 提交于
      The option never worked well and functionlist wasn't well maintained.
      Also it made the build very slow on many binutils version.
      
      So just remove it.
      
      Cc: arjan@linux.intel.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      2136220d
    • A
      [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning · 3bea9c97
      Andi Kleen 提交于
      This was supposed to see the full memory on a ASUS A8SX motherboard
      with 4GB RAM where the northbridge reports less memory, but it didn't
      help there. But it's a reasonable change so let's include it anyways.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      3bea9c97
    • A
      [PATCH] x86-64: Drop -traditional for arch/x86_64/boot · fa0a0091
      Andi Kleen 提交于
      Follows i386 and useful cleanup.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      fa0a0091
    • A
      [PATCH] x86-64: Use symbolic CPU features in early CPUID check · 72b1b1d0
      Andi Kleen 提交于
      Dead to magic numbers!
      
      Generated code is the same.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      72b1b1d0
    • D
      [PATCH] x86-64: Avoid overflows during apic timer calibration · 4637a74c
      David P. Reed 提交于
      - Use 64bit TSC calculations to avoid handling overflow
      - Use 32bit unsigned arithmetic for the APIC timer. This
      way overflows are handled correctly.
      - Fix exit check of loop to account for apic timer counting down
      
      Signed-off-by: dpreed@reed.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4637a74c
    • A
      [PATCH] x86-64: Use the 32bit wd_ops for 64bit too. · 05cb007d
      Andi Kleen 提交于
      This mainly removes a lot of code, replacing it with calls into the new 32bit
      perfctr-watchdog.c
      Signed-off-by: NAndi Kleen <ak@suse.de>
      05cb007d
    • S
      [PATCH] x86-64: set node_possible_map at runtime - try 2 · e3f1caee
      Suresh Siddha 提交于
      Set the node_possible_map at runtime on x86_64.  On a non NUMA system,
      num_possible_nodes() will now say '1'.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      e3f1caee
    • T
      [PATCH] x86-64: Dynamically adjust machine check interval · 8a336b0a
      Tim Hockin 提交于
      Background:
       We've found that MCEs (specifically DRAM SBEs) tend to come in bunches,
       especially when we are trying really hard to stress the system out.  The
       current MCE poller uses a static interval which does not care whether it
       has or has not found MCEs recently.
      
      Description:
       This patch makes the MCE poller adjust the polling interval dynamically.
       If we find an MCE, poll 2x faster (down to 10 ms).  When we stop finding
       MCEs, poll 2x slower (up to check_interval seconds).  The check_interval
       tunable becomes the max polling interval.  The "Machine check events
       logged" printk() is rate limited to the check_interval, which should be
       identical behavior to the old functionality.
      
      Result:
       If you start to take a lot of correctable errors (not exceptions), you
       log them faster and more accurately (less chance of overflowing the MCA
       registers).  If you don't take a lot of errors, you will see no change.
      
      Alternatives:
       I considered simply reducing the polling interval to 10 ms immediately
       and keeping it there as long as we continue to find errors.  This felt a
       bit heavy handed, but does perform significantly better for the default
       check_interval of 5 minutes (we're using a few seconds when testing for
       DRAM errors).  I could be convinced to go with this, if anyone felt it
       was not too aggressive.
      
      Testing:
       I used an error-injecting DIMM to create lots of correctable DRAM errors
       and verified that the polling interval accelerates.  The printk() only
       happens once per check_interval seconds.
      
      Patch:
       This patch is against 2.6.21-rc7.
      Signed-Off-By: NTim Hockin <thockin@google.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      8a336b0a
    • A
      [PATCH] x86-64: unexport cpu_llc_id · 425001fe
      Andrew Morton 提交于
      WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data:cpu_llc_id from __ksymtab between '__ksymtab_cpu_llc_id' (at offset 0x4a0) and '__ksymtab_smp_num_siblings'
      
      It is strange to export a __cpuinitdata symbols to modules, and no module
      appears to use it anyway.
      
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      425001fe
    • A
      [PATCH] x86-64: Auto compute __NR_syscall_max at compile time · 57a4f91a
      Andi Kleen 提交于
      No need to maintain it anymore
      Signed-off-by: NAndi Kleen <ak@suse.de>
      57a4f91a
    • E
      [PATCH] x86-64: move __vgetcpu_mode & __jiffies to the vsyscall_2 zone · 141a892f
      Eric Dumazet 提交于
      We apparently hit the 1024 limit of vsyscall_0 zone when some debugging
      options are set, or if __vsyscall_gtod_data is 64 bytes larger.
      
      In order to save 128 bytes from the vsyscall_0 zone, we move __vgetcpu_mode
      & __jiffies to vsyscall_2 zone where they really belong, since they are
      used only from vgetcpu() (which is in this vsyscall_2 area).
      
      After patch is applied, new layout is :
      
      ffffffffff600000 T vgettimeofday
      ffffffffff60004e t vsysc2
      ffffffffff600140 t vread_hpet
      ffffffffff600150 t vread_tsc
      ffffffffff600180 D __vsyscall_gtod_data
      ffffffffff600400 T vtime
      ffffffffff600413 t vsysc1
      ffffffffff600800 T vgetcpu
      ffffffffff600870 D __vgetcpu_mode
      ffffffffff600880 D __jiffies
      ffffffffff600c00 T venosys_1
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      141a892f
    • F
      [PATCH] x86-64: __send_IPI_dest_field - x86_64 · 9062d888
      Implement __send_IPI_dest_field which can be used to send IPIs when the
      "destination shorthand" field of the ICR is set to 00 (destination
      field). Use it whenever possible.
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      9062d888
    • F
      [PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 · 3144c332
      Fernando Luis VazquezCao 提交于
      inquire_remote_apic is used for APIC debugging, so use
      safe_apic_wait_icr_idle  instead of apic_wait_icr_idle to avoid possible
      lockups when APIC delivery fails.
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      3144c332
    • F
      [PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 · ea8c733b
      Fernando Luis VazquezCao 提交于
      The functionality provided by the new safe_apic_wait_icr_idle is being
      open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle
      instead to consolidate code and ease maintenance.
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ea8c733b
    • F
      [PATCH] x86-64: safe_apic_wait_icr_idle - x86_64 · 8339e9fb
      Fernando Luis VazquezCao 提交于
      apic_wait_icr_idle looks like this:
      
      static __inline__ void apic_wait_icr_idle(void)
      {
        while (apic_read(APIC_ICR) & APIC_ICR_BUSY)
          cpu_relax();
      }
      
      The busy loop in this function would not be problematic if the
      corresponding status bit in the ICR were always updated, but that does
      not seem to be the case under certain crash scenarios. Kdump uses an IPI
      to stop the other CPUs in the event of a crash, but when any of the
      other CPUs are locked-up inside the NMI handler the CPU that sends the
      IPI will end up looping forever in the ICR check, effectively
      hard-locking the whole system.
      
      Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3:
      
      "A local APIC unit indicates successful dispatch of an IPI by
      resetting the Delivery Status bit in the Interrupt Command
      Register (ICR). The operating system polls the delivery status
      bit after sending an INIT or STARTUP IPI until the command has
      been dispatched.
      
      A period of 20 microseconds should be sufficient for IPI dispatch
      to complete under normal operating conditions. If the IPI is not
      successfully dispatched, the operating system can abort the
      command. Alternatively, the operating system can retry the IPI by
      writing the lower 32-bit double word of the ICR. This “time-out”
      mechanism can be implemented through an external interrupt, if
      interrupts are enabled on the processor, or through execution of
      an instruction or time-stamp counter spin loop."
      
      Intel's documentation suggests the implementation of a time-out
      mechanism, which, by the way, is already being open-coded in some parts
      of the kernel that tinker with ICR.
      
      Create a apic_wait_icr_idle replacement that implements the time-out
      mechanism and that can be used to solve the aforementioned problem.
      
      AK: moved both functions out of line
      AK: Added improved loop from Keith Owens
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      8339e9fb
    • B
      [PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when suspending · 3ebad590
      Bernhard Kaindl 提交于
      Note: This patch didn'nt need an update since it's initial post.
      
      Some BIOSes may modify fixed-range MTRRs in SMM, e.g. when they
      transition the system into ACPI mode, which is entered thru an SMI,
      triggered by Linux in acpi_enable().
      
      SMIs which cause that Linux is interrupted and BIOS code is
      executed (which may change e.g. fixed-range MTRRs) in SMM may
      be raised by an embedded system controller which is often found
      in notebooks also at other occasions.
      
      If we would not update our copy of the fixed-range MTRRs before
      suspending to RAM or to disk, restore_processor_state() would
      set the fixed-range MTRRs of the BSP using old backup values
      which may be outdated and this could cause the system to fail
      later during resume.
      
      This patch ensures that our copy of the fixed-range MTRRs
      is updated when saving the boot processor state on suspend
      to disk and suspend to RAM.
      
      In combination with other patches this allows to fix s2ram
      and s2disk on the Acer Ferrari 1000 notebook and at least
      s2disk on the Acer Ferrari 5000 notebook.
      Signed-off-by: NBernhard Kaindl <bk@suse.de>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      3ebad590
    • B
      [PATCH] x86: Save the MTRRs of the BSP before booting an AP · 2b1f6278
      Bernhard Kaindl 提交于
      Applied fix by Andew Morton:
      http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'.
      
      AMD and Intel x86 CPU manuals state that it is the responsibility of
      system software to initialize and maintain MTRR consistency across
      all processors in Multi-Processing Environments.
      
      Quote from page 188 of the AMD64 System Programming manual (Volume 2):
      
      7.6.5 MTRRs in Multi-Processing Environments
      
      "In multi-processing environments, the MTRRs located in all processors must
      characterize memory in the same way. Generally, this means that identical
      values are written to the MTRRs used by the processors." (short omission here)
      "Failure to do so may result in coherency violations or loss of atomicity.
      Processor implementations do not check the MTRR settings in other processors
      to ensure consistency. It is the responsibility of system software to
      initialize and maintain MTRR consistency across all processors."
      
      Current Linux MTRR code already implements the above in the case that the
      BIOS does not properly initialize MTRRs on the secondary processors,
      but the case where the fixed-range MTRRs of the boot processor are changed
      after Linux started to boot, before the initialsation of a secondary
      processor, is not handled yet.
      
      In this case, secondary processors are currently initialized by Linux
      with MTRRs which the boot processor had very early, when mtrr_bp_init()
      did run, but not with the MTRRs which the boot processor uses at the
      time when that secondary processors is actually booted,
      causing differing MTRR contents on the secondary processors.
      
      Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the
      BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs
      of the boot processor when it transitions the system into ACPI mode.
      The SMI handler of the BIOS does this in SMM, entered while Linux ACPI
      code runs acpi_enable().
      
      Other occasions where the SMI handler of the BIOS may change bits in
      the MTRRs could occur as well. To initialize newly booted secodary
      processors with the fixed-range MTRRs which the boot processor uses
      at that time, this patch saves the fixed-range MTRRs of the boot
      processor before new secondary processors are started. When the
      secondary processors run their Linux initialisation code, their
      fixed-range MTRRs will be updated with the saved fixed-range MTRRs.
      
      If CONFIG_MTRR is not set, we define mtrr_save_state
      as an empty statement because there is nothing to do.
      
      Possible TODOs:
      
      *) CPU-hotplugging outside of SMP suspend/resume is not yet tested
         with this patch.
      
      *) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up,
         then the calls to mtrr_save_state() could be replaced by calls to
         mtrr_save_fixed_ranges(NULL) and  mtrr_save_state() would not be
         needed.
      
         That would need either verification of the CPU-hotplug code or
         at least a test on a >2 CPU machine.
      
      *) The MTRRs of other running processors are not yet checked at this
         time but it might be interesting to syncronize the MTTRs of all
         processors before booting. That would be an incremental patch,
         but of rather low priority since there is no machine known so
         far which would require this.
      
      AK: moved prototypes on x86-64 around to fix warnings
      Signed-off-by: NBernhard Kaindl <bk@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      2b1f6278
    • J
      [PATCH] x86: update for i386 and x86-64 check_bugs · 57decbda
      Jeremy Fitzhardinge 提交于
      Remove spurious comments, headers and keywords from x86-64 bugs.[ch].
      
      Use identify_boot_cpu()
      
      AK: merged with other patch
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      57decbda
    • J
      [PATCH] x86: deflate stack usage in lib/inflate.c · 35c74226
      Jeremy Fitzhardinge 提交于
      inflate_fixed and huft_build together use around 2.7k of stack.  When
      using 4k stacks, I saw stack overflows from interrupts arriving while
      unpacking the root initrd:
      
      do_IRQ: stack overflow: 384
       [<c0106b64>] show_trace_log_lvl+0x1a/0x30
       [<c01075e6>] show_trace+0x12/0x14
       [<c010763f>] dump_stack+0x16/0x18
       [<c0107ca4>] do_IRQ+0x6d/0xd9
       [<c010202b>] xen_evtchn_do_upcall+0x6e/0xa2
       [<c0106781>] xen_hypervisor_callback+0x25/0x2c
       [<c010116c>] xen_restore_fl+0x27/0x29
       [<c0330f63>] _spin_unlock_irqrestore+0x4a/0x50
       [<c0117aab>] change_page_attr+0x577/0x584
       [<c0117b45>] kernel_map_pages+0x8d/0xb4
       [<c016a314>] cache_alloc_refill+0x53f/0x632
       [<c016a6c2>] __kmalloc+0xc1/0x10d
       [<c0463d34>] malloc+0x10/0x12
       [<c04641c1>] huft_build+0x2a7/0x5fa
       [<c04645a5>] inflate_fixed+0x91/0x136
       [<c04657e2>] unpack_to_rootfs+0x5f2/0x8c1
       [<c0465acf>] populate_rootfs+0x1e/0xe4
      
      (This was under Xen, but there's no reason it couldn't happen on bare
        hardware.)
      
      This patch mallocs the local variables, thereby reducing the stack
      usage to sane levels.
      
      Also, up the heap size for the kernel decompressor to deal with the
      extra allocation.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Tim Yamin <plasmaroo@gentoo.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      35c74226
    • J
      [PATCH] x86-64: x86-64 system crashes when no memory populating Node 0 · 82d1bb72
      James Puthukattukaran 提交于
      I have a 4 socket AMD Operton system. The 2.6.18 kernel I have crashes
      when there is no memory in node0.
      
      AK: changed call to _nopanic
      
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      82d1bb72
    • G
      [PATCH] x86-64: Fix x86_64 compilation with DEBUG_SIG on · 1c3d99c1
      Glauber de Oliveira Costa 提交于
      Setting the DEBUG_SIG flag breaks compilation due to a wrong
      struct access. Aditionally, it raises two warnings. This is one
      patch to fix them all.
      Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1c3d99c1
    • J
      [PATCH] x86: Allow percpu variables to be page-aligned · b6e3590f
      Jeremy Fitzhardinge 提交于
      Let's allow page-alignment in general for per-cpu data (wanted by Xen, and
      Ingo suggested KVM as well).
      
      Because larger alignments can use more room, we increase the max per-cpu
      memory to 64k rather than 32k: it's getting a little tight.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      b6e3590f
    • A
      [PATCH] x86: Don't use MWAIT on AMD Family 10 · f039b754
      Andi Kleen 提交于
      It doesn't put the CPU into deeper sleep states, so it's better to use the standard
      idle loop to save power. But allow to reenable it anyways for benchmarking.
      
      I also removed the obsolete idle=halt on i386
      
      Cc: andreas.herrmann@amd.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f039b754
    • J
      [PATCH] x86-64: Clean up asm-x86_64/bugs.h · c169859d
      Jeremy Fitzhardinge 提交于
      Most of asm-x86_64/bugs.h is code which should be in a C file, so put it there.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      c169859d
    • J
      [PATCH] x86: fix amd64-agp aperture validation · b92e9fac
      Jan Beulich 提交于
      Under CONFIG_DISCONTIGMEM, assuming that a !pfn_valid() implies all
      subsequent pfn-s are also invalid is wrong. Thus replace this by
      explicitly checking against the E820 map.
      
      AK: make e820 on x86-64 not initdata
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NMark Langsdorf <mark.langsdorf@amd.com>
      b92e9fac
    • B
      [PATCH] x86-64: Fix "Section mismatch" compile warning · 141f9cfe
      Bernhard Walle 提交于
      Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c
      Signed-off-by: NBernhard Walle <bwalle@suse.de>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      141f9cfe
    • J
      [PATCH] x86-64: adjust EDID retrieval · dd4ecfc2
      Jan Beulich 提交于
      commit 5e518d76 did the same change to
      i386's variant.
      
      With this change, i386's and x86-64's versions are identical, raising
      the question whether the x86-64 one should go (just like there's only
      one instance of edd.S).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      dd4ecfc2
    • K
      [PATCH] x86-64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation. · ae32b129
      Konrad Rzeszutek 提交于
      This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES
      to inhibit the machine from triggering an NMI while the CPUs
      are locked. This situation is happening on boxes with more
      than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed.
      
      It has been succesfully tested for regression on uni, 2, 4, 8
      32, and 64 CPU boxes with various memory configuration.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ae32b129