1. 19 12月, 2007 9 次提交
    • I
      x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" · 4aae0702
      Ingo Molnar 提交于
      this is the tale of a full day spent debugging an ancient but elusive bug.
      
      after booting up thousands of random .config kernels, i finally happened
      to generate a .config that produced the following rare bootup failure
      on 32-bit x86:
      
      | ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
      | ..MP-BIOS bug: 8254 timer not connected to IO-APIC
      | ...trying to set up timer (IRQ0) through the 8259A ...  failed.
      | ...trying to set up timer as Virtual Wire IRQ... failed.
      | ...trying to set up timer as ExtINT IRQ... failed :(.
      | Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug
      | and send a report.  Then try booting with the 'noapic' option
      
      this bug has been reported many times during the years, but it was never
      reproduced nor fixed.
      
      the bug that i hit was extremely sensitive to .config details.
      
      First i did a .config-bisection - suspecting some .config detail.
      That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear
      and the system would boot up just fine.
      
      Debugging my way through the MCE code ended up identifying two unlikely
      candidates: the thing that made a real difference to the hang was that
      X86_MCE did two printks:
      
       Intel machine check architecture supported.
       Intel machine check reporting enabled on CPU#1.
      
      Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away!
      
      this left timing as the main suspect: i experimented with adding various
      udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and
      the race window turned out to be narrower than 30 microseconds (!).
      
      That made debugging especially funny, debugging without having printk
      ability before the bug hits is ... interesting ;-)
      
      eventually i started suspecting IRQ activities - those are pretty much the
      only thing that happen this early during bootup and have the timescale of
      a few dozen microseconds. Also, check_timer() changes the IRQ hardware
      in various creative ways, so the main candidate became IRQ0 interaction.
      
      i've added a counter to track timer irqs (on which core they arrived, at
      what exact time, etc.) and found that no timer IRQ would arrive after the
      bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A,
      but that we'd get a small number of timer irqs right around the time when we
      call the check_timer() function.
      
      Eventually i got the following backtrace triggered from debug code in the
      timer interrupt:
      
      ...trying to set up timer as Virtual Wire IRQ... failed.
      ...trying to set up timer as ExtINT IRQ...
      Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57)
      EIP: 0060:[<c044d57e>] EFLAGS: 00000246 CPU: 0
      EIP is at _spin_unlock_irqrestore+0x5/0x1c
      EAX: c0634178 EBX: 00000000 ECX: c4947d63 EDX: 00000246
      ESI: 00000002 EDI: 00010031 EBP: c04e0f2e ESP: f7c41df4
       DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
       CR0: 8005003b CR2: ffe04000 CR3: 00630000 CR4: 000006d0
       DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
       DR6: ffff0ff0 DR7: 00000400
        [<c05f5784>] setup_IO_APIC+0x9c3/0xc5c
      
      the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0
      entry while we are in the middle of setting up the local APIC, the i8259A
      and the PIT??
      
      That is certainly not how it's supposed to work! check_timer() was supposed
      to be called with irqs turned off - but this eroded away sometime in the
      past. This code would still work most of the time because this code runs
      very quickly, but just the right timing conditions are present and IRQ0
      hits in this small, ~30 usecs window, timer irqs stop and the system does
      not boot up. Also, given how early this is during bootup, the hang is
      very deterministic - but it would only occur on certain machines (and
      certain configs).
      
      The fix was quite simple: disable/restore interrupts properly in this
      function. With that in place the test-system now boots up just fine.
      
      (64-bit x86 io_apic_64.c had the same bug.)
      
      Phew! One down, only 1500 other kernel bugs are left ;-)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      4aae0702
    • S
      genirq: revert lazy irq disable for simple irqs · 971e5b35
      Steven Rostedt 提交于
      In commit 76d21601 lazy irq disabling
      was implemented, and the simple irq handler had a masking set to it.
      
      Remy Bohmer discovered that some devices in the ARM architecture
      would trigger the mask, but never unmask it. His patch to do the
      unmasking was questioned by Russell King about masking simple irqs
      to begin with. Looking further, it was discovered that the problems
      Remy was seeing was due to improper use of the simple handler by
      devices, and he later submitted patches to fix those. But the issue
      that was uncovered was that the simple handler should never mask.
      
      This patch reverts the masking in the simple handler.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      971e5b35
    • J
      x86: also define AT_VECTOR_SIZE_ARCH · 213fde71
      Jan Beulich 提交于
      The patch introducing this left out 64-bit x86 despite it also having
      extra entries.
      
      this solves Xen guest troubles.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      213fde71
    • M
      x86: kprobes bugfix · 0b0122fa
      Masami Hiramatsu 提交于
      Kprobes for x86-64 may cause a kernel crash if it inserted on "iret"
      instruction. "call absolute" is invalid on x86-64, so we don't need
      treat it.
      
       - Change the processing order as same as x86-32.
       - Add "iret"(0xcf) case.
       - Remove next_rip local variable.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0b0122fa
    • M
      x86: jprobe bugfix · 29b6cd79
      Masami Hiramatsu 提交于
      jprobe for x86-64 may cause kernel page fault when the jprobe_return()
      is called from incorrect function.
      
      - Use jprobe_saved_regs instead getting it from stack.
        (Especially on x86-64, it may get incorrect data, because
         pt_regs can not be get by using container_of(rsp))
      - Change the type of stack pointer to unsigned long *.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      29b6cd79
    • A
      timer: kernel/timer.c section fixes · b4be6258
      Adrian Bunk 提交于
      This patch fixes the following section mismatches with CONFIG_HOTPLUG=n,
      CONFIG_HOTPLUG_CPU=y:
      
      ...
      WARNING: vmlinux.o(.text+0x41cd3): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
      WARNING: vmlinux.o(.text+0x41d67): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
      ...
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b4be6258
    • K
      genirq: add unlocked version of set_irq_handler() · b019e573
      Kevin Hilman 提交于
      Add unlocked version for use by irq_chip.set_type handlers which may
      wish to change handler to level or edge handler when IRQ type is
      changed.
      
      The normal set_irq_handler() call cannot be used because it tries to
      take irq_desc.lock which is already held when the irq_chip.set_type
      hook is called.
      Signed-off-by: NKevin Hilman <khilman@mvista.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b019e573
    • T
      clockevents: fix reprogramming decision in oneshot broadcast · cdc6f27d
      Thomas Gleixner 提交于
      Resolve the following regression of a choppy, almost unusable laptop:
      
       http://lkml.org/lkml/2007/12/7/299
       http://bugzilla.kernel.org/show_bug.cgi?id=9525
      
      A previous version of the code did the reprogramming of the broadcast
      device in the return from idle code. This was removed, but the logic in
      tick_handle_oneshot_broadcast() was kept the same.
      
      When a broadcast interrupt happens we signal the expiry to all CPUs
      which have an expired event. If none of the CPUs has an expired event,
      which can happen in dyntick mode, then we reprogram the broadcast
      device. We do not reprogram otherwise, but this is only correct if all
      CPUs, which are in the idle broadcast state have been woken up.
      
      The code ignores, that there might be pending not yet expired events on
      other CPUs, which are in the idle broadcast state. So the delivery of
      those events can be delayed for quite a time.
      
      Change the tick_handle_oneshot_broadcast() function to check for CPUs,
      which are in broadcast state and are not woken up by the current event,
      and enforce the rearming of the broadcast device for those CPUs.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdc6f27d
    • B
      oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters · bd87f1f0
      Barry Kasindorf 提交于
      This patch is for controlling the upper 32bits of the event ctrl msrs.
      This includes the upper 4 bits of the event select and the Guest Only and
      Host Only bits
      
      This patch is necessary to make Event Based Profiling work reliably on a
      Family 10h processor
      
      [akpm@linux-foundation.org: checkpatch.pl fixes]
      Signed-off-by: NBarry Kasindorf <barry.kasindorf@amd.com>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      bd87f1f0
  2. 18 12月, 2007 31 次提交