1. 23 9月, 2008 2 次提交
    • A
      x86, NMI watchdog: setup before enabling NMI watchdog · b3e15bde
      Aristeu Rozanski 提交于
      There's a small window when NMI watchdog is being set up that if any NMIs
      are triggered, the NMI code will make make use of not initalized wd_ops
      elements:
      	void setup_apic_nmi_watchdog(void *unused)
      	{
      		if (__get_cpu_var(wd_enabled))
      			return;
      
      		/* cheap hack to support suspend/resume */
      		/* if cpu0 is not active neither should the other cpus */
      		if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
      			return;
      
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			/* enable it before to avoid race with handler */
      -->			__get_cpu_var(wd_enabled) = 1;
      -->			if (lapic_watchdog_init(nmi_hz) < 0) {
      (...)
      	asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
      	{
      	(...)
      			if (nmi_watchdog_tick(regs, reason))
      				return;
      (...)
      	notrace __kprobes int
      	nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
      	{
      	(...)
      		if (!__get_cpu_var(wd_enabled))
      			return rc;
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			rc |= lapic_wd_event(nmi_hz);
      (...)
      int lapic_wd_event(unsigned nmi_hz)
      {
      	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
      	u64 ctr;
      
      -->	rdmsrl(wd->perfctr_msr, ctr);
      
      and wd->*_msr will be initialized on each processor type specific setup, after
      enabling NMIs for PMIs. Since the counter was just set, the chances of an
      performance counter generated NMI is minimal, but any other unknown NMI would
      trigger the problem. This patch fixes the problem by setting everything up
      before enabling performance counter generated NMIs and will set wd_enabled
      using a callback function.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3e15bde
    • A
      x86, NMI watchdog: when booting with reset_devices, clear the performance counters · 28b166a7
      Aristeu Rozanski 提交于
      P4s have a quirk that makes necessary to clear P4_CCCR_OVF bit on the CCCR
      everytime the PMI is triggered. When booting the kernel with reset_devices
      (more specific kdump case), the counters reach zero and the PMI will be
      generated. This is not a problem on other processors but on P4s, it'll
      continue to generate NMIs until that bit is cleared. Since there may be
      other users of the performance counters, clear and disable all of them
      when booting with reset_devices option.
      
      We have a P4 box here that crashes because of this problem. Since the kdump
      kernel usually boots with only one processor active, the second logical
      unit won't be set up, therefore, MSR_P4_IQ_CCCR1 (and other performance
      counter registers) won't be cleared and P4_CCCR_OVF may be still set because
      the previous kernel was using this register. An NMI is triggered because of
      the MSR_P4_IQ_CCCR1 right after the NMI delivery is enabled, triggering the
      race fixed on my previous email.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      28b166a7
  2. 22 9月, 2008 7 次提交
  3. 21 9月, 2008 4 次提交
  4. 20 9月, 2008 27 次提交