1. 23 1月, 2017 13 次提交
  2. 10 1月, 2017 4 次提交
  3. 06 1月, 2017 1 次提交
  4. 05 1月, 2017 1 次提交
  5. 27 12月, 2016 1 次提交
  6. 25 12月, 2016 2 次提交
  7. 21 12月, 2016 1 次提交
  8. 20 12月, 2016 1 次提交
    • V
      x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic · 59107e2f
      Vitaly Kuznetsov 提交于
      There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
      which injects NMI to the guest. We may want to crash the guest and do kdump
      on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
      allow the kdump kernel to re-establish VMBus connection so it will see
      VMBus devices (storage, network,..).
      
      To properly unload VMBus making it possible to start over during kdump we
      need to do the following:
      
       - Send an 'unload' message to the hypervisor. This can be done on any CPU
         so we do this the crashing CPU.
      
       - Receive the 'unload finished' reply message. WS2012R2 delivers this
         message to the CPU which was used to establish VMBus connection during
         module load and this CPU may differ from the CPU sending 'unload'.
      
      Receiving a VMBus message means the following:
      
       - There is a per-CPU slot in memory for one message. This slot can in
         theory be accessed by any CPU.
      
       - We get an interrupt on the CPU when a message was placed into the slot.
      
       - When we read the message we need to clear the slot and signal the fact
         to the hypervisor. In case there are more messages to this CPU pending
         the hypervisor will deliver the next message. The signaling is done by
         writing to an MSR so this can only be done on the appropriate CPU.
      
      To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
      function which checks message slots for all CPUs in a loop waiting for the
      'unload finished' messages. However, there is an issue which arises when
      these conditions are met:
      
       - We're crashing on a CPU which is different from the one which was used
         to initially contact the hypervisor.
      
       - The CPU which was used for the initial contact is blocked with interrupts
         disabled and there is a message pending in the message slot.
      
      In this case we won't be able to read the 'unload finished' message on the
      crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
      simultaneously: the first CPU entering panic() will proceed to crash and
      all other CPUs will stop themselves with interrupts disabled.
      
      The suggested solution is to handle unknown NMIs for Hyper-V guests on the
      first CPU which gets them only. This will allow us to rely on VMBus
      interrupt handler being able to receive the 'unload finish' message in
      case it is delivered to a different CPU.
      
      The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
      boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
      Cc: devel@linuxdriverproject.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59107e2f
  9. 19 12月, 2016 6 次提交
  10. 13 12月, 2016 1 次提交
    • T
      x86/smpboot: Make logical package management more robust · 9d85eb91
      Thomas Gleixner 提交于
      The logical package management has several issues:
      
       - The APIC ids provided by ACPI are not required to be the same as the
         initial APIC id which can be retrieved by CPUID. The APIC ids provided
         by ACPI are those which are written by the BIOS into the APIC. The
         initial id is set by hardware and can not be changed. The hardware
         provided ids contain the real hardware package information.
      
         Especially AMD sets the effective APIC id different from the hardware id
         as they need to reserve space for the IOAPIC ids starting at id 0.
      
         As a consequence those machines trigger the currently active firmware
         bug printouts in dmesg, These are obviously wrong.
      
       - Virtual machines have their own interesting of enumerating APICs and
         packages which are not reliably covered by the current implementation.
      
      The sizing of the mapping array has been tweaked to be generously large to
      handle systems which provide a wrong core count when HT is disabled so the
      whole magic which checks for space in the physical hotplug case is not
      needed anymore.
      
      Simplify the whole machinery and do the mapping when the CPU starts and the
      CPUID derived physical package information is available. This solves the
      observed problems on AMD machines and works for the virtualization issues
      as well.
      
      Remove the extra call from XEN cpu bringup code as it is not longer
      required.
      
      Fixes: d49597fd ("x86/cpu: Deal with broken firmware (VMWare/XEN)")
      Reported-and-tested-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: M. Vefa Bicakci <m.v.b@runbox.com>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Cc: Charles (Chas) Williams <ciwillia@brocade.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9d85eb91
  11. 10 12月, 2016 3 次提交
  12. 09 12月, 2016 1 次提交
  13. 02 12月, 2016 1 次提交
  14. 28 11月, 2016 2 次提交
  15. 23 11月, 2016 2 次提交