1. 29 3月, 2018 3 次提交
  2. 28 3月, 2018 1 次提交
    • A
      KVM: x86: Fix perf timer mode IP reporting · dd60d217
      Andi Kleen 提交于
      KVM and perf have a special backdoor mechanism to report the IP for interrupts
      re-executed after vm exit. This works for the NMIs that perf normally uses.
      
      However when perf is in timer mode it doesn't work because the timer interrupt
      doesn't get this special treatment. This is common when KVM is running
      nested in another hypervisor which may not implement the PMU, so only
      timer mode is available.
      
      Call the functions to set up the backdoor IP also for non NMI interrupts.
      
      I renamed the functions to set up the backdoor IP reporting to be more
      appropiate for their new use.  The SVM change is only compile tested.
      
      v2: Moved the functions inline.
      For the normal interrupt case the before/after functions are now
      called from x86.c, not arch specific code.
      For the NMI case we still need to call it in the architecture
      specific code, because it's already needed in the low level *_run
      functions.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      [Removed unnecessary calls from arch handle_external_intr. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      dd60d217
  3. 24 3月, 2018 1 次提交
  4. 21 3月, 2018 1 次提交
    • L
      KVM: nVMX: Do not load EOI-exitmap while running L2 · e40ff1d6
      Liran Alon 提交于
      When L1 IOAPIC redirection-table is written, a request of
      KVM_REQ_SCAN_IOAPIC is set on all vCPUs. This is done such that
      all vCPUs will now recalc their IOAPIC handled vectors and load
      it to their EOI-exitmap.
      
      However, it could be that one of the vCPUs is currently running
      L2. In this case, load_eoi_exitmap() will be called which would
      write to vmcs02->eoi_exit_bitmap, which is wrong because
      vmcs02->eoi_exit_bitmap should always be equal to
      vmcs12->eoi_exit_bitmap. Furthermore, at this point
      KVM_REQ_SCAN_IOAPIC was already consumed and therefore we will
      never update vmcs01->eoi_exit_bitmap. This could lead to remote_irr
      of some IOAPIC level-triggered entry to remain set forever.
      
      Fix this issue by delaying the load of EOI-exitmap to when vCPU
      is running L1.
      
      One may wonder why not just delay entire KVM_REQ_SCAN_IOAPIC
      processing to when vCPU is running L1. This is done in order to handle
      correctly the case where LAPIC & IO-APIC of L1 is pass-throughed into
      L2. In this case, vmcs12->virtual_interrupt_delivery should be 0. In
      current nVMX implementation, that results in
      vmcs02->virtual_interrupt_delivery to also be 0. Thus,
      vmcs02->eoi_exit_bitmap is not used. Therefore, every L2 EOI cause
      a #VMExit into L0 (either on MSR_WRITE to x2APIC MSR or
      APIC_ACCESS/APIC_WRITE/EPT_MISCONFIG to APIC MMIO page).
      In order for such L2 EOI to be broadcasted, if needed, from LAPIC
      to IO-APIC, vcpu->arch.ioapic_handled_vectors must be updated
      while L2 is running. Therefore, patch makes sure to delay only the
      loading of EOI-exitmap but not the update of
      vcpu->arch.ioapic_handled_vectors.
      Reviewed-by: NArbel Moshe <arbel.moshe@oracle.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e40ff1d6
  5. 17 3月, 2018 10 次提交
  6. 07 3月, 2018 4 次提交
  7. 02 3月, 2018 5 次提交
  8. 24 2月, 2018 2 次提交
    • E
      KVM/x86: remove WARN_ON() for when vm_munmap() fails · 103c763c
      Eric Biggers 提交于
      On x86, special KVM memslots such as the TSS region have anonymous
      memory mappings created on behalf of userspace, and these mappings are
      removed when the VM is destroyed.
      
      It is however possible for removing these mappings via vm_munmap() to
      fail.  This can most easily happen if the thread receives SIGKILL while
      it's waiting to acquire ->mmap_sem.   This triggers the 'WARN_ON(r < 0)'
      in __x86_set_memory_region().  syzkaller was able to hit this, using
      'exit()' to send the SIGKILL.  Note that while the vm_munmap() failure
      results in the mapping not being removed immediately, it is not leaked
      forever but rather will be freed when the process exits.
      
      It's not really possible to handle this failure properly, so almost
      every other caller of vm_munmap() doesn't check the return value.  It's
      a limitation of having the kernel manage these mappings rather than
      userspace.
      
      So just remove the WARN_ON() so that users can't spam the kernel log
      with this warning.
      
      Fixes: f0d648bd ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      103c763c
    • P
      KVM: x86: move LAPIC initialization after VMCS creation · 0b2e9904
      Paolo Bonzini 提交于
      The initial reset of the local APIC is performed before the VMCS has been
      created, but it tries to do a vmwrite:
      
       vmwrite error: reg 810 value 4a00 (err 18944)
       CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G        W I      4.16.0-0.rc2.git0.1.fc28.x86_64 #1
       Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
       Call Trace:
        vmx_set_rvi [kvm_intel]
        vmx_hwapic_irr_update [kvm_intel]
        kvm_lapic_reset [kvm]
        kvm_create_lapic [kvm]
        kvm_arch_vcpu_init [kvm]
        kvm_vcpu_init [kvm]
        vmx_create_vcpu [kvm_intel]
        kvm_vm_ioctl [kvm]
      
      Move it later, after the VMCS has been created.
      
      Fixes: 4191db26 ("KVM: x86: Update APICv on APIC reset")
      Cc: stable@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b2e9904
  9. 04 2月, 2018 2 次提交
  10. 03 2月, 2018 1 次提交
  11. 01 2月, 2018 2 次提交
  12. 31 1月, 2018 3 次提交
    • T
      x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n · 5fa4ec9c
      Thomas Gleixner 提交于
      The reenlightment support for hyperv slapped a direct reference to
      x86_hyper_type into the kvm code which results in the following build
      failure when CONFIG_HYPERVISOR_GUEST=n:
      
      arch/x86/kvm/x86.c:6259:6: error: ‘x86_hyper_type’ undeclared (first use in this function)
      arch/x86/kvm/x86.c:6259:6: note: each undeclared identifier is reported only once for each function it appears in
      
      Use the proper helper function to cure that.
      
      The 32bit compile fails because of:
      
      arch/x86/kvm/x86.c:5936:13: warning: ‘kvm_hyperv_tsc_notifier’ defined but not used [-Wunused-function]
      
      which is a real trainwreck engineering artwork. The callsite is wrapped
      into #ifdef CONFIG_X86_64, but the function itself has the #ifdef inside
      the function body. Make the function itself wrapped into the ifdef to cure
      that.
      
      Qualiteee....
      
      Fixes: 0092e434 ("x86/kvm: Support Hyper-V reenlightenment")
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: devel@linuxdriverproject.org
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Mohammed Gamal <mmorsy@redhat.com>
      5fa4ec9c
    • V
      x86/kvm: Support Hyper-V reenlightenment · 0092e434
      Vitaly Kuznetsov 提交于
      When running nested KVM on Hyper-V guests its required to update
      masterclocks for all guests when L1 migrates to a host with different TSC
      frequency.
      
      Implement the procedure in the following way:
        - Pause all guests.
        - Tell the host (Hyper-V) to stop emulating TSC accesses.
        - Update the gtod copy, recompute clocks.
        - Unpause all guests.
      
      This is somewhat similar to cpufreq but there are two important differences:
       - TSC emulation can only be disabled globally (on all CPUs)
       - The new TSC frequency is not known until emulation is turned off so
         there is no way to 'prepare' for the event upfront.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: devel@linuxdriverproject.org
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Mohammed Gamal <mmorsy@redhat.com>
      Link: https://lkml.kernel.org/r/20180124132337.30138-8-vkuznets@redhat.com
      0092e434
    • V
      x86/kvm: Pass stable clocksource to guests when running nested on Hyper-V · b0c39dc6
      Vitaly Kuznetsov 提交于
      Currently, KVM is able to work in 'masterclock' mode passing
      PVCLOCK_TSC_STABLE_BIT to guests when the clocksource which is used on the
      host is TSC.
      
      When running nested on Hyper-V the guest normally uses a different one: TSC
      page which is resistant to TSC frequency changes on events like L1
      migration. Add support for it in KVM.
      
      The only non-trivial change is in vgettsc(): when updating the gtod copy
      both the clock readout and tsc value have to be updated now.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: devel@linuxdriverproject.org
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Mohammed Gamal <mmorsy@redhat.com>
      Link: https://lkml.kernel.org/r/20180124132337.30138-7-vkuznets@redhat.com
      b0c39dc6
  13. 17 1月, 2018 1 次提交
  14. 16 1月, 2018 4 次提交