1. 30 11月, 2021 1 次提交
    • P
      KVM: x86: check PIR even for vCPUs with disabled APICv · 37c4dbf3
      Paolo Bonzini 提交于
      The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
      if APICv is disabled on the vCPU that receives it.  In that case, the
      interrupt will just cause a vmexit and leave the ON bit set together
      with the PIR bit corresponding to the interrupt.
      
      Right now, the interrupt would not be delivered until APICv is re-enabled.
      However, fixing this is just a matter of always doing the PIR->IRR
      synchronization, even if the vCPU has temporarily disabled APICv.
      
      This is not a problem for performance, or if anything it is an
      improvement.  First, in the common case where vcpu->arch.apicv_active is
      true, one fewer check has to be performed.  Second, static_call_cond will
      elide the function call if APICv is not present or disabled.  Finally,
      in the case for AMD hardware we can remove the sync_pir_to_irr callback:
      it is only needed for apic_has_interrupt_for_ppr, and that function
      already has a fallback for !APICv.
      
      Cc: stable@vger.kernel.org
      Co-developed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      37c4dbf3
  2. 11 11月, 2021 2 次提交
  3. 19 10月, 2021 2 次提交
    • S
      KVM: x86: WARN if APIC HW/SW disable static keys are non-zero on unload · 9139a7a6
      Sean Christopherson 提交于
      WARN if the static keys used to track if any vCPU has disabled its APIC
      are left elevated at module exit.  Unlike the underflow case, nothing in
      the static key infrastructure will complain if a key is left elevated,
      and because an elevated key only affects performance, nothing in KVM will
      fail if either key is improperly incremented.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211013003554.47705-3-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9139a7a6
    • S
      Revert "KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET" · f7d8a19f
      Sean Christopherson 提交于
      Revert a change to open code bits of kvm_lapic_set_base() when emulating
      APIC RESET to fix an apic_hw_disabled underflow bug due to arch.apic_base
      and apic_hw_disabled being unsyncrhonized when the APIC is created.  If
      kvm_arch_vcpu_create() fails after creating the APIC, kvm_free_lapic()
      will see the initialized-to-zero vcpu->arch.apic_base and decrement
      apic_hw_disabled without KVM ever having incremented apic_hw_disabled.
      
      Using kvm_lapic_set_base() in kvm_lapic_reset() is also desirable for a
      potential future where KVM supports RESET outside of vCPU creation, in
      which case all the side effects of kvm_lapic_set_base() are needed, e.g.
      to handle the transition from x2APIC => xAPIC.
      
      Alternatively, KVM could temporarily increment apic_hw_disabled (and call
      kvm_lapic_set_base() at RESET), but that's a waste of cycles and would
      impact the performance of other vCPUs and VMs.  The other subtle side
      effect is that updating the xAPIC ID needs to be done at RESET regardless
      of whether the APIC was previously enabled, i.e. kvm_lapic_reset() needs
      an explicit call to kvm_apic_set_xapic_id() regardless of whether or not
      kvm_lapic_set_base() also performs the update.  That makes stuffing the
      enable bit at vCPU creation slightly more palatable, as doing so affects
      only the apic_hw_disabled key.
      
      Opportunistically tweak the comment to explicitly call out the connection
      between vcpu->arch.apic_base and apic_hw_disabled, and add a comment to
      call out the need to always do kvm_apic_set_xapic_id() at RESET.
      
      Underflow scenario:
      
        kvm_vm_ioctl() {
          kvm_vm_ioctl_create_vcpu() {
            kvm_arch_vcpu_create() {
              if (something_went_wrong)
                goto fail_free_lapic;
              /* vcpu->arch.apic_base is initialized when something_went_wrong is false. */
              kvm_vcpu_reset() {
                kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) {
                  vcpu->arch.apic_base = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
                }
              }
              return 0;
            fail_free_lapic:
              kvm_free_lapic() {
                /* vcpu->arch.apic_base is not yet initialized when something_went_wrong is true. */
                if (!(vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE))
                  static_branch_slow_dec_deferred(&apic_hw_disabled); // <= underflow bug.
              }
              return r;
            }
          }
        }
      
      This (mostly) reverts commit 42122123.
      
      Fixes: 42122123 ("KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU RESET")
      Reported-by: syzbot+9fc046ab2b0cf295a063@syzkaller.appspotmail.com
      Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211013003554.47705-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f7d8a19f
  4. 02 8月, 2021 6 次提交
  5. 18 6月, 2021 2 次提交
  6. 10 6月, 2021 1 次提交
  7. 09 6月, 2021 1 次提交
  8. 27 5月, 2021 2 次提交
  9. 03 5月, 2021 1 次提交
  10. 26 4月, 2021 1 次提交
    • V
      KVM: x86: Properly handle APF vs disabled LAPIC situation · 2f15d027
      Vitaly Kuznetsov 提交于
      Async PF 'page ready' event may happen when LAPIC is (temporary) disabled.
      In particular, Sebastien reports that when Linux kernel is directly booted
      by Cloud Hypervisor, LAPIC is 'software disabled' when APF mechanism is
      initialized. On initialization KVM tries to inject 'wakeup all' event and
      puts the corresponding token to the slot. It is, however, failing to inject
      an interrupt (kvm_apic_set_irq() -> __apic_accept_irq() -> !apic_enabled())
      so the guest never gets notified and the whole APF mechanism gets stuck.
      The same issue is likely to happen if the guest temporary disables LAPIC
      and a previously unavailable page becomes available.
      
      Do two things to resolve the issue:
      - Avoid dequeuing 'page ready' events from APF queue when LAPIC is
        disabled.
      - Trigger an attempt to deliver pending 'page ready' events when LAPIC
        becomes enabled (SPIV or MSR_IA32_APICBASE).
      Reported-by: NSebastien Boeuf <sebastien.boeuf@intel.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210422092948.568327-1-vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2f15d027
  11. 15 3月, 2021 1 次提交
    • S
      KVM: x86: Handle triple fault in L2 without killing L1 · cb6a32c2
      Sean Christopherson 提交于
      Synthesize a nested VM-Exit if L2 triggers an emulated triple fault
      instead of exiting to userspace, which likely will kill L1.  Any flow
      that does KVM_REQ_TRIPLE_FAULT is suspect, but the most common scenario
      for L2 killing L1 is if L0 (KVM) intercepts a contributory exception that
      is _not_intercepted by L1.  E.g. if KVM is intercepting #GPs for the
      VMware backdoor, a #GP that occurs in L2 while vectoring an injected #DF
      will cause KVM to emulate triple fault.
      
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210302174515.2812275-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb6a32c2
  12. 13 3月, 2021 1 次提交
  13. 05 3月, 2021 1 次提交
  14. 09 2月, 2021 2 次提交
    • V
      KVM: x86: hyper-v: Prepare to meet unallocated Hyper-V context · f2bc14b6
      Vitaly Kuznetsov 提交于
      Currently, Hyper-V context is part of 'struct kvm_vcpu_arch' and is always
      available. As a preparation to allocating it dynamically, check that it is
      not NULL at call sites which can normally proceed without it i.e. the
      behavior is identical to the situation when Hyper-V emulation is not being
      used by the guest.
      
      When Hyper-V context for a particular vCPU is not allocated, we may still
      need to get 'vp_index' from there. E.g. in a hypothetical situation when
      Hyper-V emulation was enabled on one CPU and wasn't on another, Hyper-V
      style send-IPI hypercall may still be used. Luckily, vp_index is always
      initialized to kvm_vcpu_get_idx() and can only be changed when Hyper-V
      context is present. Introduce kvm_hv_get_vpindex() helper for
      simplification.
      
      No functional change intended.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210126134816.1880136-12-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f2bc14b6
    • V
      KVM: x86: hyper-v: Rename vcpu_to_synic()/synic_to_vcpu() · e0121fa2
      Vitaly Kuznetsov 提交于
      vcpu_to_synic()'s argument is almost always 'vcpu' so there's no need to
      have an additional prefix. Also, as this is used outside of hyper-v
      emulation code, add '_hv_' part to make it clear what this s. This makes
      the naming more consistent with to_hv_vcpu().
      
      Rename synic_to_vcpu() to hv_synic_to_vcpu() for consistency.
      
      No functional change intended.
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210126134816.1880136-6-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e0121fa2
  15. 04 2月, 2021 2 次提交
    • J
      KVM: x86: use static calls to reduce kvm_x86_ops overhead · b3646477
      Jason Baron 提交于
      Convert kvm_x86_ops to use static calls. Note that all kvm_x86_ops are
      covered here except for 'pmu_ops and 'nested ops'.
      
      Here are some numbers running cpuid in a loop of 1 million calls averaged
      over 5 runs, measured in the vm (lower is better).
      
      Intel Xeon 3000MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.671s      |.486s
      static call|.573s(-15%)|.458s(-6%)
      
      AMD EPYC 2500MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.710s      |.609s
      static call|.664s(-6%) |.609s(0%)
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Message-Id: <e057bf1b8a7ad15652df6eeba3f907ae758d3399.1610680941.git.jbaron@akamai.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b3646477
    • C
      KVM: Stop using deprecated jump label APIs · 6e4e3b4d
      Cun Li 提交于
      The use of 'struct static_key' and 'static_key_false' is
      deprecated. Use the new API.
      Signed-off-by: NCun Li <cun.jia.li@gmail.com>
      Message-Id: <20210111152435.50275-1-cun.jia.li@gmail.com>
      [Make it compile.  While at it, rename kvm_no_apic_vcpu to
       kvm_has_noapic_vcpu; the former reads too much like "true if
       no vCPU has an APIC". - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6e4e3b4d
  16. 08 1月, 2021 2 次提交
    • T
      KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2
      Tom Lendacky 提交于
      Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
      where the guest vCPU register state is updated and then the vCPU is VMRUN
      to begin execution of the AP. For an SEV-ES guest, this won't work because
      the guest register state is encrypted.
      
      Following the GHCB specification, the hypervisor must not alter the guest
      register state, so KVM must track an AP/vCPU boot. Should the guest want
      to park the AP, it must use the AP Reset Hold exit event in place of, for
      example, a HLT loop.
      
      First AP boot (first INIT-SIPI-SIPI sequence):
        Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
        support. It is up to the guest to transfer control of the AP to the
        proper location.
      
      Subsequent AP boot:
        KVM will expect to receive an AP Reset Hold exit event indicating that
        the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
        awaken it. When the AP Reset Hold exit event is received, KVM will place
        the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
        sequence, KVM will make the vCPU runnable. It is again up to the guest
        to then transfer control of the AP to the proper location.
      
        To differentiate between an actual HLT and an AP Reset Hold, a new MP
        state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
        placed in upon receiving the AP Reset Hold exit event. Additionally, to
        communicate the AP Reset Hold exit event up to userspace (if needed), a
        new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.
      
      A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
      to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
      original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
      a new function that, for non SEV-ES guests, invokes the original SIPI
      delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
      implements the logic above.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      647daca2
    • S
      KVM: x86: change in pv_eoi_get_pending() to make code more readable · de7860c8
      Stephen Zhang 提交于
      Signed-off-by: NStephen Zhang <stephenzhangzsd@gmail.com>
      Message-Id: <1608277897-1932-1-git-send-email-stephenzhangzsd@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      de7860c8
  17. 10 12月, 2020 1 次提交
  18. 27 11月, 2020 1 次提交
  19. 15 11月, 2020 1 次提交
    • P
      KVM: x86: fix apic_accept_events vs check_nested_events · 1c96dcce
      Paolo Bonzini 提交于
      vmx_apic_init_signal_blocked is buggy in that it returns true
      even in VMX non-root mode.  In non-root mode, however, INITs
      are not latched, they just cause a vmexit.  Previously,
      KVM was waiting for them to be processed when kvm_apic_accept_events
      and in the meanwhile it ate the SIPIs that the processor received.
      
      However, in order to implement the wait-for-SIPI activity state,
      KVM will have to process KVM_APIC_SIPI in vmx_check_nested_events,
      and it will not be possible anymore to disregard SIPIs in non-root
      mode as the code is currently doing.
      
      By calling kvm_x86_ops.nested_ops->check_events, we can force a vmexit
      (with the side-effect of latching INITs) before incorrectly injecting
      an INIT or SIPI in a guest, and therefore vmx_apic_init_signal_blocked
      can do the right thing.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1c96dcce
  20. 28 9月, 2020 6 次提交
  21. 24 8月, 2020 1 次提交
  22. 31 7月, 2020 2 次提交