1. 08 12月, 2021 23 次提交
  2. 02 12月, 2021 1 次提交
  3. 30 11月, 2021 1 次提交
    • P
      KVM: x86: check PIR even for vCPUs with disabled APICv · 37c4dbf3
      Paolo Bonzini 提交于
      The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
      if APICv is disabled on the vCPU that receives it.  In that case, the
      interrupt will just cause a vmexit and leave the ON bit set together
      with the PIR bit corresponding to the interrupt.
      
      Right now, the interrupt would not be delivered until APICv is re-enabled.
      However, fixing this is just a matter of always doing the PIR->IRR
      synchronization, even if the vCPU has temporarily disabled APICv.
      
      This is not a problem for performance, or if anything it is an
      improvement.  First, in the common case where vcpu->arch.apicv_active is
      true, one fewer check has to be performed.  Second, static_call_cond will
      elide the function call if APICv is not present or disabled.  Finally,
      in the case for AMD hardware we can remove the sync_pir_to_irr callback:
      it is only needed for apic_has_interrupt_for_ppr, and that function
      already has a fallback for !APICv.
      
      Cc: stable@vger.kernel.org
      Co-developed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      37c4dbf3
  4. 26 11月, 2021 4 次提交
  5. 18 11月, 2021 3 次提交
    • V
      KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS · 2845e735
      Vitaly Kuznetsov 提交于
      It doesn't make sense to return the recommended maximum number of
      vCPUs which exceeds the maximum possible number of vCPUs.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211116163443.88707-7-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2845e735
    • T
      KVM: x86: Assume a 64-bit hypercall for guests with protected state · b5aead00
      Tom Lendacky 提交于
      When processing a hypercall for a guest with protected state, currently
      SEV-ES guests, the guest CS segment register can't be checked to
      determine if the guest is in 64-bit mode. For an SEV-ES guest, it is
      expected that communication between the guest and the hypervisor is
      performed to shared memory using the GHCB. In order to use the GHCB, the
      guest must have been in long mode, otherwise writes by the guest to the
      GHCB would be encrypted and not be able to be comprehended by the
      hypervisor.
      
      Create a new helper function, is_64_bit_hypercall(), that assumes the
      guest is in 64-bit mode when the guest has protected state, and returns
      true, otherwise invoking is_64_bit_mode() to determine the mode. Update
      the hypercall related routines to use is_64_bit_hypercall() instead of
      is_64_bit_mode().
      
      Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to
      this helper function for a guest running with protected state.
      
      Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
      Reported-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b5aead00
    • D
      KVM: Fix steal time asm constraints · 964b7aa0
      David Woodhouse 提交于
      In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits
      of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use
      the [abcd] registers. For which, GCC has the "q" constraint instead of
      the less restrictive "r".
      
      Also fix st->preempted, which is an input/output operand rather than an
      input.
      
      Fixes: 7e2175eb ("KVM: x86: Fix recording of guest steal time / preempted status")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <89bf72db1b859990355f9c40713a34e0d2d86c98.camel@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      964b7aa0
  6. 16 11月, 2021 1 次提交
    • KVM: x86: Fix uninitialized eoi_exit_bitmap usage in vcpu_load_eoi_exitmap() · c5adbb3a
      黄乐 提交于
      In vcpu_load_eoi_exitmap(), currently the eoi_exit_bitmap[4] array is
      initialized only when Hyper-V context is available, in other path it is
      just passed to kvm_x86_ops.load_eoi_exitmap() directly from on the stack,
      which would cause unexpected interrupt delivery/handling issues, e.g. an
      *old* linux kernel that relies on PIT to do clock calibration on KVM might
      randomly fail to boot.
      
      Fix it by passing ioapic_handled_vectors to load_eoi_exitmap() when Hyper-V
      context is not available.
      
      Fixes: f2bc14b6 ("KVM: x86: hyper-v: Prepare to meet unallocated Hyper-V context")
      Cc: stable@vger.kernel.org
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NHuang Le <huangle1@jd.com>
      Message-Id: <62115b277dab49ea97da5633f8522daf@jd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c5adbb3a
  7. 12 11月, 2021 1 次提交
  8. 11 11月, 2021 6 次提交
    • V
      KVM: x86: Drop arbitrary KVM_SOFT_MAX_VCPUS · da1bfd52
      Vitaly Kuznetsov 提交于
      KVM_CAP_NR_VCPUS is used to get the "recommended" maximum number of
      VCPUs and arm64/mips/riscv report num_online_cpus(). Powerpc reports
      either num_online_cpus() or num_present_cpus(), s390 has multiple
      constants depending on hardware features. On x86, KVM reports an
      arbitrary value of '710' which is supposed to be the maximum tested
      value but it's possible to test all KVM_MAX_VCPUS even when there are
      less physical CPUs available.
      
      Drop the arbitrary '710' value and return num_online_cpus() on x86 as
      well. The recommendation will match other architectures and will mean
      'no CPU overcommit'.
      
      For reference, QEMU only queries KVM_CAP_NR_VCPUS to print a warning
      when the requested vCPU number exceeds it. The static limit of '710'
      is quite weird as smaller systems with just a few physical CPUs should
      certainly "recommend" less.
      Suggested-by: NEduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211111134733.86601-1-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da1bfd52
    • V
      KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid() · 796c83c5
      Vipin Sharma 提交于
      Handle #GP on INVPCID due to an invalid type in the common switch
      statement instead of relying on the callers (VMX and SVM) to manually
      validate the type.
      
      Unlike INVVPID and INVEPT, INVPCID is not explicitly documented to check
      the type before reading the operand from memory, so deferring the
      type validity check until after that point is architecturally allowed.
      Signed-off-by: NVipin Sharma <vipinsh@google.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211109174426.2350547-3-vipinsh@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      796c83c5
    • V
      KVM: x86: Rename kvm_lapic_enable_pv_eoi() · 77c3323f
      Vitaly Kuznetsov 提交于
      kvm_lapic_enable_pv_eoi() is a misnomer as the function is also
      used to disable PV EOI. Rename it to kvm_lapic_set_pv_eoi().
      
      No functional change intended.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20211108152819.12485-2-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      77c3323f
    • M
      KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active · cae72dcc
      Maxim Levitsky 提交于
      KVM_GUESTDBG_BLOCKIRQ relies on interrupts being injected using
      standard kvm's inject_pending_event, and not via APICv/AVIC.
      
      Since this is a debug feature, just inhibit APICv/AVIC while
      KVM_GUESTDBG_BLOCKIRQ is in use on at least one vCPU.
      
      Fixes: 61e5f69e ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ")
      Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Tested-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211108090245.166408-1-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cae72dcc
    • J
      kvm: x86: Convert return type of *is_valid_rdpmc_ecx() to bool · e6cd31f1
      Jim Mattson 提交于
      These function names sound like predicates, and they have siblings,
      *is_valid_msr(), which _are_ predicates. Moreover, there are comments
      that essentially warn that these functions behave unexpectedly.
      
      Flip the polarity of the return values, so that they become
      predicates, and convert the boolean result to a success/failure code
      at the outer call site.
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211105202058.1048757-1-jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e6cd31f1
    • D
      KVM: x86: Fix recording of guest steal time / preempted status · 7e2175eb
      David Woodhouse 提交于
      In commit b0431382 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is
      not missed") we switched to using a gfn_to_pfn_cache for accessing the
      guest steal time structure in order to allow for an atomic xchg of the
      preempted field. This has a couple of problems.
      
      Firstly, kvm_map_gfn() doesn't work at all for IOMEM pages when the
      atomic flag is set, which it is in kvm_steal_time_set_preempted(). So a
      guest vCPU using an IOMEM page for its steal time would never have its
      preempted field set.
      
      Secondly, the gfn_to_pfn_cache is not invalidated in all cases where it
      should have been. There are two stages to the GFN->PFN conversion;
      first the GFN is converted to a userspace HVA, and then that HVA is
      looked up in the process page tables to find the underlying host PFN.
      Correct invalidation of the latter would require being hooked up to the
      MMU notifiers, but that doesn't happen---so it just keeps mapping and
      unmapping the *wrong* PFN after the userspace page tables change.
      
      In the !IOMEM case at least the stale page *is* pinned all the time it's
      cached, so it won't be freed and reused by anyone else while still
      receiving the steal time updates. The map/unmap dance only takes care
      of the KVM administrivia such as marking the page dirty.
      
      Until the gfn_to_pfn cache handles the remapping automatically by
      integrating with the MMU notifiers, we might as well not get a
      kernel mapping of it, and use the perfectly serviceable userspace HVA
      that we already have.  We just need to implement the atomic xchg on
      the userspace address with appropriate exception handling, which is
      fairly trivial.
      
      Cc: stable@vger.kernel.org
      Fixes: b0431382 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed")
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <3645b9b889dac6438394194bb5586a46b68d581f.camel@infradead.org>
      [I didn't entirely agree with David's assessment of the
       usefulness of the gfn_to_pfn cache, and integrated the outcome
       of the discussion in the above commit message. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7e2175eb