1. 08 6月, 2022 1 次提交
  2. 11 2月, 2022 1 次提交
    • D
      KVM: x86/mmu: Split huge pages mapped by the TDP MMU during KVM_CLEAR_DIRTY_LOG · cb00a70b
      David Matlack 提交于
      When using KVM_DIRTY_LOG_INITIALLY_SET, huge pages are not
      write-protected when dirty logging is enabled on the memslot. Instead
      they are write-protected once userspace invokes KVM_CLEAR_DIRTY_LOG for
      the first time and only for the specific sub-region being cleared.
      
      Enhance KVM_CLEAR_DIRTY_LOG to also try to split huge pages prior to
      write-protecting to avoid causing write-protection faults on vCPU
      threads. This also allows userspace to smear the cost of huge page
      splitting across multiple ioctls, rather than splitting the entire
      memslot as is the case when initially-all-set is not used.
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Message-Id: <20220119230739.2234394-17-dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb00a70b
  3. 02 2月, 2022 1 次提交
  4. 01 2月, 2022 1 次提交
    • M
      kvm/x86: rework guest entry logic · b2d2af7e
      Mark Rutland 提交于
      For consistency and clarity, migrate x86 over to the generic helpers for
      guest timing and lockdep/RCU/tracing management, and remove the
      x86-specific helpers.
      
      Prior to this patch, the guest timing was entered in
      kvm_guest_enter_irqoff() (called by svm_vcpu_enter_exit() and
      svm_vcpu_enter_exit()), and was exited by the call to
      vtime_account_guest_exit() within vcpu_enter_guest().
      
      To minimize duplication and to more clearly balance entry and exit, both
      entry and exit of guest timing are placed in vcpu_enter_guest(), using
      the new guest_timing_{enter,exit}_irqoff() helpers. When context
      tracking is used a small amount of additional time will be accounted
      towards guests; tick-based accounting is unnaffected as IRQs are
      disabled at this point and not enabled until after the return from the
      guest.
      
      This also corrects (benign) mis-balanced context tracking accounting
      introduced in commits:
      
        ae95f566 ("KVM: X86: TSCDEADLINE MSR emulation fastpath")
        26efe2fd ("KVM: VMX: Handle preemption timer fastpath")
      
      Where KVM can enter a guest multiple times, calling vtime_guest_enter()
      without a corresponding call to vtime_account_guest_exit(), and with
      vtime_account_system() called when vtime_account_guest() should be used.
      As account_system_time() checks PF_VCPU and calls account_guest_time(),
      this doesn't result in any functional problem, but is unnecessarily
      confusing.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NNicolas Saenz Julienne <nsaenzju@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Message-Id: <20220201132926.3301912-4-mark.rutland@arm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b2d2af7e
  5. 18 1月, 2022 1 次提交
    • L
      KVM: x86: Making the module parameter of vPMU more common · 4732f244
      Like Xu 提交于
      The new module parameter to control PMU virtualization should apply
      to Intel as well as AMD, for situations where userspace is not trusted.
      If the module parameter allows PMU virtualization, there could be a
      new KVM_CAP or guest CPUID bits whereby userspace can enable/disable
      PMU virtualization on a per-VM basis.
      
      If the module parameter does not allow PMU virtualization, there
      should be no userspace override, since we have no precedent for
      authorizing that kind of override. If it's false, other counter-based
      profiling features (such as LBR including the associated CPUID bits
      if any) will not be exposed.
      
      Change its name from "pmu" to "enable_pmu" as we have temporary
      variables with the same name in our code like "struct kvm_pmu *pmu".
      
      Fixes: b1d66dad ("KVM: x86/svm: Add module param to control PMU virtualization")
      Suggested-by : Jim Mattson <jmattson@google.com>
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20220111073823.21885-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4732f244
  6. 07 1月, 2022 1 次提交
    • D
      KVM: x86: Fix wall clock writes in Xen shared_info not to mark page dirty · 55749769
      David Woodhouse 提交于
      When dirty ring logging is enabled, any dirty logging without an active
      vCPU context will cause a kernel oops. But we've already declared that
      the shared_info page doesn't get dirty tracking anyway, since it would
      be kind of insane to mark it dirty every time we deliver an event channel
      interrupt. Userspace is supposed to just assume it's always dirty any
      time a vCPU can run or event channels are routed.
      
      So stop using the generic kvm_write_wall_clock() and just write directly
      through the gfn_to_pfn_cache that we already have set up.
      
      We can make kvm_write_wall_clock() static in x86.c again now, but let's
      not remove the 'sec_hi_ofs' argument even though it's not used yet. At
      some point we *will* want to use that for KVM guests too.
      
      Fixes: 629b5348 ("KVM: x86/xen: update wallclock region")
      Reported-by: Nbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-6-dwmw2@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      55749769
  7. 26 11月, 2021 1 次提交
  8. 18 11月, 2021 1 次提交
    • T
      KVM: x86: Assume a 64-bit hypercall for guests with protected state · b5aead00
      Tom Lendacky 提交于
      When processing a hypercall for a guest with protected state, currently
      SEV-ES guests, the guest CS segment register can't be checked to
      determine if the guest is in 64-bit mode. For an SEV-ES guest, it is
      expected that communication between the guest and the hypervisor is
      performed to shared memory using the GHCB. In order to use the GHCB, the
      guest must have been in long mode, otherwise writes by the guest to the
      GHCB would be encrypted and not be able to be comprehended by the
      hypervisor.
      
      Create a new helper function, is_64_bit_hypercall(), that assumes the
      guest is in 64-bit mode when the guest has protected state, and returns
      true, otherwise invoking is_64_bit_mode() to determine the mode. Update
      the hypercall related routines to use is_64_bit_hypercall() instead of
      is_64_bit_mode().
      
      Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to
      this helper function for a guest running with protected state.
      
      Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
      Reported-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b5aead00
  9. 17 11月, 2021 2 次提交
  10. 23 10月, 2021 1 次提交
  11. 13 8月, 2021 1 次提交
  12. 25 6月, 2021 1 次提交
    • S
      KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper · 87e99d7d
      Sean Christopherson 提交于
      Get LA57 from the role_regs, which are initialized from the vCPU even
      though TDP is enabled, instead of pulling the value directly from the
      vCPU when computing the guest's root_level for TDP MMUs.  Note, the check
      is inside an is_long_mode() statement, so that requirement is not lost.
      
      Use role_regs even though the MMU's role is available and arguably
      "better".  A future commit will consolidate the guest root level logic,
      and it needs access to EFER.LMA, which is not tracked in the role (it
      can't be toggled on VM-Exit, unlike LA57).
      
      Drop is_la57_mode() as there are no remaining users, and to discourage
      pulling MMU state from the vCPU (in the future).
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210622175739.3610207-41-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      87e99d7d
  13. 06 5月, 2021 1 次提交
  14. 26 4月, 2021 1 次提交
  15. 31 3月, 2021 1 次提交
  16. 15 3月, 2021 2 次提交
  17. 04 2月, 2021 5 次提交
  18. 02 2月, 2021 1 次提交
    • V
      KVM: x86: Supplement __cr4_reserved_bits() with X86_FEATURE_PCID check · 4683d758
      Vitaly Kuznetsov 提交于
      Commit 7a873e45 ("KVM: selftests: Verify supported CR4 bits can be set
      before KVM_SET_CPUID2") reveals that KVM allows to set X86_CR4_PCIDE even
      when PCID support is missing:
      
      ==== Test Assertion Failure ====
        x86_64/set_sregs_test.c:41: rc
        pid=6956 tid=6956 - Invalid argument
           1	0x000000000040177d: test_cr4_feature_bit at set_sregs_test.c:41
           2	0x00000000004014fc: main at set_sregs_test.c:119
           3	0x00007f2d9346d041: ?? ??:0
           4	0x000000000040164d: _start at ??:?
        KVM allowed unsupported CR4 bit (0x20000)
      
      Add X86_FEATURE_PCID feature check to __cr4_reserved_bits() to make
      kvm_is_valid_cr4() fail.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210201142843.108190-1-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4683d758
  19. 15 12月, 2020 4 次提交
    • T
      KVM: SVM: Provide support for SEV-ES vCPU loading · 86137773
      Tom Lendacky 提交于
      An SEV-ES vCPU requires additional VMCB vCPU load/put requirements. SEV-ES
      hardware will restore certain registers on VMEXIT, but not save them on
      VMRUN (see Table B-3 and Table B-4 of the AMD64 APM Volume 2), so make the
      following changes:
      
      General vCPU load changes:
        - During vCPU loading, perform a VMSAVE to the per-CPU SVM save area and
          save the current values of XCR0, XSS and PKRU to the per-CPU SVM save
          area as these registers will be restored on VMEXIT.
      
      General vCPU put changes:
        - Do not attempt to restore registers that SEV-ES hardware has already
          restored on VMEXIT.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <019390e9cb5e93cd73014fa5a040c17d42588733.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      86137773
    • T
      KVM: SVM: Support string IO operations for an SEV-ES guest · 7ed9abfe
      Tom Lendacky 提交于
      For an SEV-ES guest, string-based port IO is performed to a shared
      (un-encrypted) page so that both the hypervisor and guest can read or
      write to it and each see the contents.
      
      For string-based port IO operations, invoke SEV-ES specific routines that
      can complete the operation using common KVM port IO support.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <9d61daf0ffda496703717218f415cdc8fd487100.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7ed9abfe
    • T
      KVM: SVM: Support MMIO for an SEV-ES guest · 8f423a80
      Tom Lendacky 提交于
      For an SEV-ES guest, MMIO is performed to a shared (un-encrypted) page
      so that both the hypervisor and guest can read or write to it and each
      see the contents.
      
      The GHCB specification provides software-defined VMGEXIT exit codes to
      indicate a request for an MMIO read or an MMIO write. Add support to
      recognize the MMIO requests and invoke SEV-ES specific routines that
      can complete the MMIO operation. These routines use common KVM support
      to complete the MMIO operation.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <af8de55127d5bcc3253d9b6084a0144c12307d4d.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8f423a80
    • U
      KVM/VMX/SVM: Move kvm_machine_check function to x86.h · 3f1a18b9
      Uros Bizjak 提交于
      Move kvm_machine_check to x86.h to avoid two exact copies
      of the same function in kvm.c and svm.c.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Message-Id: <20201029135600.122392-1-ubizjak@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3f1a18b9
  20. 15 11月, 2020 1 次提交
  21. 08 11月, 2020 1 次提交
    • M
      KVM: x86: use positive error values for msr emulation that causes #GP · cc4cb017
      Maxim Levitsky 提交于
      Recent introduction of the userspace msr filtering added code that uses
      negative error codes for cases that result in either #GP delivery to
      the guest, or handled by the userspace msr filtering.
      
      This breaks an assumption that a negative error code returned from the
      msr emulation code is a semi-fatal error which should be returned
      to userspace via KVM_RUN ioctl and usually kill the guest.
      
      Fix this by reusing the already existing KVM_MSR_RET_INVALID error code,
      and by adding a new KVM_MSR_RET_FILTERED error code for the
      userspace filtered msrs.
      
      Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr emulation to userspace")
      Reported-by: NQian Cai <cai@redhat.com>
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20201101115523.115780-1-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cc4cb017
  22. 28 9月, 2020 4 次提交
  23. 11 7月, 2020 1 次提交
  24. 09 7月, 2020 5 次提交