1. 28 9月, 2020 20 次提交
  2. 12 9月, 2020 2 次提交
    • W
      KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit · 99b82a14
      Wanpeng Li 提交于
      According to SDM 27.2.4, Event delivery causes an APIC-access VM exit.
      Don't report internal error and freeze guest when event delivery causes
      an APIC-access exit, it is handleable and the event will be re-injected
      during the next vmentry.
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1597827327-25055-2-git-send-email-wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      99b82a14
    • P
      KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected · 43fea4e4
      Peter Shier 提交于
      When L2 uses PAE, L0 intercepts of L2 writes to CR0/CR3/CR4 call
      load_pdptrs to read the possibly updated PDPTEs from the guest
      physical address referenced by CR3.  It loads them into
      vcpu->arch.walk_mmu->pdptrs and sets VCPU_EXREG_PDPTR in
      vcpu->arch.regs_dirty.
      
      At the subsequent assumed reentry into L2, the mmu will call
      vmx_load_mmu_pgd which calls ept_load_pdptrs. ept_load_pdptrs sees
      VCPU_EXREG_PDPTR set in vcpu->arch.regs_dirty and loads
      VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. This all works
      if the L2 CRn write intercept always resumes L2.
      
      The resume path calls vmx_check_nested_events which checks for
      exceptions, MTF, and expired VMX preemption timers. If
      vmx_check_nested_events finds any of these conditions pending it will
      reflect the corresponding exit into L1. Live migration at this point
      would also cause a missed immediate reentry into L2.
      
      After L1 exits, vmx_vcpu_run calls vmx_register_cache_reset which
      clears VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty.  When L2 next
      resumes, ept_load_pdptrs finds VCPU_EXREG_PDPTR clear in
      vcpu->arch.regs_dirty and does not load VMCS02.GUEST_PDPTRn from
      vcpu->arch.walk_mmu->pdptrs[]. prepare_vmcs02 will then load
      VMCS02.GUEST_PDPTRn from vmcs12->pdptr0/1/2/3 which contain the stale
      values stored at last L2 exit. A repro of this bug showed L2 entering
      triple fault immediately due to the bad VMCS02.GUEST_PDPTRn values.
      
      When L2 is in PAE paging mode add a call to ept_load_pdptrs before
      leaving L2. This will update VMCS02.GUEST_PDPTRn if they are dirty in
      vcpu->arch.walk_mmu->pdptrs[].
      
      Tested:
      kvm-unit-tests with new directed test: vmx_mtf_pdpte_test.
      Verified that test fails without the fix.
      
      Also ran Google internal VMM with an Ubuntu 16.04 4.4.0-83 guest running a
      custom hypervisor with a 32-bit Windows XP L2 guest using PAE. Prior to fix
      would repro readily. Ran 14 simultaneous L2s for 140 iterations with no
      failures.
      Signed-off-by: NPeter Shier <pshier@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Message-Id: <20200820230545.2411347-1-pshier@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      43fea4e4
  3. 24 8月, 2020 1 次提交
  4. 31 7月, 2020 5 次提交
  5. 24 7月, 2020 1 次提交
  6. 11 7月, 2020 5 次提交
    • M
      KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support · 3edd6839
      Mohammed Gamal 提交于
      This patch adds a new capability KVM_CAP_SMALLER_MAXPHYADDR which
      allows userspace to query if the underlying architecture would
      support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly
      (e.g. qemu can decide if it should warn for -cpu ..,phys-bits=X)
      
      The complications in this patch are due to unexpected (but documented)
      behaviour we see with NPF vmexit handling in AMD processor.  If
      SVM is modified to add guest physical address checks in the NPF
      and guest #PF paths, we see the followning error multiple times in
      the 'access' test in kvm-unit-tests:
      
                  test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001
                  Dump mapping: address: 0x123400000000
                  ------L4: 24c3027
                  ------L3: 24c4027
                  ------L2: 24c5021
                  ------L1: 1002000021
      
      This is because the PTE's accessed bit is set by the CPU hardware before
      the NPF vmexit. This is handled completely by hardware and cannot be fixed
      in software.
      
      Therefore, availability of the new capability depends on a boolean variable
      allow_smaller_maxphyaddr which is set individually by VMX and SVM init
      routines. On VMX it's always set to true, on SVM it's only set to true
      when NPT is not enabled.
      
      CC: Tom Lendacky <thomas.lendacky@amd.com>
      CC: Babu Moger <babu.moger@amd.com>
      Signed-off-by: NMohammed Gamal <mgamal@redhat.com>
      Message-Id: <20200710154811.418214-10-mgamal@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3edd6839
    • P
      KVM: VMX: optimize #PF injection when MAXPHYADDR does not match · 8c4182bd
      Paolo Bonzini 提交于
      Ignore non-present page faults, since those cannot have reserved
      bits set.
      
      When running access.flat with "-cpu Haswell,phys-bits=36", the
      number of trapped page faults goes down from 8872644 to 3978948.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20200710154811.418214-9-mgamal@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8c4182bd
    • M
      KVM: VMX: Add guest physical address check in EPT violation and misconfig · 1dbf5d68
      Mohammed Gamal 提交于
      Check guest physical address against its maximum, which depends on the
      guest MAXPHYADDR. If the guest's physical address exceeds the
      maximum (i.e. has reserved bits set), inject a guest page fault with
      PFERR_RSVD_MASK set.
      
      This has to be done both in the EPT violation and page fault paths, as
      there are complications in both cases with respect to the computation
      of the correct error code.
      
      For EPT violations, unfortunately the only possibility is to emulate,
      because the access type in the exit qualification might refer to an
      access to a paging structure, rather than to the access performed by
      the program.
      
      Trapping page faults instead is needed in order to correct the error code,
      but the access type can be obtained from the original error code and
      passed to gva_to_gpa.  The corrections required in the error code are
      subtle. For example, imagine that a PTE for a supervisor page has a reserved
      bit set.  On a supervisor-mode access, the EPT violation path would trigger.
      However, on a user-mode access, the processor will not notice the reserved
      bit and not include PFERR_RSVD_MASK in the error code.
      Co-developed-by: NMohammed Gamal <mgamal@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20200710154811.418214-8-mgamal@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1dbf5d68
    • P
      KVM: VMX: introduce vmx_need_pf_intercept · a0c13434
      Paolo Bonzini 提交于
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20200710154811.418214-7-mgamal@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a0c13434
    • P
      KVM: x86: rename update_bp_intercept to update_exception_bitmap · 6986982f
      Paolo Bonzini 提交于
      We would like to introduce a callback to update the #PF intercept
      when CPUID changes.  Just reuse update_bp_intercept since VMX is
      already using update_exception_bitmap instead of a bespoke function.
      
      While at it, remove an unnecessary assignment in the SVM version,
      which is already done in the caller (kvm_arch_vcpu_ioctl_set_guest_debug)
      and has nothing to do with the exception bitmap.
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6986982f
  7. 09 7月, 2020 6 次提交