1. 16 5月, 2020 3 次提交
  2. 14 5月, 2020 13 次提交
  3. 08 5月, 2020 2 次提交
    • P
      KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6 · d67668e9
      Paolo Bonzini 提交于
      There are two issues with KVM_EXIT_DEBUG on AMD, whose root cause is the
      different handling of DR6 on intercepted #DB exceptions on Intel and AMD.
      
      On Intel, #DB exceptions transmit the DR6 value via the exit qualification
      field of the VMCS, and the exit qualification only contains the description
      of the precise event that caused a vmexit.
      
      On AMD, instead the DR6 field of the VMCB is filled in as if the #DB exception
      was to be injected into the guest.  This has two effects when guest debugging
      is in use:
      
      * the guest DR6 is clobbered
      
      * the kvm_run->debug.arch.dr6 field can accumulate more debug events, rather
      than just the last one that happened (the testcase in the next patch covers
      this issue).
      
      This patch fixes both issues by emulating, so to speak, the Intel behavior
      on AMD processors.  The important observation is that (after the previous
      patches) the VMCB value of DR6 is only ever observable from the guest is
      KVM_DEBUGREG_WONT_EXIT is set.  Therefore we can actually set vmcb->save.dr6
      to any value we want as long as KVM_DEBUGREG_WONT_EXIT is clear, which it
      will be if guest debugging is enabled.
      
      Therefore it is possible to enter the guest with an all-zero DR6,
      reconstruct the #DB payload from the DR6 we get at exit time, and let
      kvm_deliver_exception_payload move the newly set bits into vcpu->arch.dr6.
      Some extra bits may be included in the payload if KVM_DEBUGREG_WONT_EXIT
      is set, but this is harmless.
      
      This may not be the most optimized way to deal with this, but it is
      simple and, being confined within SVM code, it gets rid of the set_dr6
      callback and kvm_update_dr6.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d67668e9
    • P
      KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6 · 5679b803
      Paolo Bonzini 提交于
      kvm_x86_ops.set_dr6 is only ever called with vcpu->arch.dr6 as the
      second argument.  Ensure that the VMCB value is synchronized to
      vcpu->arch.dr6 on #DB (both "normal" and nested) and nested vmentry, so
      that the current value of DR6 is always available in vcpu->arch.dr6.
      The get_dr6 callback can just access vcpu->arch.dr6 and becomes redundant.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5679b803
  4. 04 5月, 2020 1 次提交
  5. 23 4月, 2020 1 次提交
    • P
      KVM: x86: move nested-related kvm_x86_ops to a separate struct · 33b22172
      Paolo Bonzini 提交于
      Clean up some of the patching of kvm_x86_ops, by moving kvm_x86_ops related to
      nested virtualization into a separate struct.
      
      As a result, these ops will always be non-NULL on VMX.  This is not a problem:
      
      * check_nested_events is only called if is_guest_mode(vcpu) returns true
      
      * get_nested_state treats VMXOFF state the same as nested being disabled
      
      * set_nested_state fails if you attempt to set nested state while
        nesting is disabled
      
      * nested_enable_evmcs could already be called on a CPU without VMX enabled
        in CPUID.
      
      * nested_get_evmcs_version was fixed in the previous patch
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      33b22172
  6. 21 4月, 2020 9 次提交
    • P
      KVM: SVM: avoid infinite loop on NPF from bad address · e72436bc
      Paolo Bonzini 提交于
      When a nested page fault is taken from an address that does not have
      a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE
      (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.
      
      The default answer there is to return false, but in this case this just
      causes the page fault to be retried ad libitum.  Since this is not a
      fast path, and the only other case where it is taken is an erratum,
      just stick a kvm_vcpu_gfn_to_memslot check in there to detect the
      common case where the erratum is not happening.
      
      This fixes an infinite loop in the new set_memory_region_test.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e72436bc
    • W
      KVM: X86: Improve latency for single target IPI fastpath · a9ab13ff
      Wanpeng Li 提交于
      IPI and Timer cause the main MSRs write vmexits in cloud environment
      observation, let's optimize virtual IPI latency more aggressively to
      inject target IPI as soon as possible.
      
      Running kvm-unit-tests/vmexit.flat IPI testing on SKX server, disable
      adaptive advance lapic timer and adaptive halt-polling to avoid the
      interference, this patch can give another 7% improvement.
      
      w/o fastpath   -> x86.c fastpath      4238 -> 3543  16.4%
      x86.c fastpath -> vmx.c fastpath      3543 -> 3293     7%
      w/o fastpath   -> vmx.c fastpath      4238 -> 3293  22.3%
      
      Cc: Haiwei Li <lihaiwei@tencent.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200410174703.1138-3-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a9ab13ff
    • U
      KVM: SVM: Use do_machine_check to pass MCE to the host · 1c164cb3
      Uros Bizjak 提交于
      Use do_machine_check instead of INT $12 to pass MCE to the host,
      the same approach VMX uses.
      
      On a related note, there is no reason to limit the use of do_machine_check
      to 64 bit targets, as is currently done for VMX. MCE handling works
      for both target families.
      
      The patch is only compile tested, for both, 64 and 32 bit targets,
      someone should test the passing of the exception by injecting
      some MCEs into the guest.
      
      For future non-RFC patch, kvm_machine_check should be moved to some
      appropriate header file.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Message-Id: <20200411153627.3474710-1-ubizjak@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1c164cb3
    • S
      KVM: x86: Introduce KVM_REQ_TLB_FLUSH_CURRENT to flush current ASID · eeeb4f67
      Sean Christopherson 提交于
      Add KVM_REQ_TLB_FLUSH_CURRENT to allow optimized TLB flushing of VMX's
      EPTP/VPID contexts[*] from the KVM MMU and/or in a deferred manner, e.g.
      to flush L2's context during nested VM-Enter.
      
      Convert KVM_REQ_TLB_FLUSH to KVM_REQ_TLB_FLUSH_CURRENT in flows where
      the flush is directly associated with vCPU-scoped instruction emulation,
      i.e. MOV CR3 and INVPCID.
      
      Add a comment in vmx_vcpu_load_vmcs() above its KVM_REQ_TLB_FLUSH to
      make it clear that it deliberately requests a flush of all contexts.
      
      Service any pending flush request on nested VM-Exit as it's possible a
      nested VM-Exit could occur after requesting a flush for L2.  Add the
      same logic for nested VM-Enter even though it's _extremely_ unlikely
      for flush to be pending on nested VM-Enter, but theoretically possible
      (in the future) due to RSM (SMM) emulation.
      
      [*] Intel also has an Address Space Identifier (ASID) concept, e.g.
          EPTP+VPID+PCID == ASID, it's just not documented in the SDM because
          the rules of invalidation are different based on which piece of the
          ASID is being changed, i.e. whether the EPTP, VPID, or PCID context
          must be invalidated.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-25-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      eeeb4f67
    • S
      KVM: x86: Rename ->tlb_flush() to ->tlb_flush_all() · 7780938c
      Sean Christopherson 提交于
      Rename ->tlb_flush() to ->tlb_flush_all() in preparation for adding a
      new hook to flush only the current ASID/context.
      
      Opportunstically replace the comment in vmx_flush_tlb() that explains
      why it flushes all EPTP/VPID contexts with a comment explaining why it
      unconditionally uses INVEPT when EPT is enabled.  I.e. rely on the "all"
      part of the name to clarify why it does global INVEPT/INVVPID.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-23-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7780938c
    • S
      KVM: SVM: Document the ASID logic in svm_flush_tlb() · 4a41e43c
      Sean Christopherson 提交于
      Add a comment in svm_flush_tlb() to document why it flushes only the
      current ASID, even when it is invoked when flushing remote TLBs.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-22-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4a41e43c
    • S
      KVM: SVM: Wire up ->tlb_flush_guest() directly to svm_flush_tlb() · 72b38320
      Sean Christopherson 提交于
      Use svm_flush_tlb() directly for kvm_x86_ops->tlb_flush_guest() now that
      the @invalidate_gpa param to ->tlb_flush() is gone, i.e. the wrapper for
      ->tlb_flush_guest() is no longer necessary.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-18-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      72b38320
    • S
      KVM: x86: Drop @invalidate_gpa param from kvm_x86_ops' tlb_flush() · f55ac304
      Sean Christopherson 提交于
      Drop @invalidate_gpa from ->tlb_flush() and kvm_vcpu_flush_tlb() now
      that all callers pass %true for said param, or ignore the param (SVM has
      an internal call to svm_flush_tlb() in svm_flush_tlb_guest that somewhat
      arbitrarily passes %false).
      
      Remove __vmx_flush_tlb() as it is no longer used.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-17-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f55ac304
    • S
      KVM: x86: Move "flush guest's TLB" logic to separate kvm_x86_ops hook · e64419d9
      Sean Christopherson 提交于
      Add a dedicated hook to handle flushing TLB entries on behalf of the
      guest, i.e. for a paravirtualized TLB flush, and use it directly instead
      of bouncing through kvm_vcpu_flush_tlb().
      
      For VMX, change the effective implementation implementation to never do
      INVEPT and flush only the current context, i.e. to always flush via
      INVVPID(SINGLE_CONTEXT).  The INVEPT performed by __vmx_flush_tlb() when
      @invalidate_gpa=false and enable_vpid=0 is unnecessary, as it will only
      flush guest-physical mappings; linear and combined mappings are flushed
      by VM-Enter when VPID is disabled, and changes in the guest pages tables
      do not affect guest-physical mappings.
      
      When EPT and VPID are enabled, doing INVVPID is not required (by Intel's
      architecture) to invalidate guest-physical mappings, i.e. TLB entries
      that cache guest-physical mappings can live across INVVPID as the
      mappings are associated with an EPTP, not a VPID.  The intent of
      @invalidate_gpa is to inform vmx_flush_tlb() that it must "invalidate
      gpa mappings", i.e. do INVEPT and not simply INVVPID.  Other than nested
      VPID handling, which now calls vpid_sync_context() directly, the only
      scenario where KVM can safely do INVVPID instead of INVEPT (when EPT is
      enabled) is if KVM is flushing TLB entries from the guest's perspective,
      i.e. is only required to invalidate linear mappings.
      
      For SVM, flushing TLB entries from the guest's perspective can be done
      by flushing the current ASID, as changes to the guest's page tables are
      associated only with the current ASID.
      
      Adding a dedicated ->tlb_flush_guest() paves the way toward removing
      @invalidate_gpa, which is a potentially dangerous control flag as its
      meaning is not exactly crystal clear, even for those who are familiar
      with the subtleties of what mappings Intel CPUs are/aren't allowed to
      keep across various invalidation scenarios.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200320212833.3507-15-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e64419d9
  7. 16 4月, 2020 2 次提交
  8. 14 4月, 2020 1 次提交
  9. 03 4月, 2020 5 次提交
  10. 31 3月, 2020 3 次提交