1. 01 6月, 2020 28 次提交
  2. 28 5月, 2020 12 次提交
    • P
      KVM: nVMX: always update CR3 in VMCS · df7e0681
      Paolo Bonzini 提交于
      vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
      an optimization, but this is only correct before the nested vmentry.
      If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
      already been put in guest mode, the value of CR3 will not be updated.
      Remove the optimization, which almost never triggers anyway.
      
      Fixes: 04f11ef4 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      df7e0681
    • P
      KVM: SVM: always update CR3 in VMCB · 978ce583
      Paolo Bonzini 提交于
      svm_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
      an optimization, but this is only correct before the nested vmentry.
      If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
      already been put in guest mode, the value of CR3 will not be updated.
      Remove the optimization, which almost never triggers anyway.
      This was was added in commit 689f3bf2 ("KVM: x86: unify callbacks
      to load paging root", 2020-03-16) just to keep the two vendor-specific
      modules closer, but we'll fix VMX too.
      
      Fixes: 689f3bf2 ("KVM: x86: unify callbacks to load paging root")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      978ce583
    • P
      KVM: nSVM: correctly inject INIT vmexits · 5b672408
      Paolo Bonzini 提交于
      The usual drill at this point, except there is no code to remove because this
      case was not handled at all.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5b672408
    • P
      KVM: nSVM: remove exit_required · bd279629
      Paolo Bonzini 提交于
      All events now inject vmexits before vmentry rather than after vmexit.  Therefore,
      exit_required is not set anymore and we can remove it.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bd279629
    • P
      KVM: nSVM: inject exceptions via svm_check_nested_events · 7c86663b
      Paolo Bonzini 提交于
      This allows exceptions injected by the emulator to be properly delivered
      as vmexits.  The code also becomes simpler, because we can just let all
      L0-intercepted exceptions go through the usual path.  In particular, our
      emulation of the VMX #DB exit qualification is very much simplified,
      because the vmexit injection path can use kvm_deliver_exception_payload
      to update DR6.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7c86663b
    • P
      KVM: x86: enable event window in inject_pending_event · c9d40913
      Paolo Bonzini 提交于
      In case an interrupt arrives after nested.check_events but before the
      call to kvm_cpu_has_injectable_intr, we could end up enabling the interrupt
      window even if the interrupt is actually going to be a vmexit.  This is
      useless rather than harmful, but it really complicates reasoning about
      SVM's handling of the VINTR intercept.  We'd like to never bother with
      the VINTR intercept if V_INTR_MASKING=1 && INTERCEPT_INTR=1, because in
      that case there is no interrupt window and we can just exit the nested
      guest whenever we want.
      
      This patch moves the opening of the interrupt window inside
      inject_pending_event.  This consolidates the check for pending
      interrupt/NMI/SMI in one place, and makes KVM's usage of immediate
      exits more consistent, extending it beyond just nested virtualization.
      
      There are two functional changes here.  They only affect corner cases,
      but overall they simplify the inject_pending_event.
      
      - re-injection of still-pending events will also use req_immediate_exit
      instead of using interrupt-window intercepts.  This should have no impact
      on performance on Intel since it simply replaces an interrupt-window
      or NMI-window exit for a preemption-timer exit.  On AMD, which has no
      equivalent of the preemption time, it may incur some overhead but an
      actual effect on performance should only be visible in pathological cases.
      
      - kvm_arch_interrupt_allowed and kvm_vcpu_has_events will return true
      if an interrupt, NMI or SMI is blocked by nested_run_pending.  This
      makes sense because entering the VM will allow it to make progress
      and deliver the event.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c9d40913
    • P
      KVM: x86: track manually whether an event has been injected · c6b22f59
      Paolo Bonzini 提交于
      Instead of calling kvm_event_needs_reinjection, track its
      future return value in a variable.  This will be useful in
      the next patch.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c6b22f59
    • V
      KVM: nSVM: Preserve registers modifications done before nested_svm_vmexit() · b6162e82
      Vitaly Kuznetsov 提交于
      L2 guest hang is observed after 'exit_required' was dropped and nSVM
      switched to check_nested_events() completely. The hang is a busy loop when
      e.g. KVM is emulating an instruction (e.g. L2 is accessing MMIO space and
      we drop to userspace). After nested_svm_vmexit() and when L1 is doing VMRUN
      nested guest's RIP is not advanced so KVM goes into emulating the same
      instruction which caused nested_svm_vmexit() and the loop continues.
      
      nested_svm_vmexit() is not new, however, with check_nested_events() we're
      now calling it later than before. In case by that time KVM has modified
      register state we may pick stale values from VMCB when trying to save
      nested guest state to nested VMCB.
      
      nVMX code handles this case correctly: sync_vmcs02_to_vmcs12() called from
      nested_vmx_vmexit() does e.g 'vmcs12->guest_rip = kvm_rip_read(vcpu)' and
      this ensures KVM-made modifications are preserved. Do the same for nSVM.
      
      Generally, nested_vmx_vmexit()/nested_svm_vmexit() need to pick up all
      nested guest state modifications done by KVM after vmexit. It would be
      great to find a way to express this in a way which would not require to
      manually track these changes, e.g. nested_{vmcb,vmcs}_get_field().
      
      Co-debugged-with: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200527090102.220647-1-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b6162e82
    • S
      KVM: x86: Initialize tdp_level during vCPU creation · 7d2e8748
      Sean Christopherson 提交于
      Initialize vcpu->arch.tdp_level during vCPU creation to avoid consuming
      garbage if userspace calls KVM_RUN without first calling KVM_SET_CPUID.
      
      Fixes: e93fd3b3 ("KVM: x86/mmu: Capture TDP level when updating CPUID")
      Reported-by: syzbot+904752567107eefb728c@syzkaller.appspotmail.com
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200527085400.23759-1-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7d2e8748
    • P
      KVM: nSVM: leave ASID aside in copy_vmcb_control_area · 6c0238c4
      Paolo Bonzini 提交于
      Restoring the ASID from the hsave area on VMEXIT is wrong, because its
      value depends on the handling of TLB flushes.  Just skipping the field in
      copy_vmcb_control_area will do.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6c0238c4
    • P
      KVM: nSVM: fix condition for filtering async PF · a3535be7
      Paolo Bonzini 提交于
      Async page faults have to be trapped in the host (L1 in this case),
      since the APF reason was passed from L0 to L1 and stored in the L1 APF
      data page.  This was completely reversed: the page faults were passed
      to the guest, a L2 hypervisor.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a3535be7
    • kvm/x86: Remove redundant function implementations · 88197e6a
      彭浩(Richard) 提交于
      pic_in_kernel(), ioapic_in_kernel() and irqchip_kernel() have the
      same implementation.
      Signed-off-by: NPeng Hao <richard.peng@oppo.com>
      Message-Id: <HKAPR02MB4291D5926EA10B8BFE9EA0D3E0B70@HKAPR02MB4291.apcprd02.prod.outlook.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      88197e6a