1. 16 8月, 2021 1 次提交
  2. 28 7月, 2021 1 次提交
  3. 26 7月, 2021 2 次提交
  4. 15 7月, 2021 4 次提交
  5. 25 6月, 2021 4 次提交
  6. 18 6月, 2021 7 次提交
  7. 07 5月, 2021 2 次提交
  8. 03 5月, 2021 3 次提交
  9. 22 4月, 2021 1 次提交
  10. 17 4月, 2021 4 次提交
    • M
      KVM: x86: pending exceptions must not be blocked by an injected event · 4020da3b
      Maxim Levitsky 提交于
      Injected interrupts/nmi should not block a pending exception,
      but rather be either lost if nested hypervisor doesn't
      intercept the pending exception (as in stock x86), or be delivered
      in exitintinfo/IDT_VECTORING_INFO field, as a part of a VMexit
      that corresponds to the pending exception.
      
      The only reason for an exception to be blocked is when nested run
      is pending (and that can't really happen currently
      but still worth checking for).
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210401143817.1030695-2-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4020da3b
    • M
      KVM: nSVM: call nested_svm_load_cr3 on nested state load · 232f75d3
      Maxim Levitsky 提交于
      While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4
      by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore
      only root_mmu is reset.
      
      On regular nested entries we call nested_svm_load_cr3 which both updates
      the guest's CR3 in the MMU when it is needed, and it also initializes
      the mmu again which makes it initialize the walk_mmu as well when nested
      paging is enabled in both host and guest.
      
      Since we don't call nested_svm_load_cr3 on nested state load,
      the walk_mmu can be left uninitialized, which can lead to a NULL pointer
      dereference while accessing it if we happen to get a nested page fault
      right after entering the nested guest first time after the migration and
      we decide to emulate it, which leads to the emulator trying to access
      walk_mmu->gva_to_gpa which is NULL.
      
      Therefore we should call this function on nested state load as well.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210401141814.1029036-3-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      232f75d3
    • S
      KVM: x86: Account a variety of miscellaneous allocations · eba04b20
      Sean Christopherson 提交于
      Switch to GFP_KERNEL_ACCOUNT for a handful of allocations that are
      clearly associated with a single task/VM.
      
      Note, there are a several SEV allocations that aren't accounted, but
      those can (hopefully) be fixed by using the local stack for memory.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210331023025.2485960-3-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      eba04b20
    • K
      KVM: nSVM: If VMRUN is single-stepped, queue the #DB intercept in nested_svm_vmexit() · 9a7de6ec
      Krish Sadhukhan 提交于
      According to APM, the #DB intercept for a single-stepped VMRUN must happen
      after the completion of that instruction, when the guest does #VMEXIT to
      the host. However, in the current implementation of KVM, the #DB intercept
      for a single-stepped VMRUN happens after the completion of the instruction
      that follows the VMRUN instruction. When the #DB intercept handler is
      invoked, it shows the RIP of the instruction that follows VMRUN, instead of
      of VMRUN itself. This is an incorrect RIP as far as single-stepping VMRUN
      is concerned.
      
      This patch fixes the problem by checking, in nested_svm_vmexit(), for the
      condition that the VMRUN instruction is being single-stepped and if so,
      queues the pending #DB intercept so that the #DB is accounted for before
      we execute L1's next instruction.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oraacle.com>
      Message-Id: <20210323175006.73249-2-krish.sadhukhan@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a7de6ec
  11. 01 4月, 2021 2 次提交
    • P
      KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit · 3c346c0c
      Paolo Bonzini 提交于
      Fixing nested_vmcb_check_save to avoid all TOC/TOU races
      is a bit harder in released kernels, so do the bare minimum
      by avoiding that EFER.SVME is cleared.  This is problematic
      because svm_set_efer frees the data structures for nested
      virtualization if EFER.SVME is cleared.
      
      Also check that EFER.SVME remains set after a nested vmexit;
      clearing it could happen if the bit is zero in the save area
      that is passed to KVM_SET_NESTED_STATE (the save area of the
      nested state corresponds to the nested hypervisor's state
      and is restored on the next nested vmexit).
      
      Cc: stable@vger.kernel.org
      Fixes: 2fcf4876 ("KVM: nSVM: implement on demand allocation of the nested state")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3c346c0c
    • P
      KVM: SVM: load control fields from VMCB12 before checking them · a58d9166
      Paolo Bonzini 提交于
      Avoid races between check and use of the nested VMCB controls.  This
      for example ensures that the VMRUN intercept is always reflected to the
      nested hypervisor, instead of being processed by the host.  Without this
      patch, it is possible to end up with svm->nested.hsave pointing to
      the MSR permission bitmap for nested guests.
      
      This bug is CVE-2021-29657.
      Reported-by: NFelix Wilhelm <fwilhelm@google.com>
      Cc: stable@vger.kernel.org
      Fixes: 2fcf4876 ("KVM: nSVM: implement on demand allocation of the nested state")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a58d9166
  12. 15 3月, 2021 9 次提交
    • C
      KVM: nSVM: Optimize vmcb12 to vmcb02 save area copies · 8173396e
      Cathy Avery 提交于
      Use the vmcb12 control clean field to determine which vmcb12.save
      registers were marked dirty in order to minimize register copies
      when switching from L1 to L2. Those vmcb12 registers marked as dirty need
      to be copied to L0's vmcb02 as they will be used to update the vmcb
      state cache for the L2 VMRUN.  In the case where we have a different
      vmcb12 from the last L2 VMRUN all vmcb12.save registers must be
      copied over to L2.save.
      
      Tested:
      kvm-unit-tests
      kvm selftests
      Fedora L1 L2
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NCathy Avery <cavery@redhat.com>
      Message-Id: <20210301200844.2000-1-cavery@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8173396e
    • B
      KVM: SVM: Add support for Virtual SPEC_CTRL · d00b99c5
      Babu Moger 提交于
      Newer AMD processors have a feature to virtualize the use of the
      SPEC_CTRL MSR. Presence of this feature is indicated via CPUID
      function 0x8000000A_EDX[20]: GuestSpecCtrl. Hypervisors are not
      required to enable this feature since it is automatically enabled on
      processors that support it.
      
      A hypervisor may wish to impose speculation controls on guest
      execution or a guest may want to impose its own speculation controls.
      Therefore, the processor implements both host and guest
      versions of SPEC_CTRL.
      
      When in host mode, the host SPEC_CTRL value is in effect and writes
      update only the host version of SPEC_CTRL. On a VMRUN, the processor
      loads the guest version of SPEC_CTRL from the VMCB. When the guest
      writes SPEC_CTRL, only the guest version is updated. On a VMEXIT,
      the guest version is saved into the VMCB and the processor returns
      to only using the host SPEC_CTRL for speculation control. The guest
      SPEC_CTRL is located at offset 0x2E0 in the VMCB.
      
      The effective SPEC_CTRL setting is the guest SPEC_CTRL setting or'ed
      with the hypervisor SPEC_CTRL setting. This allows the hypervisor to
      ensure a minimum SPEC_CTRL if desired.
      
      This support also fixes an issue where a guest may sometimes see an
      inconsistent value for the SPEC_CTRL MSR on processors that support
      this feature. With the current SPEC_CTRL support, the first write to
      SPEC_CTRL is intercepted and the virtualized version of the SPEC_CTRL
      MSR is not updated. When the guest reads back the SPEC_CTRL MSR, it
      will be 0x0, instead of the actual expected value. There isn’t a
      security concern here, because the host SPEC_CTRL value is or’ed with
      the Guest SPEC_CTRL value to generate the effective SPEC_CTRL value.
      KVM writes with the guest's virtualized SPEC_CTRL value to SPEC_CTRL
      MSR just before the VMRUN, so it will always have the actual value
      even though it doesn’t appear that way in the guest. The guest will
      only see the proper value for the SPEC_CTRL register if the guest was
      to write to the SPEC_CTRL register again. With Virtual SPEC_CTRL
      support, the save area spec_ctrl is properly saved and restored.
      So, the guest will always see the proper value when it is read back.
      Signed-off-by: NBabu Moger <babu.moger@amd.com>
      Message-Id: <161188100955.28787.11816849358413330720.stgit@bmoger-ubuntu>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d00b99c5
    • M
      KVM: nSVM: always use vmcb01 to for vmsave/vmload of guest state · cc3ed80a
      Maxim Levitsky 提交于
      This allows to avoid copying of these fields between vmcb01
      and vmcb02 on nested guest entry/exit.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cc3ed80a
    • S
      KVM: nSVM: Add helper to synthesize nested VM-Exit without collateral · 3a87c7e0
      Sean Christopherson 提交于
      Add a helper to consolidate boilerplate for nested VM-Exits that don't
      provide any data in exit_info_*.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210302174515.2812275-3-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3a87c7e0
    • S
      KVM: x86: Handle triple fault in L2 without killing L1 · cb6a32c2
      Sean Christopherson 提交于
      Synthesize a nested VM-Exit if L2 triggers an emulated triple fault
      instead of exiting to userspace, which likely will kill L1.  Any flow
      that does KVM_REQ_TRIPLE_FAULT is suspect, but the most common scenario
      for L2 killing L1 is if L0 (KVM) intercepts a contributory exception that
      is _not_intercepted by L1.  E.g. if KVM is intercepting #GPs for the
      VMware backdoor, a #GP that occurs in L2 while vectoring an injected #DF
      will cause KVM to emulate triple fault.
      
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210302174515.2812275-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb6a32c2
    • P
      KVM: SVM: Pass struct kvm_vcpu to exit handlers (and many, many other places) · 63129754
      Paolo Bonzini 提交于
      Refactor the svm_exit_handlers API to pass @vcpu instead of @svm to
      allow directly invoking common x86 exit handlers (in a future patch).
      Opportunistically convert an absurd number of instances of 'svm->vcpu'
      to direct uses of 'vcpu' to avoid pointless casting.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210205005750.3841462-4-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      63129754
    • S
      KVM: nSVM: Trace VM-Enter consistency check failures · 11f0cbf0
      Sean Christopherson 提交于
      Use trace_kvm_nested_vmenter_failed() and its macro magic to trace
      consistency check failures on nested VMRUN.  Tracing such failures by
      running the buggy VMM as a KVM guest is often the only way to get a
      precise explanation of why VMRUN failed.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210204000117.3303214-13-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      11f0cbf0
    • K
      KVM: nSVM: Add missing checks for reserved bits to svm_set_nested_state() · 6906e06d
      Krish Sadhukhan 提交于
      The path for SVM_SET_NESTED_STATE needs to have the same checks for the CPU
      registers, as we have in the VMRUN path for a nested guest. This patch adds
      those missing checks to svm_set_nested_state().
      Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Message-Id: <20201006190654.32305-3-krish.sadhukhan@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6906e06d
    • P
      KVM: nSVM: only copy L1 non-VMLOAD/VMSAVE data in svm_set_nested_state() · c08f390a
      Paolo Bonzini 提交于
      The VMLOAD/VMSAVE data is not taken from userspace, since it will
      not be restored on VMEXIT (it will be copied from VMCB02 to VMCB01).
      For clarity, replace the wholesale copy of the VMCB save area
      with a copy of that state only.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c08f390a