1. 06 1月, 2023 1 次提交
  2. 24 11月, 2022 3 次提交
    • K
      kvm: x86: Disable interception for IA32_XFD on demand · 7b32cbb5
      Kevin Tian 提交于
      mainline inclusion
      from mainline-v5.17-rc1
      commit b5274b1b
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
      CVE: NA
      
      Intel-SIG: commit b5274b1b kvm: x86: Disable interception for IA32_XFD on demand.
      
      --------------------------------
      
      Always intercepting IA32_XFD causes non-negligible overhead when this
      register is updated frequently in the guest.
      
      Disable r/w emulation after intercepting the first WRMSR(IA32_XFD)
      with a non-zero value.
      
      Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync
      with the software states in fpstate and the per-cpu xfd cache. This
      leads to two additional changes accordingly:
      
        - Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring
          software states back in-sync with the MSR, before handle_exit_irqoff()
          is called.
      
        - Always trap #NM once write interception is disabled for IA32_XFD.
          The #NM exception is rare if the guest doesn't use dynamic
          features. Otherwise, there is at most one exception per guest
          task given a dynamic feature.
      
      p.s. We have confirmed that SDM is being revised to say that
      when setting IA32_XFD[18] the AMX register state is not guaranteed
      to be preserved. This clarification avoids adding mess for a creative
      guest which sets IA32_XFD[18]=1 before saving active AMX state to
      its own storage.
      Signed-off-by: NKevin Tian <kevin.tian@intel.com>
      Signed-off-by: NJing Liu <jing2.liu@intel.com>
      Signed-off-by: NYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-22-yang.zhong@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLin Wang <lin.x.wang@intel.com>
      7b32cbb5
    • J
      kvm: x86: Disable RDMSR interception of IA32_XFD_ERR · 8f85b372
      Jing Liu 提交于
      mainline inclusion
      from mainline-v5.17-rc1
      commit 61f20813
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
      CVE: NA
      
      Intel-SIG: commit 61f20813 kvm: x86: Disable RDMSR interception of IA32_XFD_ERR.
      
      --------------------------------
      
      This saves one unnecessary VM-exit in guest #NM handler, given that the
      MSR is already restored with the guest value before the guest is resumed.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NJing Liu <jing2.liu@intel.com>
      Signed-off-by: NYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-15-yang.zhong@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLin Wang <lin.x.wang@intel.com>
      8f85b372
    • J
      kvm: x86: Intercept #NM for saving IA32_XFD_ERR · 4a642360
      Jing Liu 提交于
      mainline inclusion
      from mainline-v5.17-rc1
      commit ec5be88a
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RQLJ
      CVE: NA
      
      Intel-SIG: commit ec5be88a kvm: x86: Intercept #NM for saving IA32_XFD_ERR.
      
      --------------------------------
      
      Guest IA32_XFD_ERR is generally modified in two places:
      
        - Set by CPU when #NM is triggered;
        - Cleared by guest in its #NM handler;
      
      Intercept #NM for the first case when a nonzero value is written
      to IA32_XFD. Nonzero indicates that the guest is willing to do
      dynamic fpstate expansion for certain xfeatures, thus KVM needs to
      manage and virtualize guest XFD_ERR properly. The vcpu exception
      bitmap is updated in XFD write emulation according to guest_fpu::xfd.
      
      Save the current XFD_ERR value to the guest_fpu container in the #NM
      VM-exit handler. This must be done with interrupt disabled, otherwise
      the unsaved MSR value may be clobbered by host activity.
      
      The saving operation is conducted conditionally only when guest_fpu:xfd
      includes a non-zero value. Doing so also avoids misread on a platform
      which doesn't support XFD but #NM is triggered due to L1 interception.
      
      Queueing #NM to the guest is postponed to handle_exception_nmi(). This
      goes through the nested_vmx check so a virtual vmexit is queued instead
      when #NM is triggered in L2 but L1 wants to intercept it.
      
      Restore the host value (always ZERO outside of the host #NM
      handler) before enabling interrupt.
      
      Restore the guest value from the guest_fpu container right before
      entering the guest (with interrupt disabled).
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJing Liu <jing2.liu@intel.com>
      Signed-off-by: NKevin Tian <kevin.tian@intel.com>
      Signed-off-by: NYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-13-yang.zhong@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLin Wang <lin.x.wang@intel.com>
      4a642360
  3. 18 11月, 2022 4 次提交
  4. 03 11月, 2022 5 次提交
  5. 27 10月, 2022 1 次提交
  6. 08 10月, 2022 4 次提交
  7. 20 9月, 2022 6 次提交
  8. 08 7月, 2022 4 次提交
    • S
      KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC · 5698b7e8
      Sean Christopherson 提交于
      mainline inclusion
      from mainline-5.13
      commit 72add915
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
      CVE: NA
      
      Intel-SIG: commit 72add915 KVM: VMX: Enable SGX virtualization for
      SGX1, SGX2 and LC.
      Backport for SGX virtualization support
      
      --------------------------------
      
      Enable SGX virtualization now that KVM has the VM-Exit handlers needed
      to trap-and-execute ENCLS to ensure correctness and/or enforce the CPU
      model exposed to the guest.  Add a KVM module param, "sgx", to allow an
      admin to disable SGX virtualization independent of the kernel.
      
      When supported in hardware and the kernel, advertise SGX1, SGX2 and SGX
      LC to userspace via CPUID and wire up the ENCLS_EXITING bitmap based on
      the guest's SGX capabilities, i.e. to allow ENCLS to be executed in an
      SGX-enabled guest.  With the exception of the provision key, all SGX
      attribute bits may be exposed to the guest.  Guest access to the
      provision key, which is controlled via securityfs, will be added in a
      future patch.
      
      Note, KVM does not yet support exposing ENCLS_C leafs or ENCLV leafs.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NKai Huang <kai.huang@intel.com>
      Message-Id: <a99e9c23310c79f2f4175c1af4c4cbcef913c3e5.1618196135.git.kai.huang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NFan Du <fan.du@intel.com>
      Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
      5698b7e8
    • S
      KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs · 1766b14e
      Sean Christopherson 提交于
      mainline inclusion
      from mainline-5.13
      commit 8f102445
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
      CVE: NA
      
      Intel-SIG: commit 8f102445 KVM: VMX: Add emulation of SGX Launch
      Control LE hash MSRs.
      Backport for SGX virtualization support
      
      --------------------------------
      
      Emulate the four Launch Enclave public key hash MSRs (LE hash MSRs) that
      exist on CPUs that support SGX Launch Control (LC).  SGX LC modifies the
      behavior of ENCLS[EINIT] to use the LE hash MSRs when verifying the key
      used to sign an enclave.  On CPUs without LC support, the LE hash is
      hardwired into the CPU to an Intel controlled key (the Intel key is also
      the reset value of the LE hash MSRs). Track the guest's desired hash so
      that a future patch can stuff the hash into the hardware MSRs when
      executing EINIT on behalf of the guest, when those MSRs are writable in
      host.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Co-developed-by: NKai Huang <kai.huang@intel.com>
      Signed-off-by: NKai Huang <kai.huang@intel.com>
      Message-Id: <c58ef601ddf88f3a113add837969533099b1364a.1618196135.git.kai.huang@intel.com>
      [Add a comment regarding the MSRs being available until SGX is locked.
       - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NFan Du <fan.du@intel.com>
      Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
      1766b14e
    • S
      KVM: VMX: Frame in ENCLS handler for SGX virtualization · e4e22234
      Sean Christopherson 提交于
      mainline inclusion
      from mainline-5.13
      commit 9798adbc
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
      CVE: NA
      
      Intel-SIG: commit 9798adbc KVM: VMX: Frame in ENCLS handler for
      SGX virtualization.
      Backport for SGX virtualization support
      
      --------------------------------
      
      Introduce sgx.c and sgx.h, along with the framework for handling ENCLS
      VM-Exits.  Add a bool, enable_sgx, that will eventually be wired up to a
      module param to control whether or not SGX virtualization is enabled at
      runtime.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NKai Huang <kai.huang@intel.com>
      Message-Id: <1c782269608b2f5e1034be450f375a8432fb705d.1618196135.git.kai.huang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NFan Du <fan.du@intel.com>
      Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
      e4e22234
    • S
      KVM: VMX: Add basic handling of VM-Exit from SGX enclave · 08f41fc4
      Sean Christopherson 提交于
      mainline inclusion
      from mainline-5.13
      commit 3c0c2ad1
      category: feature
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5EZEK
      CVE: NA
      
      Intel-SIG: commit 3c0c2ad1 KVM: VMX: Add basic handling of VM-Exit
      from SGX enclave
      Backport for SGX virtualization support
      
      --------------------------------
      
      Add support for handling VM-Exits that originate from a guest SGX
      enclave.  In SGX, an "enclave" is a new CPL3-only execution environment,
      wherein the CPU and memory state is protected by hardware to make the
      state inaccesible to code running outside of the enclave.  When exiting
      an enclave due to an asynchronous event (from the perspective of the
      enclave), e.g. exceptions, interrupts, and VM-Exits, the enclave's state
      is automatically saved and scrubbed (the CPU loads synthetic state), and
      then reloaded when re-entering the enclave.  E.g. after an instruction
      based VM-Exit from an enclave, vmcs.GUEST_RIP will not contain the RIP
      of the enclave instruction that trigered VM-Exit, but will instead point
      to a RIP in the enclave's untrusted runtime (the guest userspace code
      that coordinates entry/exit to/from the enclave).
      
      To help a VMM recognize and handle exits from enclaves, SGX adds bits to
      existing VMCS fields, VM_EXIT_REASON.VMX_EXIT_REASON_FROM_ENCLAVE and
      GUEST_INTERRUPTIBILITY_INFO.GUEST_INTR_STATE_ENCLAVE_INTR.  Define the
      new architectural bits, and add a boolean to struct vcpu_vmx to cache
      VMX_EXIT_REASON_FROM_ENCLAVE.  Clear the bit in exit_reason so that
      checks against exit_reason do not need to account for SGX, e.g.
      "if (exit_reason == EXIT_REASON_EXCEPTION_NMI)" continues to work.
      
      KVM is a largely a passive observer of the new bits, e.g. KVM needs to
      account for the bits when propagating information to a nested VMM, but
      otherwise doesn't need to act differently for the majority of VM-Exits
      from enclaves.
      
      The one scenario that is directly impacted is emulation, which is for
      all intents and purposes impossible[1] since KVM does not have access to
      the RIP or instruction stream that triggered the VM-Exit.  The inability
      to emulate is a non-issue for KVM, as most instructions that might
      trigger VM-Exit unconditionally #UD in an enclave (before the VM-Exit
      check.  For the few instruction that conditionally #UD, KVM either never
      sets the exiting control, e.g. PAUSE_EXITING[2], or sets it if and only
      if the feature is not exposed to the guest in order to inject a #UD,
      e.g. RDRAND_EXITING.
      
      But, because it is still possible for a guest to trigger emulation,
      e.g. MMIO, inject a #UD if KVM ever attempts emulation after a VM-Exit
      from an enclave.  This is architecturally accurate for instruction
      VM-Exits, and for MMIO it's the least bad choice, e.g. it's preferable
      to killing the VM.  In practice, only broken or particularly stupid
      guests should ever encounter this behavior.
      
      Add a WARN in skip_emulated_instruction to detect any attempt to
      modify the guest's RIP during an SGX enclave VM-Exit as all such flows
      should either be unreachable or must handle exits from enclaves before
      getting to skip_emulated_instruction.
      
      [1] Impossible for all practical purposes.  Not truly impossible
          since KVM could implement some form of para-virtualization scheme.
      
      [2] PAUSE_LOOP_EXITING only affects CPL0 and enclaves exist only at
          CPL3, so we also don't need to worry about that interaction.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NKai Huang <kai.huang@intel.com>
      Message-Id: <315f54a8507d09c292463ef29104e1d4c62e9090.1618196135.git.kai.huang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NFan Du <fan.du@intel.com>
      Signed-off-by: NZhiquan Li <zhiquan1.li@intel.com>
      08f41fc4
  9. 06 7月, 2022 2 次提交
  10. 19 5月, 2022 1 次提交
    • S
      KVM: VMX: Set vmcs.PENDING_DBG.BS on #DB in STI/MOVSS blocking shadow · 617a368d
      Sean Christopherson 提交于
      stable inclusion
      from stable-v5.10.101
      commit 3aa5c8657292e05e6dfa8fe2316951001dab7e3a
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5669Z
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3aa5c8657292e05e6dfa8fe2316951001dab7e3a
      
      --------------------------------
      
      [ Upstream commit b9bed78e ]
      
      Set vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS, a.k.a. the pending single-step
      breakpoint flag, when re-injecting a #DB with RFLAGS.TF=1, and STI or
      MOVSS blocking is active.  Setting the flag is necessary to make VM-Entry
      consistency checks happy, as VMX has an invariant that if RFLAGS.TF is
      set and STI/MOVSS blocking is true, then the previous instruction must
      have been STI or MOV/POP, and therefore a single-step #DB must be pending
      since the RFLAGS.TF cannot have been set by the previous instruction,
      i.e. the one instruction delay after setting RFLAGS.TF must have already
      expired.
      
      Normally, the CPU sets vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS appropriately
      when recording guest state as part of a VM-Exit, but #DB VM-Exits
      intentionally do not treat the #DB as "guest state" as interception of
      the #DB effectively makes the #DB host-owned, thus KVM needs to manually
      set PENDING_DBG.BS when forwarding/re-injecting the #DB to the guest.
      
      Note, although this bug can be triggered by guest userspace, doing so
      requires IOPL=3, and guest userspace running with IOPL=3 has full access
      to all I/O ports (from the guest's perspective) and can crash/reboot the
      guest any number of ways.  IOPL=3 is required because STI blocking kicks
      in if and only if RFLAGS.IF is toggled 0=>1, and if CPL>IOPL, STI either
      takes a #GP or modifies RFLAGS.VIF, not RFLAGS.IF.
      
      MOVSS blocking can be initiated by userspace, but can be coincident with
      a #DB if and only if DR7.GD=1 (General Detect enabled) and a MOV DR is
      executed in the MOVSS shadow.  MOV DR #GPs at CPL>0, thus MOVSS blocking
      is problematic only for CPL0 (and only if the guest is crazy enough to
      access a DR in a MOVSS shadow).  All other sources of #DBs are either
      suppressed by MOVSS blocking (single-step, code fetch, data, and I/O),
      are mutually exclusive with MOVSS blocking (T-bit task switch), or are
      already handled by KVM (ICEBP, a.k.a. INT1).
      
      This bug was originally found by running tests[1] created for XSA-308[2].
      Note that Xen's userspace test emits ICEBP in the MOVSS shadow, which is
      presumably why the Xen bug was deemed to be an exploitable DOS from guest
      userspace.  KVM already handles ICEBP by skipping the ICEBP instruction
      and thus clears MOVSS blocking as a side effect of its "emulation".
      
      [1] http://xenbits.xenproject.org/docs/xtf/xsa-308_2main_8c_source.html
      [2] https://xenbits.xen.org/xsa/advisory-308.htmlReported-by: NDavid Woodhouse <dwmw2@infradead.org>
      Reported-by: NAlexander Graf <graf@amazon.de>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20220120000624.655815-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      617a368d
  11. 28 4月, 2022 1 次提交
  12. 19 4月, 2022 1 次提交
  13. 26 1月, 2022 1 次提交
  14. 14 1月, 2022 1 次提交
  15. 07 1月, 2022 5 次提交