1. 17 10月, 2018 7 次提交
    • S
      KVM: nVMX: move vmcs12 EPTP consistency check to check_vmentry_prereqs() · 5b8ba41d
      Sean Christopherson 提交于
      An invalid EPTP causes a VMFail(VMXERR_ENTRY_INVALID_CONTROL_FIELD),
      not a VMExit.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5b8ba41d
    • S
      KVM: nVMX: move host EFER consistency checks to VMFail path · 64a919f7
      Sean Christopherson 提交于
      Invalid host state related to loading EFER on VMExit causes a
      VMFail(VMXERR_ENTRY_INVALID_HOST_STATE_FIELD), not a VMExit.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      64a919f7
    • J
      KVM: nVMX: Always reflect #NM VM-exits to L1 · 3c6e099f
      Jim Mattson 提交于
      When bit 3 (corresponding to CR0.TS) of the VMCS12 cr0_guest_host_mask
      field is clear, the VMCS12 guest_cr0 field does not necessarily hold
      the current value of the L2 CR0.TS bit, so the code that checked for
      L2's CR0.TS bit being set was incorrect. Moreover, I'm not sure that
      the CR0.TS check was adequate. (What if L2's CR0.EM was set, for
      instance?)
      
      Fortunately, lazy FPU has gone away, so L0 has lost all interest in
      intercepting #NM exceptions. See commit bd7e5b08 ("KVM: x86:
      remove code for lazy FPU handling"). Therefore, there is no longer any
      question of which hypervisor gets first dibs. The #NM VM-exit should
      always be reflected to L1. (Note that the corresponding bit must be
      set in the VMCS12 exception_bitmap field for there to be an #NM
      VM-exit at all.)
      
      Fixes: ccf9844e ("kvm, vmx: Really fix lazy FPU on nested guest")
      Reported-by: NAbhiroop Dabral <adabral@paloaltonetworks.com>
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NPeter Shier <pshier@google.com>
      Tested-by: NAbhiroop Dabral <adabral@paloaltonetworks.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3c6e099f
    • T
      KVM/VMX: Remve unused function is_external_interrupt(). · aaa45da2
      Tianyu Lan 提交于
      is_external_interrupt() is not used now and so remove it.
      Signed-off-by: NLan Tianyu <Tianyu.Lan@microsoft.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      aaa45da2
    • K
    • S
      KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail · bd18bffc
      Sean Christopherson 提交于
      A VMEnter that VMFails (as opposed to VMExits) does not touch host
      state beyond registers that are explicitly noted in the VMFail path,
      e.g. EFLAGS.  Host state does not need to be loaded because VMFail
      is only signaled for consistency checks that occur before the CPU
      starts to load guest state, i.e. there is no need to restore any
      state as nothing has been modified.  But in the case where a VMFail
      is detected by hardware and not by KVM (due to deferring consistency
      checks to hardware), KVM has already loaded some amount of guest
      state.  Luckily, "loaded" only means loaded to KVM's software model,
      i.e. vmcs01 has not been modified.  So, unwind our software model to
      the pre-VMEntry host state.
      
      Not restoring host state in this VMFail path leads to a variety of
      failures because we end up with stale data in vcpu->arch, e.g. CR0,
      CR4, EFER, etc... will all be out of sync relative to vmcs01.  Any
      significant delta in the stale data is all but guaranteed to crash
      L1, e.g. emulation of SMEP, SMAP, UMIP, WP, etc... will be wrong.
      
      An alternative to this "soft" reload would be to load host state from
      vmcs12 as if we triggered a VMExit (as opposed to VMFail), but that is
      wildly inconsistent with respect to the VMX architecture, e.g. an L1
      VMM with separate VMExit and VMFail paths would explode.
      
      Note that this approach does not mean KVM is 100% accurate with
      respect to VMX hardware behavior, even at an architectural level
      (the exact order of consistency checks is microarchitecture specific).
      But 100% emulation accuracy isn't the goal (with this patch), rather
      the goal is to be consistent in the information delivered to L1, e.g.
      a VMExit should not fall-through VMENTER, and a VMFail should not jump
      to HOST_RIP.
      
      This technically reverts commit "5af41573 (KVM: nVMX: Fix mmu
      context after VMLAUNCH/VMRESUME failure)", but retains the core
      aspects of that patch, just in an open coded form due to the need to
      pull state from vmcs01 instead of vmcs12.  Restoring host state
      resolves a variety of issues introduced by commit "4f350c6d
      (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)",
      which remedied the incorrect behavior of treating VMFail like VMExit
      but in doing so neglected to restore arch state that had been modified
      prior to attempting nested VMEnter.
      
      A sample failure that occurs due to stale vcpu.arch state is a fault
      of some form while emulating an LGDT (due to emulated UMIP) from L1
      after a failed VMEntry to L3, in this case when running the KVM unit
      test test_tpr_threshold_values in L1.  L0 also hits a WARN in this
      case due to a stale arch.cr4.UMIP.
      
      L1:
        BUG: unable to handle kernel paging request at ffffc90000663b9e
        PGD 276512067 P4D 276512067 PUD 276513067 PMD 274efa067 PTE 8000000271de2163
        Oops: 0009 [#1] SMP
        CPU: 5 PID: 12495 Comm: qemu-system-x86 Tainted: G        W         4.18.0-rc2+ #2
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        RIP: 0010:native_load_gdt+0x0/0x10
      
        ...
      
        Call Trace:
         load_fixmap_gdt+0x22/0x30
         __vmx_load_host_state+0x10e/0x1c0 [kvm_intel]
         vmx_switch_vmcs+0x2d/0x50 [kvm_intel]
         nested_vmx_vmexit+0x222/0x9c0 [kvm_intel]
         vmx_handle_exit+0x246/0x15a0 [kvm_intel]
         kvm_arch_vcpu_ioctl_run+0x850/0x1830 [kvm]
         kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
         do_vfs_ioctl+0x9f/0x600
         ksys_ioctl+0x66/0x70
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x4f/0x100
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      L0:
        WARNING: CPU: 2 PID: 3529 at arch/x86/kvm/vmx.c:6618 handle_desc+0x28/0x30 [kvm_intel]
        ...
        CPU: 2 PID: 3529 Comm: qemu-system-x86 Not tainted 4.17.2-coffee+ #76
        Hardware name: Intel Corporation Kabylake Client platform/KBL S
        RIP: 0010:handle_desc+0x28/0x30 [kvm_intel]
      
        ...
      
        Call Trace:
         kvm_arch_vcpu_ioctl_run+0x863/0x1840 [kvm]
         kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
         do_vfs_ioctl+0x9f/0x5e0
         ksys_ioctl+0x66/0x70
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x49/0xf0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 5af41573 (KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure)
      Fixes: 4f350c6d (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim KrÄmář <rkrcmar@redhat.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bd18bffc
    • J
      KVM: nVMX: Clear reserved bits of #DB exit qualification · cfb634fe
      Jim Mattson 提交于
      According to volume 3 of the SDM, bits 63:15 and 12:4 of the exit
      qualification field for debug exceptions are reserved (cleared to
      0). However, the SDM is incorrect about bit 16 (corresponding to
      DR6.RTM). This bit should be set if a debug exception (#DB) or a
      breakpoint exception (#BP) occurred inside an RTM region while
      advanced debugging of RTM transactional regions was enabled. Note that
      this is the opposite of DR6.RTM, which "indicates (when clear) that a
      debug exception (#DB) or breakpoint exception (#BP) occurred inside an
      RTM region while advanced debugging of RTM transactional regions was
      enabled."
      
      There is still an issue with stale DR6 bits potentially being
      misreported for the current debug exception.  DR6 should not have been
      modified before vectoring the #DB exception, and the "new DR6 bits"
      should be available somewhere, but it was and they aren't.
      
      Fixes: b96fb439 ("KVM: nVMX: fixes to nested virt interrupt injection")
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cfb634fe
  2. 13 10月, 2018 5 次提交
  3. 04 10月, 2018 3 次提交
    • P
      kvm: nVMX: fix entry with pending interrupt if APICv is enabled · 7e712684
      Paolo Bonzini 提交于
      Commit b5861e5c introduced a check on
      the interrupt-window and NMI-window CPU execution controls in order to
      inject an external interrupt vmexit before the first guest instruction
      executes.  However, when APIC virtualization is enabled the host does not
      need a vmexit in order to inject an interrupt at the next interrupt window;
      instead, it just places the interrupt vector in RVI and the processor will
      inject it as soon as possible.  Therefore, on machines with APICv it is
      not enough to check the CPU execution controls: the same scenario can also
      happen if RVI>vPPR.
      
      Fixes: b5861e5cReviewed-by: NNikita Leshchenko <nikita.leshchenko@oracle.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7e712684
    • P
      KVM: VMX: hide flexpriority from guest when disabled at the module level · 2cf7ea9f
      Paolo Bonzini 提交于
      As of commit 8d860bbe ("kvm: vmx: Basic APIC virtualization controls
      have three settings"), KVM will disable VIRTUALIZE_APIC_ACCESSES when
      a nested guest writes APIC_BASE MSR and kvm-intel.flexpriority=0,
      whereas previously KVM would allow a nested guest to enable
      VIRTUALIZE_APIC_ACCESSES so long as it's supported in hardware.  That is,
      KVM now advertises VIRTUALIZE_APIC_ACCESSES to a guest but doesn't
      (always) allow setting it when kvm-intel.flexpriority=0, and may even
      initially allow the control and then clear it when the nested guest
      writes APIC_BASE MSR, which is decidedly odd even if it doesn't cause
      functional issues.
      
      Hide the control completely when the module parameter is cleared.
      reported-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Fixes: 8d860bbe ("kvm: vmx: Basic APIC virtualization controls have three settings")
      Cc: Jim Mattson <jmattson@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2cf7ea9f
    • S
      KVM: VMX: check for existence of secondary exec controls before accessing · fd6b6d9b
      Sean Christopherson 提交于
      Return early from vmx_set_virtual_apic_mode() if the processor doesn't
      support VIRTUALIZE_APIC_ACCESSES or VIRTUALIZE_X2APIC_MODE, both of
      which reside in SECONDARY_VM_EXEC_CONTROL.  This eliminates warnings
      due to VMWRITEs to SECONDARY_VM_EXEC_CONTROL (VMCS field 401e) failing
      on processors without secondary exec controls.
      
      Remove the similar check for TPR shadowing as it is incorporated in the
      flexpriority_enabled check and the APIC-related code in
      vmx_update_msr_bitmap() is further gated by VIRTUALIZE_X2APIC_MODE.
      Reported-by: NGerhard Wiesinger <redhat@wiesinger.com>
      Fixes: 8d860bbe ("kvm: vmx: Basic APIC virtualization controls have three settings")
      Cc: Jim Mattson <jmattson@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fd6b6d9b
  4. 01 10月, 2018 3 次提交
  5. 25 9月, 2018 1 次提交
    • P
      KVM: x86: never trap MSR_KERNEL_GS_BASE · 4679b61f
      Paolo Bonzini 提交于
      KVM has an old optimization whereby accesses to the kernel GS base MSR
      are trapped when the guest is in 32-bit and not when it is in 64-bit mode.
      The idea is that swapgs is not available in 32-bit mode, thus the
      guest has no reason to access the MSR unless in 64-bit mode and
      32-bit applications need not pay the price of switching the kernel GS
      base between the host and the guest values.
      
      However, this optimization adds complexity to the code for little
      benefit (these days most guests are going to be 64-bit anyway) and in fact
      broke after commit 678e315e ("KVM: vmx: add dedicated utility to
      access guest's kernel_gs_base", 2018-08-06); the guest kernel GS base
      can be corrupted across SMIs and UEFI Secure Boot is therefore broken
      (a secure boot Linux guest, for example, fails to reach the login prompt
      about half the time).  This patch just removes the optimization; the
      kernel GS base MSR is now never trapped by KVM, similarly to the FS and
      GS base MSRs.
      
      Fixes: 678e315eReviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4679b61f
  6. 20 9月, 2018 7 次提交
    • K
      nVMX x86: Check VPID value on vmentry of L2 guests · ba8e23db
      Krish Sadhukhan 提交于
      According to section "Checks on VMX Controls" in Intel SDM vol 3C, the
      following check needs to be enforced on vmentry of L2 guests:
      
          If the 'enable VPID' VM-execution control is 1, the value of the
          of the VPID VM-execution control field must not be 0000H.
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: NMark Kanda <mark.kanda@oracle.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ba8e23db
    • K
      nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 · 6de84e58
      Krish Sadhukhan 提交于
      According to section "Checks on VMX Controls" in Intel SDM vol 3C,
      the following check needs to be enforced on vmentry of L2 guests:
      
         - Bits 5:0 of the posted-interrupt descriptor address are all 0.
         - The posted-interrupt descriptor address does not set any bits
           beyond the processor's physical-address width.
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: NMark Kanda <mark.kanda@oracle.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
      Reviewed-by: NKarl Heubaum <karl.heubaum@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6de84e58
    • L
      KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv · e6c67d8c
      Liran Alon 提交于
      In case L1 do not intercept L2 HLT or enter L2 in HLT activity-state,
      it is possible for a vCPU to be blocked while it is in guest-mode.
      
      According to Intel SDM 26.6.5 Interrupt-Window Exiting and
      Virtual-Interrupt Delivery: "These events wake the logical processor
      if it just entered the HLT state because of a VM entry".
      Therefore, if L1 enters L2 in HLT activity-state and L2 has a pending
      deliverable interrupt in vmcs12->guest_intr_status.RVI, then the vCPU
      should be waken from the HLT state and injected with the interrupt.
      
      In addition, if while the vCPU is blocked (while it is in guest-mode),
      it receives a nested posted-interrupt, then the vCPU should also be
      waken and injected with the posted interrupt.
      
      To handle these cases, this patch enhances kvm_vcpu_has_events() to also
      check if there is a pending interrupt in L2 virtual APICv provided by
      L1. That is, it evaluates if there is a pending virtual interrupt for L2
      by checking RVI[7:4] > VPPR[7:4] as specified in Intel SDM 29.2.1
      Evaluation of Pending Interrupts.
      
      Note that this also handles the case of nested posted-interrupt by the
      fact RVI is updated in vmx_complete_nested_posted_interrupt() which is
      called from kvm_vcpu_check_block() -> kvm_arch_vcpu_runnable() ->
      kvm_vcpu_running() -> vmx_check_nested_events() ->
      vmx_complete_nested_posted_interrupt().
      Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e6c67d8c
    • P
      KVM: VMX: check nested state and CR4.VMXE against SMM · 5bea5123
      Paolo Bonzini 提交于
      VMX cannot be enabled under SMM, check it when CR4 is set and when nested
      virtualization state is restored.
      
      This should fix some WARNs reported by syzkaller, mostly around
      alloc_shadow_vmcs.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5bea5123
    • S
      KVM: VMX: use preemption timer to force immediate VMExit · d264ee0c
      Sean Christopherson 提交于
      A VMX preemption timer value of '0' is guaranteed to cause a VMExit
      prior to the CPU executing any instructions in the guest.  Use the
      preemption timer (if it's supported) to trigger immediate VMExit
      in place of the current method of sending a self-IPI.  This ensures
      that pending VMExit injection to L1 occurs prior to executing any
      instructions in the guest (regardless of nesting level).
      
      When deferring VMExit injection, KVM generates an immediate VMExit
      from the (possibly nested) guest by sending itself an IPI.  Because
      hardware interrupts are blocked prior to VMEnter and are unblocked
      (in hardware) after VMEnter, this results in taking a VMExit(INTR)
      before any guest instruction is executed.  But, as this approach
      relies on the IPI being received before VMEnter executes, it only
      works as intended when KVM is running as L0.  Because there are no
      architectural guarantees regarding when IPIs are delivered, when
      running nested the INTR may "arrive" long after L2 is running e.g.
      L0 KVM doesn't force an immediate switch to L1 to deliver an INTR.
      
      For the most part, this unintended delay is not an issue since the
      events being injected to L1 also do not have architectural guarantees
      regarding their timing.  The notable exception is the VMX preemption
      timer[1], which is architecturally guaranteed to cause a VMExit prior
      to executing any instructions in the guest if the timer value is '0'
      at VMEnter.  Specifically, the delay in injecting the VMExit causes
      the preemption timer KVM unit test to fail when run in a nested guest.
      
      Note: this approach is viable even on CPUs with a broken preemption
      timer, as broken in this context only means the timer counts at the
      wrong rate.  There are no known errata affecting timer value of '0'.
      
      [1] I/O SMIs also have guarantees on when they arrive, but I have
          no idea if/how those are emulated in KVM.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      [Use a hook for SVM instead of leaving the default in x86.c - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d264ee0c
    • S
      KVM: VMX: modify preemption timer bit only when arming timer · f459a707
      Sean Christopherson 提交于
      Provide a singular location where the VMX preemption timer bit is
      set/cleared so that future usages of the preemption timer can ensure
      the VMCS bit is up-to-date without having to modify unrelated code
      paths.  For example, the preemption timer can be used to force an
      immediate VMExit.  Cache the status of the timer to avoid redundant
      VMREAD and VMWRITE, e.g. if the timer stays armed across multiple
      VMEnters/VMExits.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f459a707
    • S
      KVM: VMX: immediately mark preemption timer expired only for zero value · 4c008127
      Sean Christopherson 提交于
      A VMX preemption timer value of '0' at the time of VMEnter is
      architecturally guaranteed to cause a VMExit prior to the CPU
      executing any instructions in the guest.  This architectural
      definition is in place to ensure that a previously expired timer
      is correctly recognized by the CPU as it is possible for the timer
      to reach zero and not trigger a VMexit due to a higher priority
      VMExit being signalled instead, e.g. a pending #DB that morphs into
      a VMExit.
      
      Whether by design or coincidence, commit f4124500 ("KVM: nVMX:
      Fully emulate preemption timer") special cased timer values of '0'
      and '1' to ensure prompt delivery of the VMExit.  Unlike '0', a
      timer value of '1' has no has no architectural guarantees regarding
      when it is delivered.
      
      Modify the timer emulation to trigger immediate VMExit if and only
      if the timer value is '0', and document precisely why '0' is special.
      Do this even if calibration of the virtual TSC failed, i.e. VMExit
      will occur immediately regardless of the frequency of the timer.
      Making only '0' a special case gives KVM leeway to be more aggressive
      in ensuring the VMExit is injected prior to executing instructions in
      the nested guest, and also eliminates any ambiguity as to why '1' is
      a special case, e.g. why wasn't the threshold for a "short timeout"
      set to 10, 100, 1000, etc...
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4c008127
  7. 08 9月, 2018 1 次提交
  8. 30 8月, 2018 3 次提交
  9. 22 8月, 2018 3 次提交
    • P
      KVM: VMX: fixes for vmentry_l1d_flush module parameter · 0027ff2a
      Paolo Bonzini 提交于
      Two bug fixes:
      
      1) missing entries in the l1d_param array; this can cause a host crash
      if an access attempts to reach the missing entry. Future-proof the get
      function against any overflows as well.  However, the two entries
      VMENTER_L1D_FLUSH_EPT_DISABLED and VMENTER_L1D_FLUSH_NOT_REQUIRED must
      not be accepted by the parse function, so disable them there.
      
      2) invalid values must be rejected even if the CPU does not have the
      bug, so test for them before checking boot_cpu_has(X86_BUG_L1TF)
      
      ... and a small refactoring, since the .cmd field is redundant with
      the index in the array.
      Reported-by: NBandan Das <bsd@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: a7b9020bSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0027ff2a
    • S
      KVM: vmx: Inject #UD for SGX ENCLS instruction in guest · 0b665d30
      Sean Christopherson 提交于
      Virtualization of Intel SGX depends on Enclave Page Cache (EPC)
      management that is not yet available in the kernel, i.e. KVM support
      for exposing SGX to a guest cannot be added until basic support
      for SGX is upstreamed, which is a WIP[1].
      
      Until SGX is properly supported in KVM, ensure a guest sees expected
      behavior for ENCLS, i.e. all ENCLS #UD.  Because SGX does not have a
      true software enable bit, e.g. there is no CR4.SGXE bit, the ENCLS
      instruction can be executed[1] by the guest if SGX is supported by the
      system.  Intercept all ENCLS leafs (via the ENCLS- exiting control and
      field) and unconditionally inject #UD.
      
      [1] https://www.spinics.net/lists/kvm/msg171333.html or
          https://lkml.org/lkml/2018/7/3/879
      
      [2] A guest can execute ENCLS in the sense that ENCLS will not take
          an immediate #UD, but no ENCLS will ever succeed in a guest
          without explicit support from KVM (map EPC memory into the guest),
          unless KVM has a *very* egregious bug, e.g. accidentally mapped
          EPC memory into the guest SPTEs.  In other words this patch is
          needed only to prevent the guest from seeing inconsistent behavior,
          e.g. #GP (SGX not enabled in Feature Control MSR) or #PF (leaf
          operand(s) does not point at EPC memory) instead of #UD on ENCLS.
          Intercepting ENCLS is not required to prevent the guest from truly
          utilizing SGX.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20180814163334.25724-3-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b665d30
    • Y
      x86/kvm/vmx: Fix coding style in vmx_setup_l1d_flush() · d806afa4
      Yi Wang 提交于
      Substitute spaces with tab. No functional changes.
      Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
      Reviewed-by: NJiang Biao <jiang.biao2@zte.com.cn>
      Message-Id: <1534398159-48509-1-git-send-email-wang.yi59@zte.com.cn>
      Cc: stable@vger.kernel.org # L1TF
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d806afa4
  10. 21 8月, 2018 1 次提交
  11. 07 8月, 2018 1 次提交
  12. 06 8月, 2018 5 次提交