1. 18 10月, 2019 1 次提交
  2. 09 10月, 2019 1 次提交
  3. 08 10月, 2019 2 次提交
    • J
      x86/asm: Fix MWAITX C-state hint value · 454de1e7
      Janakarajan Natarajan 提交于
      As per "AMD64 Architecture Programmer's Manual Volume 3: General-Purpose
      and System Instructions", MWAITX EAX[7:4]+1 specifies the optional hint
      of the optimized C-state. For C0 state, EAX[7:4] should be set to 0xf.
      
      Currently, a value of 0xf is set for EAX[3:0] instead of EAX[7:4]. Fix
      this by changing MWAITX_DISABLE_CSTATES from 0xf to 0xf0.
      
      This hasn't had any implications so far because setting reserved bits in
      EAX is simply ignored by the CPU.
      
       [ bp: Fixup comment in delay_mwaitx() and massage. ]
      Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "x86@kernel.org" <x86@kernel.org>
      Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20191007190011.4859-1-Janakarajan.Natarajan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      454de1e7
    • L
      uaccess: implement a proper unsafe_copy_to_user() and switch filldir over to it · c512c691
      Linus Torvalds 提交于
      In commit 9f79b78e ("Convert filldir[64]() from __put_user() to
      unsafe_put_user()") I made filldir() use unsafe_put_user(), which
      improves code generation on x86 enormously.
      
      But because we didn't have a "unsafe_copy_to_user()", the dirent name
      copy was also done by hand with unsafe_put_user() in a loop, and it
      turns out that a lot of other architectures didn't like that, because
      unlike x86, they have various alignment issues.
      
      Most non-x86 architectures trap and fix it up, and some (like xtensa)
      will just fail unaligned put_user() accesses unconditionally.  Which
      makes that "copy using put_user() in a loop" not work for them at all.
      
      I could make that code do explicit alignment etc, but the architectures
      that don't like unaligned accesses also don't really use the fancy
      "user_access_begin/end()" model, so they might just use the regular old
      __copy_to_user() interface.
      
      So this commit takes that looping implementation, turns it into the x86
      version of "unsafe_copy_to_user()", and makes other architectures
      implement the unsafe copy version as __copy_to_user() (the same way they
      do for the other unsafe_xyz() accessor functions).
      
      Note that it only does this for the copying _to_ user space, and we
      still don't have a unsafe version of copy_from_user().
      
      That's partly because we have no current users of it, but also partly
      because the copy_from_user() case is slightly different and cannot
      efficiently be implemented in terms of a unsafe_get_user() loop (because
      gcc can't do asm goto with outputs).
      
      It would be trivial to do this using "rep movsb", which would work
      really nicely on newer x86 cores, but really badly on some older ones.
      
      Al Viro is looking at cleaning up all our user copy routines to make
      this all a non-issue, but for now we have this simple-but-stupid version
      for x86 that works fine for the dirent name copy case because those
      names are short strings and we simply don't need anything fancier.
      
      Fixes: 9f79b78e ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Reported-and-tested-by: NTony Luck <tony.luck@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c512c691
  4. 02 10月, 2019 1 次提交
  5. 27 9月, 2019 1 次提交
    • P
      KVM: x86: assign two bits to track SPTE kinds · 6eeb4ef0
      Paolo Bonzini 提交于
      Currently, we are overloading SPTE_SPECIAL_MASK to mean both
      "A/D bits unavailable" and MMIO, where the difference between the
      two is determined by mio_mask and mmio_value.
      
      However, the next patch will need two bits to distinguish
      availability of A/D bits from write protection.  So, while at
      it give MMIO its own bit pattern, and move the two bits from
      bit 62 to bits 52..53 since Intel is allocating EPT page table
      bits from the top.
      Reviewed-by: NJunaid Shahid <junaids@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6eeb4ef0
  6. 25 9月, 2019 5 次提交
  7. 24 9月, 2019 14 次提交
    • M
      kvm: nvmx: limit atomic switch MSRs · f0b5105a
      Marc Orr 提交于
      Allowing an unlimited number of MSRs to be specified via the VMX
      load/store MSR lists (e.g., vm-entry MSR load list) is bad for two
      reasons. First, a guest can specify an unreasonable number of MSRs,
      forcing KVM to process all of them in software. Second, the SDM bounds
      the number of MSRs allowed to be packed into the atomic switch MSR lists.
      Quoting the "Miscellaneous Data" section in the "VMX Capability
      Reporting Facility" appendix:
      
      "Bits 27:25 is used to compute the recommended maximum number of MSRs
      that should appear in the VM-exit MSR-store list, the VM-exit MSR-load
      list, or the VM-entry MSR-load list. Specifically, if the value bits
      27:25 of IA32_VMX_MISC is N, then 512 * (N + 1) is the recommended
      maximum number of MSRs to be included in each list. If the limit is
      exceeded, undefined processor behavior may result (including a machine
      check during the VMX transition)."
      
      Because KVM needs to protect itself and can't model "undefined processor
      behavior", arbitrarily force a VM-entry to fail due to MSR loading when
      the MSR load list is too large. Similarly, trigger an abort during a VM
      exit that encounters an MSR load list or MSR store list that is too large.
      
      The MSR list size is intentionally not pre-checked so as to maintain
      compatibility with hardware inasmuch as possible.
      
      Test these new checks with the kvm-unit-test "x86: nvmx: test max atomic
      switch MSRs".
      Suggested-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NPeter Shier <pshier@google.com>
      Signed-off-by: NMarc Orr <marcorr@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f0b5105a
    • J
      kvm: svm: Intercept RDPRU · 0cb8410b
      Jim Mattson 提交于
      The RDPRU instruction gives the guest read access to the IA32_APERF
      MSR and the IA32_MPERF MSR. According to volume 3 of the APM, "When
      virtualization is enabled, this instruction can be intercepted by the
      Hypervisor. The intercept bit is at VMCB byte offset 10h, bit 14."
      Since we don't enumerate the instruction in KVM_SUPPORTED_CPUID,
      intercept it and synthesize #UD.
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NDrew Schmitt <dasch@google.com>
      Reviewed-by: NJacob Xu <jacobhxu@google.com>
      Reviewed-by: NPeter Shier <pshier@google.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0cb8410b
    • S
      KVM: x86/mmu: Explicitly track only a single invalid mmu generation · ca333add
      Sean Christopherson 提交于
      Toggle mmu_valid_gen between '0' and '1' instead of blindly incrementing
      the generation.  Because slots_lock is held for the entire duration of
      zapping obsolete pages, it's impossible for there to be multiple invalid
      generations associated with shadow pages at any given time.
      
      Toggling between the two generations (valid vs. invalid) allows changing
      mmu_valid_gen from an unsigned long to a u8, which reduces the size of
      struct kvm_mmu_page from 160 to 152 bytes on 64-bit KVM, i.e. reduces
      KVM's memory footprint by 8 bytes per shadow page.
      
      Set sp->mmu_valid_gen before it is added to active_mmu_pages.
      Functionally this has no effect as kvm_mmu_alloc_page() has a single
      caller that sets sp->mmu_valid_gen soon thereafter, but visually it is
      jarring to see a shadow page being added to the list without its
      mmu_valid_gen first being set.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ca333add
    • S
      KVM: x86/mmu: Revert "Revert "KVM: MMU: reclaim the zapped-obsolete page first"" · 31741eb1
      Sean Christopherson 提交于
      Now that the fast invalidate mechanism has been reintroduced, restore
      the performance tweaks for fast invalidation that existed prior to its
      removal.
      
      Paraphrashing the original changelog:
      
        Introduce a per-VM list to track obsolete shadow pages, i.e. pages
        which have been deleted from the mmu cache but haven't yet been freed.
        When page reclaiming is needed, zap/free the deleted pages first.
      
      This reverts commit 52d5dedc.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      31741eb1
    • T
      KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit · bf653b78
      Tao Xu 提交于
      As the latest Intel 64 and IA-32 Architectures Software Developer's
      Manual, UMWAIT and TPAUSE instructions cause a VM exit if the
      RDTSC exiting and enable user wait and pause VM-execution
      controls are both 1.
      
      Because KVM never enable RDTSC exiting, the vm-exit for UMWAIT and TPAUSE
      should never happen. Considering EXIT_REASON_XSAVES and
      EXIT_REASON_XRSTORS is also unexpected VM-exit for KVM. Introduce a common
      exit helper handle_unexpected_vmexit() to handle these unexpected VM-exit.
      Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Co-developed-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NTao Xu <tao3.xu@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bf653b78
    • T
      KVM: x86: Add support for user wait instructions · e69e72fa
      Tao Xu 提交于
      UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
      This patch adds support for user wait instructions in KVM. Availability
      of the user wait instructions is indicated by the presence of the CPUID
      feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
      be executed at any privilege level, and use 32bit IA32_UMWAIT_CONTROL MSR
      to set the maximum time.
      
      The behavior of user wait instructions in VMX non-root operation is
      determined first by the setting of the "enable user wait and pause"
      secondary processor-based VM-execution control bit 26.
      	If the VM-execution control is 0, UMONITOR/UMWAIT/TPAUSE cause
      an invalid-opcode exception (#UD).
      	If the VM-execution control is 1, treatment is based on the
      setting of the “RDTSC exiting†VM-execution control. Because KVM never
      enables RDTSC exiting, if the instruction causes a delay, the amount of
      time delayed is called here the physical delay. The physical delay is
      first computed by determining the virtual delay. If
      IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in
      EDX:EAX minus the value that RDTSC would return; if
      IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum
      of that difference and AND(IA32_UMWAIT_CONTROL,FFFFFFFCH).
      
      Because umwait and tpause can put a (psysical) CPU into a power saving
      state, by default we dont't expose it to kvm and enable it only when
      guest CPUID has it.
      
      Detailed information about user wait instructions can be found in the
      latest Intel 64 and IA-32 Architectures Software Developer's Manual.
      Co-developed-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NTao Xu <tao3.xu@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e69e72fa
    • S
      KVM: x86: Add comments to document various emulation types · 41577ab8
      Sean Christopherson 提交于
      Document the intended usage of each emulation type as each exists to
      handle an edge case of one kind or another and can be easily
      misinterpreted at first glance.
      
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      41577ab8
    • S
      KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT} · 60fc3d02
      Sean Christopherson 提交于
      Deferring emulation failure handling (in some cases) to the caller of
      x86_emulate_instruction() has proven fragile, e.g. multiple instances of
      KVM not setting run->exit_reason on EMULATE_FAIL, largely due to it
      being difficult to discern what emulation types can return what result,
      and which combination of types and results are handled where.
      
      Now that x86_emulate_instruction() always handles emulation failure,
      i.e. EMULATION_FAIL is only referenced in callers, remove the
      emulation_result enums entirely.  Per KVM's existing exit handling
      conventions, return '0' and '1' for "exit to userspace" and "resume
      guest" respectively.  Doing so cleans up many callers, e.g. they can
      return kvm_emulate_instruction() directly instead of having to interpret
      its result.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      60fc3d02
    • S
      KVM: x86: Add explicit flag for forced emulation on #UD · b4000606
      Sean Christopherson 提交于
      Add an explicit emulation type for forced #UD emulation and use it to
      detect that KVM should unconditionally inject a #UD instead of falling
      into its standard emulation failure handling.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b4000606
    • S
      KVM: x86: Move #GP injection for VMware into x86_emulate_instruction() · 42cbf068
      Sean Christopherson 提交于
      Immediately inject a #GP when VMware emulation fails and return
      EMULATE_DONE instead of propagating EMULATE_FAIL up the stack.  This
      helps pave the way for removing EMULATE_FAIL altogether.
      
      Rename EMULTYPE_VMWARE to EMULTYPE_VMWARE_GP to document that the x86
      emulator is called to handle VMware #GP interception, e.g. why a #GP
      is injected on emulation failure for EMULTYPE_VMWARE_GP.
      
      Drop EMULTYPE_NO_UD_ON_FAIL as a standalone type.  The "no #UD on fail"
      is used only in the VMWare case and is obsoleted by having the emulator
      itself reinject #GP.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      42cbf068
    • V
      KVM: x86: hyper-v: set NoNonArchitecturalCoreSharing CPUID bit when SMT is impossible · b2d8b167
      Vitaly Kuznetsov 提交于
      Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
      guarantee that two virtual processors won't end up running on sibling SMT
      threads without knowing about it. This is done as an optimization as in
      this case there is nothing the guest can do to protect itself against MDS
      and issuing additional flush requests is just pointless. On bare metal the
      topology is known, however, when Hyper-V is running nested (e.g. on top of
      KVM) it needs an additional piece of information: a confirmation that the
      exposed topology (wrt vCPU placement on different SMT threads) is
      trustworthy.
      
      NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
      TLFS as follows: "Indicates that a virtual processor will never share a
      physical core with another virtual processor, except for virtual processors
      that are reported as sibling SMT threads." From KVM we can give such
      guarantee in two cases:
      - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
       as it can become re-enabled during the lifetime of the guest).
      - vCPUs are properly pinned so the scheduler won't put them on sibling
      SMT threads (when they're not reported as such).
      
      This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the
      first case. The second case is outside of KVM's domain of responsibility
      (as vCPU pinning is actually done by someone who manages KVM's userspace -
      e.g. libvirt pinning QEMU threads).
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b2d8b167
    • V
      KVM/Hyper-V/VMX: Add direct tlb flush support · 6f6a657c
      Vitaly Kuznetsov 提交于
      Hyper-V provides direct tlb flush function which helps
      L1 Hypervisor to handle Hyper-V tlb flush request from
      L2 guest. Add the function support for VMX.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6f6a657c
    • T
      KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH · 344c6c80
      Tianyu Lan 提交于
      Hyper-V direct tlb flush function should be enabled for
      guest that only uses Hyper-V hypercall. User space
      hypervisor(e.g, Qemu) can disable KVM identification in
      CPUID and just exposes Hyper-V identification to make
      sure the precondition. Add new KVM capability KVM_CAP_
      HYPERV_DIRECT_TLBFLUSH for user space to enable Hyper-V
      direct tlb function and this function is default to be
      disabled in KVM.
      Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      344c6c80
    • T
      x86/Hyper-V: Fix definition of struct hv_vp_assist_page · 7a83247e
      Tianyu Lan 提交于
      The struct hv_vp_assist_page was defined incorrectly.
      The "vtl_control" should be u64[3], "nested_enlightenments
      _control" should be a u64 and there are 7 reserved bytes
      following "enlighten_vmentry". Fix the definition.
      Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7a83247e
  8. 16 9月, 2019 2 次提交
    • R
      x86: bug.h: use asm_inline in _BUG_FLAGS definitions · 32ee8230
      Rasmus Villemoes 提交于
      This helps preventing a BUG* or WARN* in some static inline from
      preventing that (or one of its callers) being inlined, so should allow
      gcc to make better informed inlining decisions.
      
      For example, with gcc 9.2, tcp_fastopen_no_cookie() vanishes from
      net/ipv4/tcp_fastopen.o. It does not itself have any BUG or WARN, but
      it calls dst_metric() which has a WARN_ON_ONCE - and despite that
      WARN_ON_ONCE vanishing since the condition is compile-time false,
      dst_metric() is apparently sufficiently "large" that when it gets
      inlined into tcp_fastopen_no_cookie(), the latter becomes too large
      for inlining.
      
      Overall, if one asks size(1), .text decreases a little and .data
      increases by about the same amount (x86-64 defconfig)
      
      $ size vmlinux.{before,after}
         text    data     bss     dec     hex filename
      19709726        5202600 1630280 26542606        195020e vmlinux.before
      19709330        5203068 1630280 26542678        1950256 vmlinux.after
      
      while bloat-o-meter says
      
      add/remove: 10/28 grow/shrink: 103/51 up/down: 3669/-2854 (815)
      ...
      Total: Before=14783683, After=14784498, chg +0.01%
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      32ee8230
    • R
      x86: alternative.h: use asm_inline for all alternative variants · 40576e5e
      Rasmus Villemoes 提交于
      Most, if not all, uses of the alternative* family just provide one or
      two instructions in .text, but the string literal can be quite large,
      causing gcc to overestimate the size of the generated code. That in
      turn affects its decisions about inlining of the function containing
      the alternative() asm statement.
      
      New enough versions of gcc allow one to overrule the estimated size by
      using "asm inline" instead of just "asm". So replace asm by the helper
      asm_inline, which for older gccs just expands to asm.
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      40576e5e
  9. 14 9月, 2019 1 次提交
    • S
      KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot · 002c5f73
      Sean Christopherson 提交于
      James Harvey reported a livelock that was introduced by commit
      d012a06a ("Revert "KVM: x86/mmu: Zap only the relevant pages when
      removing a memslot"").
      
      The livelock occurs because kvm_mmu_zap_all() as it exists today will
      voluntarily reschedule and drop KVM's mmu_lock, which allows other vCPUs
      to add shadow pages.  With enough vCPUs, kvm_mmu_zap_all() can get stuck
      in an infinite loop as it can never zap all pages before observing lock
      contention or the need to reschedule.  The equivalent of kvm_mmu_zap_all()
      that was in use at the time of the reverted commit (4e103134, "KVM:
      x86/mmu: Zap only the relevant pages when removing a memslot") employed
      a fast invalidate mechanism and was not susceptible to the above livelock.
      
      There are three ways to fix the livelock:
      
      - Reverting the revert (commit d012a06a) is not a viable option as
        the revert is needed to fix a regression that occurs when the guest has
        one or more assigned devices.  It's unlikely we'll root cause the device
        assignment regression soon enough to fix the regression timely.
      
      - Remove the conditional reschedule from kvm_mmu_zap_all().  However, although
        removing the reschedule would be a smaller code change, it's less safe
        in the sense that the resulting kvm_mmu_zap_all() hasn't been used in
        the wild for flushing memslots since the fast invalidate mechanism was
        introduced by commit 6ca18b69 ("KVM: x86: use the fast way to
        invalidate all pages"), back in 2013.
      
      - Reintroduce the fast invalidate mechanism and use it when zapping shadow
        pages in response to a memslot being deleted/moved, which is what this
        patch does.
      
      For all intents and purposes, this is a revert of commit ea145aac
      ("Revert "KVM: MMU: fast invalidate all pages"") and a partial revert of
      commit 7390de1e ("Revert "KVM: x86: use the fast way to invalidate
      all pages""), i.e. restores the behavior of commit 5304b8d3 ("KVM:
      MMU: fast invalidate all pages") and commit 6ca18b69 ("KVM: x86:
      use the fast way to invalidate all pages") respectively.
      
      Fixes: d012a06a ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"")
      Reported-by: NJames Harvey <jamespharvey20@gmail.com>
      Cc: Alex Willamson <alex.williamson@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      002c5f73
  10. 12 9月, 2019 2 次提交
    • L
      KVM: x86: Fix INIT signal handling in various CPU states · 4b9852f4
      Liran Alon 提交于
      Commit cd7764fe ("KVM: x86: latch INITs while in system management mode")
      changed code to latch INIT while vCPU is in SMM and process latched INIT
      when leaving SMM. It left a subtle remark in commit message that similar
      treatment should also be done while vCPU is in VMX non-root-mode.
      
      However, INIT signals should actually be latched in various vCPU states:
      (*) For both Intel and AMD, INIT signals should be latched while vCPU
      is in SMM.
      (*) For Intel, INIT should also be latched while vCPU is in VMX
      operation and later processed when vCPU leaves VMX operation by
      executing VMXOFF.
      (*) For AMD, INIT should also be latched while vCPU runs with GIF=0
      or in guest-mode with intercept defined on INIT signal.
      
      To fix this:
      1) Add kvm_x86_ops->apic_init_signal_blocked() such that each CPU vendor
      can define the various CPU states in which INIT signals should be
      blocked and modify kvm_apic_accept_events() to use it.
      2) Modify vmx_check_nested_events() to check for pending INIT signal
      while vCPU in guest-mode. If so, emualte vmexit on
      EXIT_REASON_INIT_SIGNAL. Note that nSVM should have similar behaviour
      but is currently left as a TODO comment to implement in the future
      because nSVM don't yet implement svm_check_nested_events().
      
      Note: Currently KVM nVMX implementation don't support VMX wait-for-SIPI
      activity state as specified in MSR_IA32_VMX_MISC bits 6:8 exposed to
      guest (See nested_vmx_setup_ctls_msrs()).
      If and when support for this activity state will be implemented,
      kvm_check_nested_events() would need to avoid emulating vmexit on
      INIT signal in case activity-state is wait-for-SIPI. In addition,
      kvm_apic_accept_events() would need to be modified to avoid discarding
      SIPI in case VMX activity-state is wait-for-SIPI but instead delay
      SIPI processing to vmx_check_nested_events() that would clear
      pending APIC events and emulate vmexit on SIPI.
      Reviewed-by: NJoao Martins <joao.m.martins@oracle.com>
      Co-developed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Signed-off-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4b9852f4
    • L
      KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode · 4a53d99d
      Liran Alon 提交于
      According to Intel SDM section 25.2 "Other Causes of VM Exits",
      When INIT signal is received on a CPU that is running in VMX
      non-root mode it should cause an exit with exit-reason of 3.
      (See Intel SDM Appendix C "VMX BASIC EXIT REASONS")
      
      This patch introduce the exit-reason definition.
      Reviewed-by: NBhavesh Davda <bhavesh.davda@oracle.com>
      Reviewed-by: NJoao Martins <joao.m.martins@oracle.com>
      Co-developed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Signed-off-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4a53d99d
  11. 11 9月, 2019 4 次提交
  12. 10 9月, 2019 1 次提交
  13. 06 9月, 2019 3 次提交
  14. 03 9月, 2019 2 次提交