1. 17 3月, 2020 1 次提交
  2. 22 2月, 2020 1 次提交
  3. 05 2月, 2020 1 次提交
  4. 14 1月, 2020 1 次提交
  5. 09 1月, 2020 3 次提交
  6. 24 9月, 2019 2 次提交
    • M
      kvm: nvmx: limit atomic switch MSRs · f0b5105a
      Marc Orr 提交于
      Allowing an unlimited number of MSRs to be specified via the VMX
      load/store MSR lists (e.g., vm-entry MSR load list) is bad for two
      reasons. First, a guest can specify an unreasonable number of MSRs,
      forcing KVM to process all of them in software. Second, the SDM bounds
      the number of MSRs allowed to be packed into the atomic switch MSR lists.
      Quoting the "Miscellaneous Data" section in the "VMX Capability
      Reporting Facility" appendix:
      
      "Bits 27:25 is used to compute the recommended maximum number of MSRs
      that should appear in the VM-exit MSR-store list, the VM-exit MSR-load
      list, or the VM-entry MSR-load list. Specifically, if the value bits
      27:25 of IA32_VMX_MISC is N, then 512 * (N + 1) is the recommended
      maximum number of MSRs to be included in each list. If the limit is
      exceeded, undefined processor behavior may result (including a machine
      check during the VMX transition)."
      
      Because KVM needs to protect itself and can't model "undefined processor
      behavior", arbitrarily force a VM-entry to fail due to MSR loading when
      the MSR load list is too large. Similarly, trigger an abort during a VM
      exit that encounters an MSR load list or MSR store list that is too large.
      
      The MSR list size is intentionally not pre-checked so as to maintain
      compatibility with hardware inasmuch as possible.
      
      Test these new checks with the kvm-unit-test "x86: nvmx: test max atomic
      switch MSRs".
      Suggested-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NPeter Shier <pshier@google.com>
      Signed-off-by: NMarc Orr <marcorr@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f0b5105a
    • T
      KVM: x86: Add support for user wait instructions · e69e72fa
      Tao Xu 提交于
      UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
      This patch adds support for user wait instructions in KVM. Availability
      of the user wait instructions is indicated by the presence of the CPUID
      feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
      be executed at any privilege level, and use 32bit IA32_UMWAIT_CONTROL MSR
      to set the maximum time.
      
      The behavior of user wait instructions in VMX non-root operation is
      determined first by the setting of the "enable user wait and pause"
      secondary processor-based VM-execution control bit 26.
      	If the VM-execution control is 0, UMONITOR/UMWAIT/TPAUSE cause
      an invalid-opcode exception (#UD).
      	If the VM-execution control is 1, treatment is based on the
      setting of the “RDTSC exiting†VM-execution control. Because KVM never
      enables RDTSC exiting, if the instruction causes a delay, the amount of
      time delayed is called here the physical delay. The physical delay is
      first computed by determining the virtual delay. If
      IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in
      EDX:EAX minus the value that RDTSC would return; if
      IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum
      of that difference and AND(IA32_UMWAIT_CONTROL,FFFFFFFCH).
      
      Because umwait and tpause can put a (psysical) CPU into a power saving
      state, by default we dont't expose it to kvm and enable it only when
      guest CPUID has it.
      
      Detailed information about user wait instructions can be found in the
      latest Intel 64 and IA-32 Architectures Software Developer's Manual.
      Co-developed-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NJingqi Liu <jingqi.liu@intel.com>
      Signed-off-by: NTao Xu <tao3.xu@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e69e72fa
  7. 11 9月, 2019 1 次提交
  8. 05 6月, 2019 1 次提交
  9. 21 12月, 2018 1 次提交
    • C
      KVM: x86: Add Intel PT virtualization work mode · f99e3daf
      Chao Peng 提交于
      Intel Processor Trace virtualization can be work in one
      of 2 possible modes:
      
      a. System-Wide mode (default):
         When the host configures Intel PT to collect trace packets
         of the entire system, it can leave the relevant VMX controls
         clear to allow VMX-specific packets to provide information
         across VMX transitions.
         KVM guest will not aware this feature in this mode and both
         host and KVM guest trace will output to host buffer.
      
      b. Host-Guest mode:
         Host can configure trace-packet generation while in
         VMX non-root operation for guests and root operation
         for native executing normally.
         Intel PT will be exposed to KVM guest in this mode, and
         the trace output to respective buffer of host and guest.
         In this mode, tht status of PT will be saved and disabled
         before VM-entry and restored after VM-exit if trace
         a virtual machine.
      Signed-off-by: NChao Peng <chao.p.peng@linux.intel.com>
      Signed-off-by: NLuwei Kang <luwei.kang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f99e3daf
  10. 27 11月, 2018 1 次提交
  11. 17 10月, 2018 1 次提交
    • U
      KVM/x86: Use assembly instruction mnemonics instead of .byte streams · 4b1e5478
      Uros Bizjak 提交于
      Recently the minimum required version of binutils was changed to 2.20,
      which supports all VMX instruction mnemonics. The patch removes
      all .byte #defines and uses real instruction mnemonics instead.
      
      The compiler is now able to pass memory operand to the instruction,
      so there is no need for memory clobber anymore. Also, the compiler
      adds CC register clobber automatically to all extended asm clauses,
      so the patch also removes explicit CC clobber.
      
      The immediate benefit of the patch is removal of many unnecesary
      register moves, resulting in 1434 saved bytes in vmx.o:
      
         text    data     bss     dec     hex filename
       151257   18246    8500  178003   2b753 vmx.o
       152691   18246    8500  179437   2bced vmx-old.o
      
      Some examples of improvement include removal of unneeded moves
      of %rsp to %rax in front of invept and invvpid instructions:
      
          a57e:	b9 01 00 00 00       	mov    $0x1,%ecx
          a583:	48 89 04 24          	mov    %rax,(%rsp)
          a587:	48 89 e0             	mov    %rsp,%rax
          a58a:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
          a591:	00 00
          a593:	66 0f 38 80 08       	invept (%rax),%rcx
      
      to:
      
          a45c:	48 89 04 24          	mov    %rax,(%rsp)
          a460:	b8 01 00 00 00       	mov    $0x1,%eax
          a465:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
          a46c:	00 00
          a46e:	66 0f 38 80 04 24    	invept (%rsp),%rax
      
      and the ability to use more optimal registers and memory operands
      in the instruction:
      
          8faa:	48 8b 44 24 28       	mov    0x28(%rsp),%rax
          8faf:	4c 89 c2             	mov    %r8,%rdx
          8fb2:	0f 79 d0             	vmwrite %rax,%rdx
      
      to:
      
          8e7c:	44 0f 79 44 24 28    	vmwrite 0x28(%rsp),%r8
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4b1e5478
  12. 22 8月, 2018 1 次提交
    • S
      KVM: vmx: Add defines for SGX ENCLS exiting · 802ec461
      Sean Christopherson 提交于
      Hardware support for basic SGX virtualization adds a new execution
      control (ENCLS_EXITING), VMCS field (ENCLS_EXITING_BITMAP) and exit
      reason (ENCLS), that enables a VMM to intercept specific ENCLS leaf
      functions, e.g. to inject faults when the VMM isn't exposing SGX to
      a VM.  When ENCLS_EXITING is enabled, the VMM can set/clear bits in
      the bitmap to intercept/allow ENCLS leaf functions in non-root, e.g.
      setting bit 2 in the ENCLS_EXITING_BITMAP will cause ENCLS[EINIT]
      to VMExit(ENCLS).
      
      Note: EXIT_REASON_ENCLS was previously added by commit 1f519992
      ("KVM: VMX: add missing exit reasons").
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20180814163334.25724-2-sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      802ec461
  13. 05 8月, 2018 1 次提交
  14. 13 7月, 2018 2 次提交
  15. 22 6月, 2018 1 次提交
    • M
      kvm: vmx: Nested VM-entry prereqs for event inj. · 0447378a
      Marc Orr 提交于
      This patch extends the checks done prior to a nested VM entry.
      Specifically, it extends the check_vmentry_prereqs function with checks
      for fields relevant to the VM-entry event injection information, as
      described in the Intel SDM, volume 3.
      
      This patch is motivated by a syzkaller bug, where a bad VM-entry
      interruption information field is generated in the VMCS02, which causes
      the nested VM launch to fail. Then, KVM fails to resume L1.
      
      While KVM should be improved to correctly resume L1 execution after a
      failed nested launch, this change is justified because the existing code
      to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
      sparse.
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NMarc Orr <marcorr@google.com>
      [Removed comment whose parts were describing previous revisions and the
       rest was obvious from function/variable naming. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0447378a
  16. 23 5月, 2018 1 次提交
    • J
      KVM: nVMX: Restore the VMCS12 offsets for v4.0 fields · b348e793
      Jim Mattson 提交于
      Changing the VMCS12 layout will break save/restore compatibility with
      older kvm releases once the KVM_{GET,SET}_NESTED_STATE ioctls are
      accepted upstream. Google has already been using these ioctls for some
      time, and we implore the community not to disturb the existing layout.
      
      Move the four most recently added fields to preserve the offsets of
      the previously defined fields and reserve locations for the vmread and
      vmwrite bitmaps, which will be used in the virtualization of VMCS
      shadowing (to improve the performance of double-nesting).
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [Kept the SDM order in vmcs_field_to_offset_table. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b348e793
  17. 21 3月, 2018 1 次提交
    • L
      kvm/x86: fix icebp instruction handling · 32d43cd3
      Linus Torvalds 提交于
      The undocumented 'icebp' instruction (aka 'int1') works pretty much like
      'int3' in the absense of in-circuit probing equipment (except,
      obviously, that it raises #DB instead of raising #BP), and is used by
      some validation test-suites as such.
      
      But Andy Lutomirski noticed that his test suite acted differently in kvm
      than on bare hardware.
      
      The reason is that kvm used an inexact test for the icebp instruction:
      it just assumed that an all-zero VM exit qualification value meant that
      the VM exit was due to icebp.
      
      That is not unlike the guess that do_debug() does for the actual
      exception handling case, but it's purely a heuristic, not an absolute
      rule.  do_debug() does it because it wants to ascribe _some_ reasons to
      the #DB that happened, and an empty %dr6 value means that 'icebp' is the
      most likely casue and we have no better information.
      
      But kvm can just do it right, because unlike the do_debug() case, kvm
      actually sees the real reason for the #DB in the VM-exit interruption
      information field.
      
      So instead of relying on an inexact heuristic, just use the actual VM
      exit information that says "it was 'icebp'".
      
      Right now the 'icebp' instruction isn't technically documented by Intel,
      but that will hopefully change.  The special "privileged software
      exception" information _is_ actually mentioned in the Intel SDM, even
      though the cause of it isn't enumerated.
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32d43cd3
  18. 12 10月, 2017 1 次提交
  19. 25 8月, 2017 1 次提交
    • Y
      KVM: MMU: Add 5 level EPT & Shadow page table support. · 855feb67
      Yu Zhang 提交于
      Extends the shadow paging code, so that 5 level shadow page
      table can be constructed if VM is running in 5 level paging
      mode.
      
      Also extends the ept code, so that 5 level ept table can be
      constructed if maxphysaddr of VM exceeds 48 bits. Unlike the
      shadow logic, KVM should still use 4 level ept table for a VM
      whose physical address width is less than 48 bits, even when
      the VM is running in 5 level paging mode.
      Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com>
      [Unconditionally reset the MMU context in kvm_cpuid_update.
       Changing MAXPHYADDR invalidates the reserved bit bitmasks.
       - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      855feb67
  20. 18 8月, 2017 1 次提交
  21. 07 8月, 2017 2 次提交
  22. 07 4月, 2017 2 次提交
  23. 27 1月, 2017 1 次提交
  24. 09 1月, 2017 3 次提交
  25. 08 12月, 2016 1 次提交
    • D
      KVM: nVMX: support restore of VMX capability MSRs · 62cc6b9d
      David Matlack 提交于
      The VMX capability MSRs advertise the set of features the KVM virtual
      CPU can support. This set of features varies across different host CPUs
      and KVM versions. This patch aims to addresses both sources of
      differences, allowing VMs to be migrated across CPUs and KVM versions
      without guest-visible changes to these MSRs. Note that cross-KVM-
      version migration is only supported from this point forward.
      
      When the VMX capability MSRs are restored, they are audited to check
      that the set of features advertised are a subset of what KVM and the
      CPU support.
      
      Since the VMX capability MSRs are read-only, they do not need to be on
      the default MSR save/restore lists. The userspace hypervisor can set
      the values of these MSRs or read them from KVM at VCPU creation time,
      and restore the same value after every save/restore.
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      62cc6b9d
  26. 23 11月, 2016 1 次提交
  27. 03 11月, 2016 1 次提交
  28. 24 7月, 2016 1 次提交
    • D
      Revert "KVM: x86: add pcommit support" · dfa169bb
      Dan Williams 提交于
      This reverts commit 8b3e34e4.
      
      Given the deprecation of the pcommit instruction, the relevant VMX
      features and CPUID bits are not going to be rolled into the SDM.  Remove
      their usage from KVM.
      
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      dfa169bb
  29. 10 11月, 2015 1 次提交
  30. 16 10月, 2015 1 次提交
  31. 01 10月, 2015 1 次提交
  32. 15 8月, 2015 1 次提交