1. 08 3月, 2012 4 次提交
    • K
      KVM: SVM: Fix CPL updates · ea5e97e8
      Kevin Wolf 提交于
      Keep CPL at 0 in real mode and at 3 in VM86. In protected/long mode, use
      RPL rather than DPL of the code segment.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      ea5e97e8
    • K
      KVM: x86 emulator: Fix task switch privilege checks · 7f3d35fd
      Kevin Wolf 提交于
      Currently, all task switches check privileges against the DPL of the
      TSS. This is only correct for jmp/call to a TSS. If a task gate is used,
      the DPL of this take gate is used for the check instead. Exceptions,
      external interrupts and iret shouldn't perform any check.
      
      [avi: kill kvm-kmod remnants]
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7f3d35fd
    • M
      KVM: Allow adjust_tsc_offset to be in host or guest cycles · f1e2b260
      Marcelo Tosatti 提交于
      Redefine the API to take a parameter indicating whether an
      adjustment is in host or guest cycles.
      Signed-off-by: NZachary Amsden <zamsden@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      f1e2b260
    • Z
      KVM: Infrastructure for software and hardware based TSC rate scaling · cc578287
      Zachary Amsden 提交于
      This requires some restructuring; rather than use 'virtual_tsc_khz'
      to indicate whether hardware rate scaling is in effect, we consider
      each VCPU to always have a virtual TSC rate.  Instead, there is new
      logic above the vendor-specific hardware scaling that decides whether
      it is even necessary to use and updates all rate variables used by
      common code.  This means we can simply query the virtual rate at
      any point, which is needed for software rate scaling.
      
      There is also now a threshold added to the TSC rate scaling; minor
      differences and variations of measured TSC rate can accidentally
      provoke rate scaling to be used when it is not needed.  Instead,
      we have a tolerance variable called tsc_tolerance_ppm, which is
      the maximum variation from user requested rate at which scaling
      will be used.  The default is 250ppm, which is the half the
      threshold for NTP adjustment, allowing for some hardware variation.
      
      In the event that hardware rate scaling is not available, we can
      kludge a bit by forcing TSC catchup to turn on when a faster than
      hardware speed has been requested, but there is nothing available
      yet for the reverse case; this requires a trap and emulate software
      implementation for RDTSC, which is still forthcoming.
      
      [avi: fix 64-bit division on i386]
      Signed-off-by: NZachary Amsden <zamsden@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      cc578287
  2. 05 3月, 2012 2 次提交
  3. 27 12月, 2011 1 次提交
  4. 30 10月, 2011 1 次提交
    • J
      KVM: SVM: Keep intercepting task switching with NPT enabled · f1c1da2b
      Jan Kiszka 提交于
      AMD processors apparently have a bug in the hardware task switching
      support when NPT is enabled. If the task switch triggers a NPF, we can
      get wrong EXITINTINFO along with that fault. On resume, spurious
      exceptions may then be injected into the guest.
      
      We were able to reproduce this bug when our guest triggered #SS and the
      handler were supposed to run over a separate task with not yet touched
      stack pages.
      
      Work around the issue by continuing to emulate task switches even in
      NPT mode.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f1c1da2b
  5. 26 9月, 2011 6 次提交
    • J
      KVM: x86: Move kvm_trace_exit into atomic vmexit section · 1e2b1dd7
      Jan Kiszka 提交于
      This avoids that events causing the vmexit are recorded before the
      actual exit reason.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      1e2b1dd7
    • N
      KVM: SVM: Fix TSC MSR read in nested SVM · 45133eca
      Nadav Har'El 提交于
      When the TSC MSR is read by an L2 guest (when L1 allowed this MSR to be
      read without exit), we need to return L2's notion of the TSC, not L1's.
      
      The current code incorrectly returned L1 TSC, because svm_get_msr() was also
      used in x86.c where this was assumed, but now that these places call the new
      svm_read_l1_tsc(), the MSR read can be fixed.
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
      Acked-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      45133eca
    • N
      KVM: L1 TSC handling · d5c1785d
      Nadav Har'El 提交于
      KVM assumed in several places that reading the TSC MSR returns the value for
      L1. This is incorrect, because when L2 is running, the correct TSC read exit
      emulation is to return L2's value.
      
      We therefore add a new x86_ops function, read_l1_tsc, to use in places that
      specifically need to read the L1 TSC, NOT the TSC of the current level of
      guest.
      
      Note that one change, of one line in kvm_arch_vcpu_load, is made redundant
      by a different patch sent by Zachary Amsden (and not yet applied):
      kvm_arch_vcpu_load() should not read the guest TSC, and if it didn't, of
      course we didn't have to change the call of kvm_get_msr() to read_l1_tsc().
      
      [avi: moved callback to kvm_x86_ops tsc block]
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Acked-by: NZachary Amsdem <zamsden@gmail.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d5c1785d
    • A
      KVM: MMU: Do not unconditionally read PDPTE from guest memory · e4e517b4
      Avi Kivity 提交于
      Architecturally, PDPTEs are cached in the PDPTRs when CR3 is reloaded.
      On SVM, it is not possible to implement this, but on VMX this is possible
      and was indeed implemented until nested SVM changed this to unconditionally
      read PDPTEs dynamically.  This has noticable impact when running PAE guests.
      
      Fix by changing the MMU to read PDPTRs from the cache, falling back to
      reading from memory for the nested MMU.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      e4e517b4
    • S
      KVM: Use __print_symbolic() for vmexit tracepoints · 0d460ffc
      Stefan Hajnoczi 提交于
      The vmexit tracepoints format the exit_reason to make it human-readable.
      Since the exit_reason depends on the instruction set (vmx or svm),
      formatting is handled with ftrace_print_symbols_seq() by referring to
      the appropriate exit reason table.
      
      However, the ftrace_print_symbols_seq() function is not meant to be used
      directly in tracepoints since it does not export the formatting table
      which userspace tools like trace-cmd and perf use to format traces.
      
      In practice perf dies when formatting vmexit-related events and
      trace-cmd falls back to printing the numeric value (with extra
      formatting code in the kvm plugin to paper over this limitation).  Other
      userspace consumers of vmexit-related tracepoints would be in similar
      trouble.
      
      To avoid significant changes to the kvm_exit tracepoint, this patch
      moves the vmx and svm exit reason tables into arch/x86/kvm/trace.h and
      selects the right table with __print_symbolic() depending on the
      instruction set.  Note that __print_symbolic() is designed for exporting
      the formatting table to userspace and allows trace-cmd and perf to work.
      Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      0d460ffc
    • S
      KVM: Record instruction set in all vmexit tracepoints · e097e5ff
      Stefan Hajnoczi 提交于
      The kvm_exit tracepoint recently added the isa argument to aid decoding
      exit_reason.  The semantics of exit_reason depend on the instruction set
      (vmx or svm) and the isa argument allows traces to be analyzed on other
      machines.
      
      Add the isa argument to kvm_nested_vmexit and kvm_nested_vmexit_inject
      so these tracepoints can also be self-describing.
      Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      e097e5ff
  6. 12 7月, 2011 1 次提交
    • N
      KVM: nVMX: Allow setting the VMXE bit in CR4 · 5e1746d6
      Nadav Har'El 提交于
      This patch allows the guest to enable the VMXE bit in CR4, which is a
      prerequisite to running VMXON.
      
      Whether to allow setting the VMXE bit now depends on the architecture (svm
      or vmx), so its checking has moved to kvm_x86_ops->set_cr4(). This function
      now returns an int: If kvm_x86_ops->set_cr4() returns 1, __kvm_set_cr4()
      will also return 1, and this will cause kvm_set_cr4() will throw a #GP.
      
      Turning on the VMXE bit is allowed only when the nested VMX feature is
      enabled, and turning it off is forbidden after a vmxon.
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      5e1746d6
  7. 22 5月, 2011 2 次提交
  8. 11 5月, 2011 18 次提交
  9. 18 3月, 2011 3 次提交
  10. 22 2月, 2011 1 次提交
  11. 10 2月, 2011 1 次提交
    • J
      KVM: SVM: Make sure KERNEL_GS_BASE is valid when loading gs_index · 893a5ab6
      Joerg Roedel 提交于
      The gs_index loading code uses the swapgs instruction to
      switch to the user gs_base temporarily. This is unsave in an
      lightweight exit-path in KVM on AMD because the
      KERNEL_GS_BASE MSR is switches lazily. An NMI happening in
      the critical path of load_gs_index may use the wrong GS_BASE
      value then leading to unpredictable behavior, e.g. a
      triple-fault.
      
      This patch fixes the issue by making sure that load_gs_index
      is called only with a valid KERNEL_GS_BASE value loaded in
      KVM.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      893a5ab6