1. 03 10月, 2013 1 次提交
  2. 27 6月, 2013 1 次提交
    • Y
      kvm: Add a tracepoint write_tsc_offset · 489223ed
      Yoshihiro YUNOMAE 提交于
      Add a tracepoint write_tsc_offset for tracing TSC offset change.
      We want to merge ftrace's trace data of guest OSs and the host OS using
      TSC for timestamp in chronological order. We need "TSC offset" values for
      each guest when merge those because the TSC value on a guest is always the
      host TSC plus guest's TSC offset. If we get the TSC offset values, we can
      calculate the host TSC value for each guest events from the TSC offset and
      the event TSC value. The host TSC values of the guest events are used when we
      want to merge trace data of guests and the host in chronological order.
      (Note: the trace_clock of both the host and the guest must be set x86-tsc in
      this case)
      
      This tracepoint also records vcpu_id which can be used to merge trace data for
      SMP guests. A merge tool will read TSC offset for each vcpu, then the tool
      converts guest TSC values to host TSC values for each vcpu.
      
      TSC offset is stored in the VMCS by vmx_write_tsc_offset() or
      vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots.
      The latter function is executed when kvm clock is updated. Only host can read
      TSC offset value from VMCS, so a host needs to output TSC offset value
      when TSC offset is changed.
      
      Since the TSC offset is not often changed, it could be overwritten by other
      frequent events while tracing. To avoid that, I recommend to use a special
      instance for getting this event:
      
      1. set a instance before booting a guest
       # cd /sys/kernel/debug/tracing/instances
       # mkdir tsc_offset
       # cd tsc_offset
       # echo x86-tsc > trace_clock
       # echo 1 > events/kvm/kvm_write_tsc_offset/enable
      
      2. boot a guest
      Signed-off-by: NYoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      489223ed
  3. 03 5月, 2013 1 次提交
  4. 28 4月, 2013 2 次提交
  5. 17 4月, 2013 2 次提交
    • Y
      KVM: VMX: Add the deliver posted interrupt algorithm · a20ed54d
      Yang Zhang 提交于
      Only deliver the posted interrupt when target vcpu is running
      and there is no previous interrupt pending in pir.
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      a20ed54d
    • Y
      KVM: VMX: Enable acknowledge interupt on vmexit · a547c6db
      Yang Zhang 提交于
      The "acknowledge interrupt on exit" feature controls processor behavior
      for external interrupt acknowledgement. When this control is set, the
      processor acknowledges the interrupt controller to acquire the
      interrupt vector on VM exit.
      
      After enabling this feature, an interrupt which arrived when target cpu is
      running in vmx non-root mode will be handled by vmx handler instead of handler
      in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt
      table to let real handler to handle it. Further, we will recognize the interrupt
      and only delivery the interrupt which not belong to current vcpu through idt table.
      The interrupt which belonged to current vcpu will be handled inside vmx handler.
      This will reduce the interrupt handle cost of KVM.
      
      Also, interrupt enable logic is changed if this feature is turnning on:
      Before this patch, hypervior call local_irq_enable() to enable it directly.
      Now IF bit is set on interrupt stack frame, and will be enabled on a return from
      interrupt handler if exterrupt interrupt exists. If no external interrupt, still
      call local_irq_enable() to enable it.
      
      Refer to Intel SDM volum 3, chapter 33.2.
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      a547c6db
  6. 03 4月, 2013 1 次提交
  7. 21 3月, 2013 1 次提交
  8. 13 3月, 2013 1 次提交
    • J
      KVM: x86: Rework INIT and SIPI handling · 66450a21
      Jan Kiszka 提交于
      A VCPU sending INIT or SIPI to some other VCPU races for setting the
      remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED
      was overwritten by kvm_emulate_halt and, thus, got lost.
      
      This introduces APIC events for those two signals, keeping them in
      kvm_apic until kvm_apic_accept_events is run over the target vcpu
      context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there
      are pending events, thus if vcpu blocking should end.
      
      The patch comes with the side effect of effectively obsoleting
      KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but
      immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI.
      The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED
      state. That also means we no longer exit to user space after receiving a
      SIPI event.
      
      Furthermore, we already reset the VCPU on INIT, only fixing up the code
      segment later on when SIPI arrives. Moreover, we fix INIT handling for
      the BSP: it never enter wait-for-SIPI but directly starts over on INIT.
      Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      66450a21
  9. 12 3月, 2013 1 次提交
  10. 29 1月, 2013 2 次提交
  11. 06 12月, 2012 1 次提交
  12. 01 12月, 2012 2 次提交
    • W
      KVM: x86: Emulate IA32_TSC_ADJUST MSR · ba904635
      Will Auld 提交于
      CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported
      
      Basic design is to emulate the MSR by allowing reads and writes to a guest
      vcpu specific location to store the value of the emulated MSR while adding
      the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will
      be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This
      is of course as long as the "use TSC counter offsetting" VM-execution control
      is enabled as well as the IA32_TSC_ADJUST control.
      
      However, because hardware will only return the TSC + IA32_TSC_ADJUST +
      vmsc tsc_offset for a guest process when it does and rdtsc (with the correct
      settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one
      of these three locations. The argument against storing it in the actual MSR
      is performance. This is likely to be seldom used while the save/restore is
      required on every transition. IA32_TSC_ADJUST was created as a way to solve
      some issues with writing TSC itself so that is not an option either.
      
      The remaining option, defined above as our solution has the problem of
      returning incorrect vmcs tsc_offset values (unless we intercept and fix, not
      done here) as mentioned above. However, more problematic is that storing the
      data in vmcs tsc_offset will have a different semantic effect on the system
      than does using the actual MSR. This is illustrated in the following example:
      
      The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest
      process performs a rdtsc. In this case the guest process will get
      TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including
      IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics
      as seen by the guest do not and hence this will not cause a problem.
      Signed-off-by: NWill Auld <will.auld@intel.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      ba904635
    • W
      KVM: x86: Add code to track call origin for msr assignment · 8fe8ab46
      Will Auld 提交于
      In order to track who initiated the call (host or guest) to modify an msr
      value I have changed function call parameters along the call path. The
      specific change is to add a struct pointer parameter that points to (index,
      data, caller) information rather than having this information passed as
      individual parameters.
      
      The initial use for this capability is for updating the IA32_TSC_ADJUST msr
      while setting the tsc value. It is anticipated that this capability is
      useful for other tasks.
      Signed-off-by: NWill Auld <will.auld@intel.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      8fe8ab46
  13. 28 11月, 2012 2 次提交
  14. 22 10月, 2012 1 次提交
  15. 23 9月, 2012 1 次提交
    • J
      KVM: x86: Fix guest debug across vcpu INIT reset · c8639010
      Jan Kiszka 提交于
      If we reset a vcpu on INIT, we so far overwrote dr7 as provided by
      KVM_SET_GUEST_DEBUG, and we also cleared switch_db_regs unconditionally.
      
      Fix this by saving the dr7 used for guest debugging and calculating the
      effective register value as well as switch_db_regs on any potential
      change. This will change to focus of the set_guest_debug vendor op to
      update_dp_bp_intercept.
      
      Found while trying to stop on start_secondary.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c8639010
  16. 17 9月, 2012 1 次提交
  17. 05 9月, 2012 1 次提交
  18. 06 8月, 2012 1 次提交
  19. 21 7月, 2012 1 次提交
  20. 12 7月, 2012 1 次提交
    • M
      KVM: VMX: Implement PCID/INVPCID for guests with EPT · ad756a16
      Mao, Junjie 提交于
      This patch handles PCID/INVPCID for guests.
      
      Process-context identifiers (PCIDs) are a facility by which a logical processor
      may cache information for multiple linear-address spaces so that the processor
      may retain cached information when software switches to a different linear
      address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual
      Volume 3A for details.
      
      For guests with EPT, the PCID feature is enabled and INVPCID behaves as running
      natively.
      For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD.
      Signed-off-by: NJunjie Mao <junjie.mao@intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      ad756a16
  21. 06 6月, 2012 1 次提交
  22. 17 4月, 2012 1 次提交
  23. 08 4月, 2012 1 次提交
  24. 08 3月, 2012 5 次提交
  25. 05 3月, 2012 2 次提交
  26. 02 3月, 2012 1 次提交
  27. 27 12月, 2011 1 次提交
  28. 30 10月, 2011 1 次提交
    • J
      KVM: SVM: Keep intercepting task switching with NPT enabled · f1c1da2b
      Jan Kiszka 提交于
      AMD processors apparently have a bug in the hardware task switching
      support when NPT is enabled. If the task switch triggers a NPF, we can
      get wrong EXITINTINFO along with that fault. On resume, spurious
      exceptions may then be injected into the guest.
      
      We were able to reproduce this bug when our guest triggered #SS and the
      handler were supposed to run over a separate task with not yet touched
      stack pages.
      
      Work around the issue by continuing to emulate task switches even in
      NPT mode.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f1c1da2b
  29. 26 9月, 2011 2 次提交