1. 03 10月, 2013 4 次提交
  2. 26 8月, 2013 1 次提交
  3. 07 8月, 2013 1 次提交
  4. 29 7月, 2013 1 次提交
  5. 20 7月, 2013 1 次提交
  6. 27 6月, 2013 5 次提交
  7. 26 6月, 2013 1 次提交
  8. 05 6月, 2013 2 次提交
    • X
      KVM: MMU: reclaim the zapped-obsolete page first · 365c8868
      Xiao Guangrong 提交于
      As Marcelo pointed out that
      | "(retention of large number of pages while zapping)
      | can be fatal, it can lead to OOM and host crash"
      
      We introduce a list, kvm->arch.zapped_obsolete_pages, to link all
      the pages which are deleted from the mmu cache but not actually
      freed. When page reclaiming is needed, we always zap this kind of
      pages first.
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      365c8868
    • X
      KVM: MMU: fast invalidate all pages · 5304b8d3
      Xiao Guangrong 提交于
      The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
      walk and zap all shadow pages one by one, also it need to zap all guest
      page's rmap and all shadow page's parent spte list. Particularly, things
      become worse if guest uses more memory or vcpus. It is not good for
      scalability
      
      In this patch, we introduce a faster way to invalidate all shadow pages.
      KVM maintains a global mmu invalid generation-number which is stored in
      kvm->arch.mmu_valid_gen and every shadow page stores the current global
      generation-number into sp->mmu_valid_gen when it is created
      
      When KVM need zap all shadow pages sptes, it just simply increase the
      global generation-number then reload root shadow pages on all vcpus.
      Vcpu will create a new shadow page table according to current kvm's
      generation-number. It ensures the old pages are not used any more.
      Then the obsolete pages (sp->mmu_valid_gen != kvm->arch.mmu_valid_gen)
      are zapped by using lock-break technique
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      5304b8d3
  9. 03 5月, 2013 1 次提交
  10. 28 4月, 2013 2 次提交
  11. 27 4月, 2013 1 次提交
  12. 22 4月, 2013 1 次提交
  13. 17 4月, 2013 2 次提交
    • Y
      KVM: VMX: Add the deliver posted interrupt algorithm · a20ed54d
      Yang Zhang 提交于
      Only deliver the posted interrupt when target vcpu is running
      and there is no previous interrupt pending in pir.
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      a20ed54d
    • Y
      KVM: VMX: Enable acknowledge interupt on vmexit · a547c6db
      Yang Zhang 提交于
      The "acknowledge interrupt on exit" feature controls processor behavior
      for external interrupt acknowledgement. When this control is set, the
      processor acknowledges the interrupt controller to acquire the
      interrupt vector on VM exit.
      
      After enabling this feature, an interrupt which arrived when target cpu is
      running in vmx non-root mode will be handled by vmx handler instead of handler
      in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt
      table to let real handler to handle it. Further, we will recognize the interrupt
      and only delivery the interrupt which not belong to current vcpu through idt table.
      The interrupt which belonged to current vcpu will be handled inside vmx handler.
      This will reduce the interrupt handle cost of KVM.
      
      Also, interrupt enable logic is changed if this feature is turnning on:
      Before this patch, hypervior call local_irq_enable() to enable it directly.
      Now IF bit is set on interrupt stack frame, and will be enabled on a return from
      interrupt handler if exterrupt interrupt exists. If no external interrupt, still
      call local_irq_enable() to enable it.
      
      Refer to Intel SDM volum 3, chapter 33.2.
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      a547c6db
  14. 14 4月, 2013 1 次提交
  15. 08 4月, 2013 2 次提交
  16. 02 4月, 2013 1 次提交
    • P
      pmu: prepare for migration support · afd80d85
      Paolo Bonzini 提交于
      In order to migrate the PMU state correctly, we need to restore the
      values of MSR_CORE_PERF_GLOBAL_STATUS (a read-only register) and
      MSR_CORE_PERF_GLOBAL_OVF_CTRL (which has side effects when written).
      We also need to write the full 40-bit value of the performance counter,
      which would only be possible with a v3 architectural PMU's full-width
      counter MSRs.
      
      To distinguish host-initiated writes from the guest's, pass the
      full struct msr_data to kvm_pmu_set_msr.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      afd80d85
  17. 20 3月, 2013 1 次提交
  18. 14 3月, 2013 2 次提交
  19. 13 3月, 2013 1 次提交
    • J
      KVM: x86: Rework INIT and SIPI handling · 66450a21
      Jan Kiszka 提交于
      A VCPU sending INIT or SIPI to some other VCPU races for setting the
      remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED
      was overwritten by kvm_emulate_halt and, thus, got lost.
      
      This introduces APIC events for those two signals, keeping them in
      kvm_apic until kvm_apic_accept_events is run over the target vcpu
      context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there
      are pending events, thus if vcpu blocking should end.
      
      The patch comes with the side effect of effectively obsoleting
      KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but
      immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI.
      The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED
      state. That also means we no longer exit to user space after receiving a
      SIPI event.
      
      Furthermore, we already reset the VCPU on INIT, only fixing up the code
      segment later on when SIPI arrives. Moreover, we fix INIT handling for
      the BSP: it never enter wait-for-SIPI but directly starts over on INIT.
      Tested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      66450a21
  20. 12 3月, 2013 1 次提交
  21. 29 1月, 2013 2 次提交
  22. 22 1月, 2013 1 次提交
  23. 14 1月, 2013 1 次提交
  24. 14 12月, 2012 3 次提交
  25. 01 12月, 2012 1 次提交
    • W
      KVM: x86: Emulate IA32_TSC_ADJUST MSR · ba904635
      Will Auld 提交于
      CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported
      
      Basic design is to emulate the MSR by allowing reads and writes to a guest
      vcpu specific location to store the value of the emulated MSR while adding
      the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will
      be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This
      is of course as long as the "use TSC counter offsetting" VM-execution control
      is enabled as well as the IA32_TSC_ADJUST control.
      
      However, because hardware will only return the TSC + IA32_TSC_ADJUST +
      vmsc tsc_offset for a guest process when it does and rdtsc (with the correct
      settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one
      of these three locations. The argument against storing it in the actual MSR
      is performance. This is likely to be seldom used while the save/restore is
      required on every transition. IA32_TSC_ADJUST was created as a way to solve
      some issues with writing TSC itself so that is not an option either.
      
      The remaining option, defined above as our solution has the problem of
      returning incorrect vmcs tsc_offset values (unless we intercept and fix, not
      done here) as mentioned above. However, more problematic is that storing the
      data in vmcs tsc_offset will have a different semantic effect on the system
      than does using the actual MSR. This is illustrated in the following example:
      
      The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest
      process performs a rdtsc. In this case the guest process will get
      TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including
      IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics
      as seen by the guest do not and hence this will not cause a problem.
      Signed-off-by: NWill Auld <will.auld@intel.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      ba904635