1. 19 8月, 2014 2 次提交
  2. 28 7月, 2014 1 次提交
  3. 24 7月, 2014 6 次提交
  4. 21 7月, 2014 3 次提交
  5. 11 7月, 2014 6 次提交
    • P
      KVM: x86: use kvm_read_guest_page for emulator accesses · 44583cba
      Paolo Bonzini 提交于
      Emulator accesses are always done a page at a time, either by the emulator
      itself (for fetches) or because we need to query the MMU for address
      translations.  Speed up these accesses by using kvm_read_guest_page
      and, in the case of fetches, by inlining kvm_read_guest_virt_helper and
      dropping the loop around kvm_read_guest_page.
      
      This final tweak saves 30-100 more clock cycles (4-10%), bringing the
      count (as measured by kvm-unit-tests) down to 720-1100 clock cycles on
      a Sandy Bridge Xeon host, compared to 2300-3200 before the whole series
      and 925-1700 after the first two low-hanging fruit changes.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      44583cba
    • B
      KVM: emulate: move init_decode_cache to emulate.c · 1498507a
      Bandan Das 提交于
      Core emulator functions all belong in emulator.c,
      x86 should have no knowledge of emulator internals
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1498507a
    • P
      KVM: x86: avoid useless set of KVM_REQ_EVENT after emulation · 6addfc42
      Paolo Bonzini 提交于
      Despite the provisions to emulate up to 130 consecutive instructions, in
      practice KVM will emulate just one before exiting handle_invalid_guest_state,
      because x86_emulate_instruction always sets KVM_REQ_EVENT.
      
      However, we only need to do this if an interrupt could be injected,
      which happens a) if an interrupt shadow bit (STI or MOV SS) has gone
      away; b) if the interrupt flag has just been set (other instructions
      than STI can set it without enabling an interrupt shadow).
      
      This cuts another 700-900 cycles from the cost of emulating an
      instruction (measured on a Sandy Bridge Xeon: 1650-2600 cycles
      before the patch on kvm-unit-tests, 925-1700 afterwards).
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6addfc42
    • P
      KVM: x86: return all bits from get_interrupt_shadow · 37ccdcbe
      Paolo Bonzini 提交于
      For the next patch we will need to know the full state of the
      interrupt shadow; we will then set KVM_REQ_EVENT when one bit
      is cleared.
      
      However, right now get_interrupt_shadow only returns the one
      corresponding to the emulated instruction, or an unconditional
      0 if the emulated instruction does not have an interrupt shadow.
      This is confusing and does not allow us to check for cleared
      bits as mentioned above.
      
      Clean the callback up, and modify toggle_interruptibility to
      match the comment above the call.  As a small result, the
      call to set_interrupt_shadow will be skipped in the common
      case where int_shadow == 0 && mask == 0.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      37ccdcbe
    • M
      KVM: svm: writes to MSR_K7_HWCR generates GPE in guest · 22d48b2d
      Matthias Lange 提交于
      Since commit 575203 the MCE subsystem in the Linux kernel for AMD sets bit 18
      in MSR_K7_HWCR. Running such a kernel as a guest in KVM on an AMD host results
      in a GPE injected into the guest because kvm_set_msr_common returns 1. This
      patch fixes this by masking bit 18 from the MSR value desired by the guest.
      Signed-off-by: NMatthias Lange <matthias.lange@kernkonzept.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      22d48b2d
    • N
      KVM: x86: Pending interrupt may be delivered after INIT · 5f7552d4
      Nadav Amit 提交于
      We encountered a scenario in which after an INIT is delivered, a pending
      interrupt is delivered, although it was sent before the INIT.  As the SDM
      states in section 10.4.7.1, the ISR and the IRR should be cleared after INIT as
      KVM does.  This also means that pending interrupts should be cleared.  This
      patch clears upon reset (and INIT) the pending interrupts; and at the same
      occassion clears the pending exceptions, since they may cause a similar issue.
      Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5f7552d4
  6. 10 7月, 2014 1 次提交
    • T
      KVM: x86: fix TSC matching · 0d3da0d2
      Tomasz Grabiec 提交于
      I've observed kvmclock being marked as unstable on a modern
      single-socket system with a stable TSC and qemu-1.6.2 or qemu-2.0.0.
      
      The culprit was failure in TSC matching because of overflow of
      kvm_arch::nr_vcpus_matched_tsc in case there were multiple TSC writes
      in a single synchronization cycle.
      
      Turns out that qemu does multiple TSC writes during init, below is the
      evidence of that (qemu-2.0.0):
      
      The first one:
      
       0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
       0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
       0xffffffffa04cfd6b : kvm_arch_vcpu_postcreate+0x4b/0x80 [kvm]
       0xffffffffa04b8188 : kvm_vm_ioctl+0x418/0x750 [kvm]
      
      The second one:
      
       0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
       0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
       0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
       0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
       0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
       0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
       0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]
      
       #0  kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
       #1  kvm_put_msrs at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
       #2  kvm_arch_put_registers at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
       #3  kvm_cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1641
       #4  cpu_synchronize_post_init at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:330
       #5  cpu_synchronize_all_post_init () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:521
       #6  main at /build/buildd/qemu-2.0.0+dfsg/vl.c:4390
      
      The third one:
      
       0xffffffffa08ff2b4 : vmx_write_tsc_offset+0xa4/0xb0 [kvm_intel]
       0xffffffffa04c9c05 : kvm_write_tsc+0x1a5/0x360 [kvm]
       0xffffffffa090610d : vmx_set_msr+0x29d/0x350 [kvm_intel]
       0xffffffffa04be83b : do_set_msr+0x3b/0x60 [kvm]
       0xffffffffa04c10a8 : msr_io+0xc8/0x160 [kvm]
       0xffffffffa04caeb6 : kvm_arch_vcpu_ioctl+0xc86/0x1060 [kvm]
       0xffffffffa04b6797 : kvm_vcpu_ioctl+0xc7/0x5a0 [kvm]
      
       #0  kvm_vcpu_ioctl at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1780
       #1  kvm_put_msrs  at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1270
       #2  kvm_arch_put_registers  at /build/buildd/qemu-2.0.0+dfsg/target-i386/kvm.c:1909
       #3  kvm_cpu_synchronize_post_reset  at /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:1635
       #4  cpu_synchronize_post_reset  at /build/buildd/qemu-2.0.0+dfsg/include/sysemu/kvm.h:323
       #5  cpu_synchronize_all_post_reset () at /build/buildd/qemu-2.0.0+dfsg/cpus.c:512
       #6  main  at /build/buildd/qemu-2.0.0+dfsg/vl.c:4482
      
      The fix is to count each vCPU only once when matched, so that
      nr_vcpus_matched_tsc holds the size of the matched set. This is
      achieved by reusing generation counters. Every vCPU with
      this_tsc_generation == cur_tsc_generation is in the matched set. The
      match set is cleared by setting cur_tsc_generation to a value which no
      other vCPU is set to (by incrementing it).
      
      I needed to bump up the counter size form u8 to u64 to ensure it never
      overflows. Otherwise in cases TSC is not written the same number of
      times on each vCPU the counter could overflow and incorrectly indicate
      some vCPUs as being in the matched set. This scenario seems unlikely
      but I'm not sure if it can be disregarded.
      Signed-off-by: NTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0d3da0d2
  7. 08 7月, 2014 1 次提交
    • B
      KVM: x86: Check for nested events if there is an injectable interrupt · 9242b5b6
      Bandan Das 提交于
      With commit b6b8a145 that introduced
      vmx_check_nested_events, checks for injectable interrupts happen
      at different points in time for L1 and L2 that could potentially
      cause a race. The regression occurs because KVM_REQ_EVENT is always
      set when nested_run_pending is set even if there's no pending interrupt.
      Consequently, there could be a small window when check_nested_events
      returns without exiting to L1, but an interrupt comes through soon
      after and it incorrectly, gets injected to L2 by inject_pending_event
      Fix this by adding a call to check for nested events too when a check
      for injectable interrupt returns true
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9242b5b6
  8. 30 6月, 2014 1 次提交
  9. 19 6月, 2014 2 次提交
  10. 18 6月, 2014 1 次提交
  11. 22 5月, 2014 1 次提交
  12. 14 5月, 2014 1 次提交
  13. 13 5月, 2014 1 次提交
  14. 08 5月, 2014 1 次提交
    • M
      kvm/x86: implement hv EOI assist · b63cf42f
      Michael S. Tsirkin 提交于
      It seems that it's easy to implement the EOI assist
      on top of the PV EOI feature: simply convert the
      page address to the format expected by PV EOI.
      
      Notes:
      -"No EOI required" is set only if interrupt injected
       is edge triggered; this is true because level interrupts are going
       through IOAPIC which disables PV EOI.
       In any case, if guest triggers EOI the bit will get cleared on exit.
      -For migration, set of HV_X64_MSR_APIC_ASSIST_PAGE sets
       KVM_PV_EOI_EN internally, so restoring HV_X64_MSR_APIC_ASSIST_PAGE
       seems sufficient
       In any case, bit is cleared on exit so worst case it's never re-enabled
      -no handling of PV EOI data is performed at HV_X64_MSR_EOI write;
       HV_X64_MSR_EOI is a separate optimization - it's an X2APIC
       replacement that lets you do EOI with an MSR and not IO.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b63cf42f
  15. 06 5月, 2014 2 次提交
  16. 24 4月, 2014 4 次提交
  17. 18 4月, 2014 1 次提交
    • M
      KVM: support any-length wildcard ioeventfd · f848a5a8
      Michael S. Tsirkin 提交于
      It is sometimes benefitial to ignore IO size, and only match on address.
      In hindsight this would have been a better default than matching length
      when KVM_IOEVENTFD_FLAG_DATAMATCH is not set, In particular, this kind
      of access can be optimized on VMX: there no need to do page lookups.
      This can currently be done with many ioeventfds but in a suboptimal way.
      
      However we can't change kernel/userspace ABI without risk of breaking
      some applications.
      Use len = 0 to mean "ignore length for matching" in a more optimal way.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f848a5a8
  18. 15 4月, 2014 2 次提交
  19. 27 3月, 2014 1 次提交
  20. 20 3月, 2014 1 次提交
    • S
      x86, kvm: Fix CPU hotplug callback registration · 460dd42e
      Srivatsa S. Bhat 提交于
      Subsystems that want to register CPU hotplug callbacks, as well as perform
      initialization for the CPUs that are already online, often do it as shown
      below:
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      This is wrong, since it is prone to ABBA deadlocks involving the
      cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
      with CPU hotplug operations).
      
      Instead, the correct and race-free way of performing the callback
      registration is:
      
      	cpu_notifier_register_begin();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* Note the use of the double underscored version of the API */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_notifier_register_done();
      
      Fix the kvm code in x86 by using this latter form of callback registration.
      
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      460dd42e
  21. 17 3月, 2014 1 次提交
    • P
      KVM: x86: handle missing MPX in nested virtualization · 93c4adc7
      Paolo Bonzini 提交于
      When doing nested virtualization, we may be able to read BNDCFGS but
      still not be allowed to write to GUEST_BNDCFGS in the VMCS.  Guard
      writes to the field with vmx_mpx_supported(), and similarly hide the
      MSR from userspace if the processor does not support the field.
      
      We could work around this with the generic MSR save/load machinery,
      but there is only a limited number of MSR save/load slots and it is
      not really worthwhile to waste one for a scenario that should not
      happen except in the nested virtualization case.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      93c4adc7