1. 15 12月, 2018 5 次提交
    • V
      x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h · 6a058a1e
      Vitaly Kuznetsov 提交于
      As a preparation to implementing Direct Mode for Hyper-V synthetic
      timers switch to using stimer config definition from hyperv-tlfs.h.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6a058a1e
    • V
      x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID · 2bc39970
      Vitaly Kuznetsov 提交于
      With every new Hyper-V Enlightenment we implement we're forced to add a
      KVM_CAP_HYPERV_* capability. While this approach works it is fairly
      inconvenient: the majority of the enlightenments we do have corresponding
      CPUID feature bit(s) and userspace has to know this anyways to be able to
      expose the feature to the guest.
      
      Add KVM_GET_SUPPORTED_HV_CPUID ioctl (backed by KVM_CAP_HYPERV_CPUID, "one
      cap to rule them all!") returning all Hyper-V CPUID feature leaves.
      
      Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
      Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
      0x40000001) and we would probably confuse userspace in case we decide to
      return these twice.
      
      KVM_CAP_HYPERV_CPUID's number is interim: we're intended to drop
      KVM_CAP_HYPERV_STIMER_DIRECT and use its number instead.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2bc39970
    • V
      x86/hyper-v: Do some housekeeping in hyperv-tlfs.h · a4987def
      Vitaly Kuznetsov 提交于
      hyperv-tlfs.h is a bit messy: CPUID feature bits are not always sorted,
      it's hard to get which CPUID they belong to, some items are duplicated
      (e.g. HV_X64_MSR_CRASH_CTL_NOTIFY/HV_CRASH_CTL_CRASH_NOTIFY).
      
      Do some housekeeping work. While on it, replace all (1 << X) with BIT(X)
      macro.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a4987def
    • R
      x86: kvm: hyperv: don't retry message delivery for periodic timers · 7deec5e0
      Roman Kagan 提交于
      The SynIC message delivery protocol allows the message originator to
      request, should the message slot be busy, to be notified when it's free.
      
      However, this is unnecessary and even undesirable for messages generated
      by SynIC timers in periodic mode: if the period is short enough compared
      to the time the guest spends in the timer interrupt handler, so the
      timer ticks start piling up, the excessive interactions due to this
      notification and retried message delivery only makes the things worse.
      
      [This was observed, in particular, with Windows L2 guests setting
      (temporarily) the periodic timer to 2 kHz, and spending hundreds of
      microseconds in the timer interrupt handler due to several L2->L1 exits;
      under some load in L0 this could exceed 500 us so the timer ticks
      started to pile up and the guest livelocked.]
      
      Relieve the situation somewhat by not retrying message delivery for
      periodic SynIC timers.  This appears to remain within the "lazy" lost
      ticks policy for SynIC timers as implemented in KVM.
      
      Note that it doesn't solve the fundamental problem of livelocking the
      guest with a periodic timer whose period is smaller than the time needed
      to process a tick, but it makes it a bit less likely to be triggered.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7deec5e0
    • R
      x86: kvm: hyperv: simplify SynIC message delivery · 3a0e7731
      Roman Kagan 提交于
      SynIC message delivery is somewhat overengineered: it pretends to follow
      the ordering rules when grabbing the message slot, using atomic
      operations and all that, but does it incorrectly and unnecessarily.
      
      The correct order would be to first set .msg_pending, then atomically
      replace .message_type if it was zero, and then clear .msg_pending if
      the previous step was successful.  But this all is done in vcpu context
      so the whole update looks atomic to the guest (it's assumed to only
      access the message page from this cpu), and therefore can be done in
      whatever order is most convenient (and is also the reason why the
      incorrect order didn't trigger any bugs so far).
      
      While at this, also switch to kvm_vcpu_{read,write}_guest_page, and drop
      the no longer needed synic_clear_sint_msg_pending.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3a0e7731
  2. 17 10月, 2018 11 次提交
  3. 06 8月, 2018 1 次提交
    • P
      KVM: x86: ensure all MSRs can always be KVM_GET/SET_MSR'd · 44883f01
      Paolo Bonzini 提交于
      Some of the MSRs returned by GET_MSR_INDEX_LIST currently cannot be sent back
      to KVM_GET_MSR and/or KVM_SET_MSR; either they can never be sent back, or you
      they are only accepted under special conditions.  This makes the API a pain to
      use.
      
      To avoid this pain, this patch makes it so that the result of the get-list
      ioctl can always be used for host-initiated get and set.  Since we don't have
      a separate way to check for read-only MSRs, this means some Hyper-V MSRs are
      ignored when written.  Arguably they should not even be in the result of
      GET_MSR_INDEX_LIST, but I am leaving there in case userspace is using the
      outcome of GET_MSR_INDEX_LIST to derive the support for the corresponding
      Hyper-V feature.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      44883f01
  4. 26 5月, 2018 5 次提交
  5. 11 5月, 2018 2 次提交
  6. 29 3月, 2018 1 次提交
  7. 24 3月, 2018 1 次提交
  8. 17 3月, 2018 3 次提交
  9. 07 3月, 2018 2 次提交
  10. 08 8月, 2017 2 次提交
  11. 07 8月, 2017 1 次提交
  12. 20 7月, 2017 1 次提交
    • R
      kvm: x86: hyperv: avoid livelock in oneshot SynIC timers · f1ff89ec
      Roman Kagan 提交于
      If the SynIC timer message delivery fails due to SINT message slot being
      busy, there's no point to attempt starting the timer again until we're
      notified of the slot being released by the guest (via EOM or EOI).
      
      Even worse, when a oneshot timer fails to deliver its message, its
      re-arming with an expiration time in the past leads to immediate retry
      of the delivery, and so on, without ever letting the guest vcpu to run
      and release the slot, which results in a livelock.
      
      To avoid that, only start the timer when there's no timer message
      pending delivery.  When there is, meaning the slot is busy, the
      processing will be restarted upon notification from the guest that the
      slot is released.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f1ff89ec
  13. 14 7月, 2017 1 次提交
    • R
      kvm: x86: hyperv: make VP_INDEX managed by userspace · d3457c87
      Roman Kagan 提交于
      Hyper-V identifies vCPUs by Virtual Processor Index, which can be
      queried via HV_X64_MSR_VP_INDEX msr.  It is defined by the spec as a
      sequential number which can't exceed the maximum number of vCPUs per VM.
      APIC ids can be sparse and thus aren't a valid replacement for VP
      indices.
      
      Current KVM uses its internal vcpu index as VP_INDEX.  However, to make
      it predictable and persistent across VM migrations, the userspace has to
      control the value of VP_INDEX.
      
      This patch achieves that, by storing vp_index explicitly on vcpu, and
      allowing HV_X64_MSR_VP_INDEX to be set from the host side.  For
      compatibility it's initialized to KVM vcpu index.  Also a few variables
      are renamed to make clear distinction betweed this Hyper-V vp_index and
      KVM vcpu_id (== APIC id).  Besides, a new capability,
      KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip
      attempting msr writes where unsupported, to avoid spamming error logs.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d3457c87
  14. 13 7月, 2017 1 次提交
    • R
      kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2 · efc479e6
      Roman Kagan 提交于
      There is a flaw in the Hyper-V SynIC implementation in KVM: when message
      page or event flags page is enabled by setting the corresponding msr,
      KVM zeroes it out.  This is problematic because on migration the
      corresponding MSRs are loaded on the destination, so the content of
      those pages is lost.
      
      This went unnoticed so far because the only user of those pages was
      in-KVM hyperv synic timers, which could continue working despite that
      zeroing.
      
      Newer QEMU uses those pages for Hyper-V VMBus implementation, and
      zeroing them breaks the migration.
      
      Besides, in newer QEMU the content of those pages is fully managed by
      QEMU, so zeroing them is undesirable even when writing the MSRs from the
      guest side.
      
      To support this new scheme, introduce a new capability,
      KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic
      pages aren't zeroed out in KVM.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      efc479e6
  15. 02 3月, 2017 1 次提交
  16. 01 2月, 2017 1 次提交
    • F
      sched/cputime: Convert task/group cputime to nsecs · 5613fda9
      Frederic Weisbecker 提交于
      Now that most cputime readers use the transition API which return the
      task cputime in old style cputime_t, we can safely store the cputime in
      nsecs. This will eventually make cputime statistics less opaque and more
      granular. Back and forth convertions between cputime_t and nsecs in order
      to deal with cputime_t random granularity won't be needed anymore.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5613fda9
  17. 09 1月, 2017 1 次提交