1. 14 7月, 2022 1 次提交
    • P
      kvm: stats: tell userspace which values are boolean · 1b870fa5
      Paolo Bonzini 提交于
      Some of the statistics values exported by KVM are always only 0 or 1.
      It can be useful to export this fact to userspace so that it can track
      them specially (for example by polling the value every now and then to
      compute a % of time spent in a specific state).
      
      Therefore, add "boolean value" as a new "unit".  While it is not exactly
      a unit, it walks and quacks like one.  In particular, using the type
      would be wrong because boolean values could be instantaneous or peak
      values (e.g. "is the rmap allocated?") or even two-bucket histograms
      (e.g. "number of posted vs. non-posted interrupt injections").
      Suggested-by: NAmneesh Singh <natto@weirdnatto.in>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1b870fa5
  2. 04 5月, 2022 2 次提交
    • O
      KVM: arm64: Implement PSCI SYSTEM_SUSPEND · bfbab445
      Oliver Upton 提交于
      ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows
      software to request that a system be placed in the deepest possible
      low-power state. Effectively, software can use this to suspend itself to
      RAM.
      
      Unfortunately, there really is no good way to implement a system-wide
      PSCI call in KVM. Any precondition checks done in the kernel will need
      to be repeated by userspace since there is no good way to protect a
      critical section that spans an exit to userspace. SYSTEM_RESET and
      SYSTEM_OFF are equally plagued by this issue, although no users have
      seemingly cared for the relatively long time these calls have been
      supported.
      
      The solution is to just make the whole implementation userspace's
      problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that
      indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND.
      Additionally, add a CAP to get buy-in from userspace for this new exit
      type.
      
      Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in.
      If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide
      explicit documentation of userspace's responsibilites for the exit and
      point to the PSCI specification to describe the actual PSCI call.
      Reviewed-by: NReiji Watanabe <reijiw@google.com>
      Signed-off-by: NOliver Upton <oupton@google.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com
      bfbab445
    • O
      KVM: arm64: Add support for userspace to suspend a vCPU · 7b33a09d
      Oliver Upton 提交于
      Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU
      is in a suspended state. In the suspended state the vCPU will block
      until a wakeup event (pending interrupt) is recognized.
      
      Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to
      userspace that KVM has recognized one such wakeup event. It is the
      responsibility of userspace to then make the vCPU runnable, or leave it
      suspended until the next wakeup event.
      Signed-off-by: NOliver Upton <oupton@google.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com
      7b33a09d
  3. 30 4月, 2022 1 次提交
    • P
      KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT · d495f942
      Paolo Bonzini 提交于
      When KVM_EXIT_SYSTEM_EVENT was introduced, it included a flags
      member that at the time was unused.  Unfortunately this extensibility
      mechanism has several issues:
      
      - x86 is not writing the member, so it would not be possible to use it
        on x86 except for new events
      
      - the member is not aligned to 64 bits, so the definition of the
        uAPI struct is incorrect for 32- on 64-bit userspace.  This is a
        problem for RISC-V, which supports CONFIG_KVM_COMPAT, but fortunately
        usage of flags was only introduced in 5.18.
      
      Since padding has to be introduced, place a new field in there
      that tells if the flags field is valid.  To allow further extensibility,
      in fact, change flags to an array of 16 values, and store how many
      of the values are valid.  The availability of the new ndata field
      is tied to a system capability; all architectures are changed to
      fill in the field.
      
      To avoid breaking compilation of userspace that was using the flags
      field, provide a userspace-only union to overlap flags with data[0].
      The new field is placed at the same offset for both 32- and 64-bit
      userspace.
      
      Cc: Will Deacon <will@kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Peter Gonda <pgonda@google.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: Nkernel test robot <lkp@intel.com>
      Message-Id: <20220422103013.34832-1-pbonzini@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d495f942
  4. 14 4月, 2022 1 次提交
  5. 02 4月, 2022 8 次提交
  6. 21 3月, 2022 1 次提交
    • O
      KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 · 6d849191
      Oliver Upton 提交于
      KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not
      advertise the set of quirks which may be disabled to userspace, so it is
      impossible to predict the behavior of KVM. Worse yet,
      KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning
      it fails to reject attempts to set invalid quirk bits.
      
      The only valid workaround for the quirky quirks API is to add a new CAP.
      Actually advertise the set of quirks that can be disabled to userspace
      so it can predict KVM's behavior. Reject values for cap->args[0] that
      contain invalid bits.
      
      Finally, add documentation for the new capability and describe the
      existing quirks.
      Signed-off-by: NOliver Upton <oupton@google.com>
      Message-Id: <20220301060351.442881-5-oupton@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6d849191
  7. 25 2月, 2022 1 次提交
  8. 22 2月, 2022 1 次提交
  9. 14 2月, 2022 4 次提交
  10. 31 1月, 2022 1 次提交
  11. 28 1月, 2022 1 次提交
    • P
      KVM: x86: add system attribute to retrieve full set of supported xsave states · dd6e6312
      Paolo Bonzini 提交于
      Because KVM_GET_SUPPORTED_CPUID is meant to be passed (by simple-minded
      VMMs) to KVM_SET_CPUID2, it cannot include any dynamic xsave states that
      have not been enabled.  Probing those, for example so that they can be
      passed to ARCH_REQ_XCOMP_GUEST_PERM, requires a new ioctl or arch_prctl.
      The latter is in fact worse, even though that is what the rest of the
      API uses, because it would require supported_xcr0 to be moved from the
      KVM module to the kernel just for this use.  In addition, the value
      would be nonsensical (or an error would have to be returned) until
      the KVM module is loaded in.
      
      Therefore, to limit the growth of system ioctls, add a /dev/kvm
      variant of KVM_{GET,HAS}_DEVICE_ATTR, and implement it in x86
      with just one group (0) and attribute (KVM_X86_XCOMP_GUEST_SUPP).
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      dd6e6312
  12. 15 1月, 2022 1 次提交
  13. 07 1月, 2022 1 次提交
    • D
      KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery · 14243b38
      David Woodhouse 提交于
      This adds basic support for delivering 2 level event channels to a guest.
      
      Initially, it only supports delivery via the IRQ routing table, triggered
      by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast()
      function which will use the pre-mapped shared_info page if it already
      exists and is still valid, while the slow path through the irqfd_inject
      workqueue will remap the shared_info page if necessary.
      
      It sets the bits in the shared_info page but not the vcpu_info; that is
      deferred to __kvm_xen_has_interrupt() which raises the vector to the
      appropriate vCPU.
      
      Add a 'verbose' mode to xen_shinfo_test while adding test cases for this.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-5-dwmw2@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      14243b38
  14. 06 1月, 2022 1 次提交
  15. 11 11月, 2021 1 次提交
    • P
      KVM: SEV: Add support for SEV intra host migration · b5663931
      Peter Gonda 提交于
      For SEV to work with intra host migration, contents of the SEV info struct
      such as the ASID (used to index the encryption key in the AMD SP) and
      the list of memory regions need to be transferred to the target VM.
      This change adds a commands for a target VMM to get a source SEV VM's sev
      info.
      Signed-off-by: NPeter Gonda <pgonda@google.com>
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Reviewed-by: NMarc Orr <marcorr@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Message-Id: <20211021174303.385706-3-pgonda@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b5663931
  16. 25 10月, 2021 2 次提交
  17. 19 10月, 2021 1 次提交
  18. 04 10月, 2021 1 次提交
  19. 21 8月, 2021 1 次提交
  20. 25 6月, 2021 1 次提交
  21. 24 6月, 2021 1 次提交
    • J
      KVM: stats: Add fd-based API to read binary stats data · cb082bfa
      Jing Zhang 提交于
      This commit defines the API for userspace and prepare the common
      functionalities to support per VM/VCPU binary stats data readings.
      
      The KVM stats now is only accessible by debugfs, which has some
      shortcomings this change series are supposed to fix:
      1. The current debugfs stats solution in KVM could be disabled
         when kernel Lockdown mode is enabled, which is a potential
         rick for production.
      2. The current debugfs stats solution in KVM is organized as "one
         stats per file", it is good for debugging, but not efficient
         for production.
      3. The stats read/clear in current debugfs solution in KVM are
         protected by the global kvm_lock.
      
      Besides that, there are some other benefits with this change:
      1. All KVM VM/VCPU stats can be read out in a bulk by one copy
         to userspace.
      2. A schema is used to describe KVM statistics. From userspace's
         perspective, the KVM statistics are self-describing.
      3. With the fd-based solution, a separate telemetry would be able
         to read KVM stats in a less privileged environment.
      4. After the initial setup by reading in stats descriptors, a
         telemetry only needs to read the stats data itself, no more
         parsing or setup is needed.
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Reviewed-by: NRicardo Koller <ricarkol@google.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: NFuad Tabba <tabba@google.com>
      Tested-by: Fuad Tabba <tabba@google.com> #arm64
      Signed-off-by: NJing Zhang <jingzhangos@google.com>
      Message-Id: <20210618222709.1858088-3-jingzhangos@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb082bfa
  22. 22 6月, 2021 3 次提交
  23. 18 6月, 2021 3 次提交
  24. 27 5月, 2021 1 次提交