1. 29 9月, 2020 2 次提交
  2. 22 8月, 2020 1 次提交
    • W
      KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() · fdfe7cbd
      Will Deacon 提交于
      The 'flags' field of 'struct mmu_notifier_range' is used to indicate
      whether invalidate_range_{start,end}() are permitted to block. In the
      case of kvm_mmu_notifier_invalidate_range_start(), this field is not
      forwarded on to the architecture-specific implementation of
      kvm_unmap_hva_range() and therefore the backend cannot sensibly decide
      whether or not to block.
      
      Add an extra 'flags' parameter to kvm_unmap_hva_range() so that
      architectures are aware as to whether or not they are permitted to block.
      
      Cc: <stable@vger.kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Message-Id: <20200811102725.7121-2-will@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fdfe7cbd
  3. 28 7月, 2020 1 次提交
  4. 10 7月, 2020 3 次提交
  5. 07 7月, 2020 9 次提交
  6. 06 7月, 2020 2 次提交
  7. 10 6月, 2020 1 次提交
  8. 09 6月, 2020 1 次提交
  9. 29 5月, 2020 1 次提交
  10. 28 5月, 2020 1 次提交
  11. 25 5月, 2020 1 次提交
  12. 16 5月, 2020 2 次提交
    • K
      KVM: arm64: Support enabling dirty log gradually in small chunks · c862626e
      Keqian Zhu 提交于
      There is already support of enabling dirty log gradually in small chunks
      for x86 in commit 3c9bd400 ("KVM: x86: enable dirty log gradually in
      small chunks"). This adds support for arm64.
      
      x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
      is enabled. However, for arm64, both huge pages and normal pages can be
      write protected gradually by userspace.
      
      Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
      Linux VMs with different page size. The memory pressure is 127G in each
      case. The time taken of memory_global_dirty_log_start in QEMU is listed
      below:
      
      Page Size      Before    After Optimization
        4K            650ms         1.8ms
        2M             4ms          1.8ms
        1G             2ms          1.8ms
      
      Besides the time reduction, the biggest improvement is that we will minimize
      the performance side effect (because of dissolving huge pages and marking
      memslots dirty) on guest after enabling dirty log.
      Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com
      c862626e
    • D
      kvm: add halt-polling cpu usage stats · cb953129
      David Matlack 提交于
      Two new stats for exposing halt-polling cpu usage:
      halt_poll_success_ns
      halt_poll_fail_ns
      
      Thus sum of these 2 stats is the total cpu time spent polling. "success"
      means the VCPU polled until a virtual interrupt was delivered. "fail"
      means the VCPU had to schedule out (either because the maximum poll time
      was reached or it needed to yield the CPU).
      
      To avoid touching every arch's kvm_vcpu_stat struct, only update and
      export halt-polling cpu usage stats if we're on x86.
      
      Exporting cpu usage as a u64 and in nanoseconds means we will overflow at
      ~500 years, which seems reasonably large.
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NJon Cargille <jcargill@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      
      Message-Id: <20200508182240.68440-1-jcargill@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb953129
  13. 04 5月, 2020 1 次提交
  14. 24 3月, 2020 1 次提交
  15. 17 2月, 2020 1 次提交
  16. 28 1月, 2020 3 次提交
  17. 23 1月, 2020 1 次提交
    • M
      KVM: arm/arm64: Cleanup MMIO handling · 0e20f5e2
      Marc Zyngier 提交于
      Our MMIO handling is a bit odd, in the sense that it uses an
      intermediate per-vcpu structure to store the various decoded
      information that describe the access.
      
      But the same information is readily available in the HSR/ESR_EL2
      field, and we actually use this field to populate the structure.
      
      Let's simplify the whole thing by getting rid of the superfluous
      structure and save a (tiny) bit of space in the vcpu structure.
      
      [32bit fix courtesy of Olof Johansson <olof@lixom.net>]
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      0e20f5e2
  18. 16 1月, 2020 1 次提交
  19. 15 1月, 2020 1 次提交
    • S
      arm64: Introduce system_capabilities_finalized() marker · b51c6ac2
      Suzuki K Poulose 提交于
      We finalize the system wide capabilities after the SMP CPUs
      are booted by the kernel. This is used as a marker for deciding
      various checks in the kernel. e.g, sanity check the hotplugged
      CPUs for missing mandatory features.
      
      However there is no explicit helper available for this in the
      kernel. There is sys_caps_initialised, which is not exposed.
      The other closest one we have is the jump_label arm64_const_caps_ready
      which denotes that the capabilities are set and the capability checks
      could use the individual jump_labels for fast path. This is
      performed before setting the ELF Hwcaps, which must be checked
      against the new CPUs. We also perform some of the other initialization
      e.g, SVE setup, which is important for the use of FP/SIMD
      where SVE is supported. Normally userspace doesn't get to run
      before we finish this. However the in-kernel users may
      potentially start using the neon mode. So, we need to
      reject uses of neon mode before we are set. Instead of defining
      a new marker for the completion of SVE setup, we could simply
      reuse the arm64_const_caps_ready and enable it once we have
      finished all the setup. Also we could expose this to the
      various users as "system_capabilities_finalized()" to make
      it more meaningful than "const_caps_ready".
      
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      b51c6ac2
  20. 22 10月, 2019 4 次提交
    • S
      KVM: arm64: Provide VCPU attributes for stolen time · 58772e9a
      Steven Price 提交于
      Allow user space to inform the KVM host where in the physical memory
      map the paravirtualized time structures should be located.
      
      User space can set an attribute on the VCPU providing the IPA base
      address of the stolen time structure for that VCPU. This must be
      repeated for every VCPU in the VM.
      
      The address is given in terms of the physical address visible to
      the guest and must be 64 byte aligned. The guest will discover the
      address via a hypercall.
      Signed-off-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      58772e9a
    • S
      KVM: arm64: Support stolen time reporting via shared structure · 8564d637
      Steven Price 提交于
      Implement the service call for configuring a shared structure between a
      VCPU and the hypervisor in which the hypervisor can write the time
      stolen from the VCPU's execution time by other tasks on the host.
      
      User space allocates memory which is placed at an IPA also chosen by user
      space. The hypervisor then updates the shared structure using
      kvm_put_guest() to ensure single copy atomicity of the 64-bit value
      reporting the stolen time in nanoseconds.
      
      Whenever stolen time is enabled by the guest, the stolen time counter is
      reset.
      
      The stolen time itself is retrieved from the sched_info structure
      maintained by the Linux scheduler code. We enable SCHEDSTATS when
      selecting KVM Kconfig to ensure this value is meaningful.
      Signed-off-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      8564d637
    • S
      KVM: arm64: Implement PV_TIME_FEATURES call · b48c1a45
      Steven Price 提交于
      This provides a mechanism for querying which paravirtualized time
      features are available in this hypervisor.
      
      Also add the header file which defines the ABI for the paravirtualized
      time features we're about to add.
      Signed-off-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      b48c1a45
    • C
      KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace · c726200d
      Christoffer Dall 提交于
      For a long time, if a guest accessed memory outside of a memslot using
      any of the load/store instructions in the architecture which doesn't
      supply decoding information in the ESR_EL2 (the ISV bit is not set), the
      kernel would print the following message and terminate the VM as a
      result of returning -ENOSYS to userspace:
      
        load/store instruction decoding not implemented
      
      The reason behind this message is that KVM assumes that all accesses
      outside a memslot is an MMIO access which should be handled by
      userspace, and we originally expected to eventually implement some sort
      of decoding of load/store instructions where the ISV bit was not set.
      
      However, it turns out that many of the instructions which don't provide
      decoding information on abort are not safe to use for MMIO accesses, and
      the remaining few that would potentially make sense to use on MMIO
      accesses, such as those with register writeback, are not used in
      practice.  It also turns out that fetching an instruction from guest
      memory can be a pretty horrible affair, involving stopping all CPUs on
      SMP systems, handling multiple corner cases of address translation in
      software, and more.  It doesn't appear likely that we'll ever implement
      this in the kernel.
      
      What is much more common is that a user has misconfigured his/her guest
      and is actually not accessing an MMIO region, but just hitting some
      random hole in the IPA space.  In this scenario, the error message above
      is almost misleading and has led to a great deal of confusion over the
      years.
      
      It is, nevertheless, ABI to userspace, and we therefore need to
      introduce a new capability that userspace explicitly enables to change
      behavior.
      
      This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV)
      which does exactly that, and introduces a new exit reason to report the
      event to userspace.  User space can then emulate an exception to the
      guest, restart the guest, suspend the guest, or take any other
      appropriate action as per the policy of the running system.
      Reported-by: NHeinrich Schuchardt <xypron.glpk@gmx.de>
      Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlexander Graf <graf@amazon.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      c726200d
  21. 15 10月, 2019 1 次提交
    • M
      arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear · f2266504
      Marc Zyngier 提交于
      The GICv3 architecture specification is incredibly misleading when it
      comes to PMR and the requirement for a DSB. It turns out that this DSB
      is only required if the CPU interface sends an Upstream Control
      message to the redistributor in order to update the RD's view of PMR.
      
      This message is only sent when ICC_CTLR_EL1.PMHE is set, which isn't
      the case in Linux. It can still be set from EL3, so some special care
      is required. But the upshot is that in the (hopefuly large) majority
      of the cases, we can drop the DSB altogether.
      
      This relies on a new static key being set if the boot CPU has PMHE
      set. The drawback is that this static key has to be exported to
      modules.
      
      Cc: Will Deacon <will@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f2266504
  22. 08 7月, 2019 1 次提交
    • M
      KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register · 1e0cf16c
      Marc Zyngier 提交于
      As part of setting up the host context, we populate its
      MPIDR by using cpu_logical_map(). It turns out that contrary
      to arm64, cpu_logical_map() on 32bit ARM doesn't return the
      *full* MPIDR, but a truncated version.
      
      This leaves the host MPIDR slightly corrupted after the first
      run of a VM, since we won't correctly restore the MPIDR on
      exit. Oops.
      
      Since we cannot trust cpu_logical_map(), let's adopt a different
      strategy. We move the initialization of the host CPU context as
      part of the per-CPU initialization (which, in retrospect, makes
      a lot of sense), and directly read the MPIDR from the HW. This
      is guaranteed to work on both arm and arm64.
      Reported-by: NAndre Przywara <Andre.Przywara@arm.com>
      Tested-by: NAndre Przywara <Andre.Przywara@arm.com>
      Fixes: 32f13955 ("arm/arm64: KVM: Statically configure the host's view of MPIDR")
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      1e0cf16c