1. 08 11月, 2019 1 次提交
  2. 29 10月, 2019 1 次提交
    • C
      KVM: arm64: Don't set HCR_EL2.TVM when S2FWB is supported · 5c401308
      Christoffer Dall 提交于
      On CPUs that support S2FWB (Armv8.4+), KVM configures the stage 2 page
      tables to override the memory attributes of memory accesses, regardless
      of the stage 1 page table configurations, and also when the stage 1 MMU
      is turned off.  This results in all memory accesses to RAM being
      cacheable, including during early boot of the guest.
      
      On CPUs without this feature, memory accesses were non-cacheable during
      boot until the guest turned on the stage 1 MMU, and we had to detect
      when the guest turned on the MMU, such that we could invalidate all cache
      entries and ensure a consistent view of memory with the MMU turned on.
      When the guest turned on the caches, we would call stage2_flush_vm()
      from kvm_toggle_cache().
      
      However, stage2_flush_vm() walks all the stage 2 tables, and calls
      __kvm_flush-dcache_pte, which on a system with S2FWB does ... absolutely
      nothing.
      
      We can avoid that whole song and dance, and simply not set TVM when
      creating a VM on a system that has S2FWB.
      Signed-off-by: NChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Link: https://lore.kernel.org/r/20191028130541.30536-1-christoffer.dall@arm.com
      5c401308
  3. 05 7月, 2019 2 次提交
    • D
      KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s · fdec2a9e
      Dave Martin 提交于
      Currently, the {read,write}_sysreg_el*() accessors for accessing
      particular ELs' sysregs in the presence of VHE rely on some local
      hacks and define their system register encodings in a way that is
      inconsistent with the core definitions in <asm/sysreg.h>.
      
      As a result, it is necessary to add duplicate definitions for any
      system register that already needs a definition in sysreg.h for
      other reasons.
      
      This is a bit of a maintenance headache, and the reasons for the
      _el*() accessors working the way they do is a bit historical.
      
      This patch gets rid of the shadow sysreg definitions in
      <asm/kvm_hyp.h>, converts the _el*() accessors to use the core
      __msr_s/__mrs_s interface, and converts all call sites to use the
      standard sysreg #define names (i.e., upper case, with SYS_ prefix).
      
      This patch will conflict heavily anyway, so the opportunity
      to clean up some bad whitespace in the context of the changes is
      taken.
      
      The change exposes a few system registers that have no sysreg.h
      definition, due to msr_s/mrs_s being used in place of msr/mrs:
      additions are made in order to fill in the gaps.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoffer Dall <christoffer.dall@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://www.spinics.net/lists/kvm-arm/msg31717.html
      [Rebased to v4.21-rc1]
      Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
      [Rebased to v5.2-rc5, changelog updates]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fdec2a9e
    • A
      KVM: arm/arm64: Add save/restore support for firmware workaround state · 99adb567
      Andre Przywara 提交于
      KVM implements the firmware interface for mitigating cache speculation
      vulnerabilities. Guests may use this interface to ensure mitigation is
      active.
      If we want to migrate such a guest to a host with a different support
      level for those workarounds, migration might need to fail, to ensure that
      critical guests don't loose their protection.
      
      Introduce a way for userland to save and restore the workarounds state.
      On restoring we do checks that make sure we don't downgrade our
      mitigation level.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Reviewed-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      99adb567
  4. 19 6月, 2019 1 次提交
  5. 24 4月, 2019 1 次提交
    • M
      KVM: arm/arm64: Context-switch ptrauth registers · 384b40ca
      Mark Rutland 提交于
      When pointer authentication is supported, a guest may wish to use it.
      This patch adds the necessary KVM infrastructure for this to work, with
      a semi-lazy context switch of the pointer auth state.
      
      Pointer authentication feature is only enabled when VHE is built
      in the kernel and present in the CPU implementation so only VHE code
      paths are modified.
      
      When we schedule a vcpu, we disable guest usage of pointer
      authentication instructions and accesses to the keys. While these are
      disabled, we avoid context-switching the keys. When we trap the guest
      trying to use pointer authentication functionality, we change to eagerly
      context-switching the keys, and enable the feature. The next time the
      vcpu is scheduled out/in, we start again. However the host key save is
      optimized and implemented inside ptrauth instruction/register access
      trap.
      
      Pointer authentication consists of address authentication and generic
      authentication, and CPUs in a system might have varied support for
      either. Where support for either feature is not uniform, it is hidden
      from guests via ID register emulation, as a result of the cpufeature
      framework in the host.
      
      Unfortunately, address authentication and generic authentication cannot
      be trapped separately, as the architecture provides a single EL2 trap
      covering both. If we wish to expose one without the other, we cannot
      prevent a (badly-written) guest from intermittently using a feature
      which is not uniformly supported (when scheduled on a physical CPU which
      supports the relevant feature). Hence, this patch expects both type of
      authentication to be present in a cpu.
      
      This switch of key is done from guest enter/exit assembly as preparation
      for the upcoming in-kernel pointer authentication support. Hence, these
      key switching routines are not implemented in C code as they may cause
      pointer authentication key signing error in some situations.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      [Only VHE, key switch in full assembly, vcpu_has_ptrauth checks
      , save host key in ptrauth exception trap]
      Signed-off-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Christoffer Dall <christoffer.dall@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      [maz: various fixups]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      384b40ca
  6. 20 2月, 2019 3 次提交
  7. 18 12月, 2018 1 次提交
    • M
      arm64: KVM: Consistently advance singlestep when emulating instructions · bd7d95ca
      Mark Rutland 提交于
      When we emulate a guest instruction, we don't advance the hardware
      singlestep state machine, and thus the guest will receive a software
      step exception after a next instruction which is not emulated by the
      host.
      
      We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
      whether userspace requested a single step, and fake a debug exception
      from within the kernel. Other times, we advance the HW singlestep state
      rely on the HW to generate the exception for us. Thus, the observed step
      behaviour differs for host and guest.
      
      Let's make this simpler and consistent by always advancing the HW
      singlestep state machine when we skip an instruction. Thus we can rely
      on the hardware to generate the singlestep exception for us, and never
      need to explicitly check for an active-pending step, nor do we need to
      fake a debug exception from the guest.
      
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      bd7d95ca
  8. 21 9月, 2018 1 次提交
  9. 21 7月, 2018 1 次提交
  10. 09 7月, 2018 2 次提交
  11. 06 7月, 2018 1 次提交
  12. 04 5月, 2018 1 次提交
    • J
      KVM: arm64: Fix order of vcpu_write_sys_reg() arguments · 1975fa56
      James Morse 提交于
      A typo in kvm_vcpu_set_be()'s call:
      | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr)
      causes us to use the 32bit register value as an index into the sys_reg[]
      array, and sail off the end of the linear map when we try to bring up
      big-endian secondaries.
      
      | Unable to handle kernel paging request at virtual address ffff80098b982c00
      | Mem abort info:
      |  ESR = 0x96000045
      |  Exception class = DABT (current EL), IL = 32 bits
      |   SET = 0, FnV = 0
      |   EA = 0, S1PTW = 0
      | Data abort info:
      |   ISV = 0, ISS = 0x00000045
      |   CM = 0, WnR = 1
      | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a
      | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000
      | Internal error: Oops: 96000045 [#1] PREEMPT SMP
      | Modules linked in:
      | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323
      | Hardware name: ARM Juno development board (r1) (DT)
      | pstate: 60000005 (nZCv daif -PAN -UAO)
      | pc : vcpu_write_sys_reg+0x50/0x134
      | lr : vcpu_write_sys_reg+0x50/0x134
      
      | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b)
      | Call trace:
      |  vcpu_write_sys_reg+0x50/0x134
      |  kvm_psci_vcpu_on+0x14c/0x150
      |  kvm_psci_0_2_call+0x244/0x2a4
      |  kvm_hvc_call_handler+0x1cc/0x258
      |  handle_hvc+0x20/0x3c
      |  handle_exit+0x130/0x1ec
      |  kvm_arch_vcpu_ioctl_run+0x340/0x614
      |  kvm_vcpu_ioctl+0x4d0/0x840
      |  do_vfs_ioctl+0xc8/0x8d0
      |  ksys_ioctl+0x78/0xa8
      |  sys_ioctl+0xc/0x18
      |  el0_svc_naked+0x30/0x34
      | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8)
      |---[ end trace 4b4a4f9628596602 ]---
      
      Fix the order of the arguments.
      
      Fixes: 8d404c4c ("KVM: arm64: Rewrite system register accessors to read/write functions")
      CC: Christoffer Dall <cdall@cs.columbia.edu>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      1975fa56
  13. 19 3月, 2018 6 次提交
  14. 26 2月, 2018 1 次提交
  15. 16 1月, 2018 3 次提交
    • D
      KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA · 558daf69
      Dongjiu Geng 提交于
      ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
      aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
      all Non-secure EL1&0 error record accesses to EL2.
      
      This patch enables the two bits for the guest OS, guaranteeing that
      KVM takes external aborts and traps attempts to access the physical
      error registers.
      
      ERRIDR_EL1 advertises the number of error records, we return
      zero meaning we can treat all the other registers as RAZ/WI too.
      Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
      [removed specific emulation, use trap_raz_wi() directly for everything,
       rephrased parts of the commit message]
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      558daf69
    • J
      KVM: arm64: Handle RAS SErrors from EL2 on guest exit · 0067df41
      James Morse 提交于
      We expect to have firmware-first handling of RAS SErrors, with errors
      notified via an APEI method. For systems without firmware-first, add
      some minimal handling to KVM.
      
      There are two ways KVM can take an SError due to a guest, either may be a
      RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
      or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
      
      The current SError from EL2 code unmasks SError and tries to fence any
      pending SError into a single instruction window. It then leaves SError
      unmasked.
      
      With the v8.2 RAS Extensions we may take an SError for a 'corrected'
      error, but KVM is only able to handle SError from EL2 if they occur
      during this single instruction window...
      
      The RAS Extensions give us a new instruction to synchronise and
      consume SErrors. The RAS Extensions document (ARM DDI0587),
      '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
      SError interrupts generated by 'instructions, translation table walks,
      hardware updates to the translation tables, and instruction fetches on
      the same PE'. This makes ESB equivalent to KVMs existing
      'dsb, mrs-daifclr, isb' sequence.
      
      Use the alternatives to synchronise and consume any SError using ESB
      instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
      in the exit_code so that we can restart the vcpu if it turns out this
      SError has no impact on the vcpu.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0067df41
    • J
      KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. · 4715c14b
      James Morse 提交于
      Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
      generated an SError with an implementation defined ESR_EL1.ISS, because we
      had no mechanism to specify the ESR value.
      
      On Juno this generates an all-zero ESR, the most significant bit 'ISV'
      is clear indicating the remainder of the ISS field is invalid.
      
      With the RAS Extensions we have a mechanism to specify this value, and the
      most significant bit has a new meaning: 'IDS - Implementation Defined
      Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
      instead of 'no valid ISS'.
      
      Add KVM support for the VSESR_EL2 register to specify an ESR value when
      HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
      specify an implementation-defined value.
      
      We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
      save/restores this bit during __{,de}activate_traps() and hardware clears the
      bit once the guest has consumed the virtual-SError.
      
      Future patches may add an API (or KVM CAP) to pend a virtual SError with
      a specified ESR.
      
      Cc: Dongjiu Geng <gengdongjiu@huawei.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4715c14b
  16. 06 11月, 2017 2 次提交
  17. 05 9月, 2017 1 次提交
  18. 02 5月, 2017 1 次提交
  19. 22 10月, 2016 1 次提交
    • W
      arm64: KVM: Take S1 walks into account when determining S2 write faults · 60e21a0e
      Will Deacon 提交于
      The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
      generated by a read or a write instruction. For stage 2 data aborts
      generated by a stage 1 translation table walk (i.e. the actual page
      table access faults at EL2), the WnR bit therefore reports whether the
      instruction generating the walk was a load or a store, *not* whether the
      page table walker was reading or writing the entry.
      
      For page tables marked as read-only at stage 2 (e.g. due to KSM merging
      them with the tables from another guest), this could result in livelock,
      where a page table walk generated by a load instruction attempts to
      set the access flag in the stage 1 descriptor, but fails to trigger
      CoW in the host since only a read fault is reported.
      
      This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
      take into account stage 2 faults in stage 1 walks. Since DBM cannot be
      disabled at EL2 for CPUs that implement it, we assume that these faults
      are always causes by writes, avoiding the livelock situation at the
      expense of occasional, spurious CoWs.
      
      We could, in theory, do a bit better by checking the guest TCR
      configuration and inspecting the page table to see why the PTE faulted.
      However, I doubt this is measurable in practice, and the threat of
      livelock is real.
      
      Cc: <stable@vger.kernel.org>
      Cc: Julien Grall <julien.grall@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      60e21a0e
  20. 08 9月, 2016 2 次提交
  21. 22 6月, 2016 1 次提交
  22. 01 3月, 2016 2 次提交
    • M
      arm64: KVM: VHE: Implement VHE activate/deactivate_traps · 68908bf7
      Marc Zyngier 提交于
      Running the kernel in HYP mode requires the HCR_E2H bit to be set
      at all times, and the HCR_TGE bit to be set when running as a host
      (and cleared when running as a guest). At the same time, the vector
       must be set to the current role of the kernel (either host or
      hypervisor), and a couple of system registers differ between VHE
      and non-VHE.
      
      We implement these by using another set of alternate functions
      that get dynamically patched.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      68908bf7
    • M
      arm/arm64: KVM: Handle out-of-RAM cache maintenance as a NOP · 57c841f1
      Marc Zyngier 提交于
      So far, our handling of cache maintenance by VA has been pretty
      simple: Either the access is in the guest RAM and generates a S2
      fault, which results in the page being mapped RW, or we go down
      the io_mem_abort() path, and nuke the guest.
      
      The first one is fine, but the second one is extremely weird.
      Treating the CM as an I/O is wrong, and nothing in the ARM ARM
      indicates that we should generate a fault for something that
      cannot end-up in the cache anyway (even if the guest maps it,
      it will keep on faulting at stage-2 for emulation).
      
      So let's just skip this instruction, and let the guest get away
      with it.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      57c841f1
  23. 25 1月, 2016 1 次提交
  24. 14 12月, 2015 1 次提交
  25. 05 12月, 2015 2 次提交
    • P
      arm64: KVM: Get rid of old vcpu_reg() · f6be563a
      Pavel Fedin 提交于
      Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
      unsafe on ARM64. This patch converts the rest of use cases to new
      accessors and completely removes vcpu_reg() on ARM64.
      Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      f6be563a
    • P
      arm64: KVM: Correctly handle zero register during MMIO · bc45a516
      Pavel Fedin 提交于
      On ARM64 register index of 31 corresponds to both zero register and SP.
      However, all memory access instructions, use ZR as transfer register. SP
      is used only as a base register in indirect memory addressing, or by
      register-register arithmetics, which cannot be trapped here.
      
      Correct emulation is achieved by introducing new register accessor
      functions, which can do special handling for reg_num == 31. These new
      accessors intentionally do not rely on old vcpu_reg() on ARM64, because
      it is to be removed. Since the affected code is shared by both ARM
      flavours, implementations of these accessors are also added to ARM32 code.
      
      This patch fixes setting MMIO register to a random value (actually SP)
      instead of zero by something like:
      
       *((volatile int *)reg) = 0;
      
      compilers tend to generate "str wzr, [xx]" here
      
      [Marc: Fixed 32bit splat]
      Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      bc45a516