提交 d8312a3f 编写于 作者: L Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - VHE optimizations

   - EL2 address space randomization

   - speculative execution mitigations ("variant 3a", aka execution past
     invalid privilege register access)

   - bugfixes and cleanups

  PPC:
   - improvements for the radix page fault handler for HV KVM on POWER9

  s390:
   - more kvm stat counters

   - virtio gpu plumbing

   - documentation

   - facilities improvements

  x86:
   - support for VMware magic I/O port and pseudo-PMCs

   - AMD pause loop exiting

   - support for AMD core performance extensions

   - support for synchronous register access

   - expose nVMX capabilities to userspace

   - support for Hyper-V signaling via eventfd

   - use Enlightened VMCS when running on Hyper-V

   - allow userspace to disable MWAIT/HLT/PAUSE vmexits

   - usual roundup of optimizations and nested virtualization bugfixes

  Generic:
   - API selftest infrastructure (though the only tests are for x86 as
     of now)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits)
  kvm: x86: fix a prototype warning
  kvm: selftests: add sync_regs_test
  kvm: selftests: add API testing infrastructure
  kvm: x86: fix a compile warning
  KVM: X86: Add Force Emulation Prefix for "emulate the next instruction"
  KVM: X86: Introduce handle_ud()
  KVM: vmx: unify adjacent #ifdefs
  x86: kvm: hide the unused 'cpu' variable
  KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig
  Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
  kvm: Add emulation for movups/movupd
  KVM: VMX: raise internal error for exception during invalid protected mode state
  KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending
  KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending
  KVM: x86: Fix misleading comments on handling pending exceptions
  KVM: x86: Rename interrupt.pending to interrupt.injected
  KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt
  x86/kvm: use Enlightened VMCS when running on Hyper-V
  x86/hyper-v: detect nested features
  x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits
  ...
...@@ -1907,6 +1907,9 @@ ...@@ -1907,6 +1907,9 @@
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs. kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
Default is 0 (don't ignore, but inject #GP) Default is 0 (don't ignore, but inject #GP)
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
Default is false (don't support).
kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit
KVM MMU at runtime. KVM MMU at runtime.
Default is 0 (off) Default is 0 (off)
......
...@@ -86,9 +86,12 @@ Translation table lookup with 64KB pages: ...@@ -86,9 +86,12 @@ Translation table lookup with 64KB pages:
+-------------------------------------------------> [63] TTBR0/1 +-------------------------------------------------> [63] TTBR0/1
When using KVM without the Virtualization Host Extensions, the hypervisor When using KVM without the Virtualization Host Extensions, the
maps kernel pages in EL2 at a fixed offset from the kernel VA. See the hypervisor maps kernel pages in EL2 at a fixed (and potentially
kern_hyp_va macro for more details. random) offset from the linear mapping. See the kern_hyp_va macro and
kvm_update_va_mask function for more details. MMIO devices such as
GICv2 gets mapped next to the HYP idmap page, as do vectors when
ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs.
When using KVM with the Virtualization Host Extensions, no additional When using KVM with the Virtualization Host Extensions, no additional
mappings are created, since the host kernel runs directly in EL2. mappings are created, since the host kernel runs directly in EL2.
00-INDEX 00-INDEX
- this file. - this file.
amd-memory-encryption.rst
- notes on AMD Secure Encrypted Virtualization feature and SEV firmware
command description
api.txt api.txt
- KVM userspace API. - KVM userspace API.
arm
- internal ABI between the kernel and HYP (for arm/arm64)
cpuid.txt cpuid.txt
- KVM-specific cpuid leaves (x86). - KVM-specific cpuid leaves (x86).
devices/ devices/
...@@ -26,6 +31,5 @@ s390-diag.txt ...@@ -26,6 +31,5 @@ s390-diag.txt
- Diagnose hypercall description (for IBM S/390) - Diagnose hypercall description (for IBM S/390)
timekeeping.txt timekeeping.txt
- timekeeping virtualization for x86-based architectures. - timekeeping virtualization for x86-based architectures.
amd-memory-encryption.txt vcpu-requests.rst
- notes on AMD Secure Encrypted Virtualization feature and SEV firmware - internal VCPU request API
command description
...@@ -3480,7 +3480,7 @@ encrypted VMs. ...@@ -3480,7 +3480,7 @@ encrypted VMs.
Currently, this ioctl is used for issuing Secure Encrypted Virtualization Currently, this ioctl is used for issuing Secure Encrypted Virtualization
(SEV) commands on AMD Processors. The SEV commands are defined in (SEV) commands on AMD Processors. The SEV commands are defined in
Documentation/virtual/kvm/amd-memory-encryption.txt. Documentation/virtual/kvm/amd-memory-encryption.rst.
4.111 KVM_MEMORY_ENCRYPT_REG_REGION 4.111 KVM_MEMORY_ENCRYPT_REG_REGION
...@@ -3516,6 +3516,38 @@ Returns: 0 on success; -1 on error ...@@ -3516,6 +3516,38 @@ Returns: 0 on success; -1 on error
This ioctl can be used to unregister the guest memory region registered This ioctl can be used to unregister the guest memory region registered
with KVM_MEMORY_ENCRYPT_REG_REGION ioctl above. with KVM_MEMORY_ENCRYPT_REG_REGION ioctl above.
4.113 KVM_HYPERV_EVENTFD
Capability: KVM_CAP_HYPERV_EVENTFD
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_hyperv_eventfd (in)
This ioctl (un)registers an eventfd to receive notifications from the guest on
the specified Hyper-V connection id through the SIGNAL_EVENT hypercall, without
causing a user exit. SIGNAL_EVENT hypercall with non-zero event flag number
(bits 24-31) still triggers a KVM_EXIT_HYPERV_HCALL user exit.
struct kvm_hyperv_eventfd {
__u32 conn_id;
__s32 fd;
__u32 flags;
__u32 padding[3];
};
The conn_id field should fit within 24 bits:
#define KVM_HYPERV_CONN_ID_MASK 0x00ffffff
The acceptable values for the flags field are:
#define KVM_HYPERV_EVENTFD_DEASSIGN (1 << 0)
Returns: 0 on success,
-EINVAL if conn_id or flags is outside the allowed range
-ENOENT on deassign if the conn_id isn't registered
-EEXIST on assign if the conn_id is already registered
5. The kvm_run structure 5. The kvm_run structure
------------------------ ------------------------
...@@ -3873,7 +3905,7 @@ in userspace. ...@@ -3873,7 +3905,7 @@ in userspace.
__u64 kvm_dirty_regs; __u64 kvm_dirty_regs;
union { union {
struct kvm_sync_regs regs; struct kvm_sync_regs regs;
char padding[1024]; char padding[SYNC_REGS_SIZE_BYTES];
} s; } s;
If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access
...@@ -4078,6 +4110,46 @@ Once this is done the KVM_REG_MIPS_VEC_* and KVM_REG_MIPS_MSA_* registers can be ...@@ -4078,6 +4110,46 @@ Once this is done the KVM_REG_MIPS_VEC_* and KVM_REG_MIPS_MSA_* registers can be
accessed, and the Config5.MSAEn bit is accessible via the KVM API and also from accessed, and the Config5.MSAEn bit is accessible via the KVM API and also from
the guest. the guest.
6.74 KVM_CAP_SYNC_REGS
Architectures: s390, x86
Target: s390: always enabled, x86: vcpu
Parameters: none
Returns: x86: KVM_CHECK_EXTENSION returns a bit-array indicating which register
sets are supported (bitfields defined in arch/x86/include/uapi/asm/kvm.h).
As described above in the kvm_sync_regs struct info in section 5 (kvm_run):
KVM_CAP_SYNC_REGS "allow[s] userspace to access certain guest registers
without having to call SET/GET_*REGS". This reduces overhead by eliminating
repeated ioctl calls for setting and/or getting register values. This is
particularly important when userspace is making synchronous guest state
modifications, e.g. when emulating and/or intercepting instructions in
userspace.
For s390 specifics, please refer to the source code.
For x86:
- the register sets to be copied out to kvm_run are selectable
by userspace (rather that all sets being copied out for every exit).
- vcpu_events are available in addition to regs and sregs.
For x86, the 'kvm_valid_regs' field of struct kvm_run is overloaded to
function as an input bit-array field set by userspace to indicate the
specific register sets to be copied out on the next exit.
To indicate when userspace has modified values that should be copied into
the vCPU, the all architecture bitarray field, 'kvm_dirty_regs' must be set.
This is done using the same bitflags as for the 'kvm_valid_regs' field.
If the dirty bit is not set, then the register set values will not be copied
into the vCPU even if they've been modified.
Unused bitfields in the bitarrays must be set to zero.
struct kvm_sync_regs {
struct kvm_regs regs;
struct kvm_sregs sregs;
struct kvm_vcpu_events events;
};
7. Capabilities that can be enabled on VMs 7. Capabilities that can be enabled on VMs
------------------------------------------ ------------------------------------------
...@@ -4286,6 +4358,26 @@ enables QEMU to build error log and branch to guest kernel registered ...@@ -4286,6 +4358,26 @@ enables QEMU to build error log and branch to guest kernel registered
machine check handling routine. Without this capability KVM will machine check handling routine. Without this capability KVM will
branch to guests' 0x200 interrupt vector. branch to guests' 0x200 interrupt vector.
7.13 KVM_CAP_X86_DISABLE_EXITS
Architectures: x86
Parameters: args[0] defines which exits are disabled
Returns: 0 on success, -EINVAL when args[0] contains invalid exits
Valid bits in args[0] are
#define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0)
#define KVM_X86_DISABLE_EXITS_HLT (1 << 1)
Enabling this capability on a VM provides userspace with a way to no
longer intercept some instructions for improved latency in some
workloads, and is suggested when vCPUs are associated to dedicated
physical CPUs. More bits can be added in the future; userspace can
just pass the KVM_CHECK_EXTENSION result to KVM_ENABLE_CAP to disable
all such vmexits.
Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
8. Other capabilities. 8. Other capabilities.
---------------------- ----------------------
...@@ -4398,15 +4490,6 @@ reserved. ...@@ -4398,15 +4490,6 @@ reserved.
Both registers and addresses are 64-bits wide. Both registers and addresses are 64-bits wide.
It will be possible to run 64-bit or 32-bit guest code. It will be possible to run 64-bit or 32-bit guest code.
8.8 KVM_CAP_X86_GUEST_MWAIT
Architectures: x86
This capability indicates that guest using memory monotoring instructions
(MWAIT/MWAITX) to stop the virtual CPU will not cause a VM exit. As such time
spent while virtual CPU is halted in this way will then be accounted for as
guest running time on the host (as opposed to e.g. HLT).
8.9 KVM_CAP_ARM_USER_IRQ 8.9 KVM_CAP_ARM_USER_IRQ
Architectures: arm, arm64 Architectures: arm, arm64
...@@ -4483,3 +4566,33 @@ Parameters: none ...@@ -4483,3 +4566,33 @@ Parameters: none
This capability indicates if the flic device will be able to get/set the This capability indicates if the flic device will be able to get/set the
AIS states for migration via the KVM_DEV_FLIC_AISM_ALL attribute and allows AIS states for migration via the KVM_DEV_FLIC_AISM_ALL attribute and allows
to discover this without having to create a flic device. to discover this without having to create a flic device.
8.14 KVM_CAP_S390_PSW
Architectures: s390
This capability indicates that the PSW is exposed via the kvm_run structure.
8.15 KVM_CAP_S390_GMAP
Architectures: s390
This capability indicates that the user space memory used as guest mapping can
be anywhere in the user memory address space, as long as the memory slots are
aligned and sized to a segment (1MB) boundary.
8.16 KVM_CAP_S390_COW
Architectures: s390
This capability indicates that the user space memory used as guest mapping can
use copy-on-write semantics as well as dirty pages tracking via read-only page
tables.
8.17 KVM_CAP_S390_BPB
Architectures: s390
This capability indicates that kvm will implement the interfaces to handle
reset, migration and nested KVM for branch prediction blocking. The stfle
facility 82 should not be provided to the guest without this capability.
...@@ -23,8 +23,8 @@ This function queries the presence of KVM cpuid leafs. ...@@ -23,8 +23,8 @@ This function queries the presence of KVM cpuid leafs.
function: define KVM_CPUID_FEATURES (0x40000001) function: define KVM_CPUID_FEATURES (0x40000001)
returns : ebx, ecx, edx = 0 returns : ebx, ecx
eax = and OR'ed group of (1 << flag), where each flags is: eax = an OR'ed group of (1 << flag), where each flags is:
flag || value || meaning flag || value || meaning
...@@ -66,3 +66,14 @@ KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side ...@@ -66,3 +66,14 @@ KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in || || per-cpu warps are expected in
|| || kvmclock. || || kvmclock.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
edx = an OR'ed group of (1 << flag), where each flags is:
flag || value || meaning
==================================================================================
KVM_HINTS_DEDICATED || 0 || guest checks this feature bit to
|| || determine if there is vCPU pinning
|| || and there is no vCPU over-commitment,
|| || allowing optimizations
----------------------------------------------------------------------------------
...@@ -6516,7 +6516,7 @@ S: Maintained ...@@ -6516,7 +6516,7 @@ S: Maintained
F: Documentation/networking/netvsc.txt F: Documentation/networking/netvsc.txt
F: arch/x86/include/asm/mshyperv.h F: arch/x86/include/asm/mshyperv.h
F: arch/x86/include/asm/trace/hyperv.h F: arch/x86/include/asm/trace/hyperv.h
F: arch/x86/include/uapi/asm/hyperv.h F: arch/x86/include/asm/hyperv-tlfs.h
F: arch/x86/kernel/cpu/mshyperv.c F: arch/x86/kernel/cpu/mshyperv.c
F: arch/x86/hyperv F: arch/x86/hyperv
F: drivers/hid/hid-hyperv.c F: drivers/hid/hid-hyperv.c
......
...@@ -70,7 +70,10 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu); ...@@ -70,7 +70,10 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high); extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high);
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); /* no VHE on 32-bit :( */
static inline int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) { BUG(); return 0; }
extern int __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu);
extern void __init_stage2_translation(void); extern void __init_stage2_translation(void);
......
...@@ -41,7 +41,17 @@ static inline unsigned long *vcpu_reg32(struct kvm_vcpu *vcpu, u8 reg_num) ...@@ -41,7 +41,17 @@ static inline unsigned long *vcpu_reg32(struct kvm_vcpu *vcpu, u8 reg_num)
return vcpu_reg(vcpu, reg_num); return vcpu_reg(vcpu, reg_num);
} }
unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu); unsigned long *__vcpu_spsr(struct kvm_vcpu *vcpu);
static inline unsigned long vpcu_read_spsr(struct kvm_vcpu *vcpu)
{
return *__vcpu_spsr(vcpu);
}
static inline void vcpu_write_spsr(struct kvm_vcpu *vcpu, unsigned long v)
{
*__vcpu_spsr(vcpu) = v;
}
static inline unsigned long vcpu_get_reg(struct kvm_vcpu *vcpu, static inline unsigned long vcpu_get_reg(struct kvm_vcpu *vcpu,
u8 reg_num) u8 reg_num)
...@@ -92,14 +102,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) ...@@ -92,14 +102,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr = HCR_GUEST_MASK; vcpu->arch.hcr = HCR_GUEST_MASK;
} }
static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu) static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu)
{
return vcpu->arch.hcr;
}
static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
{ {
vcpu->arch.hcr = hcr; return (unsigned long *)&vcpu->arch.hcr;
} }
static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu) static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
......
...@@ -155,9 +155,6 @@ struct kvm_vcpu_arch { ...@@ -155,9 +155,6 @@ struct kvm_vcpu_arch {
/* HYP trapping configuration */ /* HYP trapping configuration */
u32 hcr; u32 hcr;
/* Interrupt related fields */
u32 irq_lines; /* IRQ and FIQ levels */
/* Exception Information */ /* Exception Information */
struct kvm_vcpu_fault_info fault; struct kvm_vcpu_fault_info fault;
...@@ -315,4 +312,7 @@ static inline bool kvm_arm_harden_branch_predictor(void) ...@@ -315,4 +312,7 @@ static inline bool kvm_arm_harden_branch_predictor(void)
return false; return false;
} }
static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
#endif /* __ARM_KVM_HOST_H__ */ #endif /* __ARM_KVM_HOST_H__ */
...@@ -110,6 +110,10 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt); ...@@ -110,6 +110,10 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu);
void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu);
void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
asmlinkage void __vfp_save_state(struct vfp_hard_struct *vfp); asmlinkage void __vfp_save_state(struct vfp_hard_struct *vfp);
asmlinkage void __vfp_restore_state(struct vfp_hard_struct *vfp); asmlinkage void __vfp_restore_state(struct vfp_hard_struct *vfp);
......
...@@ -28,6 +28,13 @@ ...@@ -28,6 +28,13 @@
*/ */
#define kern_hyp_va(kva) (kva) #define kern_hyp_va(kva) (kva)
/* Contrary to arm64, there is no need to generate a PC-relative address */
#define hyp_symbol_addr(s) \
({ \
typeof(s) *addr = &(s); \
addr; \
})
/* /*
* KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels. * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
*/ */
...@@ -42,8 +49,15 @@ ...@@ -42,8 +49,15 @@
#include <asm/pgalloc.h> #include <asm/pgalloc.h>
#include <asm/stage2_pgtable.h> #include <asm/stage2_pgtable.h>
/* Ensure compatibility with arm64 */
#define VA_BITS 32
int create_hyp_mappings(void *from, void *to, pgprot_t prot); int create_hyp_mappings(void *from, void *to, pgprot_t prot);
int create_hyp_io_mappings(void *from, void *to, phys_addr_t); int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
void __iomem **kaddr,
void __iomem **haddr);
int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
void **haddr);
void free_hyp_pgds(void); void free_hyp_pgds(void);
void stage2_unmap_vm(struct kvm *kvm); void stage2_unmap_vm(struct kvm *kvm);
......
...@@ -135,6 +135,15 @@ struct kvm_arch_memory_slot { ...@@ -135,6 +135,15 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_CRM_SHIFT 7 #define KVM_REG_ARM_CRM_SHIFT 7
#define KVM_REG_ARM_32_CRN_MASK 0x0000000000007800 #define KVM_REG_ARM_32_CRN_MASK 0x0000000000007800
#define KVM_REG_ARM_32_CRN_SHIFT 11 #define KVM_REG_ARM_32_CRN_SHIFT 11
/*
* For KVM currently all guest registers are nonsecure, but we reserve a bit
* in the encoding to distinguish secure from nonsecure for AArch32 system
* registers that are banked by security. This is 1 for the secure banked
* register, and 0 for the nonsecure banked register or if the register is
* not banked by security.
*/
#define KVM_REG_ARM_SECURE_MASK 0x0000000010000000
#define KVM_REG_ARM_SECURE_SHIFT 28
#define ARM_CP15_REG_SHIFT_MASK(x,n) \ #define ARM_CP15_REG_SHIFT_MASK(x,n) \
(((x) << KVM_REG_ARM_ ## n ## _SHIFT) & KVM_REG_ARM_ ## n ## _MASK) (((x) << KVM_REG_ARM_ ## n ## _SHIFT) & KVM_REG_ARM_ ## n ## _MASK)
......
...@@ -270,6 +270,60 @@ static bool access_gic_sre(struct kvm_vcpu *vcpu, ...@@ -270,6 +270,60 @@ static bool access_gic_sre(struct kvm_vcpu *vcpu,
return true; return true;
} }
static bool access_cntp_tval(struct kvm_vcpu *vcpu,
const struct coproc_params *p,
const struct coproc_reg *r)
{
u64 now = kvm_phys_timer_read();
u64 val;
if (p->is_write) {
val = *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, val + now);
} else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL);
*vcpu_reg(vcpu, p->Rt1) = val - now;
}
return true;
}
static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
const struct coproc_params *p,
const struct coproc_reg *r)
{
u32 val;
if (p->is_write) {
val = *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CTL, val);
} else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CTL);
*vcpu_reg(vcpu, p->Rt1) = val;
}
return true;
}
static bool access_cntp_cval(struct kvm_vcpu *vcpu,
const struct coproc_params *p,
const struct coproc_reg *r)
{
u64 val;
if (p->is_write) {
val = (u64)*vcpu_reg(vcpu, p->Rt2) << 32;
val |= *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, val);
} else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL);
*vcpu_reg(vcpu, p->Rt1) = val;
*vcpu_reg(vcpu, p->Rt2) = val >> 32;
}
return true;
}
/* /*
* We could trap ID_DFR0 and tell the guest we don't support performance * We could trap ID_DFR0 and tell the guest we don't support performance
* monitoring. Unfortunately the patch to make the kernel check ID_DFR0 was * monitoring. Unfortunately the patch to make the kernel check ID_DFR0 was
...@@ -423,10 +477,17 @@ static const struct coproc_reg cp15_regs[] = { ...@@ -423,10 +477,17 @@ static const struct coproc_reg cp15_regs[] = {
{ CRn(13), CRm( 0), Op1( 0), Op2( 4), is32, { CRn(13), CRm( 0), Op1( 0), Op2( 4), is32,
NULL, reset_unknown, c13_TID_PRIV }, NULL, reset_unknown, c13_TID_PRIV },
/* CNTP */
{ CRm64(14), Op1( 2), is64, access_cntp_cval},
/* CNTKCTL: swapped by interrupt.S. */ /* CNTKCTL: swapped by interrupt.S. */
{ CRn(14), CRm( 1), Op1( 0), Op2( 0), is32, { CRn(14), CRm( 1), Op1( 0), Op2( 0), is32,
NULL, reset_val, c14_CNTKCTL, 0x00000000 }, NULL, reset_val, c14_CNTKCTL, 0x00000000 },
/* CNTP */
{ CRn(14), CRm( 2), Op1( 0), Op2( 0), is32, access_cntp_tval },
{ CRn(14), CRm( 2), Op1( 0), Op2( 1), is32, access_cntp_ctl },
/* The Configuration Base Address Register. */ /* The Configuration Base Address Register. */
{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar}, { CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
}; };
......
...@@ -142,7 +142,7 @@ unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num) ...@@ -142,7 +142,7 @@ unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num)
/* /*
* Return the SPSR for the current mode of the virtual CPU. * Return the SPSR for the current mode of the virtual CPU.
*/ */
unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu) unsigned long *__vcpu_spsr(struct kvm_vcpu *vcpu)
{ {
unsigned long mode = *vcpu_cpsr(vcpu) & MODE_MASK; unsigned long mode = *vcpu_cpsr(vcpu) & MODE_MASK;
switch (mode) { switch (mode) {
...@@ -174,5 +174,5 @@ unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu) ...@@ -174,5 +174,5 @@ unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu)
*/ */
void kvm_inject_vabt(struct kvm_vcpu *vcpu) void kvm_inject_vabt(struct kvm_vcpu *vcpu)
{ {
vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA); *vcpu_hcr(vcpu) |= HCR_VA;
} }
...@@ -9,7 +9,6 @@ KVM=../../../../virt/kvm ...@@ -9,7 +9,6 @@ KVM=../../../../virt/kvm
CFLAGS_ARMV7VE :=$(call cc-option, -march=armv7ve) CFLAGS_ARMV7VE :=$(call cc-option, -march=armv7ve)
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
......
...@@ -44,7 +44,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu, u32 *fpexc_host) ...@@ -44,7 +44,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu, u32 *fpexc_host)
isb(); isb();
} }
write_sysreg(vcpu->arch.hcr | vcpu->arch.irq_lines, HCR); write_sysreg(vcpu->arch.hcr, HCR);
/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
write_sysreg(HSTR_T(15), HSTR); write_sysreg(HSTR_T(15), HSTR);
write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR); write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR);
...@@ -90,18 +90,18 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) ...@@ -90,18 +90,18 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu)
{ {
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_save_state(vcpu); __vgic_v3_save_state(vcpu);
else __vgic_v3_deactivate_traps(vcpu);
__vgic_v2_save_state(vcpu); }
} }
static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
{ {
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_activate_traps(vcpu);
__vgic_v3_restore_state(vcpu); __vgic_v3_restore_state(vcpu);
else }
__vgic_v2_restore_state(vcpu);
} }
static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
...@@ -154,7 +154,7 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) ...@@ -154,7 +154,7 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
return true; return true;
} }
int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
{ {
struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *host_ctxt;
struct kvm_cpu_context *guest_ctxt; struct kvm_cpu_context *guest_ctxt;
......
...@@ -922,6 +922,22 @@ config HARDEN_BRANCH_PREDICTOR ...@@ -922,6 +922,22 @@ config HARDEN_BRANCH_PREDICTOR
If unsure, say Y. If unsure, say Y.
config HARDEN_EL2_VECTORS
bool "Harden EL2 vector mapping against system register leak" if EXPERT
default y
help
Speculation attacks against some high-performance processors can
be used to leak privileged information such as the vector base
register, resulting in a potential defeat of the EL2 layout
randomization.
This config option will map the vectors to a fixed location,
independent of the EL2 code mapping, so that revealing VBAR_EL2
to an attacker does not give away any extra information. This
only gets enabled on affected CPUs.
If unsure, say Y.
menuconfig ARMV8_DEPRECATED menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions" bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT depends on COMPAT
......
...@@ -5,6 +5,8 @@ ...@@ -5,6 +5,8 @@
#include <asm/cpucaps.h> #include <asm/cpucaps.h>
#include <asm/insn.h> #include <asm/insn.h>
#define ARM64_CB_PATCH ARM64_NCAPS
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <linux/init.h> #include <linux/init.h>
...@@ -22,12 +24,19 @@ struct alt_instr { ...@@ -22,12 +24,19 @@ struct alt_instr {
u8 alt_len; /* size of new instruction(s), <= orig_len */ u8 alt_len; /* size of new instruction(s), <= orig_len */
}; };
typedef void (*alternative_cb_t)(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst);
void __init apply_alternatives_all(void); void __init apply_alternatives_all(void);
void apply_alternatives(void *start, size_t length); void apply_alternatives(void *start, size_t length);
#define ALTINSTR_ENTRY(feature) \ #define ALTINSTR_ENTRY(feature,cb) \
" .word 661b - .\n" /* label */ \ " .word 661b - .\n" /* label */ \
" .if " __stringify(cb) " == 0\n" \
" .word 663f - .\n" /* new instruction */ \ " .word 663f - .\n" /* new instruction */ \
" .else\n" \
" .word " __stringify(cb) "- .\n" /* callback */ \
" .endif\n" \
" .hword " __stringify(feature) "\n" /* feature bit */ \ " .hword " __stringify(feature) "\n" /* feature bit */ \
" .byte 662b-661b\n" /* source len */ \ " .byte 662b-661b\n" /* source len */ \
" .byte 664f-663f\n" /* replacement len */ " .byte 664f-663f\n" /* replacement len */
...@@ -45,15 +54,18 @@ void apply_alternatives(void *start, size_t length); ...@@ -45,15 +54,18 @@ void apply_alternatives(void *start, size_t length);
* but most assemblers die if insn1 or insn2 have a .inst. This should * but most assemblers die if insn1 or insn2 have a .inst. This should
* be fixed in a binutils release posterior to 2.25.51.0.2 (anything * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
* containing commit 4e4d08cf7399b606 or c1baaddf8861). * containing commit 4e4d08cf7399b606 or c1baaddf8861).
*
* Alternatives with callbacks do not generate replacement instructions.
*/ */
#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled) \ #define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb) \
".if "__stringify(cfg_enabled)" == 1\n" \ ".if "__stringify(cfg_enabled)" == 1\n" \
"661:\n\t" \ "661:\n\t" \
oldinstr "\n" \ oldinstr "\n" \
"662:\n" \ "662:\n" \
".pushsection .altinstructions,\"a\"\n" \ ".pushsection .altinstructions,\"a\"\n" \
ALTINSTR_ENTRY(feature) \ ALTINSTR_ENTRY(feature,cb) \
".popsection\n" \ ".popsection\n" \
" .if " __stringify(cb) " == 0\n" \
".pushsection .altinstr_replacement, \"a\"\n" \ ".pushsection .altinstr_replacement, \"a\"\n" \
"663:\n\t" \ "663:\n\t" \
newinstr "\n" \ newinstr "\n" \
...@@ -61,11 +73,17 @@ void apply_alternatives(void *start, size_t length); ...@@ -61,11 +73,17 @@ void apply_alternatives(void *start, size_t length);
".popsection\n\t" \ ".popsection\n\t" \
".org . - (664b-663b) + (662b-661b)\n\t" \ ".org . - (664b-663b) + (662b-661b)\n\t" \
".org . - (662b-661b) + (664b-663b)\n" \ ".org . - (662b-661b) + (664b-663b)\n" \
".else\n\t" \
"663:\n\t" \
"664:\n\t" \
".endif\n" \
".endif\n" ".endif\n"
#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...) \ #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...) \
__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg)) __ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
#define ALTERNATIVE_CB(oldinstr, cb) \
__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_CB_PATCH, 1, cb)
#else #else
#include <asm/assembler.h> #include <asm/assembler.h>
...@@ -132,6 +150,14 @@ void apply_alternatives(void *start, size_t length); ...@@ -132,6 +150,14 @@ void apply_alternatives(void *start, size_t length);
661: 661:
.endm .endm
.macro alternative_cb cb
.set .Lasm_alt_mode, 0
.pushsection .altinstructions, "a"
altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0
.popsection
661:
.endm
/* /*
* Provide the other half of the alternative code sequence. * Provide the other half of the alternative code sequence.
*/ */
...@@ -157,6 +183,13 @@ void apply_alternatives(void *start, size_t length); ...@@ -157,6 +183,13 @@ void apply_alternatives(void *start, size_t length);
.org . - (662b-661b) + (664b-663b) .org . - (662b-661b) + (664b-663b)
.endm .endm
/*
* Callback-based alternative epilogue
*/
.macro alternative_cb_end
662:
.endm
/* /*
* Provides a trivial alternative or default sequence consisting solely * Provides a trivial alternative or default sequence consisting solely
* of NOPs. The number of NOPs is chosen automatically to match the * of NOPs. The number of NOPs is chosen automatically to match the
......
...@@ -32,7 +32,7 @@ ...@@ -32,7 +32,7 @@
#define ARM64_HAS_VIRT_HOST_EXTN 11 #define ARM64_HAS_VIRT_HOST_EXTN 11
#define ARM64_WORKAROUND_CAVIUM_27456 12 #define ARM64_WORKAROUND_CAVIUM_27456 12
#define ARM64_HAS_32BIT_EL0 13 #define ARM64_HAS_32BIT_EL0 13
#define ARM64_HYP_OFFSET_LOW 14 #define ARM64_HARDEN_EL2_VECTORS 14
#define ARM64_MISMATCHED_CACHE_LINE_SIZE 15 #define ARM64_MISMATCHED_CACHE_LINE_SIZE 15
#define ARM64_HAS_NO_FPSIMD 16 #define ARM64_HAS_NO_FPSIMD 16
#define ARM64_WORKAROUND_REPEAT_TLBI 17 #define ARM64_WORKAROUND_REPEAT_TLBI 17
......
...@@ -70,6 +70,7 @@ enum aarch64_insn_imm_type { ...@@ -70,6 +70,7 @@ enum aarch64_insn_imm_type {
AARCH64_INSN_IMM_6, AARCH64_INSN_IMM_6,
AARCH64_INSN_IMM_S, AARCH64_INSN_IMM_S,
AARCH64_INSN_IMM_R, AARCH64_INSN_IMM_R,
AARCH64_INSN_IMM_N,
AARCH64_INSN_IMM_MAX AARCH64_INSN_IMM_MAX
}; };
...@@ -314,6 +315,11 @@ __AARCH64_INSN_FUNCS(eor, 0x7F200000, 0x4A000000) ...@@ -314,6 +315,11 @@ __AARCH64_INSN_FUNCS(eor, 0x7F200000, 0x4A000000)
__AARCH64_INSN_FUNCS(eon, 0x7F200000, 0x4A200000) __AARCH64_INSN_FUNCS(eon, 0x7F200000, 0x4A200000)
__AARCH64_INSN_FUNCS(ands, 0x7F200000, 0x6A000000) __AARCH64_INSN_FUNCS(ands, 0x7F200000, 0x6A000000)
__AARCH64_INSN_FUNCS(bics, 0x7F200000, 0x6A200000) __AARCH64_INSN_FUNCS(bics, 0x7F200000, 0x6A200000)
__AARCH64_INSN_FUNCS(and_imm, 0x7F800000, 0x12000000)
__AARCH64_INSN_FUNCS(orr_imm, 0x7F800000, 0x32000000)
__AARCH64_INSN_FUNCS(eor_imm, 0x7F800000, 0x52000000)
__AARCH64_INSN_FUNCS(ands_imm, 0x7F800000, 0x72000000)
__AARCH64_INSN_FUNCS(extr, 0x7FA00000, 0x13800000)
__AARCH64_INSN_FUNCS(b, 0xFC000000, 0x14000000) __AARCH64_INSN_FUNCS(b, 0xFC000000, 0x14000000)
__AARCH64_INSN_FUNCS(bl, 0xFC000000, 0x94000000) __AARCH64_INSN_FUNCS(bl, 0xFC000000, 0x94000000)
__AARCH64_INSN_FUNCS(cbz, 0x7F000000, 0x34000000) __AARCH64_INSN_FUNCS(cbz, 0x7F000000, 0x34000000)
...@@ -423,6 +429,16 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst, ...@@ -423,6 +429,16 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
int shift, int shift,
enum aarch64_insn_variant variant, enum aarch64_insn_variant variant,
enum aarch64_insn_logic_type type); enum aarch64_insn_logic_type type);
u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
enum aarch64_insn_variant variant,
enum aarch64_insn_register Rn,
enum aarch64_insn_register Rd,
u64 imm);
u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
enum aarch64_insn_register Rm,
enum aarch64_insn_register Rn,
enum aarch64_insn_register Rd,
u8 lsb);
u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base, u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
enum aarch64_insn_prfm_type type, enum aarch64_insn_prfm_type type,
enum aarch64_insn_prfm_target target, enum aarch64_insn_prfm_target target,
......
...@@ -25,6 +25,7 @@ ...@@ -25,6 +25,7 @@
/* Hyp Configuration Register (HCR) bits */ /* Hyp Configuration Register (HCR) bits */
#define HCR_TEA (UL(1) << 37) #define HCR_TEA (UL(1) << 37)
#define HCR_TERR (UL(1) << 36) #define HCR_TERR (UL(1) << 36)
#define HCR_TLOR (UL(1) << 35)
#define HCR_E2H (UL(1) << 34) #define HCR_E2H (UL(1) << 34)
#define HCR_ID (UL(1) << 33) #define HCR_ID (UL(1) << 33)
#define HCR_CD (UL(1) << 32) #define HCR_CD (UL(1) << 32)
...@@ -64,6 +65,7 @@ ...@@ -64,6 +65,7 @@
/* /*
* The bits we set in HCR: * The bits we set in HCR:
* TLOR: Trap LORegion register accesses
* RW: 64bit by default, can be overridden for 32bit VMs * RW: 64bit by default, can be overridden for 32bit VMs
* TAC: Trap ACTLR * TAC: Trap ACTLR
* TSC: Trap SMC * TSC: Trap SMC
...@@ -81,9 +83,9 @@ ...@@ -81,9 +83,9 @@
*/ */
#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \ #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \ HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW) HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
HCR_FMO | HCR_IMO)
#define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
#define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */ /* TCR_EL2 Registers bits */
......
...@@ -33,6 +33,7 @@ ...@@ -33,6 +33,7 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0 #define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT) #define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
/* Translate a kernel address of @sym into its equivalent linear mapping */
#define kvm_ksym_ref(sym) \ #define kvm_ksym_ref(sym) \
({ \ ({ \
void *val = &sym; \ void *val = &sym; \
...@@ -57,7 +58,9 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu); ...@@ -57,7 +58,9 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high); extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high);
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); extern int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu);
extern int __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu);
extern u64 __vgic_v3_get_ich_vtr_el2(void); extern u64 __vgic_v3_get_ich_vtr_el2(void);
extern u64 __vgic_v3_read_vmcr(void); extern u64 __vgic_v3_read_vmcr(void);
...@@ -70,6 +73,20 @@ extern u32 __init_stage2_translation(void); ...@@ -70,6 +73,20 @@ extern u32 __init_stage2_translation(void);
extern void __qcom_hyp_sanitize_btac_predictors(void); extern void __qcom_hyp_sanitize_btac_predictors(void);
#else /* __ASSEMBLY__ */
.macro get_host_ctxt reg, tmp
adr_l \reg, kvm_host_cpu_state
mrs \tmp, tpidr_el2
add \reg, \reg, \tmp
.endm
.macro get_vcpu_ptr vcpu, ctxt
get_host_ctxt \ctxt, \vcpu
ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
kern_hyp_va \vcpu
.endm
#endif #endif
#endif /* __ARM_KVM_ASM_H__ */ #endif /* __ARM_KVM_ASM_H__ */
...@@ -26,13 +26,15 @@ ...@@ -26,13 +26,15 @@
#include <asm/esr.h> #include <asm/esr.h>
#include <asm/kvm_arm.h> #include <asm/kvm_arm.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmio.h> #include <asm/kvm_mmio.h>
#include <asm/ptrace.h> #include <asm/ptrace.h>
#include <asm/cputype.h> #include <asm/cputype.h>
#include <asm/virt.h> #include <asm/virt.h>
unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num); unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu); unsigned long vcpu_read_spsr32(const struct kvm_vcpu *vcpu);
void vcpu_write_spsr32(struct kvm_vcpu *vcpu, unsigned long v);
bool kvm_condition_valid32(const struct kvm_vcpu *vcpu); bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
void kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr); void kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr);
...@@ -45,6 +47,11 @@ void kvm_inject_undef32(struct kvm_vcpu *vcpu); ...@@ -45,6 +47,11 @@ void kvm_inject_undef32(struct kvm_vcpu *vcpu);
void kvm_inject_dabt32(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_dabt32(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_inject_pabt32(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt32(struct kvm_vcpu *vcpu, unsigned long addr);
static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
{
return !(vcpu->arch.hcr_el2 & HCR_RW);
}
static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
{ {
vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS; vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
...@@ -59,16 +66,19 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) ...@@ -59,16 +66,19 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
vcpu->arch.hcr_el2 &= ~HCR_RW; vcpu->arch.hcr_el2 &= ~HCR_RW;
}
static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu) /*
{ * TID3: trap feature register accesses that we virtualise.
return vcpu->arch.hcr_el2; * For now this is conditional, since no AArch32 feature regs
* are currently virtualised.
*/
if (!vcpu_el1_is_32bit(vcpu))
vcpu->arch.hcr_el2 |= HCR_TID3;
} }
static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
{ {
vcpu->arch.hcr_el2 = hcr; return (unsigned long *)&vcpu->arch.hcr_el2;
} }
static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr) static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
...@@ -81,11 +91,27 @@ static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) ...@@ -81,11 +91,27 @@ static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc; return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
} }
static inline unsigned long *vcpu_elr_el1(const struct kvm_vcpu *vcpu) static inline unsigned long *__vcpu_elr_el1(const struct kvm_vcpu *vcpu)
{ {
return (unsigned long *)&vcpu_gp_regs(vcpu)->elr_el1; return (unsigned long *)&vcpu_gp_regs(vcpu)->elr_el1;
} }
static inline unsigned long vcpu_read_elr_el1(const struct kvm_vcpu *vcpu)
{
if (vcpu->arch.sysregs_loaded_on_cpu)
return read_sysreg_el1(elr);
else
return *__vcpu_elr_el1(vcpu);
}
static inline void vcpu_write_elr_el1(const struct kvm_vcpu *vcpu, unsigned long v)
{
if (vcpu->arch.sysregs_loaded_on_cpu)
write_sysreg_el1(v, elr);
else
*__vcpu_elr_el1(vcpu) = v;
}
static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu) static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
{ {
return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pstate; return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pstate;
...@@ -135,13 +161,28 @@ static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num, ...@@ -135,13 +161,28 @@ static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val; vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
} }
/* Get vcpu SPSR for current mode */ static inline unsigned long vcpu_read_spsr(const struct kvm_vcpu *vcpu)
static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
{ {
if (vcpu_mode_is_32bit(vcpu)) if (vcpu_mode_is_32bit(vcpu))
return vcpu_spsr32(vcpu); return vcpu_read_spsr32(vcpu);
return (unsigned long *)&vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1]; if (vcpu->arch.sysregs_loaded_on_cpu)
return read_sysreg_el1(spsr);
else
return vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1];
}
static inline void vcpu_write_spsr(struct kvm_vcpu *vcpu, unsigned long v)
{
if (vcpu_mode_is_32bit(vcpu)) {
vcpu_write_spsr32(vcpu, v);
return;
}
if (vcpu->arch.sysregs_loaded_on_cpu)
write_sysreg_el1(v, spsr);
else
vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1] = v;
} }
static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu) static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu)
...@@ -282,15 +323,18 @@ static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) ...@@ -282,15 +323,18 @@ static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu) static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
{ {
return vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
} }
static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu) static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
{ {
if (vcpu_mode_is_32bit(vcpu)) if (vcpu_mode_is_32bit(vcpu)) {
*vcpu_cpsr(vcpu) |= COMPAT_PSR_E_BIT; *vcpu_cpsr(vcpu) |= COMPAT_PSR_E_BIT;
else } else {
vcpu_sys_reg(vcpu, SCTLR_EL1) |= (1 << 25); u64 sctlr = vcpu_read_sys_reg(vcpu, SCTLR_EL1);
sctlr |= (1 << 25);
vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr);
}
} }
static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu) static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu)
...@@ -298,7 +342,7 @@ static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu) ...@@ -298,7 +342,7 @@ static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu)
if (vcpu_mode_is_32bit(vcpu)) if (vcpu_mode_is_32bit(vcpu))
return !!(*vcpu_cpsr(vcpu) & COMPAT_PSR_E_BIT); return !!(*vcpu_cpsr(vcpu) & COMPAT_PSR_E_BIT);
return !!(vcpu_sys_reg(vcpu, SCTLR_EL1) & (1 << 25)); return !!(vcpu_read_sys_reg(vcpu, SCTLR_EL1) & (1 << 25));
} }
static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu,
......
...@@ -272,9 +272,6 @@ struct kvm_vcpu_arch { ...@@ -272,9 +272,6 @@ struct kvm_vcpu_arch {
/* IO related fields */ /* IO related fields */
struct kvm_decode mmio_decode; struct kvm_decode mmio_decode;
/* Interrupt related fields */
u64 irq_lines; /* IRQ and FIQ levels */
/* Cache some mmu pages needed inside spinlock regions */ /* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache; struct kvm_mmu_memory_cache mmu_page_cache;
...@@ -287,10 +284,25 @@ struct kvm_vcpu_arch { ...@@ -287,10 +284,25 @@ struct kvm_vcpu_arch {
/* Virtual SError ESR to restore when HCR_EL2.VSE is set */ /* Virtual SError ESR to restore when HCR_EL2.VSE is set */
u64 vsesr_el2; u64 vsesr_el2;
/* True when deferrable sysregs are loaded on the physical CPU,
* see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
bool sysregs_loaded_on_cpu;
}; };
#define vcpu_gp_regs(v) (&(v)->arch.ctxt.gp_regs) #define vcpu_gp_regs(v) (&(v)->arch.ctxt.gp_regs)
#define vcpu_sys_reg(v,r) ((v)->arch.ctxt.sys_regs[(r)])
/*
* Only use __vcpu_sys_reg if you know you want the memory backed version of a
* register, and not the one most recently accessed by a running VCPU. For
* example, for userspace access or for system registers that are never context
* switched, but only emulated.
*/
#define __vcpu_sys_reg(v,r) ((v)->arch.ctxt.sys_regs[(r)])
u64 vcpu_read_sys_reg(struct kvm_vcpu *vcpu, int reg);
void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg);
/* /*
* CP14 and CP15 live in the same array, as they are backed by the * CP14 and CP15 live in the same array, as they are backed by the
* same system registers. * same system registers.
...@@ -298,14 +310,6 @@ struct kvm_vcpu_arch { ...@@ -298,14 +310,6 @@ struct kvm_vcpu_arch {
#define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r)]) #define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r)])
#define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r)]) #define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r)])
#ifdef CONFIG_CPU_BIG_ENDIAN
#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r))
#define vcpu_cp15_64_low(v,r) vcpu_cp15((v),(r) + 1)
#else
#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r) + 1)
#define vcpu_cp15_64_low(v,r) vcpu_cp15((v),(r))
#endif
struct kvm_vm_stat { struct kvm_vm_stat {
ulong remote_tlb_flush; ulong remote_tlb_flush;
}; };
...@@ -358,10 +362,15 @@ int kvm_perf_teardown(void); ...@@ -358,10 +362,15 @@ int kvm_perf_teardown(void);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
void __kvm_set_tpidr_el2(u64 tpidr_el2);
DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr, static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
unsigned long hyp_stack_ptr, unsigned long hyp_stack_ptr,
unsigned long vector_ptr) unsigned long vector_ptr)
{ {
u64 tpidr_el2;
/* /*
* Call initialization code, and switch to the full blown HYP code. * Call initialization code, and switch to the full blown HYP code.
* If the cpucaps haven't been finalized yet, something has gone very * If the cpucaps haven't been finalized yet, something has gone very
...@@ -370,6 +379,16 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr, ...@@ -370,6 +379,16 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
*/ */
BUG_ON(!static_branch_likely(&arm64_const_caps_ready)); BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr); __kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
/*
* Calculate the raw per-cpu offset without a translation from the
* kernel's mapping to the linear mapping, and store it in tpidr_el2
* so that we can use adr_l to access per-cpu variables in EL2.
*/
tpidr_el2 = (u64)this_cpu_ptr(&kvm_host_cpu_state)
- (u64)kvm_ksym_ref(kvm_host_cpu_state);
kvm_call_hyp(__kvm_set_tpidr_el2, tpidr_el2);
} }
static inline void kvm_arch_hardware_unsetup(void) {} static inline void kvm_arch_hardware_unsetup(void) {}
...@@ -416,6 +435,13 @@ static inline void kvm_arm_vhe_guest_enter(void) ...@@ -416,6 +435,13 @@ static inline void kvm_arm_vhe_guest_enter(void)
static inline void kvm_arm_vhe_guest_exit(void) static inline void kvm_arm_vhe_guest_exit(void)
{ {
local_daif_restore(DAIF_PROCCTX_NOIRQ); local_daif_restore(DAIF_PROCCTX_NOIRQ);
/*
* When we exit from the guest we change a number of CPU configuration
* parameters, such as traps. Make sure these changes take effect
* before running the host or additional guests.
*/
isb();
} }
static inline bool kvm_arm_harden_branch_predictor(void) static inline bool kvm_arm_harden_branch_predictor(void)
...@@ -423,4 +449,7 @@ static inline bool kvm_arm_harden_branch_predictor(void) ...@@ -423,4 +449,7 @@ static inline bool kvm_arm_harden_branch_predictor(void)
return cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR); return cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR);
} }
void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu);
void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu);
#endif /* __ARM64_KVM_HOST_H__ */ #endif /* __ARM64_KVM_HOST_H__ */
...@@ -120,37 +120,38 @@ typeof(orig) * __hyp_text fname(void) \ ...@@ -120,37 +120,38 @@ typeof(orig) * __hyp_text fname(void) \
return val; \ return val; \
} }
void __vgic_v2_save_state(struct kvm_vcpu *vcpu);
void __vgic_v2_restore_state(struct kvm_vcpu *vcpu);
int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu);
void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu);
void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu); int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
void __timer_enable_traps(struct kvm_vcpu *vcpu); void __timer_enable_traps(struct kvm_vcpu *vcpu);
void __timer_disable_traps(struct kvm_vcpu *vcpu); void __timer_disable_traps(struct kvm_vcpu *vcpu);
void __sysreg_save_host_state(struct kvm_cpu_context *ctxt); void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt);
void __sysreg_restore_host_state(struct kvm_cpu_context *ctxt); void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt);
void __sysreg_save_guest_state(struct kvm_cpu_context *ctxt); void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt);
void __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt); void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt);
void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt);
void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt);
void __sysreg32_save_state(struct kvm_vcpu *vcpu); void __sysreg32_save_state(struct kvm_vcpu *vcpu);
void __sysreg32_restore_state(struct kvm_vcpu *vcpu); void __sysreg32_restore_state(struct kvm_vcpu *vcpu);
void __debug_save_state(struct kvm_vcpu *vcpu, void __debug_switch_to_guest(struct kvm_vcpu *vcpu);
struct kvm_guest_debug_arch *dbg, void __debug_switch_to_host(struct kvm_vcpu *vcpu);
struct kvm_cpu_context *ctxt);
void __debug_restore_state(struct kvm_vcpu *vcpu,
struct kvm_guest_debug_arch *dbg,
struct kvm_cpu_context *ctxt);
void __debug_cond_save_host_state(struct kvm_vcpu *vcpu);
void __debug_cond_restore_host_state(struct kvm_vcpu *vcpu);
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs); void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs); void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
bool __fpsimd_enabled(void); bool __fpsimd_enabled(void);
void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
void deactivate_traps_vhe_put(void);
u64 __guest_enter(struct kvm_vcpu *vcpu, struct kvm_cpu_context *host_ctxt); u64 __guest_enter(struct kvm_vcpu *vcpu, struct kvm_cpu_context *host_ctxt);
void __noreturn __hyp_do_panic(unsigned long, ...); void __noreturn __hyp_do_panic(unsigned long, ...);
......
...@@ -69,9 +69,6 @@ ...@@ -69,9 +69,6 @@
* mappings, and none of this applies in that case. * mappings, and none of this applies in that case.
*/ */
#define HYP_PAGE_OFFSET_HIGH_MASK ((UL(1) << VA_BITS) - 1)
#define HYP_PAGE_OFFSET_LOW_MASK ((UL(1) << (VA_BITS - 1)) - 1)
#ifdef __ASSEMBLY__ #ifdef __ASSEMBLY__
#include <asm/alternative.h> #include <asm/alternative.h>
...@@ -81,28 +78,19 @@ ...@@ -81,28 +78,19 @@
* Convert a kernel VA into a HYP VA. * Convert a kernel VA into a HYP VA.
* reg: VA to be converted. * reg: VA to be converted.
* *
* This generates the following sequences: * The actual code generation takes place in kvm_update_va_mask, and
* - High mask: * the instructions below are only there to reserve the space and
* and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK * perform the register allocation (kvm_update_va_mask uses the
* nop * specific registers encoded in the instructions).
* - Low mask:
* and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
* and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
* - VHE:
* nop
* nop
*
* The "low mask" version works because the mask is a strict subset of
* the "high mask", hence performing the first mask for nothing.
* Should be completely invisible on any viable CPU.
*/ */
.macro kern_hyp_va reg .macro kern_hyp_va reg
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN alternative_cb kvm_update_va_mask
and \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK and \reg, \reg, #1 /* mask with va_mask */
alternative_else_nop_endif ror \reg, \reg, #1 /* rotate to the first tag bit */
alternative_if ARM64_HYP_OFFSET_LOW add \reg, \reg, #0 /* insert the low 12 bits of the tag */
and \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK add \reg, \reg, #0, lsl 12 /* insert the top 12 bits of the tag */
alternative_else_nop_endif ror \reg, \reg, #63 /* rotate back */
alternative_cb_end
.endm .endm
#else #else
...@@ -113,23 +101,43 @@ alternative_else_nop_endif ...@@ -113,23 +101,43 @@ alternative_else_nop_endif
#include <asm/mmu_context.h> #include <asm/mmu_context.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
void kvm_update_va_mask(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst);
static inline unsigned long __kern_hyp_va(unsigned long v) static inline unsigned long __kern_hyp_va(unsigned long v)
{ {
asm volatile(ALTERNATIVE("and %0, %0, %1", asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
"nop", "ror %0, %0, #1\n"
ARM64_HAS_VIRT_HOST_EXTN) "add %0, %0, #0\n"
: "+r" (v) "add %0, %0, #0, lsl 12\n"
: "i" (HYP_PAGE_OFFSET_HIGH_MASK)); "ror %0, %0, #63\n",
asm volatile(ALTERNATIVE("nop", kvm_update_va_mask)
"and %0, %0, %1", : "+r" (v));
ARM64_HYP_OFFSET_LOW)
: "+r" (v)
: "i" (HYP_PAGE_OFFSET_LOW_MASK));
return v; return v;
} }
#define kern_hyp_va(v) ((typeof(v))(__kern_hyp_va((unsigned long)(v)))) #define kern_hyp_va(v) ((typeof(v))(__kern_hyp_va((unsigned long)(v))))
/*
* Obtain the PC-relative address of a kernel symbol
* s: symbol
*
* The goal of this macro is to return a symbol's address based on a
* PC-relative computation, as opposed to a loading the VA from a
* constant pool or something similar. This works well for HYP, as an
* absolute VA is guaranteed to be wrong. Only use this if trying to
* obtain the address of a symbol (i.e. not something you obtained by
* following a pointer).
*/
#define hyp_symbol_addr(s) \
({ \
typeof(s) *addr; \
asm("adrp %0, %1\n" \
"add %0, %0, :lo12:%1\n" \
: "=r" (addr) : "S" (&s)); \
addr; \
})
/* /*
* We currently only support a 40bit IPA. * We currently only support a 40bit IPA.
*/ */
...@@ -140,7 +148,11 @@ static inline unsigned long __kern_hyp_va(unsigned long v) ...@@ -140,7 +148,11 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
#include <asm/stage2_pgtable.h> #include <asm/stage2_pgtable.h>
int create_hyp_mappings(void *from, void *to, pgprot_t prot); int create_hyp_mappings(void *from, void *to, pgprot_t prot);
int create_hyp_io_mappings(void *from, void *to, phys_addr_t); int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
void __iomem **kaddr,
void __iomem **haddr);
int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
void **haddr);
void free_hyp_pgds(void); void free_hyp_pgds(void);
void stage2_unmap_vm(struct kvm *kvm); void stage2_unmap_vm(struct kvm *kvm);
...@@ -249,7 +261,7 @@ struct kvm; ...@@ -249,7 +261,7 @@ struct kvm;
static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu) static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
{ {
return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101; return (vcpu_read_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
} }
static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size) static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
...@@ -348,36 +360,95 @@ static inline unsigned int kvm_get_vmid_bits(void) ...@@ -348,36 +360,95 @@ static inline unsigned int kvm_get_vmid_bits(void)
return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8; return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
} }
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR #ifdef CONFIG_KVM_INDIRECT_VECTORS
/*
* EL2 vectors can be mapped and rerouted in a number of ways,
* depending on the kernel configuration and CPU present:
*
* - If the CPU has the ARM64_HARDEN_BRANCH_PREDICTOR cap, the
* hardening sequence is placed in one of the vector slots, which is
* executed before jumping to the real vectors.
*
* - If the CPU has both the ARM64_HARDEN_EL2_VECTORS cap and the
* ARM64_HARDEN_BRANCH_PREDICTOR cap, the slot containing the
* hardening sequence is mapped next to the idmap page, and executed
* before jumping to the real vectors.
*
* - If the CPU only has the ARM64_HARDEN_EL2_VECTORS cap, then an
* empty slot is selected, mapped next to the idmap page, and
* executed before jumping to the real vectors.
*
* Note that ARM64_HARDEN_EL2_VECTORS is somewhat incompatible with
* VHE, as we don't have hypervisor-specific mappings. If the system
* is VHE and yet selects this capability, it will be ignored.
*/
#include <asm/mmu.h> #include <asm/mmu.h>
extern void *__kvm_bp_vect_base;
extern int __kvm_harden_el2_vector_slot;
static inline void *kvm_get_hyp_vector(void) static inline void *kvm_get_hyp_vector(void)
{ {
struct bp_hardening_data *data = arm64_get_bp_hardening_data(); struct bp_hardening_data *data = arm64_get_bp_hardening_data();
void *vect = kvm_ksym_ref(__kvm_hyp_vector); void *vect = kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector));
int slot = -1;
if (data->fn) { if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) && data->fn) {
vect = __bp_harden_hyp_vecs_start + vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start));
data->hyp_vectors_slot * SZ_2K; slot = data->hyp_vectors_slot;
}
if (!has_vhe()) if (this_cpu_has_cap(ARM64_HARDEN_EL2_VECTORS) && !has_vhe()) {
vect = lm_alias(vect); vect = __kvm_bp_vect_base;
if (slot == -1)
slot = __kvm_harden_el2_vector_slot;
} }
if (slot != -1)
vect += slot * SZ_2K;
return vect; return vect;
} }
/* This is only called on a !VHE system */
static inline int kvm_map_vectors(void) static inline int kvm_map_vectors(void)
{ {
return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start), /*
kvm_ksym_ref(__bp_harden_hyp_vecs_end), * HBP = ARM64_HARDEN_BRANCH_PREDICTOR
PAGE_HYP_EXEC); * HEL2 = ARM64_HARDEN_EL2_VECTORS
} *
* !HBP + !HEL2 -> use direct vectors
* HBP + !HEL2 -> use hardened vectors in place
* !HBP + HEL2 -> allocate one vector slot and use exec mapping
* HBP + HEL2 -> use hardened vertors and use exec mapping
*/
if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) {
__kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs_start);
__kvm_bp_vect_base = kern_hyp_va(__kvm_bp_vect_base);
}
if (cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS)) {
phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs_start);
unsigned long size = (__bp_harden_hyp_vecs_end -
__bp_harden_hyp_vecs_start);
/*
* Always allocate a spare vector slot, as we don't
* know yet which CPUs have a BP hardening slot that
* we can reuse.
*/
__kvm_harden_el2_vector_slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(__kvm_harden_el2_vector_slot >= BP_HARDEN_EL2_SLOTS);
return create_hyp_exec_mappings(vect_pa, size,
&__kvm_bp_vect_base);
}
return 0;
}
#else #else
static inline void *kvm_get_hyp_vector(void) static inline void *kvm_get_hyp_vector(void)
{ {
return kvm_ksym_ref(__kvm_hyp_vector); return kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector));
} }
static inline int kvm_map_vectors(void) static inline int kvm_map_vectors(void)
......
...@@ -21,6 +21,8 @@ ...@@ -21,6 +21,8 @@
#define USER_ASID_FLAG (UL(1) << USER_ASID_BIT) #define USER_ASID_FLAG (UL(1) << USER_ASID_BIT)
#define TTBR_ASID_MASK (UL(0xffff) << 48) #define TTBR_ASID_MASK (UL(0xffff) << 48)
#define BP_HARDEN_EL2_SLOTS 4
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
typedef struct { typedef struct {
...@@ -49,9 +51,13 @@ struct bp_hardening_data { ...@@ -49,9 +51,13 @@ struct bp_hardening_data {
bp_hardening_cb_t fn; bp_hardening_cb_t fn;
}; };
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR #if (defined(CONFIG_HARDEN_BRANCH_PREDICTOR) || \
defined(CONFIG_HARDEN_EL2_VECTORS))
extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[]; extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[];
extern atomic_t arm64_el2_vector_last_slot;
#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR || CONFIG_HARDEN_EL2_VECTORS */
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void) static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
......
...@@ -288,6 +288,12 @@ ...@@ -288,6 +288,12 @@
#define SYS_MAIR_EL1 sys_reg(3, 0, 10, 2, 0) #define SYS_MAIR_EL1 sys_reg(3, 0, 10, 2, 0)
#define SYS_AMAIR_EL1 sys_reg(3, 0, 10, 3, 0) #define SYS_AMAIR_EL1 sys_reg(3, 0, 10, 3, 0)
#define SYS_LORSA_EL1 sys_reg(3, 0, 10, 4, 0)
#define SYS_LOREA_EL1 sys_reg(3, 0, 10, 4, 1)
#define SYS_LORN_EL1 sys_reg(3, 0, 10, 4, 2)
#define SYS_LORC_EL1 sys_reg(3, 0, 10, 4, 3)
#define SYS_LORID_EL1 sys_reg(3, 0, 10, 4, 7)
#define SYS_VBAR_EL1 sys_reg(3, 0, 12, 0, 0) #define SYS_VBAR_EL1 sys_reg(3, 0, 12, 0, 0)
#define SYS_DISR_EL1 sys_reg(3, 0, 12, 1, 1) #define SYS_DISR_EL1 sys_reg(3, 0, 12, 1, 1)
......
...@@ -55,9 +55,7 @@ arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o ...@@ -55,9 +55,7 @@ arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
arm64-obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o arm64-obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o
ifeq ($(CONFIG_KVM),y) arm64-obj-$(CONFIG_KVM_INDIRECT_VECTORS)+= bpi.o
arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR) += bpi.o
endif
obj-y += $(arm64-obj-y) vdso/ probes/ obj-y += $(arm64-obj-y) vdso/ probes/
obj-m += $(arm64-obj-m) obj-m += $(arm64-obj-m)
......
...@@ -107,32 +107,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp ...@@ -107,32 +107,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
return insn; return insn;
} }
static void patch_alternative(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
__le32 *replptr;
int i;
replptr = ALT_REPL_PTR(alt);
for (i = 0; i < nr_inst; i++) {
u32 insn;
insn = get_alt_insn(alt, origptr + i, replptr + i);
updptr[i] = cpu_to_le32(insn);
}
}
static void __apply_alternatives(void *alt_region, bool use_linear_alias) static void __apply_alternatives(void *alt_region, bool use_linear_alias)
{ {
struct alt_instr *alt; struct alt_instr *alt;
struct alt_region *region = alt_region; struct alt_region *region = alt_region;
__le32 *origptr, *replptr, *updptr; __le32 *origptr, *updptr;
alternative_cb_t alt_cb;
for (alt = region->begin; alt < region->end; alt++) { for (alt = region->begin; alt < region->end; alt++) {
u32 insn; int nr_inst;
int i, nr_inst;
if (!cpus_have_cap(alt->cpufeature)) /* Use ARM64_CB_PATCH as an unconditional patch */
if (alt->cpufeature < ARM64_CB_PATCH &&
!cpus_have_cap(alt->cpufeature))
continue; continue;
BUG_ON(alt->alt_len != alt->orig_len); if (alt->cpufeature == ARM64_CB_PATCH)
BUG_ON(alt->alt_len != 0);
else
BUG_ON(alt->alt_len != alt->orig_len);
pr_info_once("patching kernel code\n"); pr_info_once("patching kernel code\n");
origptr = ALT_ORIG_PTR(alt); origptr = ALT_ORIG_PTR(alt);
replptr = ALT_REPL_PTR(alt);
updptr = use_linear_alias ? lm_alias(origptr) : origptr; updptr = use_linear_alias ? lm_alias(origptr) : origptr;
nr_inst = alt->alt_len / sizeof(insn); nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
for (i = 0; i < nr_inst; i++) { if (alt->cpufeature < ARM64_CB_PATCH)
insn = get_alt_insn(alt, origptr + i, replptr + i); alt_cb = patch_alternative;
updptr[i] = cpu_to_le32(insn); else
} alt_cb = ALT_REPL_PTR(alt);
alt_cb(alt, origptr, updptr, nr_inst);
flush_icache_range((uintptr_t)origptr, flush_icache_range((uintptr_t)origptr,
(uintptr_t)(origptr + nr_inst)); (uintptr_t)(origptr + nr_inst));
......
...@@ -138,6 +138,7 @@ int main(void) ...@@ -138,6 +138,7 @@ int main(void)
DEFINE(CPU_FP_REGS, offsetof(struct kvm_regs, fp_regs)); DEFINE(CPU_FP_REGS, offsetof(struct kvm_regs, fp_regs));
DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2])); DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2]));
DEFINE(VCPU_HOST_CONTEXT, offsetof(struct kvm_vcpu, arch.host_cpu_context)); DEFINE(VCPU_HOST_CONTEXT, offsetof(struct kvm_vcpu, arch.host_cpu_context));
DEFINE(HOST_CONTEXT_VCPU, offsetof(struct kvm_cpu_context, __hyp_running_vcpu));
#endif #endif
#ifdef CONFIG_CPU_PM #ifdef CONFIG_CPU_PM
DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx)); DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx));
......
...@@ -19,42 +19,61 @@ ...@@ -19,42 +19,61 @@
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/arm-smccc.h> #include <linux/arm-smccc.h>
.macro ventry target #include <asm/alternative.h>
.rept 31 #include <asm/mmu.h>
.macro hyp_ventry
.align 7
1: .rept 27
nop nop
.endr .endr
b \target /*
* The default sequence is to directly branch to the KVM vectors,
* using the computed offset. This applies for VHE as well as
* !ARM64_HARDEN_EL2_VECTORS.
*
* For ARM64_HARDEN_EL2_VECTORS configurations, this gets replaced
* with:
*
* stp x0, x1, [sp, #-16]!
* movz x0, #(addr & 0xffff)
* movk x0, #((addr >> 16) & 0xffff), lsl #16
* movk x0, #((addr >> 32) & 0xffff), lsl #32
* br x0
*
* Where addr = kern_hyp_va(__kvm_hyp_vector) + vector-offset + 4.
* See kvm_patch_vector_branch for details.
*/
alternative_cb kvm_patch_vector_branch
b __kvm_hyp_vector + (1b - 0b)
nop
nop
nop
nop
alternative_cb_end
.endm .endm
.macro vectors target .macro generate_vectors
ventry \target + 0x000 0:
ventry \target + 0x080 .rept 16
ventry \target + 0x100 hyp_ventry
ventry \target + 0x180 .endr
.org 0b + SZ_2K // Safety measure
ventry \target + 0x200 .endm
ventry \target + 0x280
ventry \target + 0x300
ventry \target + 0x380
ventry \target + 0x400
ventry \target + 0x480
ventry \target + 0x500
ventry \target + 0x580
ventry \target + 0x600 .text
ventry \target + 0x680 .pushsection .hyp.text, "ax"
ventry \target + 0x700
ventry \target + 0x780
.endm
.align 11 .align 11
ENTRY(__bp_harden_hyp_vecs_start) ENTRY(__bp_harden_hyp_vecs_start)
.rept 4 .rept BP_HARDEN_EL2_SLOTS
vectors __kvm_hyp_vector generate_vectors
.endr .endr
ENTRY(__bp_harden_hyp_vecs_end) ENTRY(__bp_harden_hyp_vecs_end)
.popsection
ENTRY(__qcom_hyp_sanitize_link_stack_start) ENTRY(__qcom_hyp_sanitize_link_stack_start)
stp x29, x30, [sp, #-16]! stp x29, x30, [sp, #-16]!
.rept 16 .rept 16
......
...@@ -78,6 +78,8 @@ cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *__unused) ...@@ -78,6 +78,8 @@ cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *__unused)
config_sctlr_el1(SCTLR_EL1_UCT, 0); config_sctlr_el1(SCTLR_EL1_UCT, 0);
} }
atomic_t arm64_el2_vector_last_slot = ATOMIC_INIT(-1);
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR #ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
#include <asm/mmu_context.h> #include <asm/mmu_context.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
...@@ -108,7 +110,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, ...@@ -108,7 +110,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
const char *hyp_vecs_start, const char *hyp_vecs_start,
const char *hyp_vecs_end) const char *hyp_vecs_end)
{ {
static int last_slot = -1;
static DEFINE_SPINLOCK(bp_lock); static DEFINE_SPINLOCK(bp_lock);
int cpu, slot = -1; int cpu, slot = -1;
...@@ -121,10 +122,8 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, ...@@ -121,10 +122,8 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
} }
if (slot == -1) { if (slot == -1) {
last_slot++; slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start) BUG_ON(slot >= BP_HARDEN_EL2_SLOTS);
/ SZ_2K) <= last_slot);
slot = last_slot;
__copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
} }
...@@ -348,6 +347,10 @@ static const struct arm64_cpu_capabilities arm64_bp_harden_list[] = { ...@@ -348,6 +347,10 @@ static const struct arm64_cpu_capabilities arm64_bp_harden_list[] = {
#endif #endif
#ifndef ERRATA_MIDR_ALL_VERSIONS
#define ERRATA_MIDR_ALL_VERSIONS(x) MIDR_ALL_VERSIONS(x)
#endif
const struct arm64_cpu_capabilities arm64_errata[] = { const struct arm64_cpu_capabilities arm64_errata[] = {
#if defined(CONFIG_ARM64_ERRATUM_826319) || \ #if defined(CONFIG_ARM64_ERRATUM_826319) || \
defined(CONFIG_ARM64_ERRATUM_827319) || \ defined(CONFIG_ARM64_ERRATUM_827319) || \
...@@ -500,6 +503,18 @@ const struct arm64_cpu_capabilities arm64_errata[] = { ...@@ -500,6 +503,18 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT, .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
ERRATA_MIDR_RANGE_LIST(qcom_bp_harden_cpus), ERRATA_MIDR_RANGE_LIST(qcom_bp_harden_cpus),
}, },
#endif
#ifdef CONFIG_HARDEN_EL2_VECTORS
{
.desc = "Cortex-A57 EL2 vector hardening",
.capability = ARM64_HARDEN_EL2_VECTORS,
ERRATA_MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
},
{
.desc = "Cortex-A72 EL2 vector hardening",
.capability = ARM64_HARDEN_EL2_VECTORS,
ERRATA_MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
},
#endif #endif
{ {
} }
......
...@@ -838,19 +838,6 @@ static bool has_no_hw_prefetch(const struct arm64_cpu_capabilities *entry, int _ ...@@ -838,19 +838,6 @@ static bool has_no_hw_prefetch(const struct arm64_cpu_capabilities *entry, int _
MIDR_CPU_VAR_REV(1, MIDR_REVISION_MASK)); MIDR_CPU_VAR_REV(1, MIDR_REVISION_MASK));
} }
static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
int __unused)
{
phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
/*
* Activate the lower HYP offset only if:
* - the idmap doesn't clash with it,
* - the kernel is not running at EL2.
*/
return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
}
static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused) static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
{ {
u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1); u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
...@@ -1121,12 +1108,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = { ...@@ -1121,12 +1108,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.field_pos = ID_AA64PFR0_EL0_SHIFT, .field_pos = ID_AA64PFR0_EL0_SHIFT,
.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT, .min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
}, },
{
.desc = "Reduced HYP mapping offset",
.capability = ARM64_HYP_OFFSET_LOW,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.matches = hyp_offset_low,
},
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
{ {
.desc = "Kernel page table isolation (KPTI)", .desc = "Kernel page table isolation (KPTI)",
......
...@@ -577,6 +577,13 @@ set_hcr: ...@@ -577,6 +577,13 @@ set_hcr:
7: 7:
msr mdcr_el2, x3 // Configure debug traps msr mdcr_el2, x3 // Configure debug traps
/* LORegions */
mrs x1, id_aa64mmfr1_el1
ubfx x0, x1, #ID_AA64MMFR1_LOR_SHIFT, 4
cbz x0, 1f
msr_s SYS_LORC_EL1, xzr
1:
/* Stage-2 translation */ /* Stage-2 translation */
msr vttbr_el2, xzr msr vttbr_el2, xzr
......
...@@ -35,6 +35,7 @@ ...@@ -35,6 +35,7 @@
#define AARCH64_INSN_SF_BIT BIT(31) #define AARCH64_INSN_SF_BIT BIT(31)
#define AARCH64_INSN_N_BIT BIT(22) #define AARCH64_INSN_N_BIT BIT(22)
#define AARCH64_INSN_LSL_12 BIT(22)
static int aarch64_insn_encoding_class[] = { static int aarch64_insn_encoding_class[] = {
AARCH64_INSN_CLS_UNKNOWN, AARCH64_INSN_CLS_UNKNOWN,
...@@ -343,6 +344,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type, ...@@ -343,6 +344,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
mask = BIT(6) - 1; mask = BIT(6) - 1;
shift = 16; shift = 16;
break; break;
case AARCH64_INSN_IMM_N:
mask = 1;
shift = 22;
break;
default: default:
return -EINVAL; return -EINVAL;
} }
...@@ -899,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst, ...@@ -899,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
return AARCH64_BREAK_FAULT; return AARCH64_BREAK_FAULT;
} }
/* We can't encode more than a 24bit value (12bit + 12bit shift) */
if (imm & ~(BIT(24) - 1))
goto out;
/* If we have something in the top 12 bits... */
if (imm & ~(SZ_4K - 1)) { if (imm & ~(SZ_4K - 1)) {
pr_err("%s: invalid immediate encoding %d\n", __func__, imm); /* ... and in the low 12 bits -> error */
return AARCH64_BREAK_FAULT; if (imm & (SZ_4K - 1))
goto out;
imm >>= 12;
insn |= AARCH64_INSN_LSL_12;
} }
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst); insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst);
...@@ -909,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst, ...@@ -909,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src); insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src);
return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm); return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm);
out:
pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
return AARCH64_BREAK_FAULT;
} }
u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst, u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst,
...@@ -1481,3 +1499,171 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = { ...@@ -1481,3 +1499,171 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = {
__check_hi, __check_ls, __check_ge, __check_lt, __check_hi, __check_ls, __check_ge, __check_lt,
__check_gt, __check_le, __check_al, __check_al __check_gt, __check_le, __check_al, __check_al
}; };
static bool range_of_ones(u64 val)
{
/* Doesn't handle full ones or full zeroes */
u64 sval = val >> __ffs64(val);
/* One of Sean Eron Anderson's bithack tricks */
return ((sval + 1) & (sval)) == 0;
}
static u32 aarch64_encode_immediate(u64 imm,
enum aarch64_insn_variant variant,
u32 insn)
{
unsigned int immr, imms, n, ones, ror, esz, tmp;
u64 mask = ~0UL;
/* Can't encode full zeroes or full ones */
if (!imm || !~imm)
return AARCH64_BREAK_FAULT;
switch (variant) {
case AARCH64_INSN_VARIANT_32BIT:
if (upper_32_bits(imm))
return AARCH64_BREAK_FAULT;
esz = 32;
break;
case AARCH64_INSN_VARIANT_64BIT:
insn |= AARCH64_INSN_SF_BIT;
esz = 64;
break;
default:
pr_err("%s: unknown variant encoding %d\n", __func__, variant);
return AARCH64_BREAK_FAULT;
}
/*
* Inverse of Replicate(). Try to spot a repeating pattern
* with a pow2 stride.
*/
for (tmp = esz / 2; tmp >= 2; tmp /= 2) {
u64 emask = BIT(tmp) - 1;
if ((imm & emask) != ((imm >> tmp) & emask))
break;
esz = tmp;
mask = emask;
}
/* N is only set if we're encoding a 64bit value */
n = esz == 64;
/* Trim imm to the element size */
imm &= mask;
/* That's how many ones we need to encode */
ones = hweight64(imm);
/*
* imms is set to (ones - 1), prefixed with a string of ones
* and a zero if they fit. Cap it to 6 bits.
*/
imms = ones - 1;
imms |= 0xf << ffs(esz);
imms &= BIT(6) - 1;
/* Compute the rotation */
if (range_of_ones(imm)) {
/*
* Pattern: 0..01..10..0
*
* Compute how many rotate we need to align it right
*/
ror = __ffs64(imm);
} else {
/*
* Pattern: 0..01..10..01..1
*
* Fill the unused top bits with ones, and check if
* the result is a valid immediate (all ones with a
* contiguous ranges of zeroes).
*/
imm |= ~mask;
if (!range_of_ones(~imm))
return AARCH64_BREAK_FAULT;
/*
* Compute the rotation to get a continuous set of
* ones, with the first bit set at position 0
*/
ror = fls(~imm);
}
/*
* immr is the number of bits we need to rotate back to the
* original set of ones. Note that this is relative to the
* element size...
*/
immr = (esz - ror) % esz;
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, n);
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, immr);
return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, imms);
}
u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
enum aarch64_insn_variant variant,
enum aarch64_insn_register Rn,
enum aarch64_insn_register Rd,
u64 imm)
{
u32 insn;
switch (type) {
case AARCH64_INSN_LOGIC_AND:
insn = aarch64_insn_get_and_imm_value();
break;
case AARCH64_INSN_LOGIC_ORR:
insn = aarch64_insn_get_orr_imm_value();
break;
case AARCH64_INSN_LOGIC_EOR:
insn = aarch64_insn_get_eor_imm_value();
break;
case AARCH64_INSN_LOGIC_AND_SETFLAGS:
insn = aarch64_insn_get_ands_imm_value();
break;
default:
pr_err("%s: unknown logical encoding %d\n", __func__, type);
return AARCH64_BREAK_FAULT;
}
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
return aarch64_encode_immediate(imm, variant, insn);
}
u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
enum aarch64_insn_register Rm,
enum aarch64_insn_register Rn,
enum aarch64_insn_register Rd,
u8 lsb)
{
u32 insn;
insn = aarch64_insn_get_extr_value();
switch (variant) {
case AARCH64_INSN_VARIANT_32BIT:
if (lsb > 31)
return AARCH64_BREAK_FAULT;
break;
case AARCH64_INSN_VARIANT_64BIT:
if (lsb > 63)
return AARCH64_BREAK_FAULT;
insn |= AARCH64_INSN_SF_BIT;
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, 1);
break;
default:
pr_err("%s: unknown variant encoding %d\n", __func__, variant);
return AARCH64_BREAK_FAULT;
}
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, lsb);
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
}
...@@ -57,6 +57,9 @@ config KVM_ARM_PMU ...@@ -57,6 +57,9 @@ config KVM_ARM_PMU
Adds support for a virtual Performance Monitoring Unit (PMU) in Adds support for a virtual Performance Monitoring Unit (PMU) in
virtual machines. virtual machines.
config KVM_INDIRECT_VECTORS
def_bool KVM && (HARDEN_BRANCH_PREDICTOR || HARDEN_EL2_VECTORS)
source drivers/vhost/Kconfig source drivers/vhost/Kconfig
endif # VIRTUALIZATION endif # VIRTUALIZATION
...@@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e ...@@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
......
...@@ -46,7 +46,9 @@ static DEFINE_PER_CPU(u32, mdcr_el2); ...@@ -46,7 +46,9 @@ static DEFINE_PER_CPU(u32, mdcr_el2);
*/ */
static void save_guest_debug_regs(struct kvm_vcpu *vcpu) static void save_guest_debug_regs(struct kvm_vcpu *vcpu)
{ {
vcpu->arch.guest_debug_preserved.mdscr_el1 = vcpu_sys_reg(vcpu, MDSCR_EL1); u64 val = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
vcpu->arch.guest_debug_preserved.mdscr_el1 = val;
trace_kvm_arm_set_dreg32("Saved MDSCR_EL1", trace_kvm_arm_set_dreg32("Saved MDSCR_EL1",
vcpu->arch.guest_debug_preserved.mdscr_el1); vcpu->arch.guest_debug_preserved.mdscr_el1);
...@@ -54,10 +56,12 @@ static void save_guest_debug_regs(struct kvm_vcpu *vcpu) ...@@ -54,10 +56,12 @@ static void save_guest_debug_regs(struct kvm_vcpu *vcpu)
static void restore_guest_debug_regs(struct kvm_vcpu *vcpu) static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
{ {
vcpu_sys_reg(vcpu, MDSCR_EL1) = vcpu->arch.guest_debug_preserved.mdscr_el1; u64 val = vcpu->arch.guest_debug_preserved.mdscr_el1;
vcpu_write_sys_reg(vcpu, val, MDSCR_EL1);
trace_kvm_arm_set_dreg32("Restored MDSCR_EL1", trace_kvm_arm_set_dreg32("Restored MDSCR_EL1",
vcpu_sys_reg(vcpu, MDSCR_EL1)); vcpu_read_sys_reg(vcpu, MDSCR_EL1));
} }
/** /**
...@@ -108,6 +112,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) ...@@ -108,6 +112,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
{ {
bool trap_debug = !(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY); bool trap_debug = !(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY);
unsigned long mdscr;
trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
...@@ -152,9 +157,13 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) ...@@ -152,9 +157,13 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
*/ */
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) { if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
*vcpu_cpsr(vcpu) |= DBG_SPSR_SS; *vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
vcpu_sys_reg(vcpu, MDSCR_EL1) |= DBG_MDSCR_SS; mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
mdscr |= DBG_MDSCR_SS;
vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
} else { } else {
vcpu_sys_reg(vcpu, MDSCR_EL1) &= ~DBG_MDSCR_SS; mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
mdscr &= ~DBG_MDSCR_SS;
vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
} }
trace_kvm_arm_set_dreg32("SPSR_EL2", *vcpu_cpsr(vcpu)); trace_kvm_arm_set_dreg32("SPSR_EL2", *vcpu_cpsr(vcpu));
...@@ -170,7 +179,9 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) ...@@ -170,7 +179,9 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
*/ */
if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW) { if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW) {
/* Enable breakpoints/watchpoints */ /* Enable breakpoints/watchpoints */
vcpu_sys_reg(vcpu, MDSCR_EL1) |= DBG_MDSCR_MDE; mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
mdscr |= DBG_MDSCR_MDE;
vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state; vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY; vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
...@@ -193,8 +204,12 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) ...@@ -193,8 +204,12 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
if (trap_debug) if (trap_debug)
vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA; vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
/* If KDE or MDE are set, perform a full save/restore cycle. */
if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE))
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2); trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_sys_reg(vcpu, MDSCR_EL1)); trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_read_sys_reg(vcpu, MDSCR_EL1));
} }
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
......
...@@ -117,7 +117,6 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE) ...@@ -117,7 +117,6 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE)
/* Set the stack and new vectors */ /* Set the stack and new vectors */
kern_hyp_va x1 kern_hyp_va x1
mov sp, x1 mov sp, x1
kern_hyp_va x2
msr vbar_el2, x2 msr vbar_el2, x2
/* copy tpidr_el1 into tpidr_el2 for use by HYP */ /* copy tpidr_el1 into tpidr_el2 for use by HYP */
......
...@@ -7,10 +7,10 @@ ccflags-y += -fno-stack-protector -DDISABLE_BRANCH_PROFILING ...@@ -7,10 +7,10 @@ ccflags-y += -fno-stack-protector -DDISABLE_BRANCH_PROFILING
KVM=../../../../virt/kvm KVM=../../../../virt/kvm
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-cpuif-proxy.o
obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o
obj-$(CONFIG_KVM_ARM_HOST) += entry.o obj-$(CONFIG_KVM_ARM_HOST) += entry.o
......
...@@ -66,11 +66,6 @@ ...@@ -66,11 +66,6 @@
default: write_debug(ptr[0], reg, 0); \ default: write_debug(ptr[0], reg, 0); \
} }
static void __hyp_text __debug_save_spe_vhe(u64 *pmscr_el1)
{
/* The vcpu can run. but it can't hide. */
}
static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
{ {
u64 reg; u64 reg;
...@@ -103,11 +98,7 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) ...@@ -103,11 +98,7 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
dsb(nsh); dsb(nsh);
} }
static hyp_alternate_select(__debug_save_spe, static void __hyp_text __debug_restore_spe_nvhe(u64 pmscr_el1)
__debug_save_spe_nvhe, __debug_save_spe_vhe,
ARM64_HAS_VIRT_HOST_EXTN);
static void __hyp_text __debug_restore_spe(u64 pmscr_el1)
{ {
if (!pmscr_el1) if (!pmscr_el1)
return; return;
...@@ -119,16 +110,13 @@ static void __hyp_text __debug_restore_spe(u64 pmscr_el1) ...@@ -119,16 +110,13 @@ static void __hyp_text __debug_restore_spe(u64 pmscr_el1)
write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
} }
void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu, static void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu,
struct kvm_guest_debug_arch *dbg, struct kvm_guest_debug_arch *dbg,
struct kvm_cpu_context *ctxt) struct kvm_cpu_context *ctxt)
{ {
u64 aa64dfr0; u64 aa64dfr0;
int brps, wrps; int brps, wrps;
if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
return;
aa64dfr0 = read_sysreg(id_aa64dfr0_el1); aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
brps = (aa64dfr0 >> 12) & 0xf; brps = (aa64dfr0 >> 12) & 0xf;
wrps = (aa64dfr0 >> 20) & 0xf; wrps = (aa64dfr0 >> 20) & 0xf;
...@@ -141,16 +129,13 @@ void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu, ...@@ -141,16 +129,13 @@ void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu,
ctxt->sys_regs[MDCCINT_EL1] = read_sysreg(mdccint_el1); ctxt->sys_regs[MDCCINT_EL1] = read_sysreg(mdccint_el1);
} }
void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu, static void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu,
struct kvm_guest_debug_arch *dbg, struct kvm_guest_debug_arch *dbg,
struct kvm_cpu_context *ctxt) struct kvm_cpu_context *ctxt)
{ {
u64 aa64dfr0; u64 aa64dfr0;
int brps, wrps; int brps, wrps;
if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
return;
aa64dfr0 = read_sysreg(id_aa64dfr0_el1); aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
brps = (aa64dfr0 >> 12) & 0xf; brps = (aa64dfr0 >> 12) & 0xf;
...@@ -164,27 +149,54 @@ void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu, ...@@ -164,27 +149,54 @@ void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu,
write_sysreg(ctxt->sys_regs[MDCCINT_EL1], mdccint_el1); write_sysreg(ctxt->sys_regs[MDCCINT_EL1], mdccint_el1);
} }
void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu) void __hyp_text __debug_switch_to_guest(struct kvm_vcpu *vcpu)
{ {
/* If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY is set, perform struct kvm_cpu_context *host_ctxt;
* a full save/restore cycle. */ struct kvm_cpu_context *guest_ctxt;
if ((vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_KDE) || struct kvm_guest_debug_arch *host_dbg;
(vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_MDE)) struct kvm_guest_debug_arch *guest_dbg;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
/*
__debug_save_state(vcpu, &vcpu->arch.host_debug_state.regs, * Non-VHE: Disable and flush SPE data generation
kern_hyp_va(vcpu->arch.host_cpu_context)); * VHE: The vcpu can run, but it can't hide.
__debug_save_spe()(&vcpu->arch.host_debug_state.pmscr_el1); */
if (!has_vhe())
__debug_save_spe_nvhe(&vcpu->arch.host_debug_state.pmscr_el1);
if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
return;
host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
guest_ctxt = &vcpu->arch.ctxt;
host_dbg = &vcpu->arch.host_debug_state.regs;
guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
__debug_save_state(vcpu, host_dbg, host_ctxt);
__debug_restore_state(vcpu, guest_dbg, guest_ctxt);
} }
void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu) void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
{ {
__debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); struct kvm_cpu_context *host_ctxt;
__debug_restore_state(vcpu, &vcpu->arch.host_debug_state.regs, struct kvm_cpu_context *guest_ctxt;
kern_hyp_va(vcpu->arch.host_cpu_context)); struct kvm_guest_debug_arch *host_dbg;
struct kvm_guest_debug_arch *guest_dbg;
if (!has_vhe())
__debug_restore_spe_nvhe(vcpu->arch.host_debug_state.pmscr_el1);
if (!(vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY))
return;
host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
guest_ctxt = &vcpu->arch.ctxt;
host_dbg = &vcpu->arch.host_debug_state.regs;
guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
__debug_save_state(vcpu, guest_dbg, guest_ctxt);
__debug_restore_state(vcpu, host_dbg, host_ctxt);
if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY) vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
} }
u32 __hyp_text __kvm_get_mdcr_el2(void) u32 __hyp_text __kvm_get_mdcr_el2(void)
......
...@@ -62,9 +62,6 @@ ENTRY(__guest_enter) ...@@ -62,9 +62,6 @@ ENTRY(__guest_enter)
// Store the host regs // Store the host regs
save_callee_saved_regs x1 save_callee_saved_regs x1
// Store host_ctxt and vcpu for use at exit time
stp x1, x0, [sp, #-16]!
add x18, x0, #VCPU_CONTEXT add x18, x0, #VCPU_CONTEXT
// Restore guest regs x0-x17 // Restore guest regs x0-x17
...@@ -118,8 +115,7 @@ ENTRY(__guest_exit) ...@@ -118,8 +115,7 @@ ENTRY(__guest_exit)
// Store the guest regs x19-x29, lr // Store the guest regs x19-x29, lr
save_callee_saved_regs x1 save_callee_saved_regs x1
// Restore the host_ctxt from the stack get_host_ctxt x2, x3
ldr x2, [sp], #16
// Now restore the host regs // Now restore the host regs
restore_callee_saved_regs x2 restore_callee_saved_regs x2
......
...@@ -55,15 +55,9 @@ ENTRY(__vhe_hyp_call) ...@@ -55,15 +55,9 @@ ENTRY(__vhe_hyp_call)
ENDPROC(__vhe_hyp_call) ENDPROC(__vhe_hyp_call)
el1_sync: // Guest trapped into EL2 el1_sync: // Guest trapped into EL2
stp x0, x1, [sp, #-16]!
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
mrs x1, esr_el2
alternative_else
mrs x1, esr_el1
alternative_endif
lsr x0, x1, #ESR_ELx_EC_SHIFT
mrs x0, esr_el2
lsr x0, x0, #ESR_ELx_EC_SHIFT
cmp x0, #ESR_ELx_EC_HVC64 cmp x0, #ESR_ELx_EC_HVC64
ccmp x0, #ESR_ELx_EC_HVC32, #4, ne ccmp x0, #ESR_ELx_EC_HVC32, #4, ne
b.ne el1_trap b.ne el1_trap
...@@ -117,10 +111,14 @@ el1_hvc_guest: ...@@ -117,10 +111,14 @@ el1_hvc_guest:
eret eret
el1_trap: el1_trap:
get_vcpu_ptr x1, x0
mrs x0, esr_el2
lsr x0, x0, #ESR_ELx_EC_SHIFT
/* /*
* x0: ESR_EC * x0: ESR_EC
* x1: vcpu pointer
*/ */
ldr x1, [sp, #16 + 8] // vcpu stored by __guest_enter
/* /*
* We trap the first access to the FP/SIMD to save the host context * We trap the first access to the FP/SIMD to save the host context
...@@ -137,18 +135,18 @@ alternative_else_nop_endif ...@@ -137,18 +135,18 @@ alternative_else_nop_endif
b __guest_exit b __guest_exit
el1_irq: el1_irq:
stp x0, x1, [sp, #-16]! get_vcpu_ptr x1, x0
ldr x1, [sp, #16 + 8]
mov x0, #ARM_EXCEPTION_IRQ mov x0, #ARM_EXCEPTION_IRQ
b __guest_exit b __guest_exit
el1_error: el1_error:
stp x0, x1, [sp, #-16]! get_vcpu_ptr x1, x0
ldr x1, [sp, #16 + 8]
mov x0, #ARM_EXCEPTION_EL1_SERROR mov x0, #ARM_EXCEPTION_EL1_SERROR
b __guest_exit b __guest_exit
el2_error: el2_error:
ldp x0, x1, [sp], #16
/* /*
* Only two possibilities: * Only two possibilities:
* 1) Either we come from the exit path, having just unmasked * 1) Either we come from the exit path, having just unmasked
...@@ -180,14 +178,7 @@ ENTRY(__hyp_do_panic) ...@@ -180,14 +178,7 @@ ENTRY(__hyp_do_panic)
ENDPROC(__hyp_do_panic) ENDPROC(__hyp_do_panic)
ENTRY(__hyp_panic) ENTRY(__hyp_panic)
/* get_host_ctxt x0, x1
* '=kvm_host_cpu_state' is a host VA from the constant pool, it may
* not be accessible by this address from EL2, hyp_panic() converts
* it with kern_hyp_va() before use.
*/
ldr x0, =kvm_host_cpu_state
mrs x1, tpidr_el2
add x0, x0, x1
b hyp_panic b hyp_panic
ENDPROC(__hyp_panic) ENDPROC(__hyp_panic)
...@@ -206,32 +197,43 @@ ENDPROC(\label) ...@@ -206,32 +197,43 @@ ENDPROC(\label)
invalid_vector el2h_sync_invalid invalid_vector el2h_sync_invalid
invalid_vector el2h_irq_invalid invalid_vector el2h_irq_invalid
invalid_vector el2h_fiq_invalid invalid_vector el2h_fiq_invalid
invalid_vector el1_sync_invalid
invalid_vector el1_irq_invalid
invalid_vector el1_fiq_invalid invalid_vector el1_fiq_invalid
.ltorg .ltorg
.align 11 .align 11
.macro valid_vect target
.align 7
stp x0, x1, [sp, #-16]!
b \target
.endm
.macro invalid_vect target
.align 7
b \target
ldp x0, x1, [sp], #16
b \target
.endm
ENTRY(__kvm_hyp_vector) ENTRY(__kvm_hyp_vector)
ventry el2t_sync_invalid // Synchronous EL2t invalid_vect el2t_sync_invalid // Synchronous EL2t
ventry el2t_irq_invalid // IRQ EL2t invalid_vect el2t_irq_invalid // IRQ EL2t
ventry el2t_fiq_invalid // FIQ EL2t invalid_vect el2t_fiq_invalid // FIQ EL2t
ventry el2t_error_invalid // Error EL2t invalid_vect el2t_error_invalid // Error EL2t
ventry el2h_sync_invalid // Synchronous EL2h invalid_vect el2h_sync_invalid // Synchronous EL2h
ventry el2h_irq_invalid // IRQ EL2h invalid_vect el2h_irq_invalid // IRQ EL2h
ventry el2h_fiq_invalid // FIQ EL2h invalid_vect el2h_fiq_invalid // FIQ EL2h
ventry el2_error // Error EL2h valid_vect el2_error // Error EL2h
ventry el1_sync // Synchronous 64-bit EL1 valid_vect el1_sync // Synchronous 64-bit EL1
ventry el1_irq // IRQ 64-bit EL1 valid_vect el1_irq // IRQ 64-bit EL1
ventry el1_fiq_invalid // FIQ 64-bit EL1 invalid_vect el1_fiq_invalid // FIQ 64-bit EL1
ventry el1_error // Error 64-bit EL1 valid_vect el1_error // Error 64-bit EL1
ventry el1_sync // Synchronous 32-bit EL1 valid_vect el1_sync // Synchronous 32-bit EL1
ventry el1_irq // IRQ 32-bit EL1 valid_vect el1_irq // IRQ 32-bit EL1
ventry el1_fiq_invalid // FIQ 32-bit EL1 invalid_vect el1_fiq_invalid // FIQ 32-bit EL1
ventry el1_error // Error 32-bit EL1 valid_vect el1_error // Error 32-bit EL1
ENDPROC(__kvm_hyp_vector) ENDPROC(__kvm_hyp_vector)
...@@ -33,49 +33,22 @@ static bool __hyp_text __fpsimd_enabled_nvhe(void) ...@@ -33,49 +33,22 @@ static bool __hyp_text __fpsimd_enabled_nvhe(void)
return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP); return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
} }
static bool __hyp_text __fpsimd_enabled_vhe(void) static bool fpsimd_enabled_vhe(void)
{ {
return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN); return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
} }
static hyp_alternate_select(__fpsimd_is_enabled, /* Save the 32-bit only FPSIMD system register state */
__fpsimd_enabled_nvhe, __fpsimd_enabled_vhe, static void __hyp_text __fpsimd_save_fpexc32(struct kvm_vcpu *vcpu)
ARM64_HAS_VIRT_HOST_EXTN);
bool __hyp_text __fpsimd_enabled(void)
{
return __fpsimd_is_enabled()();
}
static void __hyp_text __activate_traps_vhe(void)
{
u64 val;
val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
write_sysreg(val, cpacr_el1);
write_sysreg(kvm_get_hyp_vector(), vbar_el1);
}
static void __hyp_text __activate_traps_nvhe(void)
{ {
u64 val; if (!vcpu_el1_is_32bit(vcpu))
return;
val = CPTR_EL2_DEFAULT; vcpu->arch.ctxt.sys_regs[FPEXC32_EL2] = read_sysreg(fpexc32_el2);
val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
write_sysreg(val, cptr_el2);
} }
static hyp_alternate_select(__activate_traps_arch, static void __hyp_text __activate_traps_fpsimd32(struct kvm_vcpu *vcpu)
__activate_traps_nvhe, __activate_traps_vhe,
ARM64_HAS_VIRT_HOST_EXTN);
static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
{ {
u64 val;
/* /*
* We are about to set CPTR_EL2.TFP to trap all floating point * We are about to set CPTR_EL2.TFP to trap all floating point
* register accesses to EL2, however, the ARM ARM clearly states that * register accesses to EL2, however, the ARM ARM clearly states that
...@@ -85,23 +58,17 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) ...@@ -85,23 +58,17 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to * If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to
* it will cause an exception. * it will cause an exception.
*/ */
val = vcpu->arch.hcr_el2; if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd()) {
if (!(val & HCR_RW) && system_supports_fpsimd()) {
write_sysreg(1 << 30, fpexc32_el2); write_sysreg(1 << 30, fpexc32_el2);
isb(); isb();
} }
}
if (val & HCR_RW) /* for AArch64 only: */ static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu)
val |= HCR_TID3; /* TID3: trap feature register accesses */ {
/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
write_sysreg(val, hcr_el2);
if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
write_sysreg(1 << 15, hstr_el2); write_sysreg(1 << 15, hstr_el2);
/* /*
* Make sure we trap PMU access from EL0 to EL2. Also sanitize * Make sure we trap PMU access from EL0 to EL2. Also sanitize
* PMSELR_EL0 to make sure it never contains the cycle * PMSELR_EL0 to make sure it never contains the cycle
...@@ -111,19 +78,56 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) ...@@ -111,19 +78,56 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
write_sysreg(0, pmselr_el0); write_sysreg(0, pmselr_el0);
write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
__activate_traps_arch()();
} }
static void __hyp_text __deactivate_traps_vhe(void) static void __hyp_text __deactivate_traps_common(void)
{ {
extern char vectors[]; /* kernel exception vectors */ write_sysreg(0, hstr_el2);
u64 mdcr_el2 = read_sysreg(mdcr_el2); write_sysreg(0, pmuserenr_el0);
}
mdcr_el2 &= MDCR_EL2_HPMN_MASK | static void activate_traps_vhe(struct kvm_vcpu *vcpu)
MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT | {
MDCR_EL2_TPMS; u64 val;
write_sysreg(mdcr_el2, mdcr_el2); val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
write_sysreg(val, cpacr_el1);
write_sysreg(kvm_get_hyp_vector(), vbar_el1);
}
static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
{
u64 val;
__activate_traps_common(vcpu);
val = CPTR_EL2_DEFAULT;
val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ;
write_sysreg(val, cptr_el2);
}
static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
{
u64 hcr = vcpu->arch.hcr_el2;
write_sysreg(hcr, hcr_el2);
if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE))
write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
__activate_traps_fpsimd32(vcpu);
if (has_vhe())
activate_traps_vhe(vcpu);
else
__activate_traps_nvhe(vcpu);
}
static void deactivate_traps_vhe(void)
{
extern char vectors[]; /* kernel exception vectors */
write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
write_sysreg(vectors, vbar_el1); write_sysreg(vectors, vbar_el1);
...@@ -133,6 +137,8 @@ static void __hyp_text __deactivate_traps_nvhe(void) ...@@ -133,6 +137,8 @@ static void __hyp_text __deactivate_traps_nvhe(void)
{ {
u64 mdcr_el2 = read_sysreg(mdcr_el2); u64 mdcr_el2 = read_sysreg(mdcr_el2);
__deactivate_traps_common();
mdcr_el2 &= MDCR_EL2_HPMN_MASK; mdcr_el2 &= MDCR_EL2_HPMN_MASK;
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
...@@ -141,10 +147,6 @@ static void __hyp_text __deactivate_traps_nvhe(void) ...@@ -141,10 +147,6 @@ static void __hyp_text __deactivate_traps_nvhe(void)
write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
} }
static hyp_alternate_select(__deactivate_traps_arch,
__deactivate_traps_nvhe, __deactivate_traps_vhe,
ARM64_HAS_VIRT_HOST_EXTN);
static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
{ {
/* /*
...@@ -156,14 +158,32 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) ...@@ -156,14 +158,32 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
if (vcpu->arch.hcr_el2 & HCR_VSE) if (vcpu->arch.hcr_el2 & HCR_VSE)
vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
__deactivate_traps_arch()(); if (has_vhe())
write_sysreg(0, hstr_el2); deactivate_traps_vhe();
write_sysreg(0, pmuserenr_el0); else
__deactivate_traps_nvhe();
}
void activate_traps_vhe_load(struct kvm_vcpu *vcpu)
{
__activate_traps_common(vcpu);
}
void deactivate_traps_vhe_put(void)
{
u64 mdcr_el2 = read_sysreg(mdcr_el2);
mdcr_el2 &= MDCR_EL2_HPMN_MASK |
MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT |
MDCR_EL2_TPMS;
write_sysreg(mdcr_el2, mdcr_el2);
__deactivate_traps_common();
} }
static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) static void __hyp_text __activate_vm(struct kvm *kvm)
{ {
struct kvm *kvm = kern_hyp_va(vcpu->kvm);
write_sysreg(kvm->arch.vttbr, vttbr_el2); write_sysreg(kvm->arch.vttbr, vttbr_el2);
} }
...@@ -172,29 +192,22 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) ...@@ -172,29 +192,22 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
write_sysreg(0, vttbr_el2); write_sysreg(0, vttbr_el2);
} }
static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) /* Save VGICv3 state on non-VHE systems */
static void __hyp_text __hyp_vgic_save_state(struct kvm_vcpu *vcpu)
{ {
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_save_state(vcpu); __vgic_v3_save_state(vcpu);
else __vgic_v3_deactivate_traps(vcpu);
__vgic_v2_save_state(vcpu); }
write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2);
} }
static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) /* Restore VGICv3 state on non_VEH systems */
static void __hyp_text __hyp_vgic_restore_state(struct kvm_vcpu *vcpu)
{ {
u64 val; if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_activate_traps(vcpu);
val = read_sysreg(hcr_el2);
val |= HCR_INT_OVERRIDE;
val |= vcpu->arch.irq_lines;
write_sysreg(val, hcr_el2);
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
__vgic_v3_restore_state(vcpu); __vgic_v3_restore_state(vcpu);
else }
__vgic_v2_restore_state(vcpu);
} }
static bool __hyp_text __true_value(void) static bool __hyp_text __true_value(void)
...@@ -305,54 +318,27 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu) ...@@ -305,54 +318,27 @@ static bool __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
} }
} }
int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) /*
* Return true when we were able to fixup the guest exit and should return to
* the guest, false when we should restore the host state and return to the
* main run loop.
*/
static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
{ {
struct kvm_cpu_context *host_ctxt; if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
struct kvm_cpu_context *guest_ctxt;
bool fp_enabled;
u64 exit_code;
vcpu = kern_hyp_va(vcpu);
host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = &vcpu->arch.ctxt;
__sysreg_save_host_state(host_ctxt);
__debug_cond_save_host_state(vcpu);
__activate_traps(vcpu);
__activate_vm(vcpu);
__vgic_restore_state(vcpu);
__timer_enable_traps(vcpu);
/*
* We must restore the 32-bit state before the sysregs, thanks
* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
*/
__sysreg32_restore_state(vcpu);
__sysreg_restore_guest_state(guest_ctxt);
__debug_restore_state(vcpu, kern_hyp_va(vcpu->arch.debug_ptr), guest_ctxt);
/* Jump in the fire! */
again:
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */
if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr); vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
/* /*
* We're using the raw exception code in order to only process * We're using the raw exception code in order to only process
* the trap if no SError is pending. We will come back to the * the trap if no SError is pending. We will come back to the
* same PC once the SError has been injected, and replay the * same PC once the SError has been injected, and replay the
* trapping instruction. * trapping instruction.
*/ */
if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu)) if (*exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
goto again; return true;
if (static_branch_unlikely(&vgic_v2_cpuif_trap) && if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
exit_code == ARM_EXCEPTION_TRAP) { *exit_code == ARM_EXCEPTION_TRAP) {
bool valid; bool valid;
valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW && valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
...@@ -366,9 +352,9 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -366,9 +352,9 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (ret == 1) { if (ret == 1) {
if (__skip_instr(vcpu)) if (__skip_instr(vcpu))
goto again; return true;
else else
exit_code = ARM_EXCEPTION_TRAP; *exit_code = ARM_EXCEPTION_TRAP;
} }
if (ret == -1) { if (ret == -1) {
...@@ -380,29 +366,112 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -380,29 +366,112 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
*/ */
if (!__skip_instr(vcpu)) if (!__skip_instr(vcpu))
*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS; *vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
exit_code = ARM_EXCEPTION_EL1_SERROR; *exit_code = ARM_EXCEPTION_EL1_SERROR;
} }
/* 0 falls through to be handler out of EL2 */
} }
} }
if (static_branch_unlikely(&vgic_v3_cpuif_trap) && if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
exit_code == ARM_EXCEPTION_TRAP && *exit_code == ARM_EXCEPTION_TRAP &&
(kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 || (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) { kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
int ret = __vgic_v3_perform_cpuif_access(vcpu); int ret = __vgic_v3_perform_cpuif_access(vcpu);
if (ret == 1) { if (ret == 1) {
if (__skip_instr(vcpu)) if (__skip_instr(vcpu))
goto again; return true;
else else
exit_code = ARM_EXCEPTION_TRAP; *exit_code = ARM_EXCEPTION_TRAP;
} }
}
/* 0 falls through to be handled out of EL2 */ /* Return to the host kernel and handle the exit */
return false;
}
/* Switch to the guest for VHE systems running in EL2 */
int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *host_ctxt;
struct kvm_cpu_context *guest_ctxt;
bool fp_enabled;
u64 exit_code;
host_ctxt = vcpu->arch.host_cpu_context;
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = &vcpu->arch.ctxt;
sysreg_save_host_state_vhe(host_ctxt);
__activate_traps(vcpu);
__activate_vm(vcpu->kvm);
sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
do {
/* Jump in the fire! */
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */
} while (fixup_guest_exit(vcpu, &exit_code));
fp_enabled = fpsimd_enabled_vhe();
sysreg_save_guest_state_vhe(guest_ctxt);
__deactivate_traps(vcpu);
sysreg_restore_host_state_vhe(host_ctxt);
if (fp_enabled) {
__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
__fpsimd_save_fpexc32(vcpu);
} }
__debug_switch_to_host(vcpu);
return exit_code;
}
/* Switch to the guest for legacy non-VHE systems */
int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *host_ctxt;
struct kvm_cpu_context *guest_ctxt;
bool fp_enabled;
u64 exit_code;
vcpu = kern_hyp_va(vcpu);
host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = &vcpu->arch.ctxt;
__sysreg_save_state_nvhe(host_ctxt);
__activate_traps(vcpu);
__activate_vm(kern_hyp_va(vcpu->kvm));
__hyp_vgic_restore_state(vcpu);
__timer_enable_traps(vcpu);
/*
* We must restore the 32-bit state before the sysregs, thanks
* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
*/
__sysreg32_restore_state(vcpu);
__sysreg_restore_state_nvhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
do {
/* Jump in the fire! */
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */
} while (fixup_guest_exit(vcpu, &exit_code));
if (cpus_have_const_cap(ARM64_HARDEN_BP_POST_GUEST_EXIT)) { if (cpus_have_const_cap(ARM64_HARDEN_BP_POST_GUEST_EXIT)) {
u32 midr = read_cpuid_id(); u32 midr = read_cpuid_id();
...@@ -413,29 +482,29 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -413,29 +482,29 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
} }
} }
fp_enabled = __fpsimd_enabled(); fp_enabled = __fpsimd_enabled_nvhe();
__sysreg_save_guest_state(guest_ctxt); __sysreg_save_state_nvhe(guest_ctxt);
__sysreg32_save_state(vcpu); __sysreg32_save_state(vcpu);
__timer_disable_traps(vcpu); __timer_disable_traps(vcpu);
__vgic_save_state(vcpu); __hyp_vgic_save_state(vcpu);
__deactivate_traps(vcpu); __deactivate_traps(vcpu);
__deactivate_vm(vcpu); __deactivate_vm(vcpu);
__sysreg_restore_host_state(host_ctxt); __sysreg_restore_state_nvhe(host_ctxt);
if (fp_enabled) { if (fp_enabled) {
__fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs); __fpsimd_save_state(&guest_ctxt->gp_regs.fp_regs);
__fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs); __fpsimd_restore_state(&host_ctxt->gp_regs.fp_regs);
__fpsimd_save_fpexc32(vcpu);
} }
__debug_save_state(vcpu, kern_hyp_va(vcpu->arch.debug_ptr), guest_ctxt);
/* /*
* This must come after restoring the host sysregs, since a non-VHE * This must come after restoring the host sysregs, since a non-VHE
* system may enable SPE here and make use of the TTBRs. * system may enable SPE here and make use of the TTBRs.
*/ */
__debug_cond_restore_host_state(vcpu); __debug_switch_to_host(vcpu);
return exit_code; return exit_code;
} }
...@@ -443,10 +512,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -443,10 +512,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n"; static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
struct kvm_vcpu *vcpu) struct kvm_cpu_context *__host_ctxt)
{ {
struct kvm_vcpu *vcpu;
unsigned long str_va; unsigned long str_va;
vcpu = __host_ctxt->__hyp_running_vcpu;
if (read_sysreg(vttbr_el2)) {
__timer_disable_traps(vcpu);
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
__sysreg_restore_state_nvhe(__host_ctxt);
}
/* /*
* Force the panic string to be loaded from the literal pool, * Force the panic string to be loaded from the literal pool,
* making sure it is a kernel address and not a PC-relative * making sure it is a kernel address and not a PC-relative
...@@ -460,40 +539,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, ...@@ -460,40 +539,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
read_sysreg(hpfar_el2), par, vcpu); read_sysreg(hpfar_el2), par, vcpu);
} }
static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
struct kvm_vcpu *vcpu) struct kvm_cpu_context *host_ctxt)
{ {
struct kvm_vcpu *vcpu;
vcpu = host_ctxt->__hyp_running_vcpu;
__deactivate_traps(vcpu);
sysreg_restore_host_state_vhe(host_ctxt);
panic(__hyp_panic_string, panic(__hyp_panic_string,
spsr, elr, spsr, elr,
read_sysreg_el2(esr), read_sysreg_el2(far), read_sysreg_el2(esr), read_sysreg_el2(far),
read_sysreg(hpfar_el2), par, vcpu); read_sysreg(hpfar_el2), par, vcpu);
} }
static hyp_alternate_select(__hyp_call_panic, void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *host_ctxt)
__hyp_call_panic_nvhe, __hyp_call_panic_vhe,
ARM64_HAS_VIRT_HOST_EXTN);
void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt)
{ {
struct kvm_vcpu *vcpu = NULL;
u64 spsr = read_sysreg_el2(spsr); u64 spsr = read_sysreg_el2(spsr);
u64 elr = read_sysreg_el2(elr); u64 elr = read_sysreg_el2(elr);
u64 par = read_sysreg(par_el1); u64 par = read_sysreg(par_el1);
if (read_sysreg(vttbr_el2)) { if (!has_vhe())
struct kvm_cpu_context *host_ctxt; __hyp_call_panic_nvhe(spsr, elr, par, host_ctxt);
else
host_ctxt = kern_hyp_va(__host_ctxt); __hyp_call_panic_vhe(spsr, elr, par, host_ctxt);
vcpu = host_ctxt->__hyp_running_vcpu;
__timer_disable_traps(vcpu);
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
__sysreg_restore_host_state(host_ctxt);
}
/* Call panic for real */
__hyp_call_panic()(spsr, elr, par, vcpu);
unreachable(); unreachable();
} }
...@@ -19,32 +19,43 @@ ...@@ -19,32 +19,43 @@
#include <linux/kvm_host.h> #include <linux/kvm_host.h>
#include <asm/kvm_asm.h> #include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h> #include <asm/kvm_hyp.h>
/* Yes, this does nothing, on purpose */
static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { }
/* /*
* Non-VHE: Both host and guest must save everything. * Non-VHE: Both host and guest must save everything.
* *
* VHE: Host must save tpidr*_el0, actlr_el1, mdscr_el1, sp_el0, * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and pstate,
* and guest must save everything. * which are handled as part of the el2 return state) on every switch.
* tpidr_el0 and tpidrro_el0 only need to be switched when going
* to host userspace or a different VCPU. EL1 registers only need to be
* switched when potentially going to run a different VCPU. The latter two
* classes are handled as part of kvm_arch_vcpu_load and kvm_arch_vcpu_put.
*/ */
static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt) static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
{ {
ctxt->sys_regs[ACTLR_EL1] = read_sysreg(actlr_el1);
ctxt->sys_regs[TPIDR_EL0] = read_sysreg(tpidr_el0);
ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1); ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1);
/*
* The host arm64 Linux uses sp_el0 to point to 'current' and it must
* therefore be saved/restored on every entry/exit to/from the guest.
*/
ctxt->gp_regs.regs.sp = read_sysreg(sp_el0); ctxt->gp_regs.regs.sp = read_sysreg(sp_el0);
} }
static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
{
ctxt->sys_regs[TPIDR_EL0] = read_sysreg(tpidr_el0);
ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
}
static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
{ {
ctxt->sys_regs[MPIDR_EL1] = read_sysreg(vmpidr_el2); ctxt->sys_regs[MPIDR_EL1] = read_sysreg(vmpidr_el2);
ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1); ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1);
ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(sctlr); ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(sctlr);
ctxt->sys_regs[ACTLR_EL1] = read_sysreg(actlr_el1);
ctxt->sys_regs[CPACR_EL1] = read_sysreg_el1(cpacr); ctxt->sys_regs[CPACR_EL1] = read_sysreg_el1(cpacr);
ctxt->sys_regs[TTBR0_EL1] = read_sysreg_el1(ttbr0); ctxt->sys_regs[TTBR0_EL1] = read_sysreg_el1(ttbr0);
ctxt->sys_regs[TTBR1_EL1] = read_sysreg_el1(ttbr1); ctxt->sys_regs[TTBR1_EL1] = read_sysreg_el1(ttbr1);
...@@ -64,6 +75,10 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) ...@@ -64,6 +75,10 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
ctxt->gp_regs.sp_el1 = read_sysreg(sp_el1); ctxt->gp_regs.sp_el1 = read_sysreg(sp_el1);
ctxt->gp_regs.elr_el1 = read_sysreg_el1(elr); ctxt->gp_regs.elr_el1 = read_sysreg_el1(elr);
ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr); ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
}
static void __hyp_text __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
{
ctxt->gp_regs.regs.pc = read_sysreg_el2(elr); ctxt->gp_regs.regs.pc = read_sysreg_el2(elr);
ctxt->gp_regs.regs.pstate = read_sysreg_el2(spsr); ctxt->gp_regs.regs.pstate = read_sysreg_el2(spsr);
...@@ -71,36 +86,48 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) ...@@ -71,36 +86,48 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2); ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
} }
static hyp_alternate_select(__sysreg_call_save_host_state, void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt)
__sysreg_save_state, __sysreg_do_nothing, {
ARM64_HAS_VIRT_HOST_EXTN); __sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
__sysreg_save_user_state(ctxt);
__sysreg_save_el2_return_state(ctxt);
}
void __hyp_text __sysreg_save_host_state(struct kvm_cpu_context *ctxt) void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt)
{ {
__sysreg_call_save_host_state()(ctxt);
__sysreg_save_common_state(ctxt); __sysreg_save_common_state(ctxt);
} }
void __hyp_text __sysreg_save_guest_state(struct kvm_cpu_context *ctxt) void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
{ {
__sysreg_save_state(ctxt);
__sysreg_save_common_state(ctxt); __sysreg_save_common_state(ctxt);
__sysreg_save_el2_return_state(ctxt);
} }
static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt) static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
{ {
write_sysreg(ctxt->sys_regs[ACTLR_EL1], actlr_el1);
write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0);
write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1); write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1);
/*
* The host arm64 Linux uses sp_el0 to point to 'current' and it must
* therefore be saved/restored on every entry/exit to/from the guest.
*/
write_sysreg(ctxt->gp_regs.regs.sp, sp_el0); write_sysreg(ctxt->gp_regs.regs.sp, sp_el0);
} }
static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt) static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
{
write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0);
write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
}
static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
{ {
write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2);
write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1);
write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], sctlr); write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], sctlr);
write_sysreg(ctxt->sys_regs[ACTLR_EL1], actlr_el1);
write_sysreg_el1(ctxt->sys_regs[CPACR_EL1], cpacr); write_sysreg_el1(ctxt->sys_regs[CPACR_EL1], cpacr);
write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1], ttbr0); write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1], ttbr0);
write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1], ttbr1); write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1], ttbr1);
...@@ -120,6 +147,11 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt) ...@@ -120,6 +147,11 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
write_sysreg(ctxt->gp_regs.sp_el1, sp_el1); write_sysreg(ctxt->gp_regs.sp_el1, sp_el1);
write_sysreg_el1(ctxt->gp_regs.elr_el1, elr); write_sysreg_el1(ctxt->gp_regs.elr_el1, elr);
write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr); write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
}
static void __hyp_text
__sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt)
{
write_sysreg_el2(ctxt->gp_regs.regs.pc, elr); write_sysreg_el2(ctxt->gp_regs.regs.pc, elr);
write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr); write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);
...@@ -127,27 +159,30 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt) ...@@ -127,27 +159,30 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2); write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
} }
static hyp_alternate_select(__sysreg_call_restore_host_state, void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt)
__sysreg_restore_state, __sysreg_do_nothing, {
ARM64_HAS_VIRT_HOST_EXTN); __sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
__sysreg_restore_user_state(ctxt);
__sysreg_restore_el2_return_state(ctxt);
}
void __hyp_text __sysreg_restore_host_state(struct kvm_cpu_context *ctxt) void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt)
{ {
__sysreg_call_restore_host_state()(ctxt);
__sysreg_restore_common_state(ctxt); __sysreg_restore_common_state(ctxt);
} }
void __hyp_text __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt) void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt)
{ {
__sysreg_restore_state(ctxt);
__sysreg_restore_common_state(ctxt); __sysreg_restore_common_state(ctxt);
__sysreg_restore_el2_return_state(ctxt);
} }
void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu) void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
{ {
u64 *spsr, *sysreg; u64 *spsr, *sysreg;
if (read_sysreg(hcr_el2) & HCR_RW) if (!vcpu_el1_is_32bit(vcpu))
return; return;
spsr = vcpu->arch.ctxt.gp_regs.spsr; spsr = vcpu->arch.ctxt.gp_regs.spsr;
...@@ -161,10 +196,7 @@ void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu) ...@@ -161,10 +196,7 @@ void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
sysreg[DACR32_EL2] = read_sysreg(dacr32_el2); sysreg[DACR32_EL2] = read_sysreg(dacr32_el2);
sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2); sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2);
if (__fpsimd_enabled()) if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
sysreg[FPEXC32_EL2] = read_sysreg(fpexc32_el2);
if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2); sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2);
} }
...@@ -172,7 +204,7 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu) ...@@ -172,7 +204,7 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
{ {
u64 *spsr, *sysreg; u64 *spsr, *sysreg;
if (read_sysreg(hcr_el2) & HCR_RW) if (!vcpu_el1_is_32bit(vcpu))
return; return;
spsr = vcpu->arch.ctxt.gp_regs.spsr; spsr = vcpu->arch.ctxt.gp_regs.spsr;
...@@ -186,6 +218,78 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu) ...@@ -186,6 +218,78 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
write_sysreg(sysreg[DACR32_EL2], dacr32_el2); write_sysreg(sysreg[DACR32_EL2], dacr32_el2);
write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2); write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2);
if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY) if (has_vhe() || vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2); write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
} }
/**
* kvm_vcpu_load_sysregs - Load guest system registers to the physical CPU
*
* @vcpu: The VCPU pointer
*
* Load system registers that do not affect the host's execution, for
* example EL1 system registers on a VHE system where the host kernel
* runs at EL2. This function is called from KVM's vcpu_load() function
* and loading system register state early avoids having to load them on
* every entry to the VM.
*/
void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *host_ctxt = vcpu->arch.host_cpu_context;
struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
if (!has_vhe())
return;
__sysreg_save_user_state(host_ctxt);
/*
* Load guest EL1 and user state
*
* We must restore the 32-bit state before the sysregs, thanks
* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
*/
__sysreg32_restore_state(vcpu);
__sysreg_restore_user_state(guest_ctxt);
__sysreg_restore_el1_state(guest_ctxt);
vcpu->arch.sysregs_loaded_on_cpu = true;
activate_traps_vhe_load(vcpu);
}
/**
* kvm_vcpu_put_sysregs - Restore host system registers to the physical CPU
*
* @vcpu: The VCPU pointer
*
* Save guest system registers that do not affect the host's execution, for
* example EL1 system registers on a VHE system where the host kernel
* runs at EL2. This function is called from KVM's vcpu_put() function
* and deferring saving system register state until we're no longer running the
* VCPU avoids having to save them on every exit from the VM.
*/
void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *host_ctxt = vcpu->arch.host_cpu_context;
struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
if (!has_vhe())
return;
deactivate_traps_vhe_put();
__sysreg_save_el1_state(guest_ctxt);
__sysreg_save_user_state(guest_ctxt);
__sysreg32_save_state(vcpu);
/* Restore host user state */
__sysreg_restore_user_state(host_ctxt);
vcpu->arch.sysregs_loaded_on_cpu = false;
}
void __hyp_text __kvm_set_tpidr_el2(u64 tpidr_el2)
{
asm("msr tpidr_el2, %0": : "r" (tpidr_el2));
}
...@@ -23,86 +23,6 @@ ...@@ -23,86 +23,6 @@
#include <asm/kvm_hyp.h> #include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h> #include <asm/kvm_mmu.h>
static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
{
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
u32 elrsr0, elrsr1;
elrsr0 = readl_relaxed(base + GICH_ELRSR0);
if (unlikely(nr_lr > 32))
elrsr1 = readl_relaxed(base + GICH_ELRSR1);
else
elrsr1 = 0;
cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
}
static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
{
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
int i;
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
for (i = 0; i < used_lrs; i++) {
if (cpu_if->vgic_elrsr & (1UL << i))
cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
else
cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
writel_relaxed(0, base + GICH_LR0 + (i * 4));
}
}
/* vcpu is already in the HYP VA space */
void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = kern_hyp_va(vcpu->kvm);
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
struct vgic_dist *vgic = &kvm->arch.vgic;
void __iomem *base = kern_hyp_va(vgic->vctrl_base);
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
if (!base)
return;
if (used_lrs) {
cpu_if->vgic_apr = readl_relaxed(base + GICH_APR);
save_elrsr(vcpu, base);
save_lrs(vcpu, base);
writel_relaxed(0, base + GICH_HCR);
} else {
cpu_if->vgic_elrsr = ~0UL;
cpu_if->vgic_apr = 0;
}
}
/* vcpu is already in the HYP VA space */
void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = kern_hyp_va(vcpu->kvm);
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
struct vgic_dist *vgic = &kvm->arch.vgic;
void __iomem *base = kern_hyp_va(vgic->vctrl_base);
int i;
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
if (!base)
return;
if (used_lrs) {
writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
for (i = 0; i < used_lrs; i++) {
writel_relaxed(cpu_if->vgic_lr[i],
base + GICH_LR0 + (i * 4));
}
}
}
#ifdef CONFIG_ARM64
/* /*
* __vgic_v2_perform_cpuif_access -- perform a GICV access on behalf of the * __vgic_v2_perform_cpuif_access -- perform a GICV access on behalf of the
* guest. * guest.
...@@ -140,7 +60,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu) ...@@ -140,7 +60,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
return -1; return -1;
rd = kvm_vcpu_dabt_get_rd(vcpu); rd = kvm_vcpu_dabt_get_rd(vcpu);
addr = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va); addr = hyp_symbol_addr(kvm_vgic_global_state)->vcpu_hyp_va;
addr += fault_ipa - vgic->vgic_cpu_base; addr += fault_ipa - vgic->vgic_cpu_base;
if (kvm_vcpu_dabt_iswrite(vcpu)) { if (kvm_vcpu_dabt_iswrite(vcpu)) {
...@@ -156,4 +76,3 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu) ...@@ -156,4 +76,3 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
return 1; return 1;
} }
#endif
...@@ -58,7 +58,7 @@ static u64 get_except_vector(struct kvm_vcpu *vcpu, enum exception_type type) ...@@ -58,7 +58,7 @@ static u64 get_except_vector(struct kvm_vcpu *vcpu, enum exception_type type)
exc_offset = LOWER_EL_AArch32_VECTOR; exc_offset = LOWER_EL_AArch32_VECTOR;
} }
return vcpu_sys_reg(vcpu, VBAR_EL1) + exc_offset + type; return vcpu_read_sys_reg(vcpu, VBAR_EL1) + exc_offset + type;
} }
static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr) static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr)
...@@ -67,13 +67,13 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr ...@@ -67,13 +67,13 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr
bool is_aarch32 = vcpu_mode_is_32bit(vcpu); bool is_aarch32 = vcpu_mode_is_32bit(vcpu);
u32 esr = 0; u32 esr = 0;
*vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu); vcpu_write_elr_el1(vcpu, *vcpu_pc(vcpu));
*vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync); *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);
*vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64; *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
*vcpu_spsr(vcpu) = cpsr; vcpu_write_spsr(vcpu, cpsr);
vcpu_sys_reg(vcpu, FAR_EL1) = addr; vcpu_write_sys_reg(vcpu, addr, FAR_EL1);
/* /*
* Build an {i,d}abort, depending on the level and the * Build an {i,d}abort, depending on the level and the
...@@ -94,7 +94,7 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr ...@@ -94,7 +94,7 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr
if (!is_iabt) if (!is_iabt)
esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT; esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT;
vcpu_sys_reg(vcpu, ESR_EL1) = esr | ESR_ELx_FSC_EXTABT; vcpu_write_sys_reg(vcpu, esr | ESR_ELx_FSC_EXTABT, ESR_EL1);
} }
static void inject_undef64(struct kvm_vcpu *vcpu) static void inject_undef64(struct kvm_vcpu *vcpu)
...@@ -102,11 +102,11 @@ static void inject_undef64(struct kvm_vcpu *vcpu) ...@@ -102,11 +102,11 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
unsigned long cpsr = *vcpu_cpsr(vcpu); unsigned long cpsr = *vcpu_cpsr(vcpu);
u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT); u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
*vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu); vcpu_write_elr_el1(vcpu, *vcpu_pc(vcpu));
*vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync); *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);
*vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64; *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
*vcpu_spsr(vcpu) = cpsr; vcpu_write_spsr(vcpu, cpsr);
/* /*
* Build an unknown exception, depending on the instruction * Build an unknown exception, depending on the instruction
...@@ -115,7 +115,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu) ...@@ -115,7 +115,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
if (kvm_vcpu_trap_il_is32bit(vcpu)) if (kvm_vcpu_trap_il_is32bit(vcpu))
esr |= ESR_ELx_IL; esr |= ESR_ELx_IL;
vcpu_sys_reg(vcpu, ESR_EL1) = esr; vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
} }
/** /**
...@@ -128,7 +128,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu) ...@@ -128,7 +128,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
*/ */
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr) void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
{ {
if (!(vcpu->arch.hcr_el2 & HCR_RW)) if (vcpu_el1_is_32bit(vcpu))
kvm_inject_dabt32(vcpu, addr); kvm_inject_dabt32(vcpu, addr);
else else
inject_abt64(vcpu, false, addr); inject_abt64(vcpu, false, addr);
...@@ -144,7 +144,7 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr) ...@@ -144,7 +144,7 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
*/ */
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr) void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
{ {
if (!(vcpu->arch.hcr_el2 & HCR_RW)) if (vcpu_el1_is_32bit(vcpu))
kvm_inject_pabt32(vcpu, addr); kvm_inject_pabt32(vcpu, addr);
else else
inject_abt64(vcpu, true, addr); inject_abt64(vcpu, true, addr);
...@@ -158,7 +158,7 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr) ...@@ -158,7 +158,7 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
*/ */
void kvm_inject_undefined(struct kvm_vcpu *vcpu) void kvm_inject_undefined(struct kvm_vcpu *vcpu)
{ {
if (!(vcpu->arch.hcr_el2 & HCR_RW)) if (vcpu_el1_is_32bit(vcpu))
kvm_inject_undef32(vcpu); kvm_inject_undef32(vcpu);
else else
inject_undef64(vcpu); inject_undef64(vcpu);
...@@ -167,7 +167,7 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu) ...@@ -167,7 +167,7 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr) static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
{ {
vcpu_set_vsesr(vcpu, esr); vcpu_set_vsesr(vcpu, esr);
vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); *vcpu_hcr(vcpu) |= HCR_VSE;
} }
/** /**
......
...@@ -141,28 +141,61 @@ unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num) ...@@ -141,28 +141,61 @@ unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num)
/* /*
* Return the SPSR for the current mode of the virtual CPU. * Return the SPSR for the current mode of the virtual CPU.
*/ */
unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu) static int vcpu_spsr32_mode(const struct kvm_vcpu *vcpu)
{ {
unsigned long mode = *vcpu_cpsr(vcpu) & COMPAT_PSR_MODE_MASK; unsigned long mode = *vcpu_cpsr(vcpu) & COMPAT_PSR_MODE_MASK;
switch (mode) { switch (mode) {
case COMPAT_PSR_MODE_SVC: case COMPAT_PSR_MODE_SVC: return KVM_SPSR_SVC;
mode = KVM_SPSR_SVC; case COMPAT_PSR_MODE_ABT: return KVM_SPSR_ABT;
break; case COMPAT_PSR_MODE_UND: return KVM_SPSR_UND;
case COMPAT_PSR_MODE_ABT: case COMPAT_PSR_MODE_IRQ: return KVM_SPSR_IRQ;
mode = KVM_SPSR_ABT; case COMPAT_PSR_MODE_FIQ: return KVM_SPSR_FIQ;
break; default: BUG();
case COMPAT_PSR_MODE_UND: }
mode = KVM_SPSR_UND; }
break;
case COMPAT_PSR_MODE_IRQ: unsigned long vcpu_read_spsr32(const struct kvm_vcpu *vcpu)
mode = KVM_SPSR_IRQ; {
break; int spsr_idx = vcpu_spsr32_mode(vcpu);
case COMPAT_PSR_MODE_FIQ:
mode = KVM_SPSR_FIQ; if (!vcpu->arch.sysregs_loaded_on_cpu)
break; return vcpu_gp_regs(vcpu)->spsr[spsr_idx];
switch (spsr_idx) {
case KVM_SPSR_SVC:
return read_sysreg_el1(spsr);
case KVM_SPSR_ABT:
return read_sysreg(spsr_abt);
case KVM_SPSR_UND:
return read_sysreg(spsr_und);
case KVM_SPSR_IRQ:
return read_sysreg(spsr_irq);
case KVM_SPSR_FIQ:
return read_sysreg(spsr_fiq);
default: default:
BUG(); BUG();
} }
}
void vcpu_write_spsr32(struct kvm_vcpu *vcpu, unsigned long v)
{
int spsr_idx = vcpu_spsr32_mode(vcpu);
if (!vcpu->arch.sysregs_loaded_on_cpu) {
vcpu_gp_regs(vcpu)->spsr[spsr_idx] = v;
return;
}
return (unsigned long *)&vcpu_gp_regs(vcpu)->spsr[mode]; switch (spsr_idx) {
case KVM_SPSR_SVC:
write_sysreg_el1(v, spsr);
case KVM_SPSR_ABT:
write_sysreg(v, spsr_abt);
case KVM_SPSR_UND:
write_sysreg(v, spsr_und);
case KVM_SPSR_IRQ:
write_sysreg(v, spsr_irq);
case KVM_SPSR_FIQ:
write_sysreg(v, spsr_fiq);
}
} }
...@@ -35,6 +35,7 @@ ...@@ -35,6 +35,7 @@
#include <asm/kvm_coproc.h> #include <asm/kvm_coproc.h>
#include <asm/kvm_emulate.h> #include <asm/kvm_emulate.h>
#include <asm/kvm_host.h> #include <asm/kvm_host.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h> #include <asm/kvm_mmu.h>
#include <asm/perf_event.h> #include <asm/perf_event.h>
#include <asm/sysreg.h> #include <asm/sysreg.h>
...@@ -76,6 +77,93 @@ static bool write_to_read_only(struct kvm_vcpu *vcpu, ...@@ -76,6 +77,93 @@ static bool write_to_read_only(struct kvm_vcpu *vcpu,
return false; return false;
} }
u64 vcpu_read_sys_reg(struct kvm_vcpu *vcpu, int reg)
{
if (!vcpu->arch.sysregs_loaded_on_cpu)
goto immediate_read;
/*
* System registers listed in the switch are not saved on every
* exit from the guest but are only saved on vcpu_put.
*
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the guest cannot modify its
* own MPIDR_EL1 and MPIDR_EL1 is accessed for VCPU A from VCPU B's
* thread when emulating cross-VCPU communication.
*/
switch (reg) {
case CSSELR_EL1: return read_sysreg_s(SYS_CSSELR_EL1);
case SCTLR_EL1: return read_sysreg_s(sctlr_EL12);
case ACTLR_EL1: return read_sysreg_s(SYS_ACTLR_EL1);
case CPACR_EL1: return read_sysreg_s(cpacr_EL12);
case TTBR0_EL1: return read_sysreg_s(ttbr0_EL12);
case TTBR1_EL1: return read_sysreg_s(ttbr1_EL12);
case TCR_EL1: return read_sysreg_s(tcr_EL12);
case ESR_EL1: return read_sysreg_s(esr_EL12);
case AFSR0_EL1: return read_sysreg_s(afsr0_EL12);
case AFSR1_EL1: return read_sysreg_s(afsr1_EL12);
case FAR_EL1: return read_sysreg_s(far_EL12);
case MAIR_EL1: return read_sysreg_s(mair_EL12);
case VBAR_EL1: return read_sysreg_s(vbar_EL12);
case CONTEXTIDR_EL1: return read_sysreg_s(contextidr_EL12);
case TPIDR_EL0: return read_sysreg_s(SYS_TPIDR_EL0);
case TPIDRRO_EL0: return read_sysreg_s(SYS_TPIDRRO_EL0);
case TPIDR_EL1: return read_sysreg_s(SYS_TPIDR_EL1);
case AMAIR_EL1: return read_sysreg_s(amair_EL12);
case CNTKCTL_EL1: return read_sysreg_s(cntkctl_EL12);
case PAR_EL1: return read_sysreg_s(SYS_PAR_EL1);
case DACR32_EL2: return read_sysreg_s(SYS_DACR32_EL2);
case IFSR32_EL2: return read_sysreg_s(SYS_IFSR32_EL2);
case DBGVCR32_EL2: return read_sysreg_s(SYS_DBGVCR32_EL2);
}
immediate_read:
return __vcpu_sys_reg(vcpu, reg);
}
void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
if (!vcpu->arch.sysregs_loaded_on_cpu)
goto immediate_write;
/*
* System registers listed in the switch are not restored on every
* entry to the guest but are only restored on vcpu_load.
*
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the the MPIDR should only be
* set once, before running the VCPU, and never changed later.
*/
switch (reg) {
case CSSELR_EL1: write_sysreg_s(val, SYS_CSSELR_EL1); return;
case SCTLR_EL1: write_sysreg_s(val, sctlr_EL12); return;
case ACTLR_EL1: write_sysreg_s(val, SYS_ACTLR_EL1); return;
case CPACR_EL1: write_sysreg_s(val, cpacr_EL12); return;
case TTBR0_EL1: write_sysreg_s(val, ttbr0_EL12); return;
case TTBR1_EL1: write_sysreg_s(val, ttbr1_EL12); return;
case TCR_EL1: write_sysreg_s(val, tcr_EL12); return;
case ESR_EL1: write_sysreg_s(val, esr_EL12); return;
case AFSR0_EL1: write_sysreg_s(val, afsr0_EL12); return;
case AFSR1_EL1: write_sysreg_s(val, afsr1_EL12); return;
case FAR_EL1: write_sysreg_s(val, far_EL12); return;
case MAIR_EL1: write_sysreg_s(val, mair_EL12); return;
case VBAR_EL1: write_sysreg_s(val, vbar_EL12); return;
case CONTEXTIDR_EL1: write_sysreg_s(val, contextidr_EL12); return;
case TPIDR_EL0: write_sysreg_s(val, SYS_TPIDR_EL0); return;
case TPIDRRO_EL0: write_sysreg_s(val, SYS_TPIDRRO_EL0); return;
case TPIDR_EL1: write_sysreg_s(val, SYS_TPIDR_EL1); return;
case AMAIR_EL1: write_sysreg_s(val, amair_EL12); return;
case CNTKCTL_EL1: write_sysreg_s(val, cntkctl_EL12); return;
case PAR_EL1: write_sysreg_s(val, SYS_PAR_EL1); return;
case DACR32_EL2: write_sysreg_s(val, SYS_DACR32_EL2); return;
case IFSR32_EL2: write_sysreg_s(val, SYS_IFSR32_EL2); return;
case DBGVCR32_EL2: write_sysreg_s(val, SYS_DBGVCR32_EL2); return;
}
immediate_write:
__vcpu_sys_reg(vcpu, reg) = val;
}
/* 3 bits per cache level, as per CLIDR, but non-existent caches always 0 */ /* 3 bits per cache level, as per CLIDR, but non-existent caches always 0 */
static u32 cache_levels; static u32 cache_levels;
...@@ -121,16 +209,26 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, ...@@ -121,16 +209,26 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r) const struct sys_reg_desc *r)
{ {
bool was_enabled = vcpu_has_cache_enabled(vcpu); bool was_enabled = vcpu_has_cache_enabled(vcpu);
u64 val;
int reg = r->reg;
BUG_ON(!p->is_write); BUG_ON(!p->is_write);
if (!p->is_aarch32) { /* See the 32bit mapping in kvm_host.h */
vcpu_sys_reg(vcpu, r->reg) = p->regval; if (p->is_aarch32)
reg = r->reg / 2;
if (!p->is_aarch32 || !p->is_32bit) {
val = p->regval;
} else { } else {
if (!p->is_32bit) val = vcpu_read_sys_reg(vcpu, reg);
vcpu_cp15_64_high(vcpu, r->reg) = upper_32_bits(p->regval); if (r->reg % 2)
vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval); val = (p->regval << 32) | (u64)lower_32_bits(val);
else
val = ((u64)upper_32_bits(val) << 32) |
lower_32_bits(p->regval);
} }
vcpu_write_sys_reg(vcpu, val, reg);
kvm_toggle_cache(vcpu, was_enabled); kvm_toggle_cache(vcpu, was_enabled);
return true; return true;
...@@ -175,6 +273,14 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu, ...@@ -175,6 +273,14 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
return read_zero(vcpu, p); return read_zero(vcpu, p);
} }
static bool trap_undef(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
kvm_inject_undefined(vcpu);
return false;
}
static bool trap_oslsr_el1(struct kvm_vcpu *vcpu, static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
struct sys_reg_params *p, struct sys_reg_params *p,
const struct sys_reg_desc *r) const struct sys_reg_desc *r)
...@@ -231,10 +337,10 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu, ...@@ -231,10 +337,10 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r) const struct sys_reg_desc *r)
{ {
if (p->is_write) { if (p->is_write) {
vcpu_sys_reg(vcpu, r->reg) = p->regval; vcpu_write_sys_reg(vcpu, p->regval, r->reg);
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY; vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else { } else {
p->regval = vcpu_sys_reg(vcpu, r->reg); p->regval = vcpu_read_sys_reg(vcpu, r->reg);
} }
trace_trap_reg(__func__, r->reg, p->is_write, p->regval); trace_trap_reg(__func__, r->reg, p->is_write, p->regval);
...@@ -447,7 +553,8 @@ static void reset_wcr(struct kvm_vcpu *vcpu, ...@@ -447,7 +553,8 @@ static void reset_wcr(struct kvm_vcpu *vcpu,
static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{ {
vcpu_sys_reg(vcpu, AMAIR_EL1) = read_sysreg(amair_el1); u64 amair = read_sysreg(amair_el1);
vcpu_write_sys_reg(vcpu, amair, AMAIR_EL1);
} }
static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
...@@ -464,7 +571,7 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) ...@@ -464,7 +571,7 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0); mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1); mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2); mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
vcpu_sys_reg(vcpu, MPIDR_EL1) = (1ULL << 31) | mpidr; vcpu_write_sys_reg(vcpu, (1ULL << 31) | mpidr, MPIDR_EL1);
} }
static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
...@@ -478,12 +585,12 @@ static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) ...@@ -478,12 +585,12 @@ static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
*/ */
val = ((pmcr & ~ARMV8_PMU_PMCR_MASK) val = ((pmcr & ~ARMV8_PMU_PMCR_MASK)
| (ARMV8_PMU_PMCR_MASK & 0xdecafbad)) & (~ARMV8_PMU_PMCR_E); | (ARMV8_PMU_PMCR_MASK & 0xdecafbad)) & (~ARMV8_PMU_PMCR_E);
vcpu_sys_reg(vcpu, PMCR_EL0) = val; __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
} }
static bool check_pmu_access_disabled(struct kvm_vcpu *vcpu, u64 flags) static bool check_pmu_access_disabled(struct kvm_vcpu *vcpu, u64 flags)
{ {
u64 reg = vcpu_sys_reg(vcpu, PMUSERENR_EL0); u64 reg = __vcpu_sys_reg(vcpu, PMUSERENR_EL0);
bool enabled = (reg & flags) || vcpu_mode_priv(vcpu); bool enabled = (reg & flags) || vcpu_mode_priv(vcpu);
if (!enabled) if (!enabled)
...@@ -525,14 +632,14 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -525,14 +632,14 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (p->is_write) { if (p->is_write) {
/* Only update writeable bits of PMCR */ /* Only update writeable bits of PMCR */
val = vcpu_sys_reg(vcpu, PMCR_EL0); val = __vcpu_sys_reg(vcpu, PMCR_EL0);
val &= ~ARMV8_PMU_PMCR_MASK; val &= ~ARMV8_PMU_PMCR_MASK;
val |= p->regval & ARMV8_PMU_PMCR_MASK; val |= p->regval & ARMV8_PMU_PMCR_MASK;
vcpu_sys_reg(vcpu, PMCR_EL0) = val; __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
kvm_pmu_handle_pmcr(vcpu, val); kvm_pmu_handle_pmcr(vcpu, val);
} else { } else {
/* PMCR.P & PMCR.C are RAZ */ /* PMCR.P & PMCR.C are RAZ */
val = vcpu_sys_reg(vcpu, PMCR_EL0) val = __vcpu_sys_reg(vcpu, PMCR_EL0)
& ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C); & ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C);
p->regval = val; p->regval = val;
} }
...@@ -550,10 +657,10 @@ static bool access_pmselr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -550,10 +657,10 @@ static bool access_pmselr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return false; return false;
if (p->is_write) if (p->is_write)
vcpu_sys_reg(vcpu, PMSELR_EL0) = p->regval; __vcpu_sys_reg(vcpu, PMSELR_EL0) = p->regval;
else else
/* return PMSELR.SEL field */ /* return PMSELR.SEL field */
p->regval = vcpu_sys_reg(vcpu, PMSELR_EL0) p->regval = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK; & ARMV8_PMU_COUNTER_MASK;
return true; return true;
...@@ -586,7 +693,7 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx) ...@@ -586,7 +693,7 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
{ {
u64 pmcr, val; u64 pmcr, val;
pmcr = vcpu_sys_reg(vcpu, PMCR_EL0); pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0);
val = (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; val = (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) { if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) {
kvm_inject_undefined(vcpu); kvm_inject_undefined(vcpu);
...@@ -611,7 +718,7 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu, ...@@ -611,7 +718,7 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
if (pmu_access_event_counter_el0_disabled(vcpu)) if (pmu_access_event_counter_el0_disabled(vcpu))
return false; return false;
idx = vcpu_sys_reg(vcpu, PMSELR_EL0) idx = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK; & ARMV8_PMU_COUNTER_MASK;
} else if (r->Op2 == 0) { } else if (r->Op2 == 0) {
/* PMCCNTR_EL0 */ /* PMCCNTR_EL0 */
...@@ -666,7 +773,7 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -666,7 +773,7 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 1) { if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 1) {
/* PMXEVTYPER_EL0 */ /* PMXEVTYPER_EL0 */
idx = vcpu_sys_reg(vcpu, PMSELR_EL0) & ARMV8_PMU_COUNTER_MASK; idx = __vcpu_sys_reg(vcpu, PMSELR_EL0) & ARMV8_PMU_COUNTER_MASK;
reg = PMEVTYPER0_EL0 + idx; reg = PMEVTYPER0_EL0 + idx;
} else if (r->CRn == 14 && (r->CRm & 12) == 12) { } else if (r->CRn == 14 && (r->CRm & 12) == 12) {
idx = ((r->CRm & 3) << 3) | (r->Op2 & 7); idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
...@@ -684,9 +791,9 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -684,9 +791,9 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (p->is_write) { if (p->is_write) {
kvm_pmu_set_counter_event_type(vcpu, p->regval, idx); kvm_pmu_set_counter_event_type(vcpu, p->regval, idx);
vcpu_sys_reg(vcpu, reg) = p->regval & ARMV8_PMU_EVTYPE_MASK; __vcpu_sys_reg(vcpu, reg) = p->regval & ARMV8_PMU_EVTYPE_MASK;
} else { } else {
p->regval = vcpu_sys_reg(vcpu, reg) & ARMV8_PMU_EVTYPE_MASK; p->regval = __vcpu_sys_reg(vcpu, reg) & ARMV8_PMU_EVTYPE_MASK;
} }
return true; return true;
...@@ -708,15 +815,15 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -708,15 +815,15 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
val = p->regval & mask; val = p->regval & mask;
if (r->Op2 & 0x1) { if (r->Op2 & 0x1) {
/* accessing PMCNTENSET_EL0 */ /* accessing PMCNTENSET_EL0 */
vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val; __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val;
kvm_pmu_enable_counter(vcpu, val); kvm_pmu_enable_counter(vcpu, val);
} else { } else {
/* accessing PMCNTENCLR_EL0 */ /* accessing PMCNTENCLR_EL0 */
vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= ~val; __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= ~val;
kvm_pmu_disable_counter(vcpu, val); kvm_pmu_disable_counter(vcpu, val);
} }
} else { } else {
p->regval = vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask; p->regval = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
} }
return true; return true;
...@@ -740,12 +847,12 @@ static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -740,12 +847,12 @@ static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (r->Op2 & 0x1) if (r->Op2 & 0x1)
/* accessing PMINTENSET_EL1 */ /* accessing PMINTENSET_EL1 */
vcpu_sys_reg(vcpu, PMINTENSET_EL1) |= val; __vcpu_sys_reg(vcpu, PMINTENSET_EL1) |= val;
else else
/* accessing PMINTENCLR_EL1 */ /* accessing PMINTENCLR_EL1 */
vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= ~val; __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= ~val;
} else { } else {
p->regval = vcpu_sys_reg(vcpu, PMINTENSET_EL1) & mask; p->regval = __vcpu_sys_reg(vcpu, PMINTENSET_EL1) & mask;
} }
return true; return true;
...@@ -765,12 +872,12 @@ static bool access_pmovs(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -765,12 +872,12 @@ static bool access_pmovs(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (p->is_write) { if (p->is_write) {
if (r->CRm & 0x2) if (r->CRm & 0x2)
/* accessing PMOVSSET_EL0 */ /* accessing PMOVSSET_EL0 */
vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= (p->regval & mask); __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= (p->regval & mask);
else else
/* accessing PMOVSCLR_EL0 */ /* accessing PMOVSCLR_EL0 */
vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= ~(p->regval & mask); __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= ~(p->regval & mask);
} else { } else {
p->regval = vcpu_sys_reg(vcpu, PMOVSSET_EL0) & mask; p->regval = __vcpu_sys_reg(vcpu, PMOVSSET_EL0) & mask;
} }
return true; return true;
...@@ -807,10 +914,10 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -807,10 +914,10 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return false; return false;
} }
vcpu_sys_reg(vcpu, PMUSERENR_EL0) = p->regval __vcpu_sys_reg(vcpu, PMUSERENR_EL0) =
& ARMV8_PMU_USERENR_MASK; p->regval & ARMV8_PMU_USERENR_MASK;
} else { } else {
p->regval = vcpu_sys_reg(vcpu, PMUSERENR_EL0) p->regval = __vcpu_sys_reg(vcpu, PMUSERENR_EL0)
& ARMV8_PMU_USERENR_MASK; & ARMV8_PMU_USERENR_MASK;
} }
...@@ -893,6 +1000,12 @@ static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) ...@@ -893,6 +1000,12 @@ static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
task_pid_nr(current)); task_pid_nr(current));
val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
} else if (id == SYS_ID_AA64MMFR1_EL1) {
if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
pr_err_once("kvm [%i]: LORegions unsupported for guests, suppressing\n",
task_pid_nr(current));
val &= ~(0xfUL << ID_AA64MMFR1_LOR_SHIFT);
} }
return val; return val;
...@@ -1178,6 +1291,12 @@ static const struct sys_reg_desc sys_reg_descs[] = { ...@@ -1178,6 +1291,12 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 }, { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 }, { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
{ SYS_DESC(SYS_LORSA_EL1), trap_undef },
{ SYS_DESC(SYS_LOREA_EL1), trap_undef },
{ SYS_DESC(SYS_LORN_EL1), trap_undef },
{ SYS_DESC(SYS_LORC_EL1), trap_undef },
{ SYS_DESC(SYS_LORID_EL1), trap_undef },
{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 }, { SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 }, { SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
...@@ -1545,6 +1664,11 @@ static const struct sys_reg_desc cp15_regs[] = { ...@@ -1545,6 +1664,11 @@ static const struct sys_reg_desc cp15_regs[] = {
{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID }, { Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
/* CNTP_TVAL */
{ Op1( 0), CRn(14), CRm( 2), Op2( 0), access_cntp_tval },
/* CNTP_CTL */
{ Op1( 0), CRn(14), CRm( 2), Op2( 1), access_cntp_ctl },
/* PMEVCNTRn */ /* PMEVCNTRn */
PMU_PMEVCNTR(0), PMU_PMEVCNTR(0),
PMU_PMEVCNTR(1), PMU_PMEVCNTR(1),
...@@ -1618,6 +1742,7 @@ static const struct sys_reg_desc cp15_64_regs[] = { ...@@ -1618,6 +1742,7 @@ static const struct sys_reg_desc cp15_64_regs[] = {
{ Op1( 0), CRn( 0), CRm( 9), Op2( 0), access_pmu_evcntr }, { Op1( 0), CRn( 0), CRm( 9), Op2( 0), access_pmu_evcntr },
{ Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, { Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 }, { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
{ Op1( 2), CRn( 0), CRm(14), Op2( 0), access_cntp_cval },
}; };
/* Target specific emulation tables */ /* Target specific emulation tables */
...@@ -2194,7 +2319,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg ...@@ -2194,7 +2319,7 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
if (r->get_user) if (r->get_user)
return (r->get_user)(vcpu, r, reg, uaddr); return (r->get_user)(vcpu, r, reg, uaddr);
return reg_to_user(uaddr, &vcpu_sys_reg(vcpu, r->reg), reg->id); return reg_to_user(uaddr, &__vcpu_sys_reg(vcpu, r->reg), reg->id);
} }
int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
...@@ -2215,7 +2340,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg ...@@ -2215,7 +2340,7 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
if (r->set_user) if (r->set_user)
return (r->set_user)(vcpu, r, reg, uaddr); return (r->set_user)(vcpu, r, reg, uaddr);
return reg_from_user(&vcpu_sys_reg(vcpu, r->reg), uaddr, reg->id); return reg_from_user(&__vcpu_sys_reg(vcpu, r->reg), uaddr, reg->id);
} }
static unsigned int num_demux_regs(void) static unsigned int num_demux_regs(void)
...@@ -2421,6 +2546,6 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu) ...@@ -2421,6 +2546,6 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
reset_sys_reg_descs(vcpu, table, num); reset_sys_reg_descs(vcpu, table, num);
for (num = 1; num < NR_SYS_REGS; num++) for (num = 1; num < NR_SYS_REGS; num++)
if (vcpu_sys_reg(vcpu, num) == 0x4242424242424242) if (__vcpu_sys_reg(vcpu, num) == 0x4242424242424242)
panic("Didn't reset vcpu_sys_reg(%zi)", num); panic("Didn't reset __vcpu_sys_reg(%zi)", num);
} }
...@@ -89,14 +89,14 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu, ...@@ -89,14 +89,14 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
{ {
BUG_ON(!r->reg); BUG_ON(!r->reg);
BUG_ON(r->reg >= NR_SYS_REGS); BUG_ON(r->reg >= NR_SYS_REGS);
vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL; __vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
} }
static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{ {
BUG_ON(!r->reg); BUG_ON(!r->reg);
BUG_ON(r->reg >= NR_SYS_REGS); BUG_ON(r->reg >= NR_SYS_REGS);
vcpu_sys_reg(vcpu, r->reg) = r->val; __vcpu_sys_reg(vcpu, r->reg) = r->val;
} }
static inline int cmp_sys_reg(const struct sys_reg_desc *i1, static inline int cmp_sys_reg(const struct sys_reg_desc *i1,
......
...@@ -38,13 +38,13 @@ static bool access_actlr(struct kvm_vcpu *vcpu, ...@@ -38,13 +38,13 @@ static bool access_actlr(struct kvm_vcpu *vcpu,
if (p->is_write) if (p->is_write)
return ignore_write(vcpu, p); return ignore_write(vcpu, p);
p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1); p->regval = vcpu_read_sys_reg(vcpu, ACTLR_EL1);
return true; return true;
} }
static void reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) static void reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{ {
vcpu_sys_reg(vcpu, ACTLR_EL1) = read_sysreg(actlr_el1); __vcpu_sys_reg(vcpu, ACTLR_EL1) = read_sysreg(actlr_el1);
} }
/* /*
......
/*
* Copyright (C) 2017 ARM Ltd.
* Author: Marc Zyngier <marc.zyngier@arm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <linux/kvm_host.h>
#include <linux/random.h>
#include <linux/memblock.h>
#include <asm/alternative.h>
#include <asm/debug-monitors.h>
#include <asm/insn.h>
#include <asm/kvm_mmu.h>
/*
* The LSB of the random hyp VA tag or 0 if no randomization is used.
*/
static u8 tag_lsb;
/*
* The random hyp VA tag value with the region bit if hyp randomization is used
*/
static u64 tag_val;
static u64 va_mask;
static void compute_layout(void)
{
phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
u64 hyp_va_msb;
int kva_msb;
/* Where is my RAM region? */
hyp_va_msb = idmap_addr & BIT(VA_BITS - 1);
hyp_va_msb ^= BIT(VA_BITS - 1);
kva_msb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
(u64)(high_memory - 1));
if (kva_msb == (VA_BITS - 1)) {
/*
* No space in the address, let's compute the mask so
* that it covers (VA_BITS - 1) bits, and the region
* bit. The tag stays set to zero.
*/
va_mask = BIT(VA_BITS - 1) - 1;
va_mask |= hyp_va_msb;
} else {
/*
* We do have some free bits to insert a random tag.
* Hyp VAs are now created from kernel linear map VAs
* using the following formula (with V == VA_BITS):
*
* 63 ... V | V-1 | V-2 .. tag_lsb | tag_lsb - 1 .. 0
* ---------------------------------------------------------
* | 0000000 | hyp_va_msb | random tag | kern linear VA |
*/
tag_lsb = kva_msb;
va_mask = GENMASK_ULL(tag_lsb - 1, 0);
tag_val = get_random_long() & GENMASK_ULL(VA_BITS - 2, tag_lsb);
tag_val |= hyp_va_msb;
tag_val >>= tag_lsb;
}
}
static u32 compute_instruction(int n, u32 rd, u32 rn)
{
u32 insn = AARCH64_BREAK_FAULT;
switch (n) {
case 0:
insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
AARCH64_INSN_VARIANT_64BIT,
rn, rd, va_mask);
break;
case 1:
/* ROR is a variant of EXTR with Rm = Rn */
insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
rn, rn, rd,
tag_lsb);
break;
case 2:
insn = aarch64_insn_gen_add_sub_imm(rd, rn,
tag_val & GENMASK(11, 0),
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_ADSB_ADD);
break;
case 3:
insn = aarch64_insn_gen_add_sub_imm(rd, rn,
tag_val & GENMASK(23, 12),
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_ADSB_ADD);
break;
case 4:
/* ROR is a variant of EXTR with Rm = Rn */
insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
rn, rn, rd, 64 - tag_lsb);
break;
}
return insn;
}
void __init kvm_update_va_mask(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
int i;
BUG_ON(nr_inst != 5);
if (!has_vhe() && !va_mask)
compute_layout();
for (i = 0; i < nr_inst; i++) {
u32 rd, rn, insn, oinsn;
/*
* VHE doesn't need any address translation, let's NOP
* everything.
*
* Alternatively, if we don't have any spare bits in
* the address, NOP everything after masking that
* kernel VA.
*/
if (has_vhe() || (!tag_lsb && i > 0)) {
updptr[i] = cpu_to_le32(aarch64_insn_gen_nop());
continue;
}
oinsn = le32_to_cpu(origptr[i]);
rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, oinsn);
rn = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, oinsn);
insn = compute_instruction(i, rd, rn);
BUG_ON(insn == AARCH64_BREAK_FAULT);
updptr[i] = cpu_to_le32(insn);
}
}
void *__kvm_bp_vect_base;
int __kvm_harden_el2_vector_slot;
void kvm_patch_vector_branch(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst)
{
u64 addr;
u32 insn;
BUG_ON(nr_inst != 5);
if (has_vhe() || !cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS)) {
WARN_ON_ONCE(cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS));
return;
}
if (!va_mask)
compute_layout();
/*
* Compute HYP VA by using the same computation as kern_hyp_va()
*/
addr = (uintptr_t)kvm_ksym_ref(__kvm_hyp_vector);
addr &= va_mask;
addr |= tag_val << tag_lsb;
/* Use PC[10:7] to branch to the same vector in KVM */
addr |= ((u64)origptr & GENMASK_ULL(10, 7));
/*
* Branch to the second instruction in the vectors in order to
* avoid the initial store on the stack (which we already
* perform in the hardening vectors).
*/
addr += AARCH64_INSN_SIZE;
/* stp x0, x1, [sp, #-16]! */
insn = aarch64_insn_gen_load_store_pair(AARCH64_INSN_REG_0,
AARCH64_INSN_REG_1,
AARCH64_INSN_REG_SP,
-16,
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_LDST_STORE_PAIR_PRE_INDEX);
*updptr++ = cpu_to_le32(insn);
/* movz x0, #(addr & 0xffff) */
insn = aarch64_insn_gen_movewide(AARCH64_INSN_REG_0,
(u16)addr,
0,
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_MOVEWIDE_ZERO);
*updptr++ = cpu_to_le32(insn);
/* movk x0, #((addr >> 16) & 0xffff), lsl #16 */
insn = aarch64_insn_gen_movewide(AARCH64_INSN_REG_0,
(u16)(addr >> 16),
16,
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_MOVEWIDE_KEEP);
*updptr++ = cpu_to_le32(insn);
/* movk x0, #((addr >> 32) & 0xffff), lsl #32 */
insn = aarch64_insn_gen_movewide(AARCH64_INSN_REG_0,
(u16)(addr >> 32),
32,
AARCH64_INSN_VARIANT_64BIT,
AARCH64_INSN_MOVEWIDE_KEEP);
*updptr++ = cpu_to_le32(insn);
/* br x0 */
insn = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_0,
AARCH64_INSN_BRANCH_NOLINK);
*updptr++ = cpu_to_le32(insn);
}
...@@ -94,6 +94,11 @@ static inline unsigned int kvm_arch_para_features(void) ...@@ -94,6 +94,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0; return 0;
} }
static inline unsigned int kvm_arch_para_hints(void)
{
return 0;
}
#ifdef CONFIG_MIPS_PARAVIRT #ifdef CONFIG_MIPS_PARAVIRT
static inline bool kvm_para_available(void) static inline bool kvm_para_available(void)
{ {
......
...@@ -60,7 +60,6 @@ ...@@ -60,7 +60,6 @@
#define KVM_ARCH_WANT_MMU_NOTIFIER #define KVM_ARCH_WANT_MMU_NOTIFIER
extern int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
extern int kvm_unmap_hva_range(struct kvm *kvm, extern int kvm_unmap_hva_range(struct kvm *kvm,
unsigned long start, unsigned long end); unsigned long start, unsigned long end);
extern int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end); extern int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
......
...@@ -61,6 +61,11 @@ static inline unsigned int kvm_arch_para_features(void) ...@@ -61,6 +61,11 @@ static inline unsigned int kvm_arch_para_features(void)
return r; return r;
} }
static inline unsigned int kvm_arch_para_hints(void)
{
return 0;
}
static inline bool kvm_check_and_clear_guest_paused(void) static inline bool kvm_check_and_clear_guest_paused(void)
{ {
return false; return false;
......
...@@ -295,7 +295,6 @@ struct kvmppc_ops { ...@@ -295,7 +295,6 @@ struct kvmppc_ops {
const struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old, const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new); const struct kvm_memory_slot *new);
int (*unmap_hva)(struct kvm *kvm, unsigned long hva);
int (*unmap_hva_range)(struct kvm *kvm, unsigned long start, int (*unmap_hva_range)(struct kvm *kvm, unsigned long start,
unsigned long end); unsigned long end);
int (*age_hva)(struct kvm *kvm, unsigned long start, unsigned long end); int (*age_hva)(struct kvm *kvm, unsigned long start, unsigned long end);
......
...@@ -819,12 +819,6 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm, ...@@ -819,12 +819,6 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm,
kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old, new); kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old, new);
} }
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
{
return kvm->arch.kvm_ops->unmap_hva(kvm, hva);
}
EXPORT_SYMBOL_GPL(kvm_unmap_hva);
int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end) int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
{ {
return kvm->arch.kvm_ops->unmap_hva_range(kvm, start, end); return kvm->arch.kvm_ops->unmap_hva_range(kvm, start, end);
......
...@@ -14,7 +14,6 @@ ...@@ -14,7 +14,6 @@
extern void kvmppc_core_flush_memslot_hv(struct kvm *kvm, extern void kvmppc_core_flush_memslot_hv(struct kvm *kvm,
struct kvm_memory_slot *memslot); struct kvm_memory_slot *memslot);
extern int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva);
extern int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, extern int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start,
unsigned long end); unsigned long end);
extern int kvm_age_hva_hv(struct kvm *kvm, unsigned long start, extern int kvm_age_hva_hv(struct kvm *kvm, unsigned long start,
......
...@@ -877,15 +877,6 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, ...@@ -877,15 +877,6 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot,
return 0; return 0;
} }
int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva)
{
hva_handler_fn handler;
handler = kvm_is_radix(kvm) ? kvm_unmap_radix : kvm_unmap_rmapp;
kvm_handle_hva(kvm, hva, handler);
return 0;
}
int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long end) int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long end)
{ {
hva_handler_fn handler; hva_handler_fn handler;
......
...@@ -150,7 +150,9 @@ static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr, ...@@ -150,7 +150,9 @@ static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
{ {
int psize = MMU_BASE_PSIZE; int psize = MMU_BASE_PSIZE;
if (pshift >= PMD_SHIFT) if (pshift >= PUD_SHIFT)
psize = MMU_PAGE_1G;
else if (pshift >= PMD_SHIFT)
psize = MMU_PAGE_2M; psize = MMU_PAGE_2M;
addr &= ~0xfffUL; addr &= ~0xfffUL;
addr |= mmu_psize_defs[psize].ap << 5; addr |= mmu_psize_defs[psize].ap << 5;
...@@ -163,6 +165,17 @@ static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr, ...@@ -163,6 +165,17 @@ static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
asm volatile("ptesync": : :"memory"); asm volatile("ptesync": : :"memory");
} }
static void kvmppc_radix_flush_pwc(struct kvm *kvm, unsigned long addr)
{
unsigned long rb = 0x2 << PPC_BITLSHIFT(53); /* IS = 2 */
asm volatile("ptesync": : :"memory");
/* RIC=1 PRS=0 R=1 IS=2 */
asm volatile(PPC_TLBIE_5(%0, %1, 1, 0, 1)
: : "r" (rb), "r" (kvm->arch.lpid) : "memory");
asm volatile("ptesync": : :"memory");
}
unsigned long kvmppc_radix_update_pte(struct kvm *kvm, pte_t *ptep, unsigned long kvmppc_radix_update_pte(struct kvm *kvm, pte_t *ptep,
unsigned long clr, unsigned long set, unsigned long clr, unsigned long set,
unsigned long addr, unsigned int shift) unsigned long addr, unsigned int shift)
...@@ -223,9 +236,9 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, ...@@ -223,9 +236,9 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
new_pud = pud_alloc_one(kvm->mm, gpa); new_pud = pud_alloc_one(kvm->mm, gpa);
pmd = NULL; pmd = NULL;
if (pud && pud_present(*pud)) if (pud && pud_present(*pud) && !pud_huge(*pud))
pmd = pmd_offset(pud, gpa); pmd = pmd_offset(pud, gpa);
else else if (level <= 1)
new_pmd = pmd_alloc_one(kvm->mm, gpa); new_pmd = pmd_alloc_one(kvm->mm, gpa);
if (level == 0 && !(pmd && pmd_present(*pmd) && !pmd_is_leaf(*pmd))) if (level == 0 && !(pmd && pmd_present(*pmd) && !pmd_is_leaf(*pmd)))
...@@ -246,6 +259,50 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, ...@@ -246,6 +259,50 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
new_pud = NULL; new_pud = NULL;
} }
pud = pud_offset(pgd, gpa); pud = pud_offset(pgd, gpa);
if (pud_huge(*pud)) {
unsigned long hgpa = gpa & PUD_MASK;
/*
* If we raced with another CPU which has just put
* a 1GB pte in after we saw a pmd page, try again.
*/
if (level <= 1 && !new_pmd) {
ret = -EAGAIN;
goto out_unlock;
}
/* Check if we raced and someone else has set the same thing */
if (level == 2 && pud_raw(*pud) == pte_raw(pte)) {
ret = 0;
goto out_unlock;
}
/* Valid 1GB page here already, remove it */
old = kvmppc_radix_update_pte(kvm, (pte_t *)pud,
~0UL, 0, hgpa, PUD_SHIFT);
kvmppc_radix_tlbie_page(kvm, hgpa, PUD_SHIFT);
if (old & _PAGE_DIRTY) {
unsigned long gfn = hgpa >> PAGE_SHIFT;
struct kvm_memory_slot *memslot;
memslot = gfn_to_memslot(kvm, gfn);
if (memslot && memslot->dirty_bitmap)
kvmppc_update_dirty_map(memslot,
gfn, PUD_SIZE);
}
}
if (level == 2) {
if (!pud_none(*pud)) {
/*
* There's a page table page here, but we wanted to
* install a large page, so remove and free the page
* table page. new_pmd will be NULL since level == 2.
*/
new_pmd = pmd_offset(pud, 0);
pud_clear(pud);
kvmppc_radix_flush_pwc(kvm, gpa);
}
kvmppc_radix_set_pte_at(kvm, gpa, (pte_t *)pud, pte);
ret = 0;
goto out_unlock;
}
if (pud_none(*pud)) { if (pud_none(*pud)) {
if (!new_pmd) if (!new_pmd)
goto out_unlock; goto out_unlock;
...@@ -264,6 +321,11 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, ...@@ -264,6 +321,11 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
ret = -EAGAIN; ret = -EAGAIN;
goto out_unlock; goto out_unlock;
} }
/* Check if we raced and someone else has set the same thing */
if (level == 1 && pmd_raw(*pmd) == pte_raw(pte)) {
ret = 0;
goto out_unlock;
}
/* Valid 2MB page here already, remove it */ /* Valid 2MB page here already, remove it */
old = kvmppc_radix_update_pte(kvm, pmdp_ptep(pmd), old = kvmppc_radix_update_pte(kvm, pmdp_ptep(pmd),
~0UL, 0, lgpa, PMD_SHIFT); ~0UL, 0, lgpa, PMD_SHIFT);
...@@ -276,35 +338,43 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa, ...@@ -276,35 +338,43 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
kvmppc_update_dirty_map(memslot, kvmppc_update_dirty_map(memslot,
gfn, PMD_SIZE); gfn, PMD_SIZE);
} }
} else if (level == 1 && !pmd_none(*pmd)) {
/*
* There's a page table page here, but we wanted
* to install a large page. Tell the caller and let
* it try installing a normal page if it wants.
*/
ret = -EBUSY;
goto out_unlock;
} }
if (level == 0) { if (level == 1) {
if (pmd_none(*pmd)) { if (!pmd_none(*pmd)) {
if (!new_ptep) /*
goto out_unlock; * There's a page table page here, but we wanted to
pmd_populate(kvm->mm, pmd, new_ptep); * install a large page, so remove and free the page
new_ptep = NULL; * table page. new_ptep will be NULL since level == 1.
} */
ptep = pte_offset_kernel(pmd, gpa); new_ptep = pte_offset_kernel(pmd, 0);
if (pte_present(*ptep)) { pmd_clear(pmd);
/* PTE was previously valid, so invalidate it */ kvmppc_radix_flush_pwc(kvm, gpa);
old = kvmppc_radix_update_pte(kvm, ptep, _PAGE_PRESENT,
0, gpa, 0);
kvmppc_radix_tlbie_page(kvm, gpa, 0);
if (old & _PAGE_DIRTY)
mark_page_dirty(kvm, gpa >> PAGE_SHIFT);
} }
kvmppc_radix_set_pte_at(kvm, gpa, ptep, pte);
} else {
kvmppc_radix_set_pte_at(kvm, gpa, pmdp_ptep(pmd), pte); kvmppc_radix_set_pte_at(kvm, gpa, pmdp_ptep(pmd), pte);
ret = 0;
goto out_unlock;
} }
if (pmd_none(*pmd)) {
if (!new_ptep)
goto out_unlock;
pmd_populate(kvm->mm, pmd, new_ptep);
new_ptep = NULL;
}
ptep = pte_offset_kernel(pmd, gpa);
if (pte_present(*ptep)) {
/* Check if someone else set the same thing */
if (pte_raw(*ptep) == pte_raw(pte)) {
ret = 0;
goto out_unlock;
}
/* PTE was previously valid, so invalidate it */
old = kvmppc_radix_update_pte(kvm, ptep, _PAGE_PRESENT,
0, gpa, 0);
kvmppc_radix_tlbie_page(kvm, gpa, 0);
if (old & _PAGE_DIRTY)
mark_page_dirty(kvm, gpa >> PAGE_SHIFT);
}
kvmppc_radix_set_pte_at(kvm, gpa, ptep, pte);
ret = 0; ret = 0;
out_unlock: out_unlock:
...@@ -325,11 +395,11 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, ...@@ -325,11 +395,11 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned long mmu_seq, pte_size; unsigned long mmu_seq, pte_size;
unsigned long gpa, gfn, hva, pfn; unsigned long gpa, gfn, hva, pfn;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
struct page *page = NULL, *pages[1]; struct page *page = NULL;
long ret, npages, ok; long ret;
unsigned int writing; bool writing;
struct vm_area_struct *vma; bool upgrade_write = false;
unsigned long flags; bool *upgrade_p = &upgrade_write;
pte_t pte, *ptep; pte_t pte, *ptep;
unsigned long pgflags; unsigned long pgflags;
unsigned int shift, level; unsigned int shift, level;
...@@ -369,122 +439,131 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, ...@@ -369,122 +439,131 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
dsisr & DSISR_ISSTORE); dsisr & DSISR_ISSTORE);
} }
/* used to check for invalidations in progress */
mmu_seq = kvm->mmu_notifier_seq;
smp_rmb();
writing = (dsisr & DSISR_ISSTORE) != 0; writing = (dsisr & DSISR_ISSTORE) != 0;
hva = gfn_to_hva_memslot(memslot, gfn); if (memslot->flags & KVM_MEM_READONLY) {
if (writing) {
/* give the guest a DSI */
dsisr = DSISR_ISSTORE | DSISR_PROTFAULT;
kvmppc_core_queue_data_storage(vcpu, ea, dsisr);
return RESUME_GUEST;
}
upgrade_p = NULL;
}
if (dsisr & DSISR_SET_RC) { if (dsisr & DSISR_SET_RC) {
/* /*
* Need to set an R or C bit in the 2nd-level tables; * Need to set an R or C bit in the 2nd-level tables;
* if the relevant bits aren't already set in the linux * since we are just helping out the hardware here,
* page tables, fall through to do the gup_fast to * it is sufficient to do what the hardware does.
* set them in the linux page tables too.
*/ */
ok = 0;
pgflags = _PAGE_ACCESSED; pgflags = _PAGE_ACCESSED;
if (writing) if (writing)
pgflags |= _PAGE_DIRTY; pgflags |= _PAGE_DIRTY;
local_irq_save(flags); /*
ptep = find_current_mm_pte(current->mm->pgd, hva, NULL, NULL); * We are walking the secondary page table here. We can do this
if (ptep) { * without disabling irq.
pte = READ_ONCE(*ptep); */
if (pte_present(pte) && spin_lock(&kvm->mmu_lock);
(pte_val(pte) & pgflags) == pgflags) ptep = __find_linux_pte(kvm->arch.pgtable,
ok = 1; gpa, NULL, &shift);
} if (ptep && pte_present(*ptep) &&
local_irq_restore(flags); (!writing || pte_write(*ptep))) {
if (ok) { kvmppc_radix_update_pte(kvm, ptep, 0, pgflags,
spin_lock(&kvm->mmu_lock); gpa, shift);
if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) { dsisr &= ~DSISR_SET_RC;
spin_unlock(&kvm->mmu_lock);
return RESUME_GUEST;
}
/*
* We are walking the secondary page table here. We can do this
* without disabling irq.
*/
ptep = __find_linux_pte(kvm->arch.pgtable,
gpa, NULL, &shift);
if (ptep && pte_present(*ptep)) {
kvmppc_radix_update_pte(kvm, ptep, 0, pgflags,
gpa, shift);
spin_unlock(&kvm->mmu_lock);
return RESUME_GUEST;
}
spin_unlock(&kvm->mmu_lock);
} }
spin_unlock(&kvm->mmu_lock);
if (!(dsisr & (DSISR_BAD_FAULT_64S | DSISR_NOHPTE |
DSISR_PROTFAULT | DSISR_SET_RC)))
return RESUME_GUEST;
} }
ret = -EFAULT; /* used to check for invalidations in progress */
pfn = 0; mmu_seq = kvm->mmu_notifier_seq;
pte_size = PAGE_SIZE; smp_rmb();
pgflags = _PAGE_READ | _PAGE_EXEC;
level = 0; /*
npages = get_user_pages_fast(hva, 1, writing, pages); * Do a fast check first, since __gfn_to_pfn_memslot doesn't
if (npages < 1) { * do it with !atomic && !async, which is how we call it.
/* Check if it's an I/O mapping */ * We always ask for write permission since the common case
down_read(&current->mm->mmap_sem); * is that the page is writable.
vma = find_vma(current->mm, hva); */
if (vma && vma->vm_start <= hva && hva < vma->vm_end && hva = gfn_to_hva_memslot(memslot, gfn);
(vma->vm_flags & VM_PFNMAP)) { if (upgrade_p && __get_user_pages_fast(hva, 1, 1, &page) == 1) {
pfn = vma->vm_pgoff +
((hva - vma->vm_start) >> PAGE_SHIFT);
pgflags = pgprot_val(vma->vm_page_prot);
}
up_read(&current->mm->mmap_sem);
if (!pfn)
return -EFAULT;
} else {
page = pages[0];
pfn = page_to_pfn(page); pfn = page_to_pfn(page);
if (PageCompound(page)) { upgrade_write = true;
pte_size <<= compound_order(compound_head(page)); } else {
/* See if we can insert a 2MB large-page PTE here */ /* Call KVM generic code to do the slow-path check */
if (pte_size >= PMD_SIZE && pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
(gpa & (PMD_SIZE - PAGE_SIZE)) == writing, upgrade_p);
(hva & (PMD_SIZE - PAGE_SIZE))) { if (is_error_noslot_pfn(pfn))
level = 1; return -EFAULT;
pfn &= ~((PMD_SIZE >> PAGE_SHIFT) - 1); page = NULL;
} if (pfn_valid(pfn)) {
page = pfn_to_page(pfn);
if (PageReserved(page))
page = NULL;
} }
/* See if we can provide write access */ }
if (writing) {
pgflags |= _PAGE_WRITE; /* See if we can insert a 1GB or 2MB large PTE here */
} else { level = 0;
local_irq_save(flags); if (page && PageCompound(page)) {
ptep = find_current_mm_pte(current->mm->pgd, pte_size = PAGE_SIZE << compound_order(compound_head(page));
hva, NULL, NULL); if (pte_size >= PUD_SIZE &&
if (ptep && pte_write(*ptep)) (gpa & (PUD_SIZE - PAGE_SIZE)) ==
pgflags |= _PAGE_WRITE; (hva & (PUD_SIZE - PAGE_SIZE))) {
local_irq_restore(flags); level = 2;
pfn &= ~((PUD_SIZE >> PAGE_SHIFT) - 1);
} else if (pte_size >= PMD_SIZE &&
(gpa & (PMD_SIZE - PAGE_SIZE)) ==
(hva & (PMD_SIZE - PAGE_SIZE))) {
level = 1;
pfn &= ~((PMD_SIZE >> PAGE_SHIFT) - 1);
} }
} }
/* /*
* Compute the PTE value that we need to insert. * Compute the PTE value that we need to insert.
*/ */
pgflags |= _PAGE_PRESENT | _PAGE_PTE | _PAGE_ACCESSED; if (page) {
if (pgflags & _PAGE_WRITE) pgflags = _PAGE_READ | _PAGE_EXEC | _PAGE_PRESENT | _PAGE_PTE |
pgflags |= _PAGE_DIRTY; _PAGE_ACCESSED;
pte = pfn_pte(pfn, __pgprot(pgflags)); if (writing || upgrade_write)
pgflags |= _PAGE_WRITE | _PAGE_DIRTY;
/* Allocate space in the tree and write the PTE */ pte = pfn_pte(pfn, __pgprot(pgflags));
ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq); } else {
if (ret == -EBUSY) {
/* /*
* There's already a PMD where wanted to install a large page; * Read the PTE from the process' radix tree and use that
* for now, fall back to installing a small page. * so we get the attribute bits.
*/ */
level = 0; local_irq_disable();
pfn |= gfn & ((PMD_SIZE >> PAGE_SHIFT) - 1); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
pte = pfn_pte(pfn, __pgprot(pgflags)); pte = *ptep;
ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq); local_irq_enable();
if (shift == PUD_SHIFT &&
(gpa & (PUD_SIZE - PAGE_SIZE)) ==
(hva & (PUD_SIZE - PAGE_SIZE))) {
level = 2;
} else if (shift == PMD_SHIFT &&
(gpa & (PMD_SIZE - PAGE_SIZE)) ==
(hva & (PMD_SIZE - PAGE_SIZE))) {
level = 1;
} else if (shift && shift != PAGE_SHIFT) {
/* Adjust PFN */
unsigned long mask = (1ul << shift) - PAGE_SIZE;
pte = __pte(pte_val(pte) | (hva & mask));
}
if (!(writing || upgrade_write))
pte = __pte(pte_val(pte) & ~ _PAGE_WRITE);
pte = __pte(pte_val(pte) | _PAGE_EXEC);
} }
/* Allocate space in the tree and write the PTE */
ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq);
if (page) { if (page) {
if (!ret && (pgflags & _PAGE_WRITE)) if (!ret && (pte_val(pte) & _PAGE_WRITE))
set_page_dirty_lock(page); set_page_dirty_lock(page);
put_page(page); put_page(page);
} }
...@@ -662,6 +741,10 @@ void kvmppc_free_radix(struct kvm *kvm) ...@@ -662,6 +741,10 @@ void kvmppc_free_radix(struct kvm *kvm)
for (iu = 0; iu < PTRS_PER_PUD; ++iu, ++pud) { for (iu = 0; iu < PTRS_PER_PUD; ++iu, ++pud) {
if (!pud_present(*pud)) if (!pud_present(*pud))
continue; continue;
if (pud_huge(*pud)) {
pud_clear(pud);
continue;
}
pmd = pmd_offset(pud, 0); pmd = pmd_offset(pud, 0);
for (im = 0; im < PTRS_PER_PMD; ++im, ++pmd) { for (im = 0; im < PTRS_PER_PMD; ++im, ++pmd) {
if (pmd_is_leaf(*pmd)) { if (pmd_is_leaf(*pmd)) {
......
...@@ -450,7 +450,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu, ...@@ -450,7 +450,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
/* /*
* Synchronize with the MMU notifier callbacks in * Synchronize with the MMU notifier callbacks in
* book3s_64_mmu_hv.c (kvm_unmap_hva_hv etc.). * book3s_64_mmu_hv.c (kvm_unmap_hva_range_hv etc.).
* While we have the rmap lock, code running on other CPUs * While we have the rmap lock, code running on other CPUs
* cannot finish unmapping the host real page that backs * cannot finish unmapping the host real page that backs
* this guest real page, so we are OK to access the host * this guest real page, so we are OK to access the host
......
...@@ -4375,7 +4375,6 @@ static struct kvmppc_ops kvm_ops_hv = { ...@@ -4375,7 +4375,6 @@ static struct kvmppc_ops kvm_ops_hv = {
.flush_memslot = kvmppc_core_flush_memslot_hv, .flush_memslot = kvmppc_core_flush_memslot_hv,
.prepare_memory_region = kvmppc_core_prepare_memory_region_hv, .prepare_memory_region = kvmppc_core_prepare_memory_region_hv,
.commit_memory_region = kvmppc_core_commit_memory_region_hv, .commit_memory_region = kvmppc_core_commit_memory_region_hv,
.unmap_hva = kvm_unmap_hva_hv,
.unmap_hva_range = kvm_unmap_hva_range_hv, .unmap_hva_range = kvm_unmap_hva_range_hv,
.age_hva = kvm_age_hva_hv, .age_hva = kvm_age_hva_hv,
.test_age_hva = kvm_test_age_hva_hv, .test_age_hva = kvm_test_age_hva_hv,
......
...@@ -277,15 +277,6 @@ static void do_kvm_unmap_hva(struct kvm *kvm, unsigned long start, ...@@ -277,15 +277,6 @@ static void do_kvm_unmap_hva(struct kvm *kvm, unsigned long start,
} }
} }
static int kvm_unmap_hva_pr(struct kvm *kvm, unsigned long hva)
{
trace_kvm_unmap_hva(hva);
do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE);
return 0;
}
static int kvm_unmap_hva_range_pr(struct kvm *kvm, unsigned long start, static int kvm_unmap_hva_range_pr(struct kvm *kvm, unsigned long start,
unsigned long end) unsigned long end)
{ {
...@@ -1773,7 +1764,6 @@ static struct kvmppc_ops kvm_ops_pr = { ...@@ -1773,7 +1764,6 @@ static struct kvmppc_ops kvm_ops_pr = {
.flush_memslot = kvmppc_core_flush_memslot_pr, .flush_memslot = kvmppc_core_flush_memslot_pr,
.prepare_memory_region = kvmppc_core_prepare_memory_region_pr, .prepare_memory_region = kvmppc_core_prepare_memory_region_pr,
.commit_memory_region = kvmppc_core_commit_memory_region_pr, .commit_memory_region = kvmppc_core_commit_memory_region_pr,
.unmap_hva = kvm_unmap_hva_pr,
.unmap_hva_range = kvm_unmap_hva_range_pr, .unmap_hva_range = kvm_unmap_hva_range_pr,
.age_hva = kvm_age_hva_pr, .age_hva = kvm_age_hva_pr,
.test_age_hva = kvm_test_age_hva_pr, .test_age_hva = kvm_test_age_hva_pr,
......
...@@ -724,7 +724,7 @@ int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type, ...@@ -724,7 +724,7 @@ int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
/************* MMU Notifiers *************/ /************* MMU Notifiers *************/
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) static int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
{ {
trace_kvm_unmap_hva(hva); trace_kvm_unmap_hva(hva);
......
...@@ -254,21 +254,6 @@ TRACE_EVENT(kvm_exit, ...@@ -254,21 +254,6 @@ TRACE_EVENT(kvm_exit,
) )
); );
TRACE_EVENT(kvm_unmap_hva,
TP_PROTO(unsigned long hva),
TP_ARGS(hva),
TP_STRUCT__entry(
__field( unsigned long, hva )
),
TP_fast_assign(
__entry->hva = hva;
),
TP_printk("unmap hva 0x%lx\n", __entry->hva)
);
#endif /* _TRACE_KVM_H */ #endif /* _TRACE_KVM_H */
/* This part must be outside protection */ /* This part must be outside protection */
......
...@@ -294,6 +294,7 @@ struct kvm_vcpu_stat { ...@@ -294,6 +294,7 @@ struct kvm_vcpu_stat {
u64 exit_userspace; u64 exit_userspace;
u64 exit_null; u64 exit_null;
u64 exit_external_request; u64 exit_external_request;
u64 exit_io_request;
u64 exit_external_interrupt; u64 exit_external_interrupt;
u64 exit_stop_request; u64 exit_stop_request;
u64 exit_validity; u64 exit_validity;
...@@ -310,16 +311,29 @@ struct kvm_vcpu_stat { ...@@ -310,16 +311,29 @@ struct kvm_vcpu_stat {
u64 exit_program_interruption; u64 exit_program_interruption;
u64 exit_instr_and_program; u64 exit_instr_and_program;
u64 exit_operation_exception; u64 exit_operation_exception;
u64 deliver_ckc;
u64 deliver_cputm;
u64 deliver_external_call; u64 deliver_external_call;
u64 deliver_emergency_signal; u64 deliver_emergency_signal;
u64 deliver_service_signal; u64 deliver_service_signal;
u64 deliver_virtio_interrupt; u64 deliver_virtio;
u64 deliver_stop_signal; u64 deliver_stop_signal;
u64 deliver_prefix_signal; u64 deliver_prefix_signal;
u64 deliver_restart_signal; u64 deliver_restart_signal;
u64 deliver_program_int; u64 deliver_program;
u64 deliver_io_int; u64 deliver_io;
u64 deliver_machine_check;
u64 exit_wait_state; u64 exit_wait_state;
u64 inject_ckc;
u64 inject_cputm;
u64 inject_external_call;
u64 inject_emergency_signal;
u64 inject_mchk;
u64 inject_pfault_init;
u64 inject_program;
u64 inject_restart;
u64 inject_set_prefix;
u64 inject_stop_signal;
u64 instruction_epsw; u64 instruction_epsw;
u64 instruction_gs; u64 instruction_gs;
u64 instruction_io_other; u64 instruction_io_other;
...@@ -644,7 +658,12 @@ struct kvm_vcpu_arch { ...@@ -644,7 +658,12 @@ struct kvm_vcpu_arch {
}; };
struct kvm_vm_stat { struct kvm_vm_stat {
ulong remote_tlb_flush; u64 inject_io;
u64 inject_float_mchk;
u64 inject_pfault_done;
u64 inject_service_signal;
u64 inject_virtio;
u64 remote_tlb_flush;
}; };
struct kvm_arch_memory_slot { struct kvm_arch_memory_slot {
...@@ -792,6 +811,7 @@ struct kvm_arch{ ...@@ -792,6 +811,7 @@ struct kvm_arch{
int css_support; int css_support;
int use_irqchip; int use_irqchip;
int use_cmma; int use_cmma;
int use_pfmfi;
int user_cpu_state_ctrl; int user_cpu_state_ctrl;
int user_sigp; int user_sigp;
int user_stsi; int user_stsi;
......
...@@ -193,6 +193,11 @@ static inline unsigned int kvm_arch_para_features(void) ...@@ -193,6 +193,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0; return 0;
} }
static inline unsigned int kvm_arch_para_hints(void)
{
return 0;
}
static inline bool kvm_check_and_clear_guest_paused(void) static inline bool kvm_check_and_clear_guest_paused(void)
{ {
return false; return false;
......
...@@ -22,8 +22,8 @@ typedef struct { ...@@ -22,8 +22,8 @@ typedef struct {
unsigned int has_pgste:1; unsigned int has_pgste:1;
/* The mmu context uses storage keys. */ /* The mmu context uses storage keys. */
unsigned int use_skey:1; unsigned int use_skey:1;
/* The mmu context uses CMMA. */ /* The mmu context uses CMM. */
unsigned int use_cmma:1; unsigned int uses_cmm:1;
} mm_context_t; } mm_context_t;
#define INIT_MM_CONTEXT(name) \ #define INIT_MM_CONTEXT(name) \
......
...@@ -31,7 +31,7 @@ static inline int init_new_context(struct task_struct *tsk, ...@@ -31,7 +31,7 @@ static inline int init_new_context(struct task_struct *tsk,
(current->mm && current->mm->context.alloc_pgste); (current->mm && current->mm->context.alloc_pgste);
mm->context.has_pgste = 0; mm->context.has_pgste = 0;
mm->context.use_skey = 0; mm->context.use_skey = 0;
mm->context.use_cmma = 0; mm->context.uses_cmm = 0;
#endif #endif
switch (mm->context.asce_limit) { switch (mm->context.asce_limit) {
case _REGION2_SIZE: case _REGION2_SIZE:
......
...@@ -1050,8 +1050,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr, ...@@ -1050,8 +1050,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rc = gmap_shadow_r2t(sg, saddr, rfte.val, *fake); rc = gmap_shadow_r2t(sg, saddr, rfte.val, *fake);
if (rc) if (rc)
return rc; return rc;
/* fallthrough */ } /* fallthrough */
}
case ASCE_TYPE_REGION2: { case ASCE_TYPE_REGION2: {
union region2_table_entry rste; union region2_table_entry rste;
...@@ -1077,8 +1076,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr, ...@@ -1077,8 +1076,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rc = gmap_shadow_r3t(sg, saddr, rste.val, *fake); rc = gmap_shadow_r3t(sg, saddr, rste.val, *fake);
if (rc) if (rc)
return rc; return rc;
/* fallthrough */ } /* fallthrough */
}
case ASCE_TYPE_REGION3: { case ASCE_TYPE_REGION3: {
union region3_table_entry rtte; union region3_table_entry rtte;
...@@ -1113,8 +1111,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr, ...@@ -1113,8 +1111,7 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr,
rc = gmap_shadow_sgt(sg, saddr, rtte.val, *fake); rc = gmap_shadow_sgt(sg, saddr, rtte.val, *fake);
if (rc) if (rc)
return rc; return rc;
/* fallthrough */ } /* fallthrough */
}
case ASCE_TYPE_SEGMENT: { case ASCE_TYPE_SEGMENT: {
union segment_table_entry ste; union segment_table_entry ste;
......
...@@ -50,18 +50,6 @@ u8 kvm_s390_get_ilen(struct kvm_vcpu *vcpu) ...@@ -50,18 +50,6 @@ u8 kvm_s390_get_ilen(struct kvm_vcpu *vcpu)
return ilen; return ilen;
} }
static int handle_noop(struct kvm_vcpu *vcpu)
{
switch (vcpu->arch.sie_block->icptcode) {
case 0x10:
vcpu->stat.exit_external_request++;
break;
default:
break; /* nothing */
}
return 0;
}
static int handle_stop(struct kvm_vcpu *vcpu) static int handle_stop(struct kvm_vcpu *vcpu)
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
...@@ -465,8 +453,11 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu) ...@@ -465,8 +453,11 @@ int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
switch (vcpu->arch.sie_block->icptcode) { switch (vcpu->arch.sie_block->icptcode) {
case ICPT_EXTREQ: case ICPT_EXTREQ:
vcpu->stat.exit_external_request++;
return 0;
case ICPT_IOREQ: case ICPT_IOREQ:
return handle_noop(vcpu); vcpu->stat.exit_io_request++;
return 0;
case ICPT_INST: case ICPT_INST:
rc = handle_instruction(vcpu); rc = handle_instruction(vcpu);
break; break;
......
...@@ -391,6 +391,7 @@ static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu) ...@@ -391,6 +391,7 @@ static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu)
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
int rc; int rc;
vcpu->stat.deliver_cputm++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER,
0, 0); 0, 0);
...@@ -410,6 +411,7 @@ static int __must_check __deliver_ckc(struct kvm_vcpu *vcpu) ...@@ -410,6 +411,7 @@ static int __must_check __deliver_ckc(struct kvm_vcpu *vcpu)
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
int rc; int rc;
vcpu->stat.deliver_ckc++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP,
0, 0); 0, 0);
...@@ -595,6 +597,7 @@ static int __must_check __deliver_machine_check(struct kvm_vcpu *vcpu) ...@@ -595,6 +597,7 @@ static int __must_check __deliver_machine_check(struct kvm_vcpu *vcpu)
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
KVM_S390_MCHK, KVM_S390_MCHK,
mchk.cr14, mchk.mcic); mchk.cr14, mchk.mcic);
vcpu->stat.deliver_machine_check++;
rc = __write_machine_check(vcpu, &mchk); rc = __write_machine_check(vcpu, &mchk);
} }
return rc; return rc;
...@@ -710,7 +713,7 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu) ...@@ -710,7 +713,7 @@ static int __must_check __deliver_prog(struct kvm_vcpu *vcpu)
ilen = pgm_info.flags & KVM_S390_PGM_FLAGS_ILC_MASK; ilen = pgm_info.flags & KVM_S390_PGM_FLAGS_ILC_MASK;
VCPU_EVENT(vcpu, 3, "deliver: program irq code 0x%x, ilen:%d", VCPU_EVENT(vcpu, 3, "deliver: program irq code 0x%x, ilen:%d",
pgm_info.code, ilen); pgm_info.code, ilen);
vcpu->stat.deliver_program_int++; vcpu->stat.deliver_program++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
pgm_info.code, 0); pgm_info.code, 0);
...@@ -899,7 +902,7 @@ static int __must_check __deliver_virtio(struct kvm_vcpu *vcpu) ...@@ -899,7 +902,7 @@ static int __must_check __deliver_virtio(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 4, VCPU_EVENT(vcpu, 4,
"deliver: virtio parm: 0x%x,parm64: 0x%llx", "deliver: virtio parm: 0x%x,parm64: 0x%llx",
inti->ext.ext_params, inti->ext.ext_params2); inti->ext.ext_params, inti->ext.ext_params2);
vcpu->stat.deliver_virtio_interrupt++; vcpu->stat.deliver_virtio++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
inti->type, inti->type,
inti->ext.ext_params, inti->ext.ext_params,
...@@ -975,7 +978,7 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu, ...@@ -975,7 +978,7 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
inti->io.subchannel_id >> 1 & 0x3, inti->io.subchannel_id >> 1 & 0x3,
inti->io.subchannel_nr); inti->io.subchannel_nr);
vcpu->stat.deliver_io_int++; vcpu->stat.deliver_io++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
inti->type, inti->type,
((__u32)inti->io.subchannel_id << 16) | ((__u32)inti->io.subchannel_id << 16) |
...@@ -1004,7 +1007,7 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu, ...@@ -1004,7 +1007,7 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
VCPU_EVENT(vcpu, 4, "%s isc %u", "deliver: I/O (AI/gisa)", isc); VCPU_EVENT(vcpu, 4, "%s isc %u", "deliver: I/O (AI/gisa)", isc);
memset(&io, 0, sizeof(io)); memset(&io, 0, sizeof(io));
io.io_int_word = isc_to_int_word(isc); io.io_int_word = isc_to_int_word(isc);
vcpu->stat.deliver_io_int++; vcpu->stat.deliver_io++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
KVM_S390_INT_IO(1, 0, 0, 0), KVM_S390_INT_IO(1, 0, 0, 0),
((__u32)io.subchannel_id << 16) | ((__u32)io.subchannel_id << 16) |
...@@ -1268,6 +1271,7 @@ static int __inject_prog(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1268,6 +1271,7 @@ static int __inject_prog(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_program++;
VCPU_EVENT(vcpu, 3, "inject: program irq code 0x%x", irq->u.pgm.code); VCPU_EVENT(vcpu, 3, "inject: program irq code 0x%x", irq->u.pgm.code);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_PROGRAM_INT, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_PROGRAM_INT,
irq->u.pgm.code, 0); irq->u.pgm.code, 0);
...@@ -1309,6 +1313,7 @@ static int __inject_pfault_init(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1309,6 +1313,7 @@ static int __inject_pfault_init(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_pfault_init++;
VCPU_EVENT(vcpu, 4, "inject: pfault init parameter block at 0x%llx", VCPU_EVENT(vcpu, 4, "inject: pfault init parameter block at 0x%llx",
irq->u.ext.ext_params2); irq->u.ext.ext_params2);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_PFAULT_INIT, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_PFAULT_INIT,
...@@ -1327,6 +1332,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1327,6 +1332,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
struct kvm_s390_extcall_info *extcall = &li->irq.extcall; struct kvm_s390_extcall_info *extcall = &li->irq.extcall;
uint16_t src_id = irq->u.extcall.code; uint16_t src_id = irq->u.extcall.code;
vcpu->stat.inject_external_call++;
VCPU_EVENT(vcpu, 4, "inject: external call source-cpu:%u", VCPU_EVENT(vcpu, 4, "inject: external call source-cpu:%u",
src_id); src_id);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_EXTERNAL_CALL, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_EXTERNAL_CALL,
...@@ -1351,6 +1357,7 @@ static int __inject_set_prefix(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1351,6 +1357,7 @@ static int __inject_set_prefix(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
struct kvm_s390_prefix_info *prefix = &li->irq.prefix; struct kvm_s390_prefix_info *prefix = &li->irq.prefix;
vcpu->stat.inject_set_prefix++;
VCPU_EVENT(vcpu, 3, "inject: set prefix to %x", VCPU_EVENT(vcpu, 3, "inject: set prefix to %x",
irq->u.prefix.address); irq->u.prefix.address);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_SIGP_SET_PREFIX, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_SIGP_SET_PREFIX,
...@@ -1371,6 +1378,7 @@ static int __inject_sigp_stop(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1371,6 +1378,7 @@ static int __inject_sigp_stop(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
struct kvm_s390_stop_info *stop = &li->irq.stop; struct kvm_s390_stop_info *stop = &li->irq.stop;
int rc = 0; int rc = 0;
vcpu->stat.inject_stop_signal++;
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_SIGP_STOP, 0, 0); trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_SIGP_STOP, 0, 0);
if (irq->u.stop.flags & ~KVM_S390_STOP_SUPP_FLAGS) if (irq->u.stop.flags & ~KVM_S390_STOP_SUPP_FLAGS)
...@@ -1395,6 +1403,7 @@ static int __inject_sigp_restart(struct kvm_vcpu *vcpu, ...@@ -1395,6 +1403,7 @@ static int __inject_sigp_restart(struct kvm_vcpu *vcpu,
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_restart++;
VCPU_EVENT(vcpu, 3, "%s", "inject: restart int"); VCPU_EVENT(vcpu, 3, "%s", "inject: restart int");
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_RESTART, 0, 0); trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_RESTART, 0, 0);
...@@ -1407,6 +1416,7 @@ static int __inject_sigp_emergency(struct kvm_vcpu *vcpu, ...@@ -1407,6 +1416,7 @@ static int __inject_sigp_emergency(struct kvm_vcpu *vcpu,
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_emergency_signal++;
VCPU_EVENT(vcpu, 4, "inject: emergency from cpu %u", VCPU_EVENT(vcpu, 4, "inject: emergency from cpu %u",
irq->u.emerg.code); irq->u.emerg.code);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_EMERGENCY, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_EMERGENCY,
...@@ -1427,6 +1437,7 @@ static int __inject_mchk(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1427,6 +1437,7 @@ static int __inject_mchk(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
struct kvm_s390_mchk_info *mchk = &li->irq.mchk; struct kvm_s390_mchk_info *mchk = &li->irq.mchk;
vcpu->stat.inject_mchk++;
VCPU_EVENT(vcpu, 3, "inject: machine check mcic 0x%llx", VCPU_EVENT(vcpu, 3, "inject: machine check mcic 0x%llx",
irq->u.mchk.mcic); irq->u.mchk.mcic);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_MCHK, 0, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_MCHK, 0,
...@@ -1457,6 +1468,7 @@ static int __inject_ckc(struct kvm_vcpu *vcpu) ...@@ -1457,6 +1468,7 @@ static int __inject_ckc(struct kvm_vcpu *vcpu)
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_ckc++;
VCPU_EVENT(vcpu, 3, "%s", "inject: clock comparator external"); VCPU_EVENT(vcpu, 3, "%s", "inject: clock comparator external");
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP,
0, 0); 0, 0);
...@@ -1470,6 +1482,7 @@ static int __inject_cpu_timer(struct kvm_vcpu *vcpu) ...@@ -1470,6 +1482,7 @@ static int __inject_cpu_timer(struct kvm_vcpu *vcpu)
{ {
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
vcpu->stat.inject_cputm++;
VCPU_EVENT(vcpu, 3, "%s", "inject: cpu timer external"); VCPU_EVENT(vcpu, 3, "%s", "inject: cpu timer external");
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER, trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER,
0, 0); 0, 0);
...@@ -1596,6 +1609,7 @@ static int __inject_service(struct kvm *kvm, ...@@ -1596,6 +1609,7 @@ static int __inject_service(struct kvm *kvm,
{ {
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
kvm->stat.inject_service_signal++;
spin_lock(&fi->lock); spin_lock(&fi->lock);
fi->srv_signal.ext_params |= inti->ext.ext_params & SCCB_EVENT_PENDING; fi->srv_signal.ext_params |= inti->ext.ext_params & SCCB_EVENT_PENDING;
/* /*
...@@ -1621,6 +1635,7 @@ static int __inject_virtio(struct kvm *kvm, ...@@ -1621,6 +1635,7 @@ static int __inject_virtio(struct kvm *kvm,
{ {
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
kvm->stat.inject_virtio++;
spin_lock(&fi->lock); spin_lock(&fi->lock);
if (fi->counters[FIRQ_CNTR_VIRTIO] >= KVM_S390_MAX_VIRTIO_IRQS) { if (fi->counters[FIRQ_CNTR_VIRTIO] >= KVM_S390_MAX_VIRTIO_IRQS) {
spin_unlock(&fi->lock); spin_unlock(&fi->lock);
...@@ -1638,6 +1653,7 @@ static int __inject_pfault_done(struct kvm *kvm, ...@@ -1638,6 +1653,7 @@ static int __inject_pfault_done(struct kvm *kvm,
{ {
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
kvm->stat.inject_pfault_done++;
spin_lock(&fi->lock); spin_lock(&fi->lock);
if (fi->counters[FIRQ_CNTR_PFAULT] >= if (fi->counters[FIRQ_CNTR_PFAULT] >=
(ASYNC_PF_PER_VCPU * KVM_MAX_VCPUS)) { (ASYNC_PF_PER_VCPU * KVM_MAX_VCPUS)) {
...@@ -1657,6 +1673,7 @@ static int __inject_float_mchk(struct kvm *kvm, ...@@ -1657,6 +1673,7 @@ static int __inject_float_mchk(struct kvm *kvm,
{ {
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
kvm->stat.inject_float_mchk++;
spin_lock(&fi->lock); spin_lock(&fi->lock);
fi->mchk.cr14 |= inti->mchk.cr14 & (1UL << CR_PENDING_SUBCLASS); fi->mchk.cr14 |= inti->mchk.cr14 & (1UL << CR_PENDING_SUBCLASS);
fi->mchk.mcic |= inti->mchk.mcic; fi->mchk.mcic |= inti->mchk.mcic;
...@@ -1672,6 +1689,7 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti) ...@@ -1672,6 +1689,7 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
struct list_head *list; struct list_head *list;
int isc; int isc;
kvm->stat.inject_io++;
isc = int_word_to_isc(inti->io.io_int_word); isc = int_word_to_isc(inti->io.io_int_word);
if (kvm->arch.gisa && inti->type & KVM_S390_INT_IO_AI_MASK) { if (kvm->arch.gisa && inti->type & KVM_S390_INT_IO_AI_MASK) {
......
...@@ -57,6 +57,7 @@ ...@@ -57,6 +57,7 @@
(KVM_MAX_VCPUS + LOCAL_IRQS)) (KVM_MAX_VCPUS + LOCAL_IRQS))
#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
struct kvm_stats_debugfs_item debugfs_entries[] = { struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "userspace_handled", VCPU_STAT(exit_userspace) }, { "userspace_handled", VCPU_STAT(exit_userspace) },
...@@ -64,6 +65,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { ...@@ -64,6 +65,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "exit_validity", VCPU_STAT(exit_validity) }, { "exit_validity", VCPU_STAT(exit_validity) },
{ "exit_stop_request", VCPU_STAT(exit_stop_request) }, { "exit_stop_request", VCPU_STAT(exit_stop_request) },
{ "exit_external_request", VCPU_STAT(exit_external_request) }, { "exit_external_request", VCPU_STAT(exit_external_request) },
{ "exit_io_request", VCPU_STAT(exit_io_request) },
{ "exit_external_interrupt", VCPU_STAT(exit_external_interrupt) }, { "exit_external_interrupt", VCPU_STAT(exit_external_interrupt) },
{ "exit_instruction", VCPU_STAT(exit_instruction) }, { "exit_instruction", VCPU_STAT(exit_instruction) },
{ "exit_pei", VCPU_STAT(exit_pei) }, { "exit_pei", VCPU_STAT(exit_pei) },
...@@ -78,16 +80,34 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { ...@@ -78,16 +80,34 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "instruction_lctl", VCPU_STAT(instruction_lctl) }, { "instruction_lctl", VCPU_STAT(instruction_lctl) },
{ "instruction_stctl", VCPU_STAT(instruction_stctl) }, { "instruction_stctl", VCPU_STAT(instruction_stctl) },
{ "instruction_stctg", VCPU_STAT(instruction_stctg) }, { "instruction_stctg", VCPU_STAT(instruction_stctg) },
{ "deliver_ckc", VCPU_STAT(deliver_ckc) },
{ "deliver_cputm", VCPU_STAT(deliver_cputm) },
{ "deliver_emergency_signal", VCPU_STAT(deliver_emergency_signal) }, { "deliver_emergency_signal", VCPU_STAT(deliver_emergency_signal) },
{ "deliver_external_call", VCPU_STAT(deliver_external_call) }, { "deliver_external_call", VCPU_STAT(deliver_external_call) },
{ "deliver_service_signal", VCPU_STAT(deliver_service_signal) }, { "deliver_service_signal", VCPU_STAT(deliver_service_signal) },
{ "deliver_virtio_interrupt", VCPU_STAT(deliver_virtio_interrupt) }, { "deliver_virtio", VCPU_STAT(deliver_virtio) },
{ "deliver_stop_signal", VCPU_STAT(deliver_stop_signal) }, { "deliver_stop_signal", VCPU_STAT(deliver_stop_signal) },
{ "deliver_prefix_signal", VCPU_STAT(deliver_prefix_signal) }, { "deliver_prefix_signal", VCPU_STAT(deliver_prefix_signal) },
{ "deliver_restart_signal", VCPU_STAT(deliver_restart_signal) }, { "deliver_restart_signal", VCPU_STAT(deliver_restart_signal) },
{ "deliver_program_interruption", VCPU_STAT(deliver_program_int) }, { "deliver_program", VCPU_STAT(deliver_program) },
{ "deliver_io_interrupt", VCPU_STAT(deliver_io_int) }, { "deliver_io", VCPU_STAT(deliver_io) },
{ "deliver_machine_check", VCPU_STAT(deliver_machine_check) },
{ "exit_wait_state", VCPU_STAT(exit_wait_state) }, { "exit_wait_state", VCPU_STAT(exit_wait_state) },
{ "inject_ckc", VCPU_STAT(inject_ckc) },
{ "inject_cputm", VCPU_STAT(inject_cputm) },
{ "inject_external_call", VCPU_STAT(inject_external_call) },
{ "inject_float_mchk", VM_STAT(inject_float_mchk) },
{ "inject_emergency_signal", VCPU_STAT(inject_emergency_signal) },
{ "inject_io", VM_STAT(inject_io) },
{ "inject_mchk", VCPU_STAT(inject_mchk) },
{ "inject_pfault_done", VM_STAT(inject_pfault_done) },
{ "inject_program", VCPU_STAT(inject_program) },
{ "inject_restart", VCPU_STAT(inject_restart) },
{ "inject_service_signal", VM_STAT(inject_service_signal) },
{ "inject_set_prefix", VCPU_STAT(inject_set_prefix) },
{ "inject_stop_signal", VCPU_STAT(inject_stop_signal) },
{ "inject_pfault_init", VCPU_STAT(inject_pfault_init) },
{ "inject_virtio", VM_STAT(inject_virtio) },
{ "instruction_epsw", VCPU_STAT(instruction_epsw) }, { "instruction_epsw", VCPU_STAT(instruction_epsw) },
{ "instruction_gs", VCPU_STAT(instruction_gs) }, { "instruction_gs", VCPU_STAT(instruction_gs) },
{ "instruction_io_other", VCPU_STAT(instruction_io_other) }, { "instruction_io_other", VCPU_STAT(instruction_io_other) },
...@@ -152,13 +172,33 @@ static int nested; ...@@ -152,13 +172,33 @@ static int nested;
module_param(nested, int, S_IRUGO); module_param(nested, int, S_IRUGO);
MODULE_PARM_DESC(nested, "Nested virtualization support"); MODULE_PARM_DESC(nested, "Nested virtualization support");
/* upper facilities limit for kvm */
unsigned long kvm_s390_fac_list_mask[16] = { FACILITIES_KVM };
unsigned long kvm_s390_fac_list_mask_size(void) /*
* For now we handle at most 16 double words as this is what the s390 base
* kernel handles and stores in the prefix page. If we ever need to go beyond
* this, this requires changes to code, but the external uapi can stay.
*/
#define SIZE_INTERNAL 16
/*
* Base feature mask that defines default mask for facilities. Consists of the
* defines in FACILITIES_KVM and the non-hypervisor managed bits.
*/
static unsigned long kvm_s390_fac_base[SIZE_INTERNAL] = { FACILITIES_KVM };
/*
* Extended feature mask. Consists of the defines in FACILITIES_KVM_CPUMODEL
* and defines the facilities that can be enabled via a cpu model.
*/
static unsigned long kvm_s390_fac_ext[SIZE_INTERNAL] = { FACILITIES_KVM_CPUMODEL };
static unsigned long kvm_s390_fac_size(void)
{ {
BUILD_BUG_ON(ARRAY_SIZE(kvm_s390_fac_list_mask) > S390_ARCH_FAC_MASK_SIZE_U64); BUILD_BUG_ON(SIZE_INTERNAL > S390_ARCH_FAC_MASK_SIZE_U64);
return ARRAY_SIZE(kvm_s390_fac_list_mask); BUILD_BUG_ON(SIZE_INTERNAL > S390_ARCH_FAC_LIST_SIZE_U64);
BUILD_BUG_ON(SIZE_INTERNAL * sizeof(unsigned long) >
sizeof(S390_lowcore.stfle_fac_list));
return SIZE_INTERNAL;
} }
/* available cpu features supported by kvm */ /* available cpu features supported by kvm */
...@@ -679,6 +719,8 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att ...@@ -679,6 +719,8 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att
mutex_lock(&kvm->lock); mutex_lock(&kvm->lock);
if (!kvm->created_vcpus) { if (!kvm->created_vcpus) {
kvm->arch.use_cmma = 1; kvm->arch.use_cmma = 1;
/* Not compatible with cmma. */
kvm->arch.use_pfmfi = 0;
ret = 0; ret = 0;
} }
mutex_unlock(&kvm->lock); mutex_unlock(&kvm->lock);
...@@ -1583,7 +1625,7 @@ static int kvm_s390_get_cmma_bits(struct kvm *kvm, ...@@ -1583,7 +1625,7 @@ static int kvm_s390_get_cmma_bits(struct kvm *kvm,
return -EINVAL; return -EINVAL;
/* CMMA is disabled or was not used, or the buffer has length zero */ /* CMMA is disabled or was not used, or the buffer has length zero */
bufsize = min(args->count, KVM_S390_CMMA_SIZE_MAX); bufsize = min(args->count, KVM_S390_CMMA_SIZE_MAX);
if (!bufsize || !kvm->mm->context.use_cmma) { if (!bufsize || !kvm->mm->context.uses_cmm) {
memset(args, 0, sizeof(*args)); memset(args, 0, sizeof(*args));
return 0; return 0;
} }
...@@ -1660,7 +1702,7 @@ static int kvm_s390_get_cmma_bits(struct kvm *kvm, ...@@ -1660,7 +1702,7 @@ static int kvm_s390_get_cmma_bits(struct kvm *kvm,
/* /*
* This function sets the CMMA attributes for the given pages. If the input * This function sets the CMMA attributes for the given pages. If the input
* buffer has zero length, no action is taken, otherwise the attributes are * buffer has zero length, no action is taken, otherwise the attributes are
* set and the mm->context.use_cmma flag is set. * set and the mm->context.uses_cmm flag is set.
*/ */
static int kvm_s390_set_cmma_bits(struct kvm *kvm, static int kvm_s390_set_cmma_bits(struct kvm *kvm,
const struct kvm_s390_cmma_log *args) const struct kvm_s390_cmma_log *args)
...@@ -1710,9 +1752,9 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm, ...@@ -1710,9 +1752,9 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
srcu_read_unlock(&kvm->srcu, srcu_idx); srcu_read_unlock(&kvm->srcu, srcu_idx);
up_read(&kvm->mm->mmap_sem); up_read(&kvm->mm->mmap_sem);
if (!kvm->mm->context.use_cmma) { if (!kvm->mm->context.uses_cmm) {
down_write(&kvm->mm->mmap_sem); down_write(&kvm->mm->mmap_sem);
kvm->mm->context.use_cmma = 1; kvm->mm->context.uses_cmm = 1;
up_write(&kvm->mm->mmap_sem); up_write(&kvm->mm->mmap_sem);
} }
out: out:
...@@ -1967,20 +2009,15 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -1967,20 +2009,15 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (!kvm->arch.sie_page2) if (!kvm->arch.sie_page2)
goto out_err; goto out_err;
/* Populate the facility mask initially. */
memcpy(kvm->arch.model.fac_mask, S390_lowcore.stfle_fac_list,
sizeof(S390_lowcore.stfle_fac_list));
for (i = 0; i < S390_ARCH_FAC_LIST_SIZE_U64; i++) {
if (i < kvm_s390_fac_list_mask_size())
kvm->arch.model.fac_mask[i] &= kvm_s390_fac_list_mask[i];
else
kvm->arch.model.fac_mask[i] = 0UL;
}
/* Populate the facility list initially. */
kvm->arch.model.fac_list = kvm->arch.sie_page2->fac_list; kvm->arch.model.fac_list = kvm->arch.sie_page2->fac_list;
memcpy(kvm->arch.model.fac_list, kvm->arch.model.fac_mask,
S390_ARCH_FAC_LIST_SIZE_BYTE); for (i = 0; i < kvm_s390_fac_size(); i++) {
kvm->arch.model.fac_mask[i] = S390_lowcore.stfle_fac_list[i] &
(kvm_s390_fac_base[i] |
kvm_s390_fac_ext[i]);
kvm->arch.model.fac_list[i] = S390_lowcore.stfle_fac_list[i] &
kvm_s390_fac_base[i];
}
/* we are always in czam mode - even on pre z14 machines */ /* we are always in czam mode - even on pre z14 machines */
set_kvm_facility(kvm->arch.model.fac_mask, 138); set_kvm_facility(kvm->arch.model.fac_mask, 138);
...@@ -2028,6 +2065,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -2028,6 +2065,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.css_support = 0; kvm->arch.css_support = 0;
kvm->arch.use_irqchip = 0; kvm->arch.use_irqchip = 0;
kvm->arch.use_pfmfi = sclp.has_pfmfi;
kvm->arch.epoch = 0; kvm->arch.epoch = 0;
spin_lock_init(&kvm->arch.start_stop_lock); spin_lock_init(&kvm->arch.start_stop_lock);
...@@ -2454,8 +2492,6 @@ int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu) ...@@ -2454,8 +2492,6 @@ int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->cbrlo = get_zeroed_page(GFP_KERNEL); vcpu->arch.sie_block->cbrlo = get_zeroed_page(GFP_KERNEL);
if (!vcpu->arch.sie_block->cbrlo) if (!vcpu->arch.sie_block->cbrlo)
return -ENOMEM; return -ENOMEM;
vcpu->arch.sie_block->ecb2 &= ~ECB2_PFMFI;
return 0; return 0;
} }
...@@ -2491,7 +2527,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) ...@@ -2491,7 +2527,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
if (test_kvm_facility(vcpu->kvm, 73)) if (test_kvm_facility(vcpu->kvm, 73))
vcpu->arch.sie_block->ecb |= ECB_TE; vcpu->arch.sie_block->ecb |= ECB_TE;
if (test_kvm_facility(vcpu->kvm, 8) && sclp.has_pfmfi) if (test_kvm_facility(vcpu->kvm, 8) && vcpu->kvm->arch.use_pfmfi)
vcpu->arch.sie_block->ecb2 |= ECB2_PFMFI; vcpu->arch.sie_block->ecb2 |= ECB2_PFMFI;
if (test_kvm_facility(vcpu->kvm, 130)) if (test_kvm_facility(vcpu->kvm, 130))
vcpu->arch.sie_block->ecb2 |= ECB2_IEP; vcpu->arch.sie_block->ecb2 |= ECB2_IEP;
...@@ -3023,7 +3059,7 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu) ...@@ -3023,7 +3059,7 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_START_MIGRATION, vcpu)) { if (kvm_check_request(KVM_REQ_START_MIGRATION, vcpu)) {
/* /*
* Disable CMMA virtualization; we will emulate the ESSA * Disable CMM virtualization; we will emulate the ESSA
* instruction manually, in order to provide additional * instruction manually, in order to provide additional
* functionalities needed for live migration. * functionalities needed for live migration.
*/ */
...@@ -3033,11 +3069,11 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu) ...@@ -3033,11 +3069,11 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_STOP_MIGRATION, vcpu)) { if (kvm_check_request(KVM_REQ_STOP_MIGRATION, vcpu)) {
/* /*
* Re-enable CMMA virtualization if CMMA is available and * Re-enable CMM virtualization if CMMA is available and
* was used. * CMM has been used.
*/ */
if ((vcpu->kvm->arch.use_cmma) && if ((vcpu->kvm->arch.use_cmma) &&
(vcpu->kvm->mm->context.use_cmma)) (vcpu->kvm->mm->context.uses_cmm))
vcpu->arch.sie_block->ecb2 |= ECB2_CMMA; vcpu->arch.sie_block->ecb2 |= ECB2_CMMA;
goto retry; goto retry;
} }
...@@ -4044,7 +4080,7 @@ static int __init kvm_s390_init(void) ...@@ -4044,7 +4080,7 @@ static int __init kvm_s390_init(void)
} }
for (i = 0; i < 16; i++) for (i = 0; i < 16; i++)
kvm_s390_fac_list_mask[i] |= kvm_s390_fac_base[i] |=
S390_lowcore.stfle_fac_list[i] & nonhyp_mask(i); S390_lowcore.stfle_fac_list[i] & nonhyp_mask(i);
return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE); return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
......
...@@ -294,8 +294,6 @@ void exit_sie(struct kvm_vcpu *vcpu); ...@@ -294,8 +294,6 @@ void exit_sie(struct kvm_vcpu *vcpu);
void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu); void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu);
int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu); int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu);
unsigned long kvm_s390_fac_list_mask_size(void);
extern unsigned long kvm_s390_fac_list_mask[];
void kvm_s390_set_cpu_timer(struct kvm_vcpu *vcpu, __u64 cputm); void kvm_s390_set_cpu_timer(struct kvm_vcpu *vcpu, __u64 cputm);
__u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu); __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu);
......
...@@ -1078,9 +1078,9 @@ static int handle_essa(struct kvm_vcpu *vcpu) ...@@ -1078,9 +1078,9 @@ static int handle_essa(struct kvm_vcpu *vcpu)
* value really needs to be written to; if the value is * value really needs to be written to; if the value is
* already correct, we do nothing and avoid the lock. * already correct, we do nothing and avoid the lock.
*/ */
if (vcpu->kvm->mm->context.use_cmma == 0) { if (vcpu->kvm->mm->context.uses_cmm == 0) {
down_write(&vcpu->kvm->mm->mmap_sem); down_write(&vcpu->kvm->mm->mmap_sem);
vcpu->kvm->mm->context.use_cmma = 1; vcpu->kvm->mm->context.uses_cmm = 1;
up_write(&vcpu->kvm->mm->mmap_sem); up_write(&vcpu->kvm->mm->mmap_sem);
} }
/* /*
......
...@@ -62,6 +62,13 @@ static struct facility_def facility_defs[] = { ...@@ -62,6 +62,13 @@ static struct facility_def facility_defs[] = {
} }
}, },
{ {
/*
* FACILITIES_KVM contains the list of facilities that are part
* of the default facility mask and list that are passed to the
* initial CPU model. If no CPU model is used, this, together
* with the non-hypervisor managed bits, is the maximum list of
* guest facilities supported by KVM.
*/
.name = "FACILITIES_KVM", .name = "FACILITIES_KVM",
.bits = (int[]){ .bits = (int[]){
0, /* N3 instructions */ 0, /* N3 instructions */
...@@ -89,6 +96,19 @@ static struct facility_def facility_defs[] = { ...@@ -89,6 +96,19 @@ static struct facility_def facility_defs[] = {
-1 /* END */ -1 /* END */
} }
}, },
{
/*
* FACILITIES_KVM_CPUMODEL contains the list of facilities
* that can be enabled by CPU model code if the host supports
* it. These facilities are not passed to the guest without
* CPU model support.
*/
.name = "FACILITIES_KVM_CPUMODEL",
.bits = (int[]){
-1 /* END */
}
},
}; };
static void print_facility_list(struct facility_def *def) static void print_facility_list(struct facility_def *def)
......
此差异已折叠。
此差异已折叠。
...@@ -88,6 +88,7 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1, ...@@ -88,6 +88,7 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
#ifdef CONFIG_KVM_GUEST #ifdef CONFIG_KVM_GUEST
bool kvm_para_available(void); bool kvm_para_available(void);
unsigned int kvm_arch_para_features(void); unsigned int kvm_arch_para_features(void);
unsigned int kvm_arch_para_hints(void);
void kvm_async_pf_task_wait(u32 token, int interrupt_kernel); void kvm_async_pf_task_wait(u32 token, int interrupt_kernel);
void kvm_async_pf_task_wake(u32 token); void kvm_async_pf_task_wake(u32 token);
u32 kvm_read_and_reset_pf_reason(void); u32 kvm_read_and_reset_pf_reason(void);
...@@ -115,6 +116,11 @@ static inline unsigned int kvm_arch_para_features(void) ...@@ -115,6 +116,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0; return 0;
} }
static inline unsigned int kvm_arch_para_hints(void)
{
return 0;
}
static inline u32 kvm_read_and_reset_pf_reason(void) static inline u32 kvm_read_and_reset_pf_reason(void)
{ {
return 0; return 0;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -60,7 +60,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { ...@@ -60,7 +60,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
u32 intercept_dr; u32 intercept_dr;
u32 intercept_exceptions; u32 intercept_exceptions;
u64 intercept; u64 intercept;
u8 reserved_1[42]; u8 reserved_1[40];
u16 pause_filter_thresh;
u16 pause_filter_count; u16 pause_filter_count;
u64 iopm_base_pa; u64 iopm_base_pa;
u64 msrpm_base_pa; u64 msrpm_base_pa;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册