提交 4e241557 编写于 作者: L Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull first batch of KVM updates from Paolo Bonzini:
 "The bulk of the changes here is for x86.  And for once it's not for
  silicon that no one owns: these are really new features for everyone.

  Details:

   - ARM:
        several features are in progress but missed the 4.2 deadline.
        So here is just a smattering of bug fixes, plus enabling the
        VFIO integration.

   - s390:
        Some fixes/refactorings/optimizations, plus support for 2GB
        pages.

   - x86:
        * host and guest support for marking kvmclock as a stable
          scheduler clock.
        * support for write combining.
        * support for system management mode, needed for secure boot in
          guests.
        * a bunch of cleanups required for the above
        * support for virtualized performance counters on AMD
        * legacy PCI device assignment is deprecated and defaults to "n"
          in Kconfig; VFIO replaces it

        On top of this there are also bug fixes and eager FPU context
        loading for FPU-heavy guests.

   - Common code:
        Support for multiple address spaces; for now it is used only for
        x86 SMM but the s390 folks also have plans"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
  KVM: s390: clear floating interrupt bitmap and parameters
  KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs
  KVM: x86/vPMU: Implement AMD vPMU code for KVM
  KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch
  KVM: x86/vPMU: introduce kvm_pmu_msr_idx_to_pmc
  KVM: x86/vPMU: reorder PMU functions
  KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code
  KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU
  KVM: x86/vPMU: introduce pmu.h header
  KVM: x86/vPMU: rename a few PMU functions
  KVM: MTRR: do not map huge page for non-consistent range
  KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type
  KVM: MTRR: introduce mtrr_for_each_mem_type
  KVM: MTRR: introduce fixed_mtrr_addr_* functions
  KVM: MTRR: sort variable MTRRs
  KVM: MTRR: introduce var_mtrr_range
  KVM: MTRR: introduce fixed_mtrr_segment table
  KVM: MTRR: improve kvm_mtrr_get_guest_memory_type
  KVM: MTRR: do not split 64 bits MSR content
  KVM: MTRR: clean up mtrr default type
  ...
...@@ -254,6 +254,11 @@ since the last call to this ioctl. Bit 0 is the first page in the ...@@ -254,6 +254,11 @@ since the last call to this ioctl. Bit 0 is the first page in the
memory slot. Ensure the entire structure is cleared to avoid padding memory slot. Ensure the entire structure is cleared to avoid padding
issues. issues.
If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 specifies
the address space for which you want to return the dirty bitmap.
They must be less than the value that KVM_CHECK_EXTENSION returns for
the KVM_CAP_MULTI_ADDRESS_SPACE capability.
4.9 KVM_SET_MEMORY_ALIAS 4.9 KVM_SET_MEMORY_ALIAS
...@@ -820,11 +825,21 @@ struct kvm_vcpu_events { ...@@ -820,11 +825,21 @@ struct kvm_vcpu_events {
} nmi; } nmi;
__u32 sipi_vector; __u32 sipi_vector;
__u32 flags; __u32 flags;
struct {
__u8 smm;
__u8 pending;
__u8 smm_inside_nmi;
__u8 latched_init;
} smi;
}; };
KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that Only two fields are defined in the flags field:
interrupt.shadow contains a valid state. Otherwise, this field is undefined.
- KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
interrupt.shadow contains a valid state.
- KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
smi contains a valid state.
4.32 KVM_SET_VCPU_EVENTS 4.32 KVM_SET_VCPU_EVENTS
...@@ -841,17 +856,20 @@ vcpu. ...@@ -841,17 +856,20 @@ vcpu.
See KVM_GET_VCPU_EVENTS for the data structure. See KVM_GET_VCPU_EVENTS for the data structure.
Fields that may be modified asynchronously by running VCPUs can be excluded Fields that may be modified asynchronously by running VCPUs can be excluded
from the update. These fields are nmi.pending and sipi_vector. Keep the from the update. These fields are nmi.pending, sipi_vector, smi.smm,
corresponding bits in the flags field cleared to suppress overwriting the smi.pending. Keep the corresponding bits in the flags field cleared to
current in-kernel state. The bits are: suppress overwriting the current in-kernel state. The bits are:
KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
KVM_VCPUEVENT_VALID_SMM - transfer the smi sub-struct.
If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
the flags field to signal that interrupt.shadow contains a valid state and the flags field to signal that interrupt.shadow contains a valid state and
shall be written into the VCPU. shall be written into the VCPU.
KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.
4.33 KVM_GET_DEBUGREGS 4.33 KVM_GET_DEBUGREGS
...@@ -911,6 +929,13 @@ slot. When changing an existing slot, it may be moved in the guest ...@@ -911,6 +929,13 @@ slot. When changing an existing slot, it may be moved in the guest
physical memory space, or its flags may be modified. It may not be physical memory space, or its flags may be modified. It may not be
resized. Slots may not overlap in guest physical address space. resized. Slots may not overlap in guest physical address space.
If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot"
specifies the address space which is being modified. They must be
less than the value that KVM_CHECK_EXTENSION returns for the
KVM_CAP_MULTI_ADDRESS_SPACE capability. Slots in separate address spaces
are unrelated; the restriction on overlapping slots only applies within
each address space.
Memory for the region is taken starting at the address denoted by the Memory for the region is taken starting at the address denoted by the
field userspace_addr, which must point at user addressable memory for field userspace_addr, which must point at user addressable memory for
the entire memory slot size. Any object may back this memory, including the entire memory slot size. Any object may back this memory, including
...@@ -959,7 +984,8 @@ documentation when it pops into existence). ...@@ -959,7 +984,8 @@ documentation when it pops into existence).
4.37 KVM_ENABLE_CAP 4.37 KVM_ENABLE_CAP
Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM
Architectures: ppc, s390 Architectures: x86 (only KVM_CAP_ENABLE_CAP_VM),
mips (only KVM_CAP_ENABLE_CAP), ppc, s390
Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM) Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM)
Parameters: struct kvm_enable_cap (in) Parameters: struct kvm_enable_cap (in)
Returns: 0 on success; -1 on error Returns: 0 on success; -1 on error
...@@ -1268,7 +1294,7 @@ The flags bitmap is defined as: ...@@ -1268,7 +1294,7 @@ The flags bitmap is defined as:
/* the host supports the ePAPR idle hcall /* the host supports the ePAPR idle hcall
#define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) #define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0)
4.48 KVM_ASSIGN_PCI_DEVICE 4.48 KVM_ASSIGN_PCI_DEVICE (deprecated)
Capability: none Capability: none
Architectures: x86 Architectures: x86
...@@ -1318,7 +1344,7 @@ Errors: ...@@ -1318,7 +1344,7 @@ Errors:
have their standard meanings. have their standard meanings.
4.49 KVM_DEASSIGN_PCI_DEVICE 4.49 KVM_DEASSIGN_PCI_DEVICE (deprecated)
Capability: none Capability: none
Architectures: x86 Architectures: x86
...@@ -1337,7 +1363,7 @@ Errors: ...@@ -1337,7 +1363,7 @@ Errors:
Other error conditions may be defined by individual device types or Other error conditions may be defined by individual device types or
have their standard meanings. have their standard meanings.
4.50 KVM_ASSIGN_DEV_IRQ 4.50 KVM_ASSIGN_DEV_IRQ (deprecated)
Capability: KVM_CAP_ASSIGN_DEV_IRQ Capability: KVM_CAP_ASSIGN_DEV_IRQ
Architectures: x86 Architectures: x86
...@@ -1377,7 +1403,7 @@ Errors: ...@@ -1377,7 +1403,7 @@ Errors:
have their standard meanings. have their standard meanings.
4.51 KVM_DEASSIGN_DEV_IRQ 4.51 KVM_DEASSIGN_DEV_IRQ (deprecated)
Capability: KVM_CAP_ASSIGN_DEV_IRQ Capability: KVM_CAP_ASSIGN_DEV_IRQ
Architectures: x86 Architectures: x86
...@@ -1451,7 +1477,7 @@ struct kvm_irq_routing_s390_adapter { ...@@ -1451,7 +1477,7 @@ struct kvm_irq_routing_s390_adapter {
}; };
4.53 KVM_ASSIGN_SET_MSIX_NR 4.53 KVM_ASSIGN_SET_MSIX_NR (deprecated)
Capability: none Capability: none
Architectures: x86 Architectures: x86
...@@ -1473,7 +1499,7 @@ struct kvm_assigned_msix_nr { ...@@ -1473,7 +1499,7 @@ struct kvm_assigned_msix_nr {
#define KVM_MAX_MSIX_PER_DEV 256 #define KVM_MAX_MSIX_PER_DEV 256
4.54 KVM_ASSIGN_SET_MSIX_ENTRY 4.54 KVM_ASSIGN_SET_MSIX_ENTRY (deprecated)
Capability: none Capability: none
Architectures: x86 Architectures: x86
...@@ -1629,7 +1655,7 @@ should skip processing the bitmap and just invalidate everything. It must ...@@ -1629,7 +1655,7 @@ should skip processing the bitmap and just invalidate everything. It must
be set to the number of set bits in the bitmap. be set to the number of set bits in the bitmap.
4.61 KVM_ASSIGN_SET_INTX_MASK 4.61 KVM_ASSIGN_SET_INTX_MASK (deprecated)
Capability: KVM_CAP_PCI_2_3 Capability: KVM_CAP_PCI_2_3
Architectures: x86 Architectures: x86
...@@ -2978,6 +3004,16 @@ len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0 ...@@ -2978,6 +3004,16 @@ len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0
and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq), and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq),
which is the maximum number of possibly pending cpu-local interrupts. which is the maximum number of possibly pending cpu-local interrupts.
4.90 KVM_SMI
Capability: KVM_CAP_X86_SMM
Architectures: x86
Type: vcpu ioctl
Parameters: none
Returns: 0 on success, -1 on error
Queues an SMI on the thread's vcpu.
5. The kvm_run structure 5. The kvm_run structure
------------------------ ------------------------
...@@ -3013,7 +3049,12 @@ an interrupt can be injected now with KVM_INTERRUPT. ...@@ -3013,7 +3049,12 @@ an interrupt can be injected now with KVM_INTERRUPT.
The value of the current interrupt flag. Only valid if in-kernel The value of the current interrupt flag. Only valid if in-kernel
local APIC is not used. local APIC is not used.
__u8 padding2[2]; __u16 flags;
More architecture-specific flags detailing state of the VCPU that may
affect the device's behavior. The only currently defined flag is
KVM_RUN_X86_SMM, which is valid on x86 machines and is set if the
VCPU is in system management mode.
/* in (pre_kvm_run), out (post_kvm_run) */ /* in (pre_kvm_run), out (post_kvm_run) */
__u64 cr8; __u64 cr8;
......
...@@ -173,6 +173,12 @@ Shadow pages contain the following information: ...@@ -173,6 +173,12 @@ Shadow pages contain the following information:
Contains the value of cr4.smap && !cr0.wp for which the page is valid Contains the value of cr4.smap && !cr0.wp for which the page is valid
(pages for which this is true are different from other pages; see the (pages for which this is true are different from other pages; see the
treatment of cr0.wp=0 below). treatment of cr0.wp=0 below).
role.smm:
Is 1 if the page is valid in system management mode. This field
determines which of the kvm_memslots array was used to build this
shadow page; it is also used to go back from a struct kvm_mmu_page
to a memslot, through the kvm_memslots_for_spte_role macro and
__gfn_to_memslot.
gfn: gfn:
Either the guest page table containing the translations shadowed by this Either the guest page table containing the translations shadowed by this
page, or the base page frame for linear translations. See role.direct. page, or the base page frame for linear translations. See role.direct.
......
...@@ -28,6 +28,7 @@ config KVM ...@@ -28,6 +28,7 @@ config KVM
select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU select SRCU
select MMU_NOTIFIER select MMU_NOTIFIER
select KVM_VFIO
select HAVE_KVM_EVENTFD select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD select HAVE_KVM_IRQFD
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
......
...@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt) ...@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt) AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)
KVM := ../../../virt/kvm KVM := ../../../virt/kvm
kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o
obj-y += kvm-arm.o init.o interrupts.o obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
......
...@@ -171,7 +171,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) ...@@ -171,7 +171,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
int r; int r;
switch (ext) { switch (ext) {
case KVM_CAP_IRQCHIP: case KVM_CAP_IRQCHIP:
case KVM_CAP_IRQFD:
case KVM_CAP_IOEVENTFD: case KVM_CAP_IOEVENTFD:
case KVM_CAP_DEVICE_CTRL: case KVM_CAP_DEVICE_CTRL:
case KVM_CAP_USER_MEMORY: case KVM_CAP_USER_MEMORY:
...@@ -532,6 +531,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -532,6 +531,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_vgic_flush_hwstate(vcpu); kvm_vgic_flush_hwstate(vcpu);
kvm_timer_flush_hwstate(vcpu); kvm_timer_flush_hwstate(vcpu);
preempt_disable();
local_irq_disable(); local_irq_disable();
/* /*
...@@ -544,6 +544,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -544,6 +544,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) { if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
local_irq_enable(); local_irq_enable();
preempt_enable();
kvm_timer_sync_hwstate(vcpu); kvm_timer_sync_hwstate(vcpu);
kvm_vgic_sync_hwstate(vcpu); kvm_vgic_sync_hwstate(vcpu);
continue; continue;
...@@ -553,14 +554,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -553,14 +554,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
* Enter the guest * Enter the guest
*/ */
trace_kvm_entry(*vcpu_pc(vcpu)); trace_kvm_entry(*vcpu_pc(vcpu));
kvm_guest_enter(); __kvm_guest_enter();
vcpu->mode = IN_GUEST_MODE; vcpu->mode = IN_GUEST_MODE;
ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
vcpu->mode = OUTSIDE_GUEST_MODE; vcpu->mode = OUTSIDE_GUEST_MODE;
kvm_guest_exit(); /*
trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); * Back from guest
*************************************************************/
/* /*
* We may have taken a host interrupt in HYP mode (ie * We may have taken a host interrupt in HYP mode (ie
* while executing the guest). This interrupt is still * while executing the guest). This interrupt is still
...@@ -574,8 +577,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -574,8 +577,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
local_irq_enable(); local_irq_enable();
/* /*
* Back from guest * We do local_irq_enable() before calling kvm_guest_exit() so
*************************************************************/ * that if a timer interrupt hits while running the guest we
* account that tick as being spent in the guest. We enable
* preemption after calling kvm_guest_exit() so that if we get
* preempted we make sure ticks after that is not counted as
* guest time.
*/
kvm_guest_exit();
trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
preempt_enable();
kvm_timer_sync_hwstate(vcpu); kvm_timer_sync_hwstate(vcpu);
kvm_vgic_sync_hwstate(vcpu); kvm_vgic_sync_hwstate(vcpu);
......
...@@ -170,13 +170,9 @@ __kvm_vcpu_return: ...@@ -170,13 +170,9 @@ __kvm_vcpu_return:
@ Don't trap coprocessor accesses for host kernel @ Don't trap coprocessor accesses for host kernel
set_hstr vmexit set_hstr vmexit
set_hdcr vmexit set_hdcr vmexit
set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11)) set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11)), after_vfp_restore
#ifdef CONFIG_VFPv3 #ifdef CONFIG_VFPv3
@ Save floating point registers we if let guest use them.
tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11))
bne after_vfp_restore
@ Switch VFP/NEON hardware state to the host's @ Switch VFP/NEON hardware state to the host's
add r7, vcpu, #VCPU_VFP_GUEST add r7, vcpu, #VCPU_VFP_GUEST
store_vfp_state r7 store_vfp_state r7
...@@ -188,6 +184,8 @@ after_vfp_restore: ...@@ -188,6 +184,8 @@ after_vfp_restore:
@ Restore FPEXC_EN which we clobbered on entry @ Restore FPEXC_EN which we clobbered on entry
pop {r2} pop {r2}
VFPFMXR FPEXC, r2 VFPFMXR FPEXC, r2
#else
after_vfp_restore:
#endif #endif
@ Reset Hyp-role @ Reset Hyp-role
...@@ -483,7 +481,7 @@ switch_to_guest_vfp: ...@@ -483,7 +481,7 @@ switch_to_guest_vfp:
push {r3-r7} push {r3-r7}
@ NEON/VFP used. Turn on VFP access. @ NEON/VFP used. Turn on VFP access.
set_hcptr vmexit, (HCPTR_TCP(10) | HCPTR_TCP(11)) set_hcptr vmtrap, (HCPTR_TCP(10) | HCPTR_TCP(11))
@ Switch VFP/NEON hardware state to the guest's @ Switch VFP/NEON hardware state to the guest's
add r7, r0, #VCPU_VFP_HOST add r7, r0, #VCPU_VFP_HOST
......
...@@ -412,7 +412,6 @@ vcpu .req r0 @ vcpu pointer always in r0 ...@@ -412,7 +412,6 @@ vcpu .req r0 @ vcpu pointer always in r0
add r11, vcpu, #VCPU_VGIC_CPU add r11, vcpu, #VCPU_VGIC_CPU
/* Save all interesting registers */ /* Save all interesting registers */
ldr r3, [r2, #GICH_HCR]
ldr r4, [r2, #GICH_VMCR] ldr r4, [r2, #GICH_VMCR]
ldr r5, [r2, #GICH_MISR] ldr r5, [r2, #GICH_MISR]
ldr r6, [r2, #GICH_EISR0] ldr r6, [r2, #GICH_EISR0]
...@@ -420,7 +419,6 @@ vcpu .req r0 @ vcpu pointer always in r0 ...@@ -420,7 +419,6 @@ vcpu .req r0 @ vcpu pointer always in r0
ldr r8, [r2, #GICH_ELRSR0] ldr r8, [r2, #GICH_ELRSR0]
ldr r9, [r2, #GICH_ELRSR1] ldr r9, [r2, #GICH_ELRSR1]
ldr r10, [r2, #GICH_APR] ldr r10, [r2, #GICH_APR]
ARM_BE8(rev r3, r3 )
ARM_BE8(rev r4, r4 ) ARM_BE8(rev r4, r4 )
ARM_BE8(rev r5, r5 ) ARM_BE8(rev r5, r5 )
ARM_BE8(rev r6, r6 ) ARM_BE8(rev r6, r6 )
...@@ -429,7 +427,6 @@ ARM_BE8(rev r8, r8 ) ...@@ -429,7 +427,6 @@ ARM_BE8(rev r8, r8 )
ARM_BE8(rev r9, r9 ) ARM_BE8(rev r9, r9 )
ARM_BE8(rev r10, r10 ) ARM_BE8(rev r10, r10 )
str r3, [r11, #VGIC_V2_CPU_HCR]
str r4, [r11, #VGIC_V2_CPU_VMCR] str r4, [r11, #VGIC_V2_CPU_VMCR]
str r5, [r11, #VGIC_V2_CPU_MISR] str r5, [r11, #VGIC_V2_CPU_MISR]
#ifdef CONFIG_CPU_ENDIAN_BE8 #ifdef CONFIG_CPU_ENDIAN_BE8
...@@ -591,8 +588,13 @@ ARM_BE8(rev r6, r6 ) ...@@ -591,8 +588,13 @@ ARM_BE8(rev r6, r6 )
.endm .endm
/* Configures the HCPTR (Hyp Coprocessor Trap Register) on entry/return /* Configures the HCPTR (Hyp Coprocessor Trap Register) on entry/return
* (hardware reset value is 0). Keep previous value in r2. */ * (hardware reset value is 0). Keep previous value in r2.
.macro set_hcptr operation, mask * An ISB is emited on vmexit/vmtrap, but executed on vmexit only if
* VFP wasn't already enabled (always executed on vmtrap).
* If a label is specified with vmexit, it is branched to if VFP wasn't
* enabled.
*/
.macro set_hcptr operation, mask, label = none
mrc p15, 4, r2, c1, c1, 2 mrc p15, 4, r2, c1, c1, 2
ldr r3, =\mask ldr r3, =\mask
.if \operation == vmentry .if \operation == vmentry
...@@ -601,6 +603,17 @@ ARM_BE8(rev r6, r6 ) ...@@ -601,6 +603,17 @@ ARM_BE8(rev r6, r6 )
bic r3, r2, r3 @ Don't trap defined coproc-accesses bic r3, r2, r3 @ Don't trap defined coproc-accesses
.endif .endif
mcr p15, 4, r3, c1, c1, 2 mcr p15, 4, r3, c1, c1, 2
.if \operation != vmentry
.if \operation == vmexit
tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11))
beq 1f
.endif
isb
.if \label != none
b \label
.endif
1:
.endif
.endm .endm
/* Configures the HDCR (Hyp Debug Configuration Register) on entry/return /* Configures the HDCR (Hyp Debug Configuration Register) on entry/return
......
...@@ -691,8 +691,8 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm) ...@@ -691,8 +691,8 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
* work. This is not used by the hardware and we have no * work. This is not used by the hardware and we have no
* alignment requirement for this allocation. * alignment requirement for this allocation.
*/ */
pgd = (pgd_t *)kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t), pgd = kmalloc(PTRS_PER_S2_PGD * sizeof(pgd_t),
GFP_KERNEL | __GFP_ZERO); GFP_KERNEL | __GFP_ZERO);
if (!pgd) { if (!pgd) {
kvm_free_hwpgd(hwpgd); kvm_free_hwpgd(hwpgd);
...@@ -1155,7 +1155,8 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end) ...@@ -1155,7 +1155,8 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
*/ */
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
{ {
struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot); struct kvm_memslots *slots = kvm_memslots(kvm);
struct kvm_memory_slot *memslot = id_to_memslot(slots, slot);
phys_addr_t start = memslot->base_gfn << PAGE_SHIFT; phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
...@@ -1718,8 +1719,9 @@ int kvm_mmu_init(void) ...@@ -1718,8 +1719,9 @@ int kvm_mmu_init(void)
} }
void kvm_arch_commit_memory_region(struct kvm *kvm, void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old, const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
/* /*
...@@ -1733,7 +1735,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, ...@@ -1733,7 +1735,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
hva_t hva = mem->userspace_addr; hva_t hva = mem->userspace_addr;
...@@ -1838,7 +1840,7 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -1838,7 +1840,7 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
return 0; return 0;
} }
void kvm_arch_memslots_updated(struct kvm *kvm) void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots)
{ {
} }
......
...@@ -230,10 +230,6 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) ...@@ -230,10 +230,6 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
case PSCI_0_2_FN64_AFFINITY_INFO: case PSCI_0_2_FN64_AFFINITY_INFO:
val = kvm_psci_vcpu_affinity_info(vcpu); val = kvm_psci_vcpu_affinity_info(vcpu);
break; break;
case PSCI_0_2_FN_MIGRATE:
case PSCI_0_2_FN64_MIGRATE:
val = PSCI_RET_NOT_SUPPORTED;
break;
case PSCI_0_2_FN_MIGRATE_INFO_TYPE: case PSCI_0_2_FN_MIGRATE_INFO_TYPE:
/* /*
* Trusted OS is MP hence does not require migration * Trusted OS is MP hence does not require migration
...@@ -242,10 +238,6 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) ...@@ -242,10 +238,6 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
*/ */
val = PSCI_0_2_TOS_MP; val = PSCI_0_2_TOS_MP;
break; break;
case PSCI_0_2_FN_MIGRATE_INFO_UP_CPU:
case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
val = PSCI_RET_NOT_SUPPORTED;
break;
case PSCI_0_2_FN_SYSTEM_OFF: case PSCI_0_2_FN_SYSTEM_OFF:
kvm_psci_system_off(vcpu); kvm_psci_system_off(vcpu);
/* /*
...@@ -271,7 +263,8 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) ...@@ -271,7 +263,8 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
ret = 0; ret = 0;
break; break;
default: default:
return -EINVAL; val = PSCI_RET_NOT_SUPPORTED;
break;
} }
*vcpu_reg(vcpu, 0) = val; *vcpu_reg(vcpu, 0) = val;
...@@ -291,12 +284,9 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu) ...@@ -291,12 +284,9 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
case KVM_PSCI_FN_CPU_ON: case KVM_PSCI_FN_CPU_ON:
val = kvm_psci_vcpu_on(vcpu); val = kvm_psci_vcpu_on(vcpu);
break; break;
case KVM_PSCI_FN_CPU_SUSPEND: default:
case KVM_PSCI_FN_MIGRATE:
val = PSCI_RET_NOT_SUPPORTED; val = PSCI_RET_NOT_SUPPORTED;
break; break;
default:
return -EINVAL;
} }
*vcpu_reg(vcpu, 0) = val; *vcpu_reg(vcpu, 0) = val;
......
...@@ -28,6 +28,7 @@ config KVM ...@@ -28,6 +28,7 @@ config KVM
select KVM_ARM_HOST select KVM_ARM_HOST
select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU select SRCU
select KVM_VFIO
select HAVE_KVM_EVENTFD select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD select HAVE_KVM_IRQFD
---help--- ---help---
......
...@@ -11,7 +11,7 @@ ARM=../../../arch/arm/kvm ...@@ -11,7 +11,7 @@ ARM=../../../arch/arm/kvm
obj-$(CONFIG_KVM_ARM_HOST) += kvm.o obj-$(CONFIG_KVM_ARM_HOST) += kvm.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o
......
...@@ -50,8 +50,8 @@ ...@@ -50,8 +50,8 @@
stp x29, lr, [x3, #80] stp x29, lr, [x3, #80]
mrs x19, sp_el0 mrs x19, sp_el0
mrs x20, elr_el2 // EL1 PC mrs x20, elr_el2 // pc before entering el2
mrs x21, spsr_el2 // EL1 pstate mrs x21, spsr_el2 // pstate before entering el2
stp x19, x20, [x3, #96] stp x19, x20, [x3, #96]
str x21, [x3, #112] str x21, [x3, #112]
...@@ -82,8 +82,8 @@ ...@@ -82,8 +82,8 @@
ldr x21, [x3, #16] ldr x21, [x3, #16]
msr sp_el0, x19 msr sp_el0, x19
msr elr_el2, x20 // EL1 PC msr elr_el2, x20 // pc on return from el2
msr spsr_el2, x21 // EL1 pstate msr spsr_el2, x21 // pstate on return from el2
add x3, x2, #CPU_XREG_OFFSET(19) add x3, x2, #CPU_XREG_OFFSET(19)
ldp x19, x20, [x3] ldp x19, x20, [x3]
......
...@@ -47,7 +47,6 @@ __save_vgic_v2_state: ...@@ -47,7 +47,6 @@ __save_vgic_v2_state:
add x3, x0, #VCPU_VGIC_CPU add x3, x0, #VCPU_VGIC_CPU
/* Save all interesting registers */ /* Save all interesting registers */
ldr w4, [x2, #GICH_HCR]
ldr w5, [x2, #GICH_VMCR] ldr w5, [x2, #GICH_VMCR]
ldr w6, [x2, #GICH_MISR] ldr w6, [x2, #GICH_MISR]
ldr w7, [x2, #GICH_EISR0] ldr w7, [x2, #GICH_EISR0]
...@@ -55,7 +54,6 @@ __save_vgic_v2_state: ...@@ -55,7 +54,6 @@ __save_vgic_v2_state:
ldr w9, [x2, #GICH_ELRSR0] ldr w9, [x2, #GICH_ELRSR0]
ldr w10, [x2, #GICH_ELRSR1] ldr w10, [x2, #GICH_ELRSR1]
ldr w11, [x2, #GICH_APR] ldr w11, [x2, #GICH_APR]
CPU_BE( rev w4, w4 )
CPU_BE( rev w5, w5 ) CPU_BE( rev w5, w5 )
CPU_BE( rev w6, w6 ) CPU_BE( rev w6, w6 )
CPU_BE( rev w7, w7 ) CPU_BE( rev w7, w7 )
...@@ -64,7 +62,6 @@ CPU_BE( rev w9, w9 ) ...@@ -64,7 +62,6 @@ CPU_BE( rev w9, w9 )
CPU_BE( rev w10, w10 ) CPU_BE( rev w10, w10 )
CPU_BE( rev w11, w11 ) CPU_BE( rev w11, w11 )
str w4, [x3, #VGIC_V2_CPU_HCR]
str w5, [x3, #VGIC_V2_CPU_VMCR] str w5, [x3, #VGIC_V2_CPU_VMCR]
str w6, [x3, #VGIC_V2_CPU_MISR] str w6, [x3, #VGIC_V2_CPU_MISR]
CPU_LE( str w7, [x3, #VGIC_V2_CPU_EISR] ) CPU_LE( str w7, [x3, #VGIC_V2_CPU_EISR] )
......
...@@ -48,13 +48,11 @@ ...@@ -48,13 +48,11 @@
dsb st dsb st
// Save all interesting registers // Save all interesting registers
mrs_s x4, ICH_HCR_EL2
mrs_s x5, ICH_VMCR_EL2 mrs_s x5, ICH_VMCR_EL2
mrs_s x6, ICH_MISR_EL2 mrs_s x6, ICH_MISR_EL2
mrs_s x7, ICH_EISR_EL2 mrs_s x7, ICH_EISR_EL2
mrs_s x8, ICH_ELSR_EL2 mrs_s x8, ICH_ELSR_EL2
str w4, [x3, #VGIC_V3_CPU_HCR]
str w5, [x3, #VGIC_V3_CPU_VMCR] str w5, [x3, #VGIC_V3_CPU_VMCR]
str w6, [x3, #VGIC_V3_CPU_MISR] str w6, [x3, #VGIC_V3_CPU_MISR]
str w7, [x3, #VGIC_V3_CPU_EISR] str w7, [x3, #VGIC_V3_CPU_EISR]
......
...@@ -839,7 +839,7 @@ static inline void kvm_arch_hardware_unsetup(void) {} ...@@ -839,7 +839,7 @@ static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_free_memslot(struct kvm *kvm, static inline void kvm_arch_free_memslot(struct kvm *kvm,
struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {}
static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm, static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot) {} struct kvm_memory_slot *slot) {}
......
...@@ -198,15 +198,16 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -198,15 +198,16 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
return 0; return 0;
} }
void kvm_arch_commit_memory_region(struct kvm *kvm, void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old, const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
unsigned long npages = 0; unsigned long npages = 0;
...@@ -393,7 +394,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -393,7 +394,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_mips_deliver_interrupts(vcpu, kvm_mips_deliver_interrupts(vcpu,
kvm_read_c0_guest_cause(vcpu->arch.cop0)); kvm_read_c0_guest_cause(vcpu->arch.cop0));
kvm_guest_enter(); __kvm_guest_enter();
/* Disable hardware page table walking while in guest */ /* Disable hardware page table walking while in guest */
htw_stop(); htw_stop();
...@@ -403,7 +404,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -403,7 +404,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
/* Re-enable HTW before enabling interrupts */ /* Re-enable HTW before enabling interrupts */
htw_start(); htw_start();
kvm_guest_exit(); __kvm_guest_exit();
local_irq_enable(); local_irq_enable();
if (vcpu->sigset_active) if (vcpu->sigset_active)
...@@ -968,6 +969,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, ...@@ -968,6 +969,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl,
/* Get (and clear) the dirty memory log for a memory slot. */ /* Get (and clear) the dirty memory log for a memory slot. */
int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
{ {
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
unsigned long ga, ga_end; unsigned long ga, ga_end;
int is_dirty = 0; int is_dirty = 0;
...@@ -982,7 +984,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) ...@@ -982,7 +984,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
/* If nothing is dirty, don't bother messing with page tables. */ /* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) { if (is_dirty) {
memslot = &kvm->memslots->memslots[log->slot]; slots = kvm_memslots(kvm);
memslot = id_to_memslot(slots, log->slot);
ga = memslot->base_gfn << PAGE_SHIFT; ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT); ga_end = ga + (memslot->npages << PAGE_SHIFT);
......
...@@ -430,7 +430,7 @@ static inline void note_hpte_modification(struct kvm *kvm, ...@@ -430,7 +430,7 @@ static inline void note_hpte_modification(struct kvm *kvm,
*/ */
static inline struct kvm_memslots *kvm_memslots_raw(struct kvm *kvm) static inline struct kvm_memslots *kvm_memslots_raw(struct kvm *kvm)
{ {
return rcu_dereference_raw_notrace(kvm->memslots); return rcu_dereference_raw_notrace(kvm->memslots[0]);
} }
extern void kvmppc_mmu_debugfs_init(struct kvm *kvm); extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
......
...@@ -698,7 +698,7 @@ struct kvm_vcpu_arch { ...@@ -698,7 +698,7 @@ struct kvm_vcpu_arch {
static inline void kvm_arch_hardware_disable(void) {} static inline void kvm_arch_hardware_disable(void) {}
static inline void kvm_arch_hardware_unsetup(void) {} static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {}
static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_exit(void) {} static inline void kvm_arch_exit(void) {}
......
...@@ -182,10 +182,11 @@ extern int kvmppc_core_create_memslot(struct kvm *kvm, ...@@ -182,10 +182,11 @@ extern int kvmppc_core_create_memslot(struct kvm *kvm,
unsigned long npages); unsigned long npages);
extern int kvmppc_core_prepare_memory_region(struct kvm *kvm, extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem); const struct kvm_userspace_memory_region *mem);
extern void kvmppc_core_commit_memory_region(struct kvm *kvm, extern void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old); const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new);
extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm, extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
struct kvm_ppc_smmu_info *info); struct kvm_ppc_smmu_info *info);
extern void kvmppc_core_flush_memslot(struct kvm *kvm, extern void kvmppc_core_flush_memslot(struct kvm *kvm,
...@@ -243,10 +244,11 @@ struct kvmppc_ops { ...@@ -243,10 +244,11 @@ struct kvmppc_ops {
void (*flush_memslot)(struct kvm *kvm, struct kvm_memory_slot *memslot); void (*flush_memslot)(struct kvm *kvm, struct kvm_memory_slot *memslot);
int (*prepare_memory_region)(struct kvm *kvm, int (*prepare_memory_region)(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem); const struct kvm_userspace_memory_region *mem);
void (*commit_memory_region)(struct kvm *kvm, void (*commit_memory_region)(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old); const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new);
int (*unmap_hva)(struct kvm *kvm, unsigned long hva); int (*unmap_hva)(struct kvm *kvm, unsigned long hva);
int (*unmap_hva_range)(struct kvm *kvm, unsigned long start, int (*unmap_hva_range)(struct kvm *kvm, unsigned long start,
unsigned long end); unsigned long end);
......
...@@ -757,16 +757,17 @@ void kvmppc_core_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot) ...@@ -757,16 +757,17 @@ void kvmppc_core_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
int kvmppc_core_prepare_memory_region(struct kvm *kvm, int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem) const struct kvm_userspace_memory_region *mem)
{ {
return kvm->arch.kvm_ops->prepare_memory_region(kvm, memslot, mem); return kvm->arch.kvm_ops->prepare_memory_region(kvm, memslot, mem);
} }
void kvmppc_core_commit_memory_region(struct kvm *kvm, void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old) const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new)
{ {
kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old); kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old, new);
} }
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
......
...@@ -650,7 +650,7 @@ static void kvmppc_rmap_reset(struct kvm *kvm) ...@@ -650,7 +650,7 @@ static void kvmppc_rmap_reset(struct kvm *kvm)
int srcu_idx; int srcu_idx;
srcu_idx = srcu_read_lock(&kvm->srcu); srcu_idx = srcu_read_lock(&kvm->srcu);
slots = kvm->memslots; slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) { kvm_for_each_memslot(memslot, slots) {
/* /*
* This assumes it is acceptable to lose reference and * This assumes it is acceptable to lose reference and
......
...@@ -2321,6 +2321,7 @@ static int kvm_vm_ioctl_get_smmu_info_hv(struct kvm *kvm, ...@@ -2321,6 +2321,7 @@ static int kvm_vm_ioctl_get_smmu_info_hv(struct kvm *kvm,
static int kvm_vm_ioctl_get_dirty_log_hv(struct kvm *kvm, static int kvm_vm_ioctl_get_dirty_log_hv(struct kvm *kvm,
struct kvm_dirty_log *log) struct kvm_dirty_log *log)
{ {
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
int r; int r;
unsigned long n; unsigned long n;
...@@ -2331,7 +2332,8 @@ static int kvm_vm_ioctl_get_dirty_log_hv(struct kvm *kvm, ...@@ -2331,7 +2332,8 @@ static int kvm_vm_ioctl_get_dirty_log_hv(struct kvm *kvm,
if (log->slot >= KVM_USER_MEM_SLOTS) if (log->slot >= KVM_USER_MEM_SLOTS)
goto out; goto out;
memslot = id_to_memslot(kvm->memslots, log->slot); slots = kvm_memslots(kvm);
memslot = id_to_memslot(slots, log->slot);
r = -ENOENT; r = -ENOENT;
if (!memslot->dirty_bitmap) if (!memslot->dirty_bitmap)
goto out; goto out;
...@@ -2374,16 +2376,18 @@ static int kvmppc_core_create_memslot_hv(struct kvm_memory_slot *slot, ...@@ -2374,16 +2376,18 @@ static int kvmppc_core_create_memslot_hv(struct kvm_memory_slot *slot,
static int kvmppc_core_prepare_memory_region_hv(struct kvm *kvm, static int kvmppc_core_prepare_memory_region_hv(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem) const struct kvm_userspace_memory_region *mem)
{ {
return 0; return 0;
} }
static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm, static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old) const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new)
{ {
unsigned long npages = mem->memory_size >> PAGE_SHIFT; unsigned long npages = mem->memory_size >> PAGE_SHIFT;
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
if (npages && old->npages) { if (npages && old->npages) {
...@@ -2393,7 +2397,8 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm, ...@@ -2393,7 +2397,8 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
* since the rmap array starts out as all zeroes, * since the rmap array starts out as all zeroes,
* i.e. no pages are dirty. * i.e. no pages are dirty.
*/ */
memslot = id_to_memslot(kvm->memslots, mem->slot); slots = kvm_memslots(kvm);
memslot = id_to_memslot(slots, mem->slot);
kvmppc_hv_get_dirty_log(kvm, memslot, NULL); kvmppc_hv_get_dirty_log(kvm, memslot, NULL);
} }
} }
......
...@@ -1530,6 +1530,7 @@ static int kvmppc_vcpu_run_pr(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) ...@@ -1530,6 +1530,7 @@ static int kvmppc_vcpu_run_pr(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
static int kvm_vm_ioctl_get_dirty_log_pr(struct kvm *kvm, static int kvm_vm_ioctl_get_dirty_log_pr(struct kvm *kvm,
struct kvm_dirty_log *log) struct kvm_dirty_log *log)
{ {
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
struct kvm_vcpu *vcpu; struct kvm_vcpu *vcpu;
ulong ga, ga_end; ulong ga, ga_end;
...@@ -1545,7 +1546,8 @@ static int kvm_vm_ioctl_get_dirty_log_pr(struct kvm *kvm, ...@@ -1545,7 +1546,8 @@ static int kvm_vm_ioctl_get_dirty_log_pr(struct kvm *kvm,
/* If nothing is dirty, don't bother messing with page tables. */ /* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) { if (is_dirty) {
memslot = id_to_memslot(kvm->memslots, log->slot); slots = kvm_memslots(kvm);
memslot = id_to_memslot(slots, log->slot);
ga = memslot->base_gfn << PAGE_SHIFT; ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT); ga_end = ga + (memslot->npages << PAGE_SHIFT);
...@@ -1571,14 +1573,15 @@ static void kvmppc_core_flush_memslot_pr(struct kvm *kvm, ...@@ -1571,14 +1573,15 @@ static void kvmppc_core_flush_memslot_pr(struct kvm *kvm,
static int kvmppc_core_prepare_memory_region_pr(struct kvm *kvm, static int kvmppc_core_prepare_memory_region_pr(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem) const struct kvm_userspace_memory_region *mem)
{ {
return 0; return 0;
} }
static void kvmppc_core_commit_memory_region_pr(struct kvm *kvm, static void kvmppc_core_commit_memory_region_pr(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old) const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new)
{ {
return; return;
} }
......
...@@ -1004,10 +1004,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, ...@@ -1004,10 +1004,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
break; break;
} }
local_irq_enable();
trace_kvm_exit(exit_nr, vcpu); trace_kvm_exit(exit_nr, vcpu);
kvm_guest_exit(); __kvm_guest_exit();
local_irq_enable();
run->exit_reason = KVM_EXIT_UNKNOWN; run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1; run->ready_for_interrupt_injection = 1;
...@@ -1784,14 +1784,15 @@ int kvmppc_core_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -1784,14 +1784,15 @@ int kvmppc_core_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
int kvmppc_core_prepare_memory_region(struct kvm *kvm, int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem) const struct kvm_userspace_memory_region *mem)
{ {
return 0; return 0;
} }
void kvmppc_core_commit_memory_region(struct kvm *kvm, void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old) const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new)
{ {
} }
......
...@@ -115,7 +115,7 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) ...@@ -115,7 +115,7 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
continue; continue;
} }
kvm_guest_enter(); __kvm_guest_enter();
return 1; return 1;
} }
...@@ -595,18 +595,19 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -595,18 +595,19 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
return kvmppc_core_prepare_memory_region(kvm, memslot, mem); return kvmppc_core_prepare_memory_region(kvm, memslot, mem);
} }
void kvm_arch_commit_memory_region(struct kvm *kvm, void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old, const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
kvmppc_core_commit_memory_region(kvm, mem, old); kvmppc_core_commit_memory_region(kvm, mem, old, new);
} }
void kvm_arch_flush_shadow_memslot(struct kvm *kvm, void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
......
...@@ -80,6 +80,7 @@ struct sca_block { ...@@ -80,6 +80,7 @@ struct sca_block {
#define CPUSTAT_MCDS 0x00000100 #define CPUSTAT_MCDS 0x00000100
#define CPUSTAT_SM 0x00000080 #define CPUSTAT_SM 0x00000080
#define CPUSTAT_IBS 0x00000040 #define CPUSTAT_IBS 0x00000040
#define CPUSTAT_GED2 0x00000010
#define CPUSTAT_G 0x00000008 #define CPUSTAT_G 0x00000008
#define CPUSTAT_GED 0x00000004 #define CPUSTAT_GED 0x00000004
#define CPUSTAT_J 0x00000002 #define CPUSTAT_J 0x00000002
...@@ -95,7 +96,8 @@ struct kvm_s390_sie_block { ...@@ -95,7 +96,8 @@ struct kvm_s390_sie_block {
#define PROG_IN_SIE (1<<0) #define PROG_IN_SIE (1<<0)
__u32 prog0c; /* 0x000c */ __u32 prog0c; /* 0x000c */
__u8 reserved10[16]; /* 0x0010 */ __u8 reserved10[16]; /* 0x0010 */
#define PROG_BLOCK_SIE 0x00000001 #define PROG_BLOCK_SIE (1<<0)
#define PROG_REQUEST (1<<1)
atomic_t prog20; /* 0x0020 */ atomic_t prog20; /* 0x0020 */
__u8 reserved24[4]; /* 0x0024 */ __u8 reserved24[4]; /* 0x0024 */
__u64 cputm; /* 0x0028 */ __u64 cputm; /* 0x0028 */
...@@ -634,7 +636,7 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} ...@@ -634,7 +636,7 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_free_memslot(struct kvm *kvm, static inline void kvm_arch_free_memslot(struct kvm *kvm,
struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {}
static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm, static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot) {} struct kvm_memory_slot *slot) {}
......
...@@ -1005,7 +1005,7 @@ ENTRY(sie64a) ...@@ -1005,7 +1005,7 @@ ENTRY(sie64a)
.Lsie_gmap: .Lsie_gmap:
lg %r14,__SF_EMPTY(%r15) # get control block pointer lg %r14,__SF_EMPTY(%r15) # get control block pointer
oi __SIE_PROG0C+3(%r14),1 # we are going into SIE now oi __SIE_PROG0C+3(%r14),1 # we are going into SIE now
tm __SIE_PROG20+3(%r14),1 # last exit... tm __SIE_PROG20+3(%r14),3 # last exit...
jnz .Lsie_done jnz .Lsie_done
LPP __SF_EMPTY(%r15) # set guest id LPP __SF_EMPTY(%r15) # set guest id
sie 0(%r14) sie 0(%r14)
......
...@@ -241,21 +241,6 @@ static int handle_prog(struct kvm_vcpu *vcpu) ...@@ -241,21 +241,6 @@ static int handle_prog(struct kvm_vcpu *vcpu)
return kvm_s390_inject_prog_irq(vcpu, &pgm_info); return kvm_s390_inject_prog_irq(vcpu, &pgm_info);
} }
static int handle_instruction_and_prog(struct kvm_vcpu *vcpu)
{
int rc, rc2;
vcpu->stat.exit_instr_and_program++;
rc = handle_instruction(vcpu);
rc2 = handle_prog(vcpu);
if (rc == -EOPNOTSUPP)
vcpu->arch.sie_block->icptcode = 0x04;
if (rc)
return rc;
return rc2;
}
/** /**
* handle_external_interrupt - used for external interruption interceptions * handle_external_interrupt - used for external interruption interceptions
* *
...@@ -355,7 +340,6 @@ static const intercept_handler_t intercept_funcs[] = { ...@@ -355,7 +340,6 @@ static const intercept_handler_t intercept_funcs[] = {
[0x00 >> 2] = handle_noop, [0x00 >> 2] = handle_noop,
[0x04 >> 2] = handle_instruction, [0x04 >> 2] = handle_instruction,
[0x08 >> 2] = handle_prog, [0x08 >> 2] = handle_prog,
[0x0C >> 2] = handle_instruction_and_prog,
[0x10 >> 2] = handle_noop, [0x10 >> 2] = handle_noop,
[0x14 >> 2] = handle_external_interrupt, [0x14 >> 2] = handle_external_interrupt,
[0x18 >> 2] = handle_noop, [0x18 >> 2] = handle_noop,
......
...@@ -134,6 +134,8 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu) ...@@ -134,6 +134,8 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
active_mask = pending_local_irqs(vcpu); active_mask = pending_local_irqs(vcpu);
active_mask |= pending_floating_irqs(vcpu); active_mask |= pending_floating_irqs(vcpu);
if (!active_mask)
return 0;
if (psw_extint_disabled(vcpu)) if (psw_extint_disabled(vcpu))
active_mask &= ~IRQ_PEND_EXT_MASK; active_mask &= ~IRQ_PEND_EXT_MASK;
...@@ -941,12 +943,9 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu) ...@@ -941,12 +943,9 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
if (cpu_timer_irq_pending(vcpu)) if (cpu_timer_irq_pending(vcpu))
set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs); set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs);
do { while ((irqs = deliverable_irqs(vcpu)) && !rc) {
irqs = deliverable_irqs(vcpu);
/* bits are in the order of interrupt priority */ /* bits are in the order of interrupt priority */
irq_type = find_first_bit(&irqs, IRQ_PEND_COUNT); irq_type = find_first_bit(&irqs, IRQ_PEND_COUNT);
if (irq_type == IRQ_PEND_COUNT)
break;
if (is_ioirq(irq_type)) { if (is_ioirq(irq_type)) {
rc = __deliver_io(vcpu, irq_type); rc = __deliver_io(vcpu, irq_type);
} else { } else {
...@@ -958,9 +957,7 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu) ...@@ -958,9 +957,7 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
} }
rc = func(vcpu); rc = func(vcpu);
} }
if (rc) }
break;
} while (!rc);
set_intercept_indicators(vcpu); set_intercept_indicators(vcpu);
...@@ -1061,7 +1058,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq) ...@@ -1061,7 +1058,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
if (sclp.has_sigpif) if (sclp.has_sigpif)
return __inject_extcall_sigpif(vcpu, src_id); return __inject_extcall_sigpif(vcpu, src_id);
if (!test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs)) if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
return -EBUSY; return -EBUSY;
*extcall = irq->u.extcall; *extcall = irq->u.extcall;
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags); atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
...@@ -1340,12 +1337,54 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti) ...@@ -1340,12 +1337,54 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
return 0; return 0;
} }
static int __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti) /*
* Find a destination VCPU for a floating irq and kick it.
*/
static void __floating_irq_kick(struct kvm *kvm, u64 type)
{ {
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
struct kvm_s390_local_interrupt *li; struct kvm_s390_local_interrupt *li;
struct kvm_vcpu *dst_vcpu;
int sigcpu, online_vcpus, nr_tries = 0;
online_vcpus = atomic_read(&kvm->online_vcpus);
if (!online_vcpus)
return;
/* find idle VCPUs first, then round robin */
sigcpu = find_first_bit(fi->idle_mask, online_vcpus);
if (sigcpu == online_vcpus) {
do {
sigcpu = fi->next_rr_cpu;
fi->next_rr_cpu = (fi->next_rr_cpu + 1) % online_vcpus;
/* avoid endless loops if all vcpus are stopped */
if (nr_tries++ >= online_vcpus)
return;
} while (is_vcpu_stopped(kvm_get_vcpu(kvm, sigcpu)));
}
dst_vcpu = kvm_get_vcpu(kvm, sigcpu);
/* make the VCPU drop out of the SIE, or wake it up if sleeping */
li = &dst_vcpu->arch.local_int;
spin_lock(&li->lock);
switch (type) {
case KVM_S390_MCHK:
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
atomic_set_mask(CPUSTAT_IO_INT, li->cpuflags);
break;
default:
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
break;
}
spin_unlock(&li->lock);
kvm_s390_vcpu_wakeup(dst_vcpu);
}
static int __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
{
struct kvm_s390_float_interrupt *fi; struct kvm_s390_float_interrupt *fi;
struct kvm_vcpu *dst_vcpu = NULL;
int sigcpu;
u64 type = READ_ONCE(inti->type); u64 type = READ_ONCE(inti->type);
int rc; int rc;
...@@ -1373,32 +1412,8 @@ static int __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti) ...@@ -1373,32 +1412,8 @@ static int __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
if (rc) if (rc)
return rc; return rc;
sigcpu = find_first_bit(fi->idle_mask, KVM_MAX_VCPUS); __floating_irq_kick(kvm, type);
if (sigcpu == KVM_MAX_VCPUS) {
do {
sigcpu = fi->next_rr_cpu++;
if (sigcpu == KVM_MAX_VCPUS)
sigcpu = fi->next_rr_cpu = 0;
} while (kvm_get_vcpu(kvm, sigcpu) == NULL);
}
dst_vcpu = kvm_get_vcpu(kvm, sigcpu);
li = &dst_vcpu->arch.local_int;
spin_lock(&li->lock);
switch (type) {
case KVM_S390_MCHK:
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
atomic_set_mask(CPUSTAT_IO_INT, li->cpuflags);
break;
default:
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
break;
}
spin_unlock(&li->lock);
kvm_s390_vcpu_wakeup(kvm_get_vcpu(kvm, sigcpu));
return 0; return 0;
} }
int kvm_s390_inject_vm(struct kvm *kvm, int kvm_s390_inject_vm(struct kvm *kvm,
...@@ -1606,6 +1621,9 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm) ...@@ -1606,6 +1621,9 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm)
int i; int i;
spin_lock(&fi->lock); spin_lock(&fi->lock);
fi->pending_irqs = 0;
memset(&fi->srv_signal, 0, sizeof(fi->srv_signal));
memset(&fi->mchk, 0, sizeof(fi->mchk));
for (i = 0; i < FIRQ_LIST_COUNT; i++) for (i = 0; i < FIRQ_LIST_COUNT; i++)
clear_irq_list(&fi->lists[i]); clear_irq_list(&fi->lists[i]);
for (i = 0; i < FIRQ_MAX_COUNT; i++) for (i = 0; i < FIRQ_MAX_COUNT; i++)
......
...@@ -36,6 +36,10 @@ ...@@ -36,6 +36,10 @@
#include "kvm-s390.h" #include "kvm-s390.h"
#include "gaccess.h" #include "gaccess.h"
#define KMSG_COMPONENT "kvm-s390"
#undef pr_fmt
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
#include "trace.h" #include "trace.h"
#include "trace-s390.h" #include "trace-s390.h"
...@@ -110,7 +114,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { ...@@ -110,7 +114,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
/* upper facilities limit for kvm */ /* upper facilities limit for kvm */
unsigned long kvm_s390_fac_list_mask[] = { unsigned long kvm_s390_fac_list_mask[] = {
0xffe6fffbfcfdfc40UL, 0xffe6fffbfcfdfc40UL,
0x005c800000000000UL, 0x005e800000000000UL,
}; };
unsigned long kvm_s390_fac_list_mask_size(void) unsigned long kvm_s390_fac_list_mask_size(void)
...@@ -236,6 +240,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, ...@@ -236,6 +240,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
{ {
int r; int r;
unsigned long n; unsigned long n;
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot; struct kvm_memory_slot *memslot;
int is_dirty = 0; int is_dirty = 0;
...@@ -245,7 +250,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, ...@@ -245,7 +250,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot >= KVM_USER_MEM_SLOTS) if (log->slot >= KVM_USER_MEM_SLOTS)
goto out; goto out;
memslot = id_to_memslot(kvm->memslots, log->slot); slots = kvm_memslots(kvm);
memslot = id_to_memslot(slots, log->slot);
r = -ENOENT; r = -ENOENT;
if (!memslot->dirty_bitmap) if (!memslot->dirty_bitmap)
goto out; goto out;
...@@ -454,10 +460,10 @@ static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr) ...@@ -454,10 +460,10 @@ static int kvm_s390_set_tod_low(struct kvm *kvm, struct kvm_device_attr *attr)
mutex_lock(&kvm->lock); mutex_lock(&kvm->lock);
kvm->arch.epoch = gtod - host_tod; kvm->arch.epoch = gtod - host_tod;
kvm_for_each_vcpu(vcpu_idx, cur_vcpu, kvm) { kvm_s390_vcpu_block_all(kvm);
kvm_for_each_vcpu(vcpu_idx, cur_vcpu, kvm)
cur_vcpu->arch.sie_block->epoch = kvm->arch.epoch; cur_vcpu->arch.sie_block->epoch = kvm->arch.epoch;
exit_sie(cur_vcpu); kvm_s390_vcpu_unblock_all(kvm);
}
mutex_unlock(&kvm->lock); mutex_unlock(&kvm->lock);
return 0; return 0;
} }
...@@ -1311,8 +1317,13 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) ...@@ -1311,8 +1317,13 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH | atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH |
CPUSTAT_SM | CPUSTAT_SM |
CPUSTAT_STOPPED | CPUSTAT_STOPPED);
CPUSTAT_GED);
if (test_kvm_facility(vcpu->kvm, 78))
atomic_set_mask(CPUSTAT_GED2, &vcpu->arch.sie_block->cpuflags);
else if (test_kvm_facility(vcpu->kvm, 8))
atomic_set_mask(CPUSTAT_GED, &vcpu->arch.sie_block->cpuflags);
kvm_s390_vcpu_setup_model(vcpu); kvm_s390_vcpu_setup_model(vcpu);
vcpu->arch.sie_block->ecb = 6; vcpu->arch.sie_block->ecb = 6;
...@@ -1409,16 +1420,28 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) ...@@ -1409,16 +1420,28 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_s390_vcpu_has_irq(vcpu, 0); return kvm_s390_vcpu_has_irq(vcpu, 0);
} }
void s390_vcpu_block(struct kvm_vcpu *vcpu) void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
{ {
atomic_set_mask(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20); atomic_set_mask(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
exit_sie(vcpu);
} }
void s390_vcpu_unblock(struct kvm_vcpu *vcpu) void kvm_s390_vcpu_unblock(struct kvm_vcpu *vcpu)
{ {
atomic_clear_mask(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20); atomic_clear_mask(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
} }
static void kvm_s390_vcpu_request(struct kvm_vcpu *vcpu)
{
atomic_set_mask(PROG_REQUEST, &vcpu->arch.sie_block->prog20);
exit_sie(vcpu);
}
static void kvm_s390_vcpu_request_handled(struct kvm_vcpu *vcpu)
{
atomic_clear_mask(PROG_REQUEST, &vcpu->arch.sie_block->prog20);
}
/* /*
* Kick a guest cpu out of SIE and wait until SIE is not running. * Kick a guest cpu out of SIE and wait until SIE is not running.
* If the CPU is not running (e.g. waiting as idle) the function will * If the CPU is not running (e.g. waiting as idle) the function will
...@@ -1430,11 +1453,11 @@ void exit_sie(struct kvm_vcpu *vcpu) ...@@ -1430,11 +1453,11 @@ void exit_sie(struct kvm_vcpu *vcpu)
cpu_relax(); cpu_relax();
} }
/* Kick a guest cpu out of SIE and prevent SIE-reentry */ /* Kick a guest cpu out of SIE to process a request synchronously */
void exit_sie_sync(struct kvm_vcpu *vcpu) void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu)
{ {
s390_vcpu_block(vcpu); kvm_make_request(req, vcpu);
exit_sie(vcpu); kvm_s390_vcpu_request(vcpu);
} }
static void kvm_gmap_notifier(struct gmap *gmap, unsigned long address) static void kvm_gmap_notifier(struct gmap *gmap, unsigned long address)
...@@ -1447,8 +1470,7 @@ static void kvm_gmap_notifier(struct gmap *gmap, unsigned long address) ...@@ -1447,8 +1470,7 @@ static void kvm_gmap_notifier(struct gmap *gmap, unsigned long address)
/* match against both prefix pages */ /* match against both prefix pages */
if (kvm_s390_get_prefix(vcpu) == (address & ~0x1000UL)) { if (kvm_s390_get_prefix(vcpu) == (address & ~0x1000UL)) {
VCPU_EVENT(vcpu, 2, "gmap notifier for %lx", address); VCPU_EVENT(vcpu, 2, "gmap notifier for %lx", address);
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu); kvm_s390_sync_request(KVM_REQ_MMU_RELOAD, vcpu);
exit_sie_sync(vcpu);
} }
} }
} }
...@@ -1720,8 +1742,10 @@ static bool ibs_enabled(struct kvm_vcpu *vcpu) ...@@ -1720,8 +1742,10 @@ static bool ibs_enabled(struct kvm_vcpu *vcpu)
static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu) static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
{ {
if (!vcpu->requests)
return 0;
retry: retry:
s390_vcpu_unblock(vcpu); kvm_s390_vcpu_request_handled(vcpu);
/* /*
* We use MMU_RELOAD just to re-arm the ipte notifier for the * We use MMU_RELOAD just to re-arm the ipte notifier for the
* guest prefix page. gmap_ipte_notify will wait on the ptl lock. * guest prefix page. gmap_ipte_notify will wait on the ptl lock.
...@@ -1993,12 +2017,14 @@ static int __vcpu_run(struct kvm_vcpu *vcpu) ...@@ -1993,12 +2017,14 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
* As PF_VCPU will be used in fault handler, between * As PF_VCPU will be used in fault handler, between
* guest_enter and guest_exit should be no uaccess. * guest_enter and guest_exit should be no uaccess.
*/ */
preempt_disable(); local_irq_disable();
kvm_guest_enter(); __kvm_guest_enter();
preempt_enable(); local_irq_enable();
exit_reason = sie64a(vcpu->arch.sie_block, exit_reason = sie64a(vcpu->arch.sie_block,
vcpu->run->s.regs.gprs); vcpu->run->s.regs.gprs);
kvm_guest_exit(); local_irq_disable();
__kvm_guest_exit();
local_irq_enable();
vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
rc = vcpu_post_run(vcpu, exit_reason); rc = vcpu_post_run(vcpu, exit_reason);
...@@ -2068,7 +2094,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) ...@@ -2068,7 +2094,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) { if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) {
kvm_s390_vcpu_start(vcpu); kvm_s390_vcpu_start(vcpu);
} else if (is_vcpu_stopped(vcpu)) { } else if (is_vcpu_stopped(vcpu)) {
pr_err_ratelimited("kvm-s390: can't run stopped vcpu %d\n", pr_err_ratelimited("can't run stopped vcpu %d\n",
vcpu->vcpu_id); vcpu->vcpu_id);
return -EINVAL; return -EINVAL;
} }
...@@ -2206,8 +2232,7 @@ int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr) ...@@ -2206,8 +2232,7 @@ int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr)
static void __disable_ibs_on_vcpu(struct kvm_vcpu *vcpu) static void __disable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
{ {
kvm_check_request(KVM_REQ_ENABLE_IBS, vcpu); kvm_check_request(KVM_REQ_ENABLE_IBS, vcpu);
kvm_make_request(KVM_REQ_DISABLE_IBS, vcpu); kvm_s390_sync_request(KVM_REQ_DISABLE_IBS, vcpu);
exit_sie_sync(vcpu);
} }
static void __disable_ibs_on_all_vcpus(struct kvm *kvm) static void __disable_ibs_on_all_vcpus(struct kvm *kvm)
...@@ -2223,8 +2248,7 @@ static void __disable_ibs_on_all_vcpus(struct kvm *kvm) ...@@ -2223,8 +2248,7 @@ static void __disable_ibs_on_all_vcpus(struct kvm *kvm)
static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu) static void __enable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
{ {
kvm_check_request(KVM_REQ_DISABLE_IBS, vcpu); kvm_check_request(KVM_REQ_DISABLE_IBS, vcpu);
kvm_make_request(KVM_REQ_ENABLE_IBS, vcpu); kvm_s390_sync_request(KVM_REQ_ENABLE_IBS, vcpu);
exit_sie_sync(vcpu);
} }
void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu) void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
...@@ -2563,7 +2587,7 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -2563,7 +2587,7 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
/* Section: memory related */ /* Section: memory related */
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
/* A few sanity checks. We can have memory slots which have to be /* A few sanity checks. We can have memory slots which have to be
...@@ -2581,8 +2605,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, ...@@ -2581,8 +2605,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
} }
void kvm_arch_commit_memory_region(struct kvm *kvm, void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old, const struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change) enum kvm_mr_change change)
{ {
int rc; int rc;
...@@ -2601,7 +2626,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, ...@@ -2601,7 +2626,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
rc = gmap_map_segment(kvm->arch.gmap, mem->userspace_addr, rc = gmap_map_segment(kvm->arch.gmap, mem->userspace_addr,
mem->guest_phys_addr, mem->memory_size); mem->guest_phys_addr, mem->memory_size);
if (rc) if (rc)
printk(KERN_WARNING "kvm-s390: failed to commit memory region\n"); pr_warn("failed to commit memory region\n");
return; return;
} }
......
...@@ -211,10 +211,10 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr); ...@@ -211,10 +211,10 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr); int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu);
void s390_vcpu_block(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu);
void s390_vcpu_unblock(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_unblock(struct kvm_vcpu *vcpu);
void exit_sie(struct kvm_vcpu *vcpu); void exit_sie(struct kvm_vcpu *vcpu);
void exit_sie_sync(struct kvm_vcpu *vcpu); void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu);
int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu); int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu);
/* is cmma enabled */ /* is cmma enabled */
...@@ -228,6 +228,25 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu); ...@@ -228,6 +228,25 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
int kvm_s390_inject_prog_irq(struct kvm_vcpu *vcpu, int kvm_s390_inject_prog_irq(struct kvm_vcpu *vcpu,
struct kvm_s390_pgm_info *pgm_info); struct kvm_s390_pgm_info *pgm_info);
static inline void kvm_s390_vcpu_block_all(struct kvm *kvm)
{
int i;
struct kvm_vcpu *vcpu;
WARN_ON(!mutex_is_locked(&kvm->lock));
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_s390_vcpu_block(vcpu);
}
static inline void kvm_s390_vcpu_unblock_all(struct kvm *kvm)
{
int i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_s390_vcpu_unblock(vcpu);
}
/** /**
* kvm_s390_inject_prog_cond - conditionally inject a program check * kvm_s390_inject_prog_cond - conditionally inject a program check
* @vcpu: virtual cpu * @vcpu: virtual cpu
......
...@@ -698,10 +698,14 @@ static int handle_pfmf(struct kvm_vcpu *vcpu) ...@@ -698,10 +698,14 @@ static int handle_pfmf(struct kvm_vcpu *vcpu)
case 0x00001000: case 0x00001000:
end = (start + (1UL << 20)) & ~((1UL << 20) - 1); end = (start + (1UL << 20)) & ~((1UL << 20) - 1);
break; break;
/* We dont support EDAT2
case 0x00002000: case 0x00002000:
/* only support 2G frame size if EDAT2 is available and we are
not in 24-bit addressing mode */
if (!test_kvm_facility(vcpu->kvm, 78) ||
psw_bits(vcpu->arch.sie_block->gpsw).eaba == PSW_AMODE_24BIT)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
end = (start + (1UL << 31)) & ~((1UL << 31) - 1); end = (start + (1UL << 31)) & ~((1UL << 31) - 1);
break;*/ break;
default: default:
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
} }
......
...@@ -193,6 +193,8 @@ struct x86_emulate_ops { ...@@ -193,6 +193,8 @@ struct x86_emulate_ops {
int (*cpl)(struct x86_emulate_ctxt *ctxt); int (*cpl)(struct x86_emulate_ctxt *ctxt);
int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest); int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest);
int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value); int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value);
u64 (*get_smbase)(struct x86_emulate_ctxt *ctxt);
void (*set_smbase)(struct x86_emulate_ctxt *ctxt, u64 smbase);
int (*set_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data); int (*set_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data);
int (*get_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata); int (*get_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
int (*check_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc); int (*check_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc);
...@@ -262,6 +264,11 @@ enum x86emul_mode { ...@@ -262,6 +264,11 @@ enum x86emul_mode {
X86EMUL_MODE_PROT64, /* 64-bit (long) mode. */ X86EMUL_MODE_PROT64, /* 64-bit (long) mode. */
}; };
/* These match some of the HF_* flags defined in kvm_host.h */
#define X86EMUL_GUEST_MASK (1 << 5) /* VCPU is in guest-mode */
#define X86EMUL_SMM_MASK (1 << 6)
#define X86EMUL_SMM_INSIDE_NMI_MASK (1 << 7)
struct x86_emulate_ctxt { struct x86_emulate_ctxt {
const struct x86_emulate_ops *ops; const struct x86_emulate_ops *ops;
...@@ -273,8 +280,8 @@ struct x86_emulate_ctxt { ...@@ -273,8 +280,8 @@ struct x86_emulate_ctxt {
/* interruptibility state, as a result of execution of STI or MOV SS */ /* interruptibility state, as a result of execution of STI or MOV SS */
int interruptibility; int interruptibility;
int emul_flags;
bool guest_mode; /* guest running a nested guest */
bool perm_ok; /* do not check permissions if true */ bool perm_ok; /* do not check permissions if true */
bool ud; /* inject an #UD if host doesn't support insn */ bool ud; /* inject an #UD if host doesn't support insn */
......
...@@ -184,23 +184,12 @@ struct kvm_mmu_memory_cache { ...@@ -184,23 +184,12 @@ struct kvm_mmu_memory_cache {
void *objects[KVM_NR_MEM_OBJS]; void *objects[KVM_NR_MEM_OBJS];
}; };
/*
* kvm_mmu_page_role, below, is defined as:
*
* bits 0:3 - total guest paging levels (2-4, or zero for real mode)
* bits 4:7 - page table level for this shadow (1-4)
* bits 8:9 - page table quadrant for 2-level guests
* bit 16 - direct mapping of virtual to physical mapping at gfn
* used for real mode and two-dimensional paging
* bits 17:19 - common access permissions for all ptes in this shadow page
*/
union kvm_mmu_page_role { union kvm_mmu_page_role {
unsigned word; unsigned word;
struct { struct {
unsigned level:4; unsigned level:4;
unsigned cr4_pae:1; unsigned cr4_pae:1;
unsigned quadrant:2; unsigned quadrant:2;
unsigned pad_for_nice_hex_output:6;
unsigned direct:1; unsigned direct:1;
unsigned access:3; unsigned access:3;
unsigned invalid:1; unsigned invalid:1;
...@@ -208,6 +197,15 @@ union kvm_mmu_page_role { ...@@ -208,6 +197,15 @@ union kvm_mmu_page_role {
unsigned cr0_wp:1; unsigned cr0_wp:1;
unsigned smep_andnot_wp:1; unsigned smep_andnot_wp:1;
unsigned smap_andnot_wp:1; unsigned smap_andnot_wp:1;
unsigned :8;
/*
* This is left at the top of the word so that
* kvm_memslots_for_spte_role can extract it with a
* simple shift. While there is room, give it a whole
* byte so it is also faster to load it from memory.
*/
unsigned smm:8;
}; };
}; };
...@@ -338,12 +336,28 @@ struct kvm_pmu { ...@@ -338,12 +336,28 @@ struct kvm_pmu {
u64 reprogram_pmi; u64 reprogram_pmi;
}; };
struct kvm_pmu_ops;
enum { enum {
KVM_DEBUGREG_BP_ENABLED = 1, KVM_DEBUGREG_BP_ENABLED = 1,
KVM_DEBUGREG_WONT_EXIT = 2, KVM_DEBUGREG_WONT_EXIT = 2,
KVM_DEBUGREG_RELOAD = 4, KVM_DEBUGREG_RELOAD = 4,
}; };
struct kvm_mtrr_range {
u64 base;
u64 mask;
struct list_head node;
};
struct kvm_mtrr {
struct kvm_mtrr_range var_ranges[KVM_NR_VAR_MTRR];
mtrr_type fixed_ranges[KVM_NR_FIXED_MTRR_REGION];
u64 deftype;
struct list_head head;
};
struct kvm_vcpu_arch { struct kvm_vcpu_arch {
/* /*
* rip and regs accesses must go through * rip and regs accesses must go through
...@@ -368,6 +382,7 @@ struct kvm_vcpu_arch { ...@@ -368,6 +382,7 @@ struct kvm_vcpu_arch {
int32_t apic_arb_prio; int32_t apic_arb_prio;
int mp_state; int mp_state;
u64 ia32_misc_enable_msr; u64 ia32_misc_enable_msr;
u64 smbase;
bool tpr_access_reporting; bool tpr_access_reporting;
u64 ia32_xss; u64 ia32_xss;
...@@ -471,8 +486,9 @@ struct kvm_vcpu_arch { ...@@ -471,8 +486,9 @@ struct kvm_vcpu_arch {
atomic_t nmi_queued; /* unprocessed asynchronous NMIs */ atomic_t nmi_queued; /* unprocessed asynchronous NMIs */
unsigned nmi_pending; /* NMI queued after currently running handler */ unsigned nmi_pending; /* NMI queued after currently running handler */
bool nmi_injected; /* Trying to inject an NMI this entry */ bool nmi_injected; /* Trying to inject an NMI this entry */
bool smi_pending; /* SMI queued after currently running handler */
struct mtrr_state_type mtrr_state; struct kvm_mtrr mtrr_state;
u64 pat; u64 pat;
unsigned switch_db_regs; unsigned switch_db_regs;
...@@ -637,6 +653,8 @@ struct kvm_arch { ...@@ -637,6 +653,8 @@ struct kvm_arch {
#endif #endif
bool boot_vcpu_runs_old_kvmclock; bool boot_vcpu_runs_old_kvmclock;
u64 disabled_quirks;
}; };
struct kvm_vm_stat { struct kvm_vm_stat {
...@@ -689,12 +707,13 @@ struct msr_data { ...@@ -689,12 +707,13 @@ struct msr_data {
struct kvm_lapic_irq { struct kvm_lapic_irq {
u32 vector; u32 vector;
u32 delivery_mode; u16 delivery_mode;
u32 dest_mode; u16 dest_mode;
u32 level; bool level;
u32 trig_mode; u16 trig_mode;
u32 shorthand; u32 shorthand;
u32 dest_id; u32 dest_id;
bool msi_redir_hint;
}; };
struct kvm_x86_ops { struct kvm_x86_ops {
...@@ -706,19 +725,20 @@ struct kvm_x86_ops { ...@@ -706,19 +725,20 @@ struct kvm_x86_ops {
int (*hardware_setup)(void); /* __init */ int (*hardware_setup)(void); /* __init */
void (*hardware_unsetup)(void); /* __exit */ void (*hardware_unsetup)(void); /* __exit */
bool (*cpu_has_accelerated_tpr)(void); bool (*cpu_has_accelerated_tpr)(void);
bool (*cpu_has_high_real_mode_segbase)(void);
void (*cpuid_update)(struct kvm_vcpu *vcpu); void (*cpuid_update)(struct kvm_vcpu *vcpu);
/* Create, but do not attach this VCPU */ /* Create, but do not attach this VCPU */
struct kvm_vcpu *(*vcpu_create)(struct kvm *kvm, unsigned id); struct kvm_vcpu *(*vcpu_create)(struct kvm *kvm, unsigned id);
void (*vcpu_free)(struct kvm_vcpu *vcpu); void (*vcpu_free)(struct kvm_vcpu *vcpu);
void (*vcpu_reset)(struct kvm_vcpu *vcpu); void (*vcpu_reset)(struct kvm_vcpu *vcpu, bool init_event);
void (*prepare_guest_switch)(struct kvm_vcpu *vcpu); void (*prepare_guest_switch)(struct kvm_vcpu *vcpu);
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu); void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu); void (*vcpu_put)(struct kvm_vcpu *vcpu);
void (*update_db_bp_intercept)(struct kvm_vcpu *vcpu); void (*update_db_bp_intercept)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata); int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg); u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg);
void (*get_segment)(struct kvm_vcpu *vcpu, void (*get_segment)(struct kvm_vcpu *vcpu,
...@@ -836,6 +856,8 @@ struct kvm_x86_ops { ...@@ -836,6 +856,8 @@ struct kvm_x86_ops {
void (*enable_log_dirty_pt_masked)(struct kvm *kvm, void (*enable_log_dirty_pt_masked)(struct kvm *kvm,
struct kvm_memory_slot *slot, struct kvm_memory_slot *slot,
gfn_t offset, unsigned long mask); gfn_t offset, unsigned long mask);
/* pmu operations of sub-arch */
const struct kvm_pmu_ops *pmu_ops;
}; };
struct kvm_arch_async_pf { struct kvm_arch_async_pf {
...@@ -871,7 +893,7 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); ...@@ -871,7 +893,7 @@ void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
void kvm_mmu_slot_remove_write_access(struct kvm *kvm, void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
struct kvm_memory_slot *memslot); struct kvm_memory_slot *memslot);
void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
struct kvm_memory_slot *memslot); const struct kvm_memory_slot *memslot);
void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
struct kvm_memory_slot *memslot); struct kvm_memory_slot *memslot);
void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
...@@ -882,7 +904,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm, ...@@ -882,7 +904,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot, struct kvm_memory_slot *slot,
gfn_t gfn_offset, unsigned long mask); gfn_t gfn_offset, unsigned long mask);
void kvm_mmu_zap_all(struct kvm *kvm); void kvm_mmu_zap_all(struct kvm *kvm);
void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm); void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, struct kvm_memslots *slots);
unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
...@@ -890,7 +912,6 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3); ...@@ -890,7 +912,6 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3);
int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
const void *val, int bytes); const void *val, int bytes);
u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
struct kvm_irq_mask_notifier { struct kvm_irq_mask_notifier {
void (*func)(struct kvm_irq_mask_notifier *kimn, bool masked); void (*func)(struct kvm_irq_mask_notifier *kimn, bool masked);
...@@ -938,7 +959,7 @@ static inline int emulate_instruction(struct kvm_vcpu *vcpu, ...@@ -938,7 +959,7 @@ static inline int emulate_instruction(struct kvm_vcpu *vcpu,
void kvm_enable_efer_bits(u64); void kvm_enable_efer_bits(u64);
bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer);
int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data); int kvm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
struct x86_emulate_ctxt; struct x86_emulate_ctxt;
...@@ -967,7 +988,7 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw); ...@@ -967,7 +988,7 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw);
void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l); void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l);
int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr); int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr);
int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr);
int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr);
unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu); unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu);
...@@ -1110,6 +1131,14 @@ enum { ...@@ -1110,6 +1131,14 @@ enum {
#define HF_NMI_MASK (1 << 3) #define HF_NMI_MASK (1 << 3)
#define HF_IRET_MASK (1 << 4) #define HF_IRET_MASK (1 << 4)
#define HF_GUEST_MASK (1 << 5) /* VCPU is in guest-mode */ #define HF_GUEST_MASK (1 << 5) /* VCPU is in guest-mode */
#define HF_SMM_MASK (1 << 6)
#define HF_SMM_INSIDE_NMI_MASK (1 << 7)
#define __KVM_VCPU_MULTIPLE_ADDRESS_SPACE
#define KVM_ADDRESS_SPACE_NUM 2
#define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0)
#define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm)
/* /*
* Hardware virtualization extension instructions may fault if a * Hardware virtualization extension instructions may fault if a
...@@ -1144,7 +1173,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v); ...@@ -1144,7 +1173,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu); int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu); int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
int kvm_cpu_get_interrupt(struct kvm_vcpu *v); int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
void kvm_vcpu_reset(struct kvm_vcpu *vcpu); void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu); void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);
void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm, void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
unsigned long address); unsigned long address);
...@@ -1168,16 +1197,9 @@ void kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err); ...@@ -1168,16 +1197,9 @@ void kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err);
int kvm_is_in_guest(void); int kvm_is_in_guest(void);
void kvm_pmu_init(struct kvm_vcpu *vcpu); int __x86_set_memory_region(struct kvm *kvm,
void kvm_pmu_destroy(struct kvm_vcpu *vcpu); const struct kvm_userspace_memory_region *mem);
void kvm_pmu_reset(struct kvm_vcpu *vcpu); int x86_set_memory_region(struct kvm *kvm,
void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu); const struct kvm_userspace_memory_region *mem);
bool kvm_pmu_msr(struct kvm_vcpu *vcpu, u32 msr);
int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
int kvm_pmu_check_pmc(struct kvm_vcpu *vcpu, unsigned pmc);
int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
void kvm_handle_pmu_event(struct kvm_vcpu *vcpu);
void kvm_deliver_pmi(struct kvm_vcpu *vcpu);
#endif /* _ASM_X86_KVM_HOST_H */ #endif /* _ASM_X86_KVM_HOST_H */
...@@ -41,5 +41,6 @@ struct pvclock_wall_clock { ...@@ -41,5 +41,6 @@ struct pvclock_wall_clock {
#define PVCLOCK_TSC_STABLE_BIT (1 << 0) #define PVCLOCK_TSC_STABLE_BIT (1 << 0)
#define PVCLOCK_GUEST_STOPPED (1 << 1) #define PVCLOCK_GUEST_STOPPED (1 << 1)
#define PVCLOCK_COUNTS_FROM_ZERO (1 << 2)
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_PVCLOCK_ABI_H */ #endif /* _ASM_X86_PVCLOCK_ABI_H */
...@@ -86,7 +86,6 @@ unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src, ...@@ -86,7 +86,6 @@ unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
offset = pvclock_get_nsec_offset(src); offset = pvclock_get_nsec_offset(src);
ret = src->system_time + offset; ret = src->system_time + offset;
ret_flags = src->flags; ret_flags = src->flags;
rdtsc_barrier();
*cycles = ret; *cycles = ret;
*flags = ret_flags; *flags = ret_flags;
......
...@@ -106,6 +106,8 @@ struct kvm_ioapic_state { ...@@ -106,6 +106,8 @@ struct kvm_ioapic_state {
#define KVM_IRQCHIP_IOAPIC 2 #define KVM_IRQCHIP_IOAPIC 2
#define KVM_NR_IRQCHIPS 3 #define KVM_NR_IRQCHIPS 3
#define KVM_RUN_X86_SMM (1 << 0)
/* for KVM_GET_REGS and KVM_SET_REGS */ /* for KVM_GET_REGS and KVM_SET_REGS */
struct kvm_regs { struct kvm_regs {
/* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
...@@ -281,6 +283,7 @@ struct kvm_reinject_control { ...@@ -281,6 +283,7 @@ struct kvm_reinject_control {
#define KVM_VCPUEVENT_VALID_NMI_PENDING 0x00000001 #define KVM_VCPUEVENT_VALID_NMI_PENDING 0x00000001
#define KVM_VCPUEVENT_VALID_SIPI_VECTOR 0x00000002 #define KVM_VCPUEVENT_VALID_SIPI_VECTOR 0x00000002
#define KVM_VCPUEVENT_VALID_SHADOW 0x00000004 #define KVM_VCPUEVENT_VALID_SHADOW 0x00000004
#define KVM_VCPUEVENT_VALID_SMM 0x00000008
/* Interrupt shadow states */ /* Interrupt shadow states */
#define KVM_X86_SHADOW_INT_MOV_SS 0x01 #define KVM_X86_SHADOW_INT_MOV_SS 0x01
...@@ -309,7 +312,13 @@ struct kvm_vcpu_events { ...@@ -309,7 +312,13 @@ struct kvm_vcpu_events {
} nmi; } nmi;
__u32 sipi_vector; __u32 sipi_vector;
__u32 flags; __u32 flags;
__u32 reserved[10]; struct {
__u8 smm;
__u8 pending;
__u8 smm_inside_nmi;
__u8 latched_init;
} smi;
__u32 reserved[9];
}; };
/* for KVM_GET/SET_DEBUGREGS */ /* for KVM_GET/SET_DEBUGREGS */
...@@ -345,4 +354,7 @@ struct kvm_xcrs { ...@@ -345,4 +354,7 @@ struct kvm_xcrs {
struct kvm_sync_regs { struct kvm_sync_regs {
}; };
#define KVM_QUIRK_LINT0_REENABLED (1 << 0)
#define KVM_QUIRK_CD_NW_CLEARED (1 << 1)
#endif /* _ASM_X86_KVM_H */ #endif /* _ASM_X86_KVM_H */
...@@ -331,7 +331,7 @@ static void kvm_guest_apic_eoi_write(u32 reg, u32 val) ...@@ -331,7 +331,7 @@ static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
apic_write(APIC_EOI, APIC_EOI_ACK); apic_write(APIC_EOI, APIC_EOI_ACK);
} }
void kvm_guest_cpu_init(void) static void kvm_guest_cpu_init(void)
{ {
if (!kvm_para_available()) if (!kvm_para_available())
return; return;
...@@ -688,7 +688,7 @@ static inline void spin_time_accum_blocked(u64 start) ...@@ -688,7 +688,7 @@ static inline void spin_time_accum_blocked(u64 start)
static struct dentry *d_spin_debug; static struct dentry *d_spin_debug;
static struct dentry *d_kvm_debug; static struct dentry *d_kvm_debug;
struct dentry *kvm_init_debugfs(void) static struct dentry *kvm_init_debugfs(void)
{ {
d_kvm_debug = debugfs_create_dir("kvm-guest", NULL); d_kvm_debug = debugfs_create_dir("kvm-guest", NULL);
if (!d_kvm_debug) if (!d_kvm_debug)
......
...@@ -24,6 +24,7 @@ ...@@ -24,6 +24,7 @@
#include <linux/percpu.h> #include <linux/percpu.h>
#include <linux/hardirq.h> #include <linux/hardirq.h>
#include <linux/memblock.h> #include <linux/memblock.h>
#include <linux/sched.h>
#include <asm/x86_init.h> #include <asm/x86_init.h>
#include <asm/reboot.h> #include <asm/reboot.h>
...@@ -217,8 +218,10 @@ static void kvm_shutdown(void) ...@@ -217,8 +218,10 @@ static void kvm_shutdown(void)
void __init kvmclock_init(void) void __init kvmclock_init(void)
{ {
struct pvclock_vcpu_time_info *vcpu_time;
unsigned long mem; unsigned long mem;
int size; int size, cpu;
u8 flags;
size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS); size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
...@@ -264,7 +267,14 @@ void __init kvmclock_init(void) ...@@ -264,7 +267,14 @@ void __init kvmclock_init(void)
pv_info.name = "KVM"; pv_info.name = "KVM";
if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT)) if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT); pvclock_set_flags(~0);
cpu = get_cpu();
vcpu_time = &hv_clock[cpu].pvti;
flags = pvclock_read_flags(vcpu_time);
if (flags & PVCLOCK_COUNTS_FROM_ZERO)
set_sched_clock_stable();
put_cpu();
} }
int __init kvm_setup_vsyscall_timeinfo(void) int __init kvm_setup_vsyscall_timeinfo(void)
......
...@@ -86,15 +86,16 @@ config KVM_MMU_AUDIT ...@@ -86,15 +86,16 @@ config KVM_MMU_AUDIT
auditing of KVM MMU events at runtime. auditing of KVM MMU events at runtime.
config KVM_DEVICE_ASSIGNMENT config KVM_DEVICE_ASSIGNMENT
bool "KVM legacy PCI device assignment support" bool "KVM legacy PCI device assignment support (DEPRECATED)"
depends on KVM && PCI && IOMMU_API depends on KVM && PCI && IOMMU_API
default y default n
---help--- ---help---
Provide support for legacy PCI device assignment through KVM. The Provide support for legacy PCI device assignment through KVM. The
kernel now also supports a full featured userspace device driver kernel now also supports a full featured userspace device driver
framework through VFIO, which supersedes much of this support. framework through VFIO, which supersedes this support and provides
better security.
If unsure, say Y. If unsure, say N.
# OK, it's a little counter-intuitive to do this, but it puts it neatly under # OK, it's a little counter-intuitive to do this, but it puts it neatly under
# the virtualization menu. # the virtualization menu.
......
...@@ -12,10 +12,10 @@ kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ ...@@ -12,10 +12,10 @@ kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o
kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += assigned-dev.o iommu.o kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += assigned-dev.o iommu.o
kvm-intel-y += vmx.o kvm-intel-y += vmx.o pmu_intel.o
kvm-amd-y += svm.o kvm-amd-y += svm.o pmu_amd.o
obj-$(CONFIG_KVM) += kvm.o obj-$(CONFIG_KVM) += kvm.o
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
......
...@@ -16,12 +16,14 @@ ...@@ -16,12 +16,14 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <asm/fpu/internal.h> /* For use_eager_fpu. Ugh! */
#include <asm/user.h> #include <asm/user.h>
#include <asm/fpu/xstate.h> #include <asm/fpu/xstate.h>
#include "cpuid.h" #include "cpuid.h"
#include "lapic.h" #include "lapic.h"
#include "mmu.h" #include "mmu.h"
#include "trace.h" #include "trace.h"
#include "pmu.h"
static u32 xstate_required_size(u64 xstate_bv, bool compacted) static u32 xstate_required_size(u64 xstate_bv, bool compacted)
{ {
...@@ -95,7 +97,7 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) ...@@ -95,7 +97,7 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
if (best && (best->eax & (F(XSAVES) | F(XSAVEC)))) if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
best->ebx = xstate_required_size(vcpu->arch.xcr0, true); best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
vcpu->arch.eager_fpu = guest_cpuid_has_mpx(vcpu); vcpu->arch.eager_fpu = use_eager_fpu() || guest_cpuid_has_mpx(vcpu);
/* /*
* The existing code assumes virtual address is 48-bit in the canonical * The existing code assumes virtual address is 48-bit in the canonical
...@@ -109,7 +111,7 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) ...@@ -109,7 +111,7 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
/* Update physical-address width */ /* Update physical-address width */
vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
kvm_pmu_cpuid_update(vcpu); kvm_pmu_refresh(vcpu);
return 0; return 0;
} }
...@@ -413,6 +415,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, ...@@ -413,6 +415,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
} }
break; break;
} }
case 6: /* Thermal management */
entry->eax = 0x4; /* allow ARAT */
entry->ebx = 0;
entry->ecx = 0;
entry->edx = 0;
break;
case 7: { case 7: {
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
/* Mask ebx against host capability word 9 */ /* Mask ebx against host capability word 9 */
...@@ -589,7 +597,6 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, ...@@ -589,7 +597,6 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
break; break;
case 3: /* Processor serial number */ case 3: /* Processor serial number */
case 5: /* MONITOR/MWAIT */ case 5: /* MONITOR/MWAIT */
case 6: /* Thermal management */
case 0xC0000002: case 0xC0000002:
case 0xC0000003: case 0xC0000003:
case 0xC0000004: case 0xC0000004:
......
...@@ -70,6 +70,14 @@ static inline bool guest_cpuid_has_fsgsbase(struct kvm_vcpu *vcpu) ...@@ -70,6 +70,14 @@ static inline bool guest_cpuid_has_fsgsbase(struct kvm_vcpu *vcpu)
return best && (best->ebx & bit(X86_FEATURE_FSGSBASE)); return best && (best->ebx & bit(X86_FEATURE_FSGSBASE));
} }
static inline bool guest_cpuid_has_longmode(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
best = kvm_find_cpuid_entry(vcpu, 0x80000001, 0);
return best && (best->edx & bit(X86_FEATURE_LM));
}
static inline bool guest_cpuid_has_osvw(struct kvm_vcpu *vcpu) static inline bool guest_cpuid_has_osvw(struct kvm_vcpu *vcpu)
{ {
struct kvm_cpuid_entry2 *best; struct kvm_cpuid_entry2 *best;
......
...@@ -25,6 +25,7 @@ ...@@ -25,6 +25,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <asm/kvm_emulate.h> #include <asm/kvm_emulate.h>
#include <linux/stringify.h> #include <linux/stringify.h>
#include <asm/debugreg.h>
#include "x86.h" #include "x86.h"
#include "tss.h" #include "tss.h"
...@@ -523,13 +524,9 @@ static void masked_increment(ulong *reg, ulong mask, int inc) ...@@ -523,13 +524,9 @@ static void masked_increment(ulong *reg, ulong mask, int inc)
static inline void static inline void
register_address_increment(struct x86_emulate_ctxt *ctxt, int reg, int inc) register_address_increment(struct x86_emulate_ctxt *ctxt, int reg, int inc)
{ {
ulong mask; ulong *preg = reg_rmw(ctxt, reg);
if (ctxt->ad_bytes == sizeof(unsigned long)) assign_register(preg, *preg + inc, ctxt->ad_bytes);
mask = ~0UL;
else
mask = ad_mask(ctxt);
masked_increment(reg_rmw(ctxt, reg), mask, inc);
} }
static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc) static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
...@@ -2262,6 +2259,260 @@ static int em_lseg(struct x86_emulate_ctxt *ctxt) ...@@ -2262,6 +2259,260 @@ static int em_lseg(struct x86_emulate_ctxt *ctxt)
return rc; return rc;
} }
static int emulator_has_longmode(struct x86_emulate_ctxt *ctxt)
{
u32 eax, ebx, ecx, edx;
eax = 0x80000001;
ecx = 0;
ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
return edx & bit(X86_FEATURE_LM);
}
#define GET_SMSTATE(type, smbase, offset) \
({ \
type __val; \
int r = ctxt->ops->read_std(ctxt, smbase + offset, &__val, \
sizeof(__val), NULL); \
if (r != X86EMUL_CONTINUE) \
return X86EMUL_UNHANDLEABLE; \
__val; \
})
static void rsm_set_desc_flags(struct desc_struct *desc, u32 flags)
{
desc->g = (flags >> 23) & 1;
desc->d = (flags >> 22) & 1;
desc->l = (flags >> 21) & 1;
desc->avl = (flags >> 20) & 1;
desc->p = (flags >> 15) & 1;
desc->dpl = (flags >> 13) & 3;
desc->s = (flags >> 12) & 1;
desc->type = (flags >> 8) & 15;
}
static int rsm_load_seg_32(struct x86_emulate_ctxt *ctxt, u64 smbase, int n)
{
struct desc_struct desc;
int offset;
u16 selector;
selector = GET_SMSTATE(u32, smbase, 0x7fa8 + n * 4);
if (n < 3)
offset = 0x7f84 + n * 12;
else
offset = 0x7f2c + (n - 3) * 12;
set_desc_base(&desc, GET_SMSTATE(u32, smbase, offset + 8));
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, offset + 4));
rsm_set_desc_flags(&desc, GET_SMSTATE(u32, smbase, offset));
ctxt->ops->set_segment(ctxt, selector, &desc, 0, n);
return X86EMUL_CONTINUE;
}
static int rsm_load_seg_64(struct x86_emulate_ctxt *ctxt, u64 smbase, int n)
{
struct desc_struct desc;
int offset;
u16 selector;
u32 base3;
offset = 0x7e00 + n * 16;
selector = GET_SMSTATE(u16, smbase, offset);
rsm_set_desc_flags(&desc, GET_SMSTATE(u16, smbase, offset + 2) << 8);
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, offset + 4));
set_desc_base(&desc, GET_SMSTATE(u32, smbase, offset + 8));
base3 = GET_SMSTATE(u32, smbase, offset + 12);
ctxt->ops->set_segment(ctxt, selector, &desc, base3, n);
return X86EMUL_CONTINUE;
}
static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt,
u64 cr0, u64 cr4)
{
int bad;
/*
* First enable PAE, long mode needs it before CR0.PG = 1 is set.
* Then enable protected mode. However, PCID cannot be enabled
* if EFER.LMA=0, so set it separately.
*/
bad = ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PCIDE);
if (bad)
return X86EMUL_UNHANDLEABLE;
bad = ctxt->ops->set_cr(ctxt, 0, cr0);
if (bad)
return X86EMUL_UNHANDLEABLE;
if (cr4 & X86_CR4_PCIDE) {
bad = ctxt->ops->set_cr(ctxt, 4, cr4);
if (bad)
return X86EMUL_UNHANDLEABLE;
}
return X86EMUL_CONTINUE;
}
static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt, u64 smbase)
{
struct desc_struct desc;
struct desc_ptr dt;
u16 selector;
u32 val, cr0, cr4;
int i;
cr0 = GET_SMSTATE(u32, smbase, 0x7ffc);
ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u32, smbase, 0x7ff8));
ctxt->eflags = GET_SMSTATE(u32, smbase, 0x7ff4) | X86_EFLAGS_FIXED;
ctxt->_eip = GET_SMSTATE(u32, smbase, 0x7ff0);
for (i = 0; i < 8; i++)
*reg_write(ctxt, i) = GET_SMSTATE(u32, smbase, 0x7fd0 + i * 4);
val = GET_SMSTATE(u32, smbase, 0x7fcc);
ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1);
val = GET_SMSTATE(u32, smbase, 0x7fc8);
ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1);
selector = GET_SMSTATE(u32, smbase, 0x7fc4);
set_desc_base(&desc, GET_SMSTATE(u32, smbase, 0x7f64));
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, 0x7f60));
rsm_set_desc_flags(&desc, GET_SMSTATE(u32, smbase, 0x7f5c));
ctxt->ops->set_segment(ctxt, selector, &desc, 0, VCPU_SREG_TR);
selector = GET_SMSTATE(u32, smbase, 0x7fc0);
set_desc_base(&desc, GET_SMSTATE(u32, smbase, 0x7f80));
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, 0x7f7c));
rsm_set_desc_flags(&desc, GET_SMSTATE(u32, smbase, 0x7f78));
ctxt->ops->set_segment(ctxt, selector, &desc, 0, VCPU_SREG_LDTR);
dt.address = GET_SMSTATE(u32, smbase, 0x7f74);
dt.size = GET_SMSTATE(u32, smbase, 0x7f70);
ctxt->ops->set_gdt(ctxt, &dt);
dt.address = GET_SMSTATE(u32, smbase, 0x7f58);
dt.size = GET_SMSTATE(u32, smbase, 0x7f54);
ctxt->ops->set_idt(ctxt, &dt);
for (i = 0; i < 6; i++) {
int r = rsm_load_seg_32(ctxt, smbase, i);
if (r != X86EMUL_CONTINUE)
return r;
}
cr4 = GET_SMSTATE(u32, smbase, 0x7f14);
ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7ef8));
return rsm_enter_protected_mode(ctxt, cr0, cr4);
}
static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase)
{
struct desc_struct desc;
struct desc_ptr dt;
u64 val, cr0, cr4;
u32 base3;
u16 selector;
int i;
for (i = 0; i < 16; i++)
*reg_write(ctxt, i) = GET_SMSTATE(u64, smbase, 0x7ff8 - i * 8);
ctxt->_eip = GET_SMSTATE(u64, smbase, 0x7f78);
ctxt->eflags = GET_SMSTATE(u32, smbase, 0x7f70) | X86_EFLAGS_FIXED;
val = GET_SMSTATE(u32, smbase, 0x7f68);
ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1);
val = GET_SMSTATE(u32, smbase, 0x7f60);
ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1);
cr0 = GET_SMSTATE(u64, smbase, 0x7f58);
ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u64, smbase, 0x7f50));
cr4 = GET_SMSTATE(u64, smbase, 0x7f48);
ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7f00));
val = GET_SMSTATE(u64, smbase, 0x7ed0);
ctxt->ops->set_msr(ctxt, MSR_EFER, val & ~EFER_LMA);
selector = GET_SMSTATE(u32, smbase, 0x7e90);
rsm_set_desc_flags(&desc, GET_SMSTATE(u32, smbase, 0x7e92) << 8);
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, 0x7e94));
set_desc_base(&desc, GET_SMSTATE(u32, smbase, 0x7e98));
base3 = GET_SMSTATE(u32, smbase, 0x7e9c);
ctxt->ops->set_segment(ctxt, selector, &desc, base3, VCPU_SREG_TR);
dt.size = GET_SMSTATE(u32, smbase, 0x7e84);
dt.address = GET_SMSTATE(u64, smbase, 0x7e88);
ctxt->ops->set_idt(ctxt, &dt);
selector = GET_SMSTATE(u32, smbase, 0x7e70);
rsm_set_desc_flags(&desc, GET_SMSTATE(u32, smbase, 0x7e72) << 8);
set_desc_limit(&desc, GET_SMSTATE(u32, smbase, 0x7e74));
set_desc_base(&desc, GET_SMSTATE(u32, smbase, 0x7e78));
base3 = GET_SMSTATE(u32, smbase, 0x7e7c);
ctxt->ops->set_segment(ctxt, selector, &desc, base3, VCPU_SREG_LDTR);
dt.size = GET_SMSTATE(u32, smbase, 0x7e64);
dt.address = GET_SMSTATE(u64, smbase, 0x7e68);
ctxt->ops->set_gdt(ctxt, &dt);
for (i = 0; i < 6; i++) {
int r = rsm_load_seg_64(ctxt, smbase, i);
if (r != X86EMUL_CONTINUE)
return r;
}
return rsm_enter_protected_mode(ctxt, cr0, cr4);
}
static int em_rsm(struct x86_emulate_ctxt *ctxt)
{
unsigned long cr0, cr4, efer;
u64 smbase;
int ret;
if ((ctxt->emul_flags & X86EMUL_SMM_MASK) == 0)
return emulate_ud(ctxt);
/*
* Get back to real mode, to prepare a safe state in which to load
* CR0/CR3/CR4/EFER. Also this will ensure that addresses passed
* to read_std/write_std are not virtual.
*
* CR4.PCIDE must be zero, because it is a 64-bit mode only feature.
*/
cr0 = ctxt->ops->get_cr(ctxt, 0);
if (cr0 & X86_CR0_PE)
ctxt->ops->set_cr(ctxt, 0, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
cr4 = ctxt->ops->get_cr(ctxt, 4);
if (cr4 & X86_CR4_PAE)
ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PAE);
efer = 0;
ctxt->ops->set_msr(ctxt, MSR_EFER, efer);
smbase = ctxt->ops->get_smbase(ctxt);
if (emulator_has_longmode(ctxt))
ret = rsm_load_state_64(ctxt, smbase + 0x8000);
else
ret = rsm_load_state_32(ctxt, smbase + 0x8000);
if (ret != X86EMUL_CONTINUE) {
/* FIXME: should triple fault */
return X86EMUL_UNHANDLEABLE;
}
if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
ctxt->ops->set_nmi_mask(ctxt, false);
ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK;
ctxt->emul_flags &= ~X86EMUL_SMM_MASK;
return X86EMUL_CONTINUE;
}
static void static void
setup_syscalls_segments(struct x86_emulate_ctxt *ctxt, setup_syscalls_segments(struct x86_emulate_ctxt *ctxt,
struct desc_struct *cs, struct desc_struct *ss) struct desc_struct *cs, struct desc_struct *ss)
...@@ -2573,6 +2824,30 @@ static bool emulator_io_permited(struct x86_emulate_ctxt *ctxt, ...@@ -2573,6 +2824,30 @@ static bool emulator_io_permited(struct x86_emulate_ctxt *ctxt,
return true; return true;
} }
static void string_registers_quirk(struct x86_emulate_ctxt *ctxt)
{
/*
* Intel CPUs mask the counter and pointers in quite strange
* manner when ECX is zero due to REP-string optimizations.
*/
#ifdef CONFIG_X86_64
if (ctxt->ad_bytes != 4 || !vendor_intel(ctxt))
return;
*reg_write(ctxt, VCPU_REGS_RCX) = 0;
switch (ctxt->b) {
case 0xa4: /* movsb */
case 0xa5: /* movsd/w */
*reg_rmw(ctxt, VCPU_REGS_RSI) &= (u32)-1;
/* fall through */
case 0xaa: /* stosb */
case 0xab: /* stosd/w */
*reg_rmw(ctxt, VCPU_REGS_RDI) &= (u32)-1;
}
#endif
}
static void save_state_to_tss16(struct x86_emulate_ctxt *ctxt, static void save_state_to_tss16(struct x86_emulate_ctxt *ctxt,
struct tss_segment_16 *tss) struct tss_segment_16 *tss)
{ {
...@@ -2849,7 +3124,7 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, ...@@ -2849,7 +3124,7 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt,
ulong old_tss_base = ulong old_tss_base =
ops->get_cached_segment_base(ctxt, VCPU_SREG_TR); ops->get_cached_segment_base(ctxt, VCPU_SREG_TR);
u32 desc_limit; u32 desc_limit;
ulong desc_addr; ulong desc_addr, dr7;
/* FIXME: old_tss_base == ~0 ? */ /* FIXME: old_tss_base == ~0 ? */
...@@ -2934,6 +3209,9 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, ...@@ -2934,6 +3209,9 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt,
ret = em_push(ctxt); ret = em_push(ctxt);
} }
ops->get_dr(ctxt, 7, &dr7);
ops->set_dr(ctxt, 7, dr7 & ~(DR_LOCAL_ENABLE_MASK | DR_LOCAL_SLOWDOWN));
return ret; return ret;
} }
...@@ -3840,7 +4118,7 @@ static const struct opcode group5[] = { ...@@ -3840,7 +4118,7 @@ static const struct opcode group5[] = {
F(DstMem | SrcNone | Lock, em_inc), F(DstMem | SrcNone | Lock, em_inc),
F(DstMem | SrcNone | Lock, em_dec), F(DstMem | SrcNone | Lock, em_dec),
I(SrcMem | NearBranch, em_call_near_abs), I(SrcMem | NearBranch, em_call_near_abs),
I(SrcMemFAddr | ImplicitOps | Stack, em_call_far), I(SrcMemFAddr | ImplicitOps, em_call_far),
I(SrcMem | NearBranch, em_jmp_abs), I(SrcMem | NearBranch, em_jmp_abs),
I(SrcMemFAddr | ImplicitOps, em_jmp_far), I(SrcMemFAddr | ImplicitOps, em_jmp_far),
I(SrcMem | Stack, em_push), D(Undefined), I(SrcMem | Stack, em_push), D(Undefined),
...@@ -4173,7 +4451,7 @@ static const struct opcode twobyte_table[256] = { ...@@ -4173,7 +4451,7 @@ static const struct opcode twobyte_table[256] = {
F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N, F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
/* 0xA8 - 0xAF */ /* 0xA8 - 0xAF */
I(Stack | Src2GS, em_push_sreg), I(Stack | Src2GS, em_pop_sreg), I(Stack | Src2GS, em_push_sreg), I(Stack | Src2GS, em_pop_sreg),
DI(ImplicitOps, rsm), II(No64 | EmulateOnUD | ImplicitOps, em_rsm, rsm),
F(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_bts), F(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_bts),
F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shrd), F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shrd),
F(DstMem | SrcReg | Src2CL | ModRM, em_shrd), F(DstMem | SrcReg | Src2CL | ModRM, em_shrd),
...@@ -4871,7 +5149,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) ...@@ -4871,7 +5149,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
fetch_possible_mmx_operand(ctxt, &ctxt->dst); fetch_possible_mmx_operand(ctxt, &ctxt->dst);
} }
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) { if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && ctxt->intercept) {
rc = emulator_check_intercept(ctxt, ctxt->intercept, rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_PRE_EXCEPT); X86_ICPT_PRE_EXCEPT);
if (rc != X86EMUL_CONTINUE) if (rc != X86EMUL_CONTINUE)
...@@ -4900,7 +5178,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) ...@@ -4900,7 +5178,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
goto done; goto done;
} }
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) { if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept, rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_EXCEPT); X86_ICPT_POST_EXCEPT);
if (rc != X86EMUL_CONTINUE) if (rc != X86EMUL_CONTINUE)
...@@ -4910,6 +5188,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) ...@@ -4910,6 +5188,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
if (ctxt->rep_prefix && (ctxt->d & String)) { if (ctxt->rep_prefix && (ctxt->d & String)) {
/* All REP prefixes have the same first termination condition */ /* All REP prefixes have the same first termination condition */
if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) { if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
string_registers_quirk(ctxt);
ctxt->eip = ctxt->_eip; ctxt->eip = ctxt->_eip;
ctxt->eflags &= ~X86_EFLAGS_RF; ctxt->eflags &= ~X86_EFLAGS_RF;
goto done; goto done;
...@@ -4953,7 +5232,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) ...@@ -4953,7 +5232,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
special_insn: special_insn:
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) { if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept, rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_MEMACCESS); X86_ICPT_POST_MEMACCESS);
if (rc != X86EMUL_CONTINUE) if (rc != X86EMUL_CONTINUE)
......
...@@ -349,6 +349,7 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status) ...@@ -349,6 +349,7 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status)
irqe.delivery_mode = entry->fields.delivery_mode << 8; irqe.delivery_mode = entry->fields.delivery_mode << 8;
irqe.level = 1; irqe.level = 1;
irqe.shorthand = 0; irqe.shorthand = 0;
irqe.msi_redir_hint = false;
if (irqe.trig_mode == IOAPIC_EDGE_TRIG) if (irqe.trig_mode == IOAPIC_EDGE_TRIG)
ioapic->irr_delivered |= 1 << irq; ioapic->irr_delivered |= 1 << irq;
...@@ -637,11 +638,9 @@ void kvm_ioapic_destroy(struct kvm *kvm) ...@@ -637,11 +638,9 @@ void kvm_ioapic_destroy(struct kvm *kvm)
struct kvm_ioapic *ioapic = kvm->arch.vioapic; struct kvm_ioapic *ioapic = kvm->arch.vioapic;
cancel_delayed_work_sync(&ioapic->eoi_inject); cancel_delayed_work_sync(&ioapic->eoi_inject);
if (ioapic) { kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &ioapic->dev);
kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &ioapic->dev); kvm->arch.vioapic = NULL;
kvm->arch.vioapic = NULL; kfree(ioapic);
kfree(ioapic);
}
} }
int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state) int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
......
...@@ -31,6 +31,8 @@ ...@@ -31,6 +31,8 @@
#include "ioapic.h" #include "ioapic.h"
#include "lapic.h"
static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e, static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e,
struct kvm *kvm, int irq_source_id, int level, struct kvm *kvm, int irq_source_id, int level,
bool line_status) bool line_status)
...@@ -48,11 +50,6 @@ static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e, ...@@ -48,11 +50,6 @@ static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
line_status); line_status);
} }
inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
{
return irq->delivery_mode == APIC_DM_LOWEST;
}
int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
struct kvm_lapic_irq *irq, unsigned long *dest_map) struct kvm_lapic_irq *irq, unsigned long *dest_map)
{ {
...@@ -60,7 +57,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, ...@@ -60,7 +57,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
struct kvm_vcpu *vcpu, *lowest = NULL; struct kvm_vcpu *vcpu, *lowest = NULL;
if (irq->dest_mode == 0 && irq->dest_id == 0xff && if (irq->dest_mode == 0 && irq->dest_id == 0xff &&
kvm_is_dm_lowest_prio(irq)) { kvm_lowest_prio_delivery(irq)) {
printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n"); printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n");
irq->delivery_mode = APIC_DM_FIXED; irq->delivery_mode = APIC_DM_FIXED;
} }
...@@ -76,7 +73,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, ...@@ -76,7 +73,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
irq->dest_id, irq->dest_mode)) irq->dest_id, irq->dest_mode))
continue; continue;
if (!kvm_is_dm_lowest_prio(irq)) { if (!kvm_lowest_prio_delivery(irq)) {
if (r < 0) if (r < 0)
r = 0; r = 0;
r += kvm_apic_set_irq(vcpu, irq, dest_map); r += kvm_apic_set_irq(vcpu, irq, dest_map);
...@@ -106,9 +103,10 @@ static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, ...@@ -106,9 +103,10 @@ static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
irq->dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo; irq->dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
irq->trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data; irq->trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
irq->delivery_mode = e->msi.data & 0x700; irq->delivery_mode = e->msi.data & 0x700;
irq->msi_redir_hint = ((e->msi.address_lo
& MSI_ADDR_REDIRECTION_LOWPRI) > 0);
irq->level = 1; irq->level = 1;
irq->shorthand = 0; irq->shorthand = 0;
/* TODO Deal with RH bit of MSI message address */
} }
int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
......
...@@ -99,4 +99,9 @@ static inline bool is_guest_mode(struct kvm_vcpu *vcpu) ...@@ -99,4 +99,9 @@ static inline bool is_guest_mode(struct kvm_vcpu *vcpu)
return vcpu->arch.hflags & HF_GUEST_MASK; return vcpu->arch.hflags & HF_GUEST_MASK;
} }
static inline bool is_smm(struct kvm_vcpu *vcpu)
{
return vcpu->arch.hflags & HF_SMM_MASK;
}
#endif #endif
...@@ -240,6 +240,15 @@ static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id) ...@@ -240,6 +240,15 @@ static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id)
recalculate_apic_map(apic->vcpu->kvm); recalculate_apic_map(apic->vcpu->kvm);
} }
static inline void kvm_apic_set_x2apic_id(struct kvm_lapic *apic, u8 id)
{
u32 ldr = ((id >> 4) << 16) | (1 << (id & 0xf));
apic_set_reg(apic, APIC_ID, id << 24);
apic_set_reg(apic, APIC_LDR, ldr);
recalculate_apic_map(apic->vcpu->kvm);
}
static inline int apic_lvt_enabled(struct kvm_lapic *apic, int lvt_type) static inline int apic_lvt_enabled(struct kvm_lapic *apic, int lvt_type)
{ {
return !(kvm_apic_get_reg(apic, lvt_type) & APIC_LVT_MASKED); return !(kvm_apic_get_reg(apic, lvt_type) & APIC_LVT_MASKED);
...@@ -728,7 +737,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src, ...@@ -728,7 +737,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
dst = map->logical_map[cid]; dst = map->logical_map[cid];
if (irq->delivery_mode == APIC_DM_LOWEST) { if (kvm_lowest_prio_delivery(irq)) {
int l = -1; int l = -1;
for_each_set_bit(i, &bitmap, 16) { for_each_set_bit(i, &bitmap, 16) {
if (!dst[i]) if (!dst[i])
...@@ -799,7 +808,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, ...@@ -799,7 +808,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
break; break;
case APIC_DM_SMI: case APIC_DM_SMI:
apic_debug("Ignoring guest SMI\n"); result = 1;
kvm_make_request(KVM_REQ_SMI, vcpu);
kvm_vcpu_kick(vcpu);
break; break;
case APIC_DM_NMI: case APIC_DM_NMI:
...@@ -914,9 +925,10 @@ static void apic_send_ipi(struct kvm_lapic *apic) ...@@ -914,9 +925,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
irq.vector = icr_low & APIC_VECTOR_MASK; irq.vector = icr_low & APIC_VECTOR_MASK;
irq.delivery_mode = icr_low & APIC_MODE_MASK; irq.delivery_mode = icr_low & APIC_MODE_MASK;
irq.dest_mode = icr_low & APIC_DEST_MASK; irq.dest_mode = icr_low & APIC_DEST_MASK;
irq.level = icr_low & APIC_INT_ASSERT; irq.level = (icr_low & APIC_INT_ASSERT) != 0;
irq.trig_mode = icr_low & APIC_INT_LEVELTRIG; irq.trig_mode = icr_low & APIC_INT_LEVELTRIG;
irq.shorthand = icr_low & APIC_SHORT_MASK; irq.shorthand = icr_low & APIC_SHORT_MASK;
irq.msi_redir_hint = false;
if (apic_x2apic_mode(apic)) if (apic_x2apic_mode(apic))
irq.dest_id = icr_high; irq.dest_id = icr_high;
else else
...@@ -926,10 +938,11 @@ static void apic_send_ipi(struct kvm_lapic *apic) ...@@ -926,10 +938,11 @@ static void apic_send_ipi(struct kvm_lapic *apic)
apic_debug("icr_high 0x%x, icr_low 0x%x, " apic_debug("icr_high 0x%x, icr_low 0x%x, "
"short_hand 0x%x, dest 0x%x, trig_mode 0x%x, level 0x%x, " "short_hand 0x%x, dest 0x%x, trig_mode 0x%x, level 0x%x, "
"dest_mode 0x%x, delivery_mode 0x%x, vector 0x%x\n", "dest_mode 0x%x, delivery_mode 0x%x, vector 0x%x, "
"msi_redir_hint 0x%x\n",
icr_high, icr_low, irq.shorthand, irq.dest_id, icr_high, icr_low, irq.shorthand, irq.dest_id,
irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode, irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
irq.vector); irq.vector, irq.msi_redir_hint);
kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL); kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
} }
...@@ -1541,9 +1554,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) ...@@ -1541,9 +1554,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
if ((old_value ^ value) & X2APIC_ENABLE) { if ((old_value ^ value) & X2APIC_ENABLE) {
if (value & X2APIC_ENABLE) { if (value & X2APIC_ENABLE) {
u32 id = kvm_apic_id(apic); kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
u32 ldr = ((id >> 4) << 16) | (1 << (id & 0xf));
kvm_apic_set_ldr(apic, ldr);
kvm_x86_ops->set_virtual_x2apic_mode(vcpu, true); kvm_x86_ops->set_virtual_x2apic_mode(vcpu, true);
} else } else
kvm_x86_ops->set_virtual_x2apic_mode(vcpu, false); kvm_x86_ops->set_virtual_x2apic_mode(vcpu, false);
...@@ -1562,7 +1573,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) ...@@ -1562,7 +1573,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
} }
void kvm_lapic_reset(struct kvm_vcpu *vcpu) void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
{ {
struct kvm_lapic *apic; struct kvm_lapic *apic;
int i; int i;
...@@ -1576,19 +1587,22 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu) ...@@ -1576,19 +1587,22 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu)
/* Stop the timer in case it's a reset to an active apic */ /* Stop the timer in case it's a reset to an active apic */
hrtimer_cancel(&apic->lapic_timer.timer); hrtimer_cancel(&apic->lapic_timer.timer);
kvm_apic_set_id(apic, vcpu->vcpu_id); if (!init_event)
kvm_apic_set_id(apic, vcpu->vcpu_id);
kvm_apic_set_version(apic->vcpu); kvm_apic_set_version(apic->vcpu);
for (i = 0; i < APIC_LVT_NUM; i++) for (i = 0; i < APIC_LVT_NUM; i++)
apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED); apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED);
apic_update_lvtt(apic); apic_update_lvtt(apic);
apic_set_reg(apic, APIC_LVT0, if (!(vcpu->kvm->arch.disabled_quirks & KVM_QUIRK_LINT0_REENABLED))
SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); apic_set_reg(apic, APIC_LVT0,
SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT));
apic_set_reg(apic, APIC_DFR, 0xffffffffU); apic_set_reg(apic, APIC_DFR, 0xffffffffU);
apic_set_spiv(apic, 0xff); apic_set_spiv(apic, 0xff);
apic_set_reg(apic, APIC_TASKPRI, 0); apic_set_reg(apic, APIC_TASKPRI, 0);
kvm_apic_set_ldr(apic, 0); if (!apic_x2apic_mode(apic))
kvm_apic_set_ldr(apic, 0);
apic_set_reg(apic, APIC_ESR, 0); apic_set_reg(apic, APIC_ESR, 0);
apic_set_reg(apic, APIC_ICR, 0); apic_set_reg(apic, APIC_ICR, 0);
apic_set_reg(apic, APIC_ICR2, 0); apic_set_reg(apic, APIC_ICR2, 0);
...@@ -1717,7 +1731,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu) ...@@ -1717,7 +1731,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE); APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE);
static_key_slow_inc(&apic_sw_disabled.key); /* sw disabled at reset */ static_key_slow_inc(&apic_sw_disabled.key); /* sw disabled at reset */
kvm_lapic_reset(vcpu); kvm_lapic_reset(vcpu, false);
kvm_iodevice_init(&apic->dev, &apic_mmio_ops); kvm_iodevice_init(&apic->dev, &apic_mmio_ops);
return 0; return 0;
...@@ -2049,11 +2063,22 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) ...@@ -2049,11 +2063,22 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
if (!kvm_vcpu_has_lapic(vcpu) || !apic->pending_events) if (!kvm_vcpu_has_lapic(vcpu) || !apic->pending_events)
return; return;
pe = xchg(&apic->pending_events, 0); /*
* INITs are latched while in SMM. Because an SMM CPU cannot
* be in KVM_MP_STATE_INIT_RECEIVED state, just eat SIPIs
* and delay processing of INIT until the next RSM.
*/
if (is_smm(vcpu)) {
WARN_ON_ONCE(vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED);
if (test_bit(KVM_APIC_SIPI, &apic->pending_events))
clear_bit(KVM_APIC_SIPI, &apic->pending_events);
return;
}
pe = xchg(&apic->pending_events, 0);
if (test_bit(KVM_APIC_INIT, &pe)) { if (test_bit(KVM_APIC_INIT, &pe)) {
kvm_lapic_reset(vcpu); kvm_lapic_reset(vcpu, true);
kvm_vcpu_reset(vcpu); kvm_vcpu_reset(vcpu, true);
if (kvm_vcpu_is_bsp(apic->vcpu)) if (kvm_vcpu_is_bsp(apic->vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
else else
......
...@@ -48,7 +48,7 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); ...@@ -48,7 +48,7 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
void kvm_apic_accept_events(struct kvm_vcpu *vcpu); void kvm_apic_accept_events(struct kvm_vcpu *vcpu);
void kvm_lapic_reset(struct kvm_vcpu *vcpu); void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event);
u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu); u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8); void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
void kvm_lapic_set_eoi(struct kvm_vcpu *vcpu); void kvm_lapic_set_eoi(struct kvm_vcpu *vcpu);
...@@ -150,7 +150,18 @@ static inline bool kvm_apic_vid_enabled(struct kvm *kvm) ...@@ -150,7 +150,18 @@ static inline bool kvm_apic_vid_enabled(struct kvm *kvm)
static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu) static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
{ {
return vcpu->arch.apic->pending_events; return kvm_vcpu_has_lapic(vcpu) && vcpu->arch.apic->pending_events;
}
static inline bool kvm_lowest_prio_delivery(struct kvm_lapic_irq *irq)
{
return (irq->delivery_mode == APIC_DM_LOWEST ||
irq->msi_redir_hint);
}
static inline int kvm_lapic_latched_init(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_has_lapic(vcpu) && test_bit(KVM_APIC_INIT, &vcpu->arch.apic->pending_events);
} }
bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
......
此差异已折叠。
...@@ -43,6 +43,7 @@ ...@@ -43,6 +43,7 @@
#define PT_PDPE_LEVEL 3 #define PT_PDPE_LEVEL 3
#define PT_DIRECTORY_LEVEL 2 #define PT_DIRECTORY_LEVEL 2
#define PT_PAGE_TABLE_LEVEL 1 #define PT_PAGE_TABLE_LEVEL 1
#define PT_MAX_HUGEPAGE_LEVEL (PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES - 1)
static inline u64 rsvd_bits(int s, int e) static inline u64 rsvd_bits(int s, int e)
{ {
...@@ -170,4 +171,5 @@ static inline bool permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, ...@@ -170,4 +171,5 @@ static inline bool permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
} }
void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm); void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm);
void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
#endif #endif
...@@ -114,7 +114,7 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level) ...@@ -114,7 +114,7 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
return; return;
gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt); gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn); pfn = kvm_vcpu_gfn_to_pfn_atomic(vcpu, gfn);
if (is_error_pfn(pfn)) if (is_error_pfn(pfn))
return; return;
...@@ -131,12 +131,16 @@ static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep) ...@@ -131,12 +131,16 @@ static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)
static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10); static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 10);
unsigned long *rmapp; unsigned long *rmapp;
struct kvm_mmu_page *rev_sp; struct kvm_mmu_page *rev_sp;
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
gfn_t gfn; gfn_t gfn;
rev_sp = page_header(__pa(sptep)); rev_sp = page_header(__pa(sptep));
gfn = kvm_mmu_page_get_gfn(rev_sp, sptep - rev_sp->spt); gfn = kvm_mmu_page_get_gfn(rev_sp, sptep - rev_sp->spt);
if (!gfn_to_memslot(kvm, gfn)) { slots = kvm_memslots_for_spte_role(kvm, rev_sp->role);
slot = __gfn_to_memslot(slots, gfn);
if (!slot) {
if (!__ratelimit(&ratelimit_state)) if (!__ratelimit(&ratelimit_state))
return; return;
audit_printk(kvm, "no memslot for gfn %llx\n", gfn); audit_printk(kvm, "no memslot for gfn %llx\n", gfn);
...@@ -146,7 +150,7 @@ static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep) ...@@ -146,7 +150,7 @@ static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)
return; return;
} }
rmapp = gfn_to_rmap(kvm, gfn, rev_sp->role.level); rmapp = __gfn_to_rmap(gfn, rev_sp->role.level, slot);
if (!*rmapp) { if (!*rmapp) {
if (!__ratelimit(&ratelimit_state)) if (!__ratelimit(&ratelimit_state))
return; return;
...@@ -191,19 +195,21 @@ static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp) ...@@ -191,19 +195,21 @@ static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
unsigned long *rmapp; unsigned long *rmapp;
u64 *sptep; u64 *sptep;
struct rmap_iterator iter; struct rmap_iterator iter;
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
if (sp->role.direct || sp->unsync || sp->role.invalid) if (sp->role.direct || sp->unsync || sp->role.invalid)
return; return;
rmapp = gfn_to_rmap(kvm, sp->gfn, PT_PAGE_TABLE_LEVEL); slots = kvm_memslots_for_spte_role(kvm, sp->role);
slot = __gfn_to_memslot(slots, sp->gfn);
rmapp = __gfn_to_rmap(sp->gfn, PT_PAGE_TABLE_LEVEL, slot);
for (sptep = rmap_get_first(*rmapp, &iter); sptep; for_each_rmap_spte(rmapp, &iter, sptep)
sptep = rmap_get_next(&iter)) {
if (is_writable_pte(*sptep)) if (is_writable_pte(*sptep))
audit_printk(kvm, "shadow page has writable " audit_printk(kvm, "shadow page has writable "
"mappings: gfn %llx role %x\n", "mappings: gfn %llx role %x\n",
sp->gfn, sp->role.word); sp->gfn, sp->role.word);
}
} }
static void audit_sp(struct kvm *kvm, struct kvm_mmu_page *sp) static void audit_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
......
/*
* vMTRR implementation
*
* Copyright (C) 2006 Qumranet, Inc.
* Copyright 2010 Red Hat, Inc. and/or its affiliates.
* Copyright(C) 2015 Intel Corporation.
*
* Authors:
* Yaniv Kamay <yaniv@qumranet.com>
* Avi Kivity <avi@qumranet.com>
* Marcelo Tosatti <mtosatti@redhat.com>
* Paolo Bonzini <pbonzini@redhat.com>
* Xiao Guangrong <guangrong.xiao@linux.intel.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*/
#include <linux/kvm_host.h>
#include <asm/mtrr.h>
#include "cpuid.h"
#include "mmu.h"
#define IA32_MTRR_DEF_TYPE_E (1ULL << 11)
#define IA32_MTRR_DEF_TYPE_FE (1ULL << 10)
#define IA32_MTRR_DEF_TYPE_TYPE_MASK (0xff)
static bool msr_mtrr_valid(unsigned msr)
{
switch (msr) {
case 0x200 ... 0x200 + 2 * KVM_NR_VAR_MTRR - 1:
case MSR_MTRRfix64K_00000:
case MSR_MTRRfix16K_80000:
case MSR_MTRRfix16K_A0000:
case MSR_MTRRfix4K_C0000:
case MSR_MTRRfix4K_C8000:
case MSR_MTRRfix4K_D0000:
case MSR_MTRRfix4K_D8000:
case MSR_MTRRfix4K_E0000:
case MSR_MTRRfix4K_E8000:
case MSR_MTRRfix4K_F0000:
case MSR_MTRRfix4K_F8000:
case MSR_MTRRdefType:
case MSR_IA32_CR_PAT:
return true;
case 0x2f8:
return true;
}
return false;
}
static bool valid_pat_type(unsigned t)
{
return t < 8 && (1 << t) & 0xf3; /* 0, 1, 4, 5, 6, 7 */
}
static bool valid_mtrr_type(unsigned t)
{
return t < 8 && (1 << t) & 0x73; /* 0, 1, 4, 5, 6 */
}
bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
int i;
u64 mask;
if (!msr_mtrr_valid(msr))
return false;
if (msr == MSR_IA32_CR_PAT) {
for (i = 0; i < 8; i++)
if (!valid_pat_type((data >> (i * 8)) & 0xff))
return false;
return true;
} else if (msr == MSR_MTRRdefType) {
if (data & ~0xcff)
return false;
return valid_mtrr_type(data & 0xff);
} else if (msr >= MSR_MTRRfix64K_00000 && msr <= MSR_MTRRfix4K_F8000) {
for (i = 0; i < 8 ; i++)
if (!valid_mtrr_type((data >> (i * 8)) & 0xff))
return false;
return true;
}
/* variable MTRRs */
WARN_ON(!(msr >= 0x200 && msr < 0x200 + 2 * KVM_NR_VAR_MTRR));
mask = (~0ULL) << cpuid_maxphyaddr(vcpu);
if ((msr & 1) == 0) {
/* MTRR base */
if (!valid_mtrr_type(data & 0xff))
return false;
mask |= 0xf00;
} else
/* MTRR mask */
mask |= 0x7ff;
if (data & mask) {
kvm_inject_gp(vcpu, 0);
return false;
}
return true;
}
EXPORT_SYMBOL_GPL(kvm_mtrr_valid);
static bool mtrr_is_enabled(struct kvm_mtrr *mtrr_state)
{
return !!(mtrr_state->deftype & IA32_MTRR_DEF_TYPE_E);
}
static bool fixed_mtrr_is_enabled(struct kvm_mtrr *mtrr_state)
{
return !!(mtrr_state->deftype & IA32_MTRR_DEF_TYPE_FE);
}
static u8 mtrr_default_type(struct kvm_mtrr *mtrr_state)
{
return mtrr_state->deftype & IA32_MTRR_DEF_TYPE_TYPE_MASK;
}
/*
* Three terms are used in the following code:
* - segment, it indicates the address segments covered by fixed MTRRs.
* - unit, it corresponds to the MSR entry in the segment.
* - range, a range is covered in one memory cache type.
*/
struct fixed_mtrr_segment {
u64 start;
u64 end;
int range_shift;
/* the start position in kvm_mtrr.fixed_ranges[]. */
int range_start;
};
static struct fixed_mtrr_segment fixed_seg_table[] = {
/* MSR_MTRRfix64K_00000, 1 unit. 64K fixed mtrr. */
{
.start = 0x0,
.end = 0x80000,
.range_shift = 16, /* 64K */
.range_start = 0,
},
/*
* MSR_MTRRfix16K_80000 ... MSR_MTRRfix16K_A0000, 2 units,
* 16K fixed mtrr.
*/
{
.start = 0x80000,
.end = 0xc0000,
.range_shift = 14, /* 16K */
.range_start = 8,
},
/*
* MSR_MTRRfix4K_C0000 ... MSR_MTRRfix4K_F8000, 8 units,
* 4K fixed mtrr.
*/
{
.start = 0xc0000,
.end = 0x100000,
.range_shift = 12, /* 12K */
.range_start = 24,
}
};
/*
* The size of unit is covered in one MSR, one MSR entry contains
* 8 ranges so that unit size is always 8 * 2^range_shift.
*/
static u64 fixed_mtrr_seg_unit_size(int seg)
{
return 8 << fixed_seg_table[seg].range_shift;
}
static bool fixed_msr_to_seg_unit(u32 msr, int *seg, int *unit)
{
switch (msr) {
case MSR_MTRRfix64K_00000:
*seg = 0;
*unit = 0;
break;
case MSR_MTRRfix16K_80000 ... MSR_MTRRfix16K_A0000:
*seg = 1;
*unit = msr - MSR_MTRRfix16K_80000;
break;
case MSR_MTRRfix4K_C0000 ... MSR_MTRRfix4K_F8000:
*seg = 2;
*unit = msr - MSR_MTRRfix4K_C0000;
break;
default:
return false;
}
return true;
}
static void fixed_mtrr_seg_unit_range(int seg, int unit, u64 *start, u64 *end)
{
struct fixed_mtrr_segment *mtrr_seg = &fixed_seg_table[seg];
u64 unit_size = fixed_mtrr_seg_unit_size(seg);
*start = mtrr_seg->start + unit * unit_size;
*end = *start + unit_size;
WARN_ON(*end > mtrr_seg->end);
}
static int fixed_mtrr_seg_unit_range_index(int seg, int unit)
{
struct fixed_mtrr_segment *mtrr_seg = &fixed_seg_table[seg];
WARN_ON(mtrr_seg->start + unit * fixed_mtrr_seg_unit_size(seg)
> mtrr_seg->end);
/* each unit has 8 ranges. */
return mtrr_seg->range_start + 8 * unit;
}
static int fixed_mtrr_seg_end_range_index(int seg)
{
struct fixed_mtrr_segment *mtrr_seg = &fixed_seg_table[seg];
int n;
n = (mtrr_seg->end - mtrr_seg->start) >> mtrr_seg->range_shift;
return mtrr_seg->range_start + n - 1;
}
static bool fixed_msr_to_range(u32 msr, u64 *start, u64 *end)
{
int seg, unit;
if (!fixed_msr_to_seg_unit(msr, &seg, &unit))
return false;
fixed_mtrr_seg_unit_range(seg, unit, start, end);
return true;
}
static int fixed_msr_to_range_index(u32 msr)
{
int seg, unit;
if (!fixed_msr_to_seg_unit(msr, &seg, &unit))
return -1;
return fixed_mtrr_seg_unit_range_index(seg, unit);
}
static int fixed_mtrr_addr_to_seg(u64 addr)
{
struct fixed_mtrr_segment *mtrr_seg;
int seg, seg_num = ARRAY_SIZE(fixed_seg_table);
for (seg = 0; seg < seg_num; seg++) {
mtrr_seg = &fixed_seg_table[seg];
if (mtrr_seg->start >= addr && addr < mtrr_seg->end)
return seg;
}
return -1;
}
static int fixed_mtrr_addr_seg_to_range_index(u64 addr, int seg)
{
struct fixed_mtrr_segment *mtrr_seg;
int index;
mtrr_seg = &fixed_seg_table[seg];
index = mtrr_seg->range_start;
index += (addr - mtrr_seg->start) >> mtrr_seg->range_shift;
return index;
}
static u64 fixed_mtrr_range_end_addr(int seg, int index)
{
struct fixed_mtrr_segment *mtrr_seg = &fixed_seg_table[seg];
int pos = index - mtrr_seg->range_start;
return mtrr_seg->start + ((pos + 1) << mtrr_seg->range_shift);
}
static void var_mtrr_range(struct kvm_mtrr_range *range, u64 *start, u64 *end)
{
u64 mask;
*start = range->base & PAGE_MASK;
mask = range->mask & PAGE_MASK;
mask |= ~0ULL << boot_cpu_data.x86_phys_bits;
/* This cannot overflow because writing to the reserved bits of
* variable MTRRs causes a #GP.
*/
*end = (*start | ~mask) + 1;
}
static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state;
gfn_t start, end;
int index;
if (msr == MSR_IA32_CR_PAT || !tdp_enabled ||
!kvm_arch_has_noncoherent_dma(vcpu->kvm))
return;
if (!mtrr_is_enabled(mtrr_state) && msr != MSR_MTRRdefType)
return;
/* fixed MTRRs. */
if (fixed_msr_to_range(msr, &start, &end)) {
if (!fixed_mtrr_is_enabled(mtrr_state))
return;
} else if (msr == MSR_MTRRdefType) {
start = 0x0;
end = ~0ULL;
} else {
/* variable range MTRRs. */
index = (msr - 0x200) / 2;
var_mtrr_range(&mtrr_state->var_ranges[index], &start, &end);
}
kvm_zap_gfn_range(vcpu->kvm, gpa_to_gfn(start), gpa_to_gfn(end));
}
static bool var_mtrr_range_is_valid(struct kvm_mtrr_range *range)
{
return (range->mask & (1 << 11)) != 0;
}
static void set_var_mtrr_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state;
struct kvm_mtrr_range *tmp, *cur;
int index, is_mtrr_mask;
index = (msr - 0x200) / 2;
is_mtrr_mask = msr - 0x200 - 2 * index;
cur = &mtrr_state->var_ranges[index];
/* remove the entry if it's in the list. */
if (var_mtrr_range_is_valid(cur))
list_del(&mtrr_state->var_ranges[index].node);
if (!is_mtrr_mask)
cur->base = data;
else
cur->mask = data;
/* add it to the list if it's enabled. */
if (var_mtrr_range_is_valid(cur)) {
list_for_each_entry(tmp, &mtrr_state->head, node)
if (cur->base >= tmp->base)
break;
list_add_tail(&cur->node, &tmp->node);
}
}
int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
int index;
if (!kvm_mtrr_valid(vcpu, msr, data))
return 1;
index = fixed_msr_to_range_index(msr);
if (index >= 0)
*(u64 *)&vcpu->arch.mtrr_state.fixed_ranges[index] = data;
else if (msr == MSR_MTRRdefType)
vcpu->arch.mtrr_state.deftype = data;
else if (msr == MSR_IA32_CR_PAT)
vcpu->arch.pat = data;
else
set_var_mtrr_msr(vcpu, msr, data);
update_mtrr(vcpu, msr);
return 0;
}
int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
{
int index;
/* MSR_MTRRcap is a readonly MSR. */
if (msr == MSR_MTRRcap) {
/*
* SMRR = 0
* WC = 1
* FIX = 1
* VCNT = KVM_NR_VAR_MTRR
*/
*pdata = 0x500 | KVM_NR_VAR_MTRR;
return 0;
}
if (!msr_mtrr_valid(msr))
return 1;
index = fixed_msr_to_range_index(msr);
if (index >= 0)
*pdata = *(u64 *)&vcpu->arch.mtrr_state.fixed_ranges[index];
else if (msr == MSR_MTRRdefType)
*pdata = vcpu->arch.mtrr_state.deftype;
else if (msr == MSR_IA32_CR_PAT)
*pdata = vcpu->arch.pat;
else { /* Variable MTRRs */
int is_mtrr_mask;
index = (msr - 0x200) / 2;
is_mtrr_mask = msr - 0x200 - 2 * index;
if (!is_mtrr_mask)
*pdata = vcpu->arch.mtrr_state.var_ranges[index].base;
else
*pdata = vcpu->arch.mtrr_state.var_ranges[index].mask;
}
return 0;
}
void kvm_vcpu_mtrr_init(struct kvm_vcpu *vcpu)
{
INIT_LIST_HEAD(&vcpu->arch.mtrr_state.head);
}
struct mtrr_iter {
/* input fields. */
struct kvm_mtrr *mtrr_state;
u64 start;
u64 end;
/* output fields. */
int mem_type;
/* [start, end) is not fully covered in MTRRs? */
bool partial_map;
/* private fields. */
union {
/* used for fixed MTRRs. */
struct {
int index;
int seg;
};
/* used for var MTRRs. */
struct {
struct kvm_mtrr_range *range;
/* max address has been covered in var MTRRs. */
u64 start_max;
};
};
bool fixed;
};
static bool mtrr_lookup_fixed_start(struct mtrr_iter *iter)
{
int seg, index;
if (!fixed_mtrr_is_enabled(iter->mtrr_state))
return false;
seg = fixed_mtrr_addr_to_seg(iter->start);
if (seg < 0)
return false;
iter->fixed = true;
index = fixed_mtrr_addr_seg_to_range_index(iter->start, seg);
iter->index = index;
iter->seg = seg;
return true;
}
static bool match_var_range(struct mtrr_iter *iter,
struct kvm_mtrr_range *range)
{
u64 start, end;
var_mtrr_range(range, &start, &end);
if (!(start >= iter->end || end <= iter->start)) {
iter->range = range;
/*
* the function is called when we do kvm_mtrr.head walking.
* Range has the minimum base address which interleaves
* [looker->start_max, looker->end).
*/
iter->partial_map |= iter->start_max < start;
/* update the max address has been covered. */
iter->start_max = max(iter->start_max, end);
return true;
}
return false;
}
static void __mtrr_lookup_var_next(struct mtrr_iter *iter)
{
struct kvm_mtrr *mtrr_state = iter->mtrr_state;
list_for_each_entry_continue(iter->range, &mtrr_state->head, node)
if (match_var_range(iter, iter->range))
return;
iter->range = NULL;
iter->partial_map |= iter->start_max < iter->end;
}
static void mtrr_lookup_var_start(struct mtrr_iter *iter)
{
struct kvm_mtrr *mtrr_state = iter->mtrr_state;
iter->fixed = false;
iter->start_max = iter->start;
iter->range = list_prepare_entry(iter->range, &mtrr_state->head, node);
__mtrr_lookup_var_next(iter);
}
static void mtrr_lookup_fixed_next(struct mtrr_iter *iter)
{
/* terminate the lookup. */
if (fixed_mtrr_range_end_addr(iter->seg, iter->index) >= iter->end) {
iter->fixed = false;
iter->range = NULL;
return;
}
iter->index++;
/* have looked up for all fixed MTRRs. */
if (iter->index >= ARRAY_SIZE(iter->mtrr_state->fixed_ranges))
return mtrr_lookup_var_start(iter);
/* switch to next segment. */
if (iter->index > fixed_mtrr_seg_end_range_index(iter->seg))
iter->seg++;
}
static void mtrr_lookup_var_next(struct mtrr_iter *iter)
{
__mtrr_lookup_var_next(iter);
}
static void mtrr_lookup_start(struct mtrr_iter *iter)
{
if (!mtrr_is_enabled(iter->mtrr_state)) {
iter->partial_map = true;
return;
}
if (!mtrr_lookup_fixed_start(iter))
mtrr_lookup_var_start(iter);
}
static void mtrr_lookup_init(struct mtrr_iter *iter,
struct kvm_mtrr *mtrr_state, u64 start, u64 end)
{
iter->mtrr_state = mtrr_state;
iter->start = start;
iter->end = end;
iter->partial_map = false;
iter->fixed = false;
iter->range = NULL;
mtrr_lookup_start(iter);
}
static bool mtrr_lookup_okay(struct mtrr_iter *iter)
{
if (iter->fixed) {
iter->mem_type = iter->mtrr_state->fixed_ranges[iter->index];
return true;
}
if (iter->range) {
iter->mem_type = iter->range->base & 0xff;
return true;
}
return false;
}
static void mtrr_lookup_next(struct mtrr_iter *iter)
{
if (iter->fixed)
mtrr_lookup_fixed_next(iter);
else
mtrr_lookup_var_next(iter);
}
#define mtrr_for_each_mem_type(_iter_, _mtrr_, _gpa_start_, _gpa_end_) \
for (mtrr_lookup_init(_iter_, _mtrr_, _gpa_start_, _gpa_end_); \
mtrr_lookup_okay(_iter_); mtrr_lookup_next(_iter_))
u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn)
{
struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state;
struct mtrr_iter iter;
u64 start, end;
int type = -1;
const int wt_wb_mask = (1 << MTRR_TYPE_WRBACK)
| (1 << MTRR_TYPE_WRTHROUGH);
start = gfn_to_gpa(gfn);
end = start + PAGE_SIZE;
mtrr_for_each_mem_type(&iter, mtrr_state, start, end) {
int curr_type = iter.mem_type;
/*
* Please refer to Intel SDM Volume 3: 11.11.4.1 MTRR
* Precedences.
*/
if (type == -1) {
type = curr_type;
continue;
}
/*
* If two or more variable memory ranges match and the
* memory types are identical, then that memory type is
* used.
*/
if (type == curr_type)
continue;
/*
* If two or more variable memory ranges match and one of
* the memory types is UC, the UC memory type used.
*/
if (curr_type == MTRR_TYPE_UNCACHABLE)
return MTRR_TYPE_UNCACHABLE;
/*
* If two or more variable memory ranges match and the
* memory types are WT and WB, the WT memory type is used.
*/
if (((1 << type) & wt_wb_mask) &&
((1 << curr_type) & wt_wb_mask)) {
type = MTRR_TYPE_WRTHROUGH;
continue;
}
/*
* For overlaps not defined by the above rules, processor
* behavior is undefined.
*/
/* We use WB for this undefined behavior. :( */
return MTRR_TYPE_WRBACK;
}
/* It is not covered by MTRRs. */
if (iter.partial_map) {
/*
* We just check one page, partially covered by MTRRs is
* impossible.
*/
WARN_ON(type != -1);
type = mtrr_default_type(mtrr_state);
}
return type;
}
EXPORT_SYMBOL_GPL(kvm_mtrr_get_guest_memory_type);
bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn,
int page_num)
{
struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state;
struct mtrr_iter iter;
u64 start, end;
int type = -1;
start = gfn_to_gpa(gfn);
end = gfn_to_gpa(gfn + page_num);
mtrr_for_each_mem_type(&iter, mtrr_state, start, end) {
if (type == -1) {
type = iter.mem_type;
continue;
}
if (type != iter.mem_type)
return false;
}
if (!iter.partial_map)
return true;
if (type == -1)
return true;
return type == mtrr_default_type(mtrr_state);
}
...@@ -256,7 +256,7 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, ...@@ -256,7 +256,7 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu,
if (ret) if (ret)
return ret; return ret;
mark_page_dirty(vcpu->kvm, table_gfn); kvm_vcpu_mark_page_dirty(vcpu, table_gfn);
walker->ptes[level] = pte; walker->ptes[level] = pte;
} }
return 0; return 0;
...@@ -338,7 +338,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, ...@@ -338,7 +338,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
real_gfn = gpa_to_gfn(real_gfn); real_gfn = gpa_to_gfn(real_gfn);
host_addr = gfn_to_hva_prot(vcpu->kvm, real_gfn, host_addr = kvm_vcpu_gfn_to_hva_prot(vcpu, real_gfn,
&walker->pte_writable[walker->level - 1]); &walker->pte_writable[walker->level - 1]);
if (unlikely(kvm_is_error_hva(host_addr))) if (unlikely(kvm_is_error_hva(host_addr)))
goto error; goto error;
...@@ -511,11 +511,11 @@ static bool FNAME(gpte_changed)(struct kvm_vcpu *vcpu, ...@@ -511,11 +511,11 @@ static bool FNAME(gpte_changed)(struct kvm_vcpu *vcpu,
base_gpa = pte_gpa & ~mask; base_gpa = pte_gpa & ~mask;
index = (pte_gpa - base_gpa) / sizeof(pt_element_t); index = (pte_gpa - base_gpa) / sizeof(pt_element_t);
r = kvm_read_guest_atomic(vcpu->kvm, base_gpa, r = kvm_vcpu_read_guest_atomic(vcpu, base_gpa,
gw->prefetch_ptes, sizeof(gw->prefetch_ptes)); gw->prefetch_ptes, sizeof(gw->prefetch_ptes));
curr_pte = gw->prefetch_ptes[index]; curr_pte = gw->prefetch_ptes[index];
} else } else
r = kvm_read_guest_atomic(vcpu->kvm, pte_gpa, r = kvm_vcpu_read_guest_atomic(vcpu, pte_gpa,
&curr_pte, sizeof(curr_pte)); &curr_pte, sizeof(curr_pte));
return r || curr_pte != gw->ptes[level - 1]; return r || curr_pte != gw->ptes[level - 1];
...@@ -869,8 +869,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva) ...@@ -869,8 +869,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
if (!rmap_can_add(vcpu)) if (!rmap_can_add(vcpu))
break; break;
if (kvm_read_guest_atomic(vcpu->kvm, pte_gpa, &gpte, if (kvm_vcpu_read_guest_atomic(vcpu, pte_gpa, &gpte,
sizeof(pt_element_t))) sizeof(pt_element_t)))
break; break;
FNAME(update_pte)(vcpu, sp, sptep, &gpte); FNAME(update_pte)(vcpu, sp, sptep, &gpte);
...@@ -956,8 +956,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) ...@@ -956,8 +956,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
pte_gpa = first_pte_gpa + i * sizeof(pt_element_t); pte_gpa = first_pte_gpa + i * sizeof(pt_element_t);
if (kvm_read_guest_atomic(vcpu->kvm, pte_gpa, &gpte, if (kvm_vcpu_read_guest_atomic(vcpu, pte_gpa, &gpte,
sizeof(pt_element_t))) sizeof(pt_element_t)))
return -EINVAL; return -EINVAL;
if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) { if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) {
...@@ -970,7 +970,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) ...@@ -970,7 +970,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
pte_access &= FNAME(gpte_access)(vcpu, gpte); pte_access &= FNAME(gpte_access)(vcpu, gpte);
FNAME(protect_clean_gpte)(&pte_access, gpte); FNAME(protect_clean_gpte)(&pte_access, gpte);
if (sync_mmio_spte(vcpu->kvm, &sp->spt[i], gfn, pte_access, if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access,
&nr_present)) &nr_present))
continue; continue;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -952,6 +952,28 @@ TRACE_EVENT(kvm_wait_lapic_expire, ...@@ -952,6 +952,28 @@ TRACE_EVENT(kvm_wait_lapic_expire,
__entry->delta < 0 ? "early" : "late") __entry->delta < 0 ? "early" : "late")
); );
TRACE_EVENT(kvm_enter_smm,
TP_PROTO(unsigned int vcpu_id, u64 smbase, bool entering),
TP_ARGS(vcpu_id, smbase, entering),
TP_STRUCT__entry(
__field( unsigned int, vcpu_id )
__field( u64, smbase )
__field( bool, entering )
),
TP_fast_assign(
__entry->vcpu_id = vcpu_id;
__entry->smbase = smbase;
__entry->entering = entering;
),
TP_printk("vcpu %u: %s SMM, smbase 0x%llx",
__entry->vcpu_id,
__entry->entering ? "entering" : "leaving",
__entry->smbase)
);
#endif /* _TRACE_KVM_H */ #endif /* _TRACE_KVM_H */
#undef TRACE_INCLUDE_PATH #undef TRACE_INCLUDE_PATH
......
此差异已折叠。
此差异已折叠。
...@@ -4,6 +4,8 @@ ...@@ -4,6 +4,8 @@
#include <linux/kvm_host.h> #include <linux/kvm_host.h>
#include "kvm_cache_regs.h" #include "kvm_cache_regs.h"
#define MSR_IA32_CR_PAT_DEFAULT 0x0007040600070406ULL
static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu) static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu)
{ {
vcpu->arch.exception.pending = false; vcpu->arch.exception.pending = false;
...@@ -160,7 +162,13 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt, ...@@ -160,7 +162,13 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
gva_t addr, void *val, unsigned int bytes, gva_t addr, void *val, unsigned int bytes,
struct x86_exception *exception); struct x86_exception *exception);
void kvm_vcpu_mtrr_init(struct kvm_vcpu *vcpu);
u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data); bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data);
int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data);
int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn,
int page_num);
#define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \ #define KVM_SUPPORTED_XCR0 (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \
| XSTATE_BNDREGS | XSTATE_BNDCSR \ | XSTATE_BNDREGS | XSTATE_BNDCSR \
......
此差异已折叠。
...@@ -28,6 +28,7 @@ struct kvm_run; ...@@ -28,6 +28,7 @@ struct kvm_run;
struct kvm_userspace_memory_region; struct kvm_userspace_memory_region;
struct kvm_vcpu; struct kvm_vcpu;
struct kvm_vcpu_init; struct kvm_vcpu_init;
struct kvm_memslots;
enum kvm_mr_change; enum kvm_mr_change;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -29,8 +29,8 @@ void kvm_async_pf_deinit(void); ...@@ -29,8 +29,8 @@ void kvm_async_pf_deinit(void);
void kvm_async_pf_vcpu_init(struct kvm_vcpu *vcpu); void kvm_async_pf_vcpu_init(struct kvm_vcpu *vcpu);
#else #else
#define kvm_async_pf_init() (0) #define kvm_async_pf_init() (0)
#define kvm_async_pf_deinit() do{}while(0) #define kvm_async_pf_deinit() do {} while (0)
#define kvm_async_pf_vcpu_init(C) do{}while(0) #define kvm_async_pf_vcpu_init(C) do {} while (0)
#endif #endif
#endif #endif
...@@ -24,9 +24,9 @@ struct kvm_coalesced_mmio_dev { ...@@ -24,9 +24,9 @@ struct kvm_coalesced_mmio_dev {
int kvm_coalesced_mmio_init(struct kvm *kvm); int kvm_coalesced_mmio_init(struct kvm *kvm);
void kvm_coalesced_mmio_free(struct kvm *kvm); void kvm_coalesced_mmio_free(struct kvm *kvm);
int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm, int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
struct kvm_coalesced_mmio_zone *zone); struct kvm_coalesced_mmio_zone *zone);
int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm, int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
struct kvm_coalesced_mmio_zone *zone); struct kvm_coalesced_mmio_zone *zone);
#else #else
......
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册