提交 636deed6 编写于 作者: L Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - some cleanups
   - direct physical timer assignment
   - cache sanitization for 32-bit guests

  s390:
   - interrupt cleanup
   - introduction of the Guest Information Block
   - preparation for processor subfunctions in cpu models

  PPC:
   - bug fixes and improvements, especially related to machine checks
     and protection keys

  x86:
   - many, many cleanups, including removing a bunch of MMU code for
     unnecessary optimizations
   - AVIC fixes

  Generic:
   - memcg accounting"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
  kvm: vmx: fix formatting of a comment
  KVM: doc: Document the life cycle of a VM and its resources
  MAINTAINERS: Add KVM selftests to existing KVM entry
  Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
  KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
  KVM: PPC: Fix compilation when KVM is not enabled
  KVM: Minor cleanups for kvm_main.c
  KVM: s390: add debug logging for cpu model subfunctions
  KVM: s390: implement subfunction processor calls
  arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
  KVM: arm/arm64: Remove unused timer variable
  KVM: PPC: Book3S: Improve KVM reference counting
  KVM: PPC: Book3S HV: Fix build failure without IOMMU support
  Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
  x86: kvmguest: use TSC clocksource if invariant TSC is exposed
  KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
  KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
  KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
  KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
  KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
  ...
...@@ -45,6 +45,23 @@ the API. The only supported use is one virtual machine per process, ...@@ -45,6 +45,23 @@ the API. The only supported use is one virtual machine per process,
and one vcpu per thread. and one vcpu per thread.
It is important to note that althought VM ioctls may only be issued from
the process that created the VM, a VM's lifecycle is associated with its
file descriptor, not its creator (process). In other words, the VM and
its resources, *including the associated address space*, are not freed
until the last reference to the VM's file descriptor has been released.
For example, if fork() is issued after ioctl(KVM_CREATE_VM), the VM will
not be freed until both the parent (original) process and its child have
put their references to the VM's file descriptor.
Because a VM's resources are not freed until the last reference to its
file descriptor is released, creating additional references to a VM via
via fork(), dup(), etc... without careful consideration is strongly
discouraged and may have unwanted side effects, e.g. memory allocated
by and on behalf of the VM's process may not be freed/unaccounted when
the VM is shut down.
3. Extensions 3. Extensions
------------- -------------
......
...@@ -53,7 +53,8 @@ the global max polling interval then the polling interval can be increased in ...@@ -53,7 +53,8 @@ the global max polling interval then the polling interval can be increased in
the hope that next time during the longer polling interval the wake up source the hope that next time during the longer polling interval the wake up source
will be received while the host is polling and the latency benefits will be will be received while the host is polling and the latency benefits will be
received. The polling interval is grown in the function grow_halt_poll_ns() and received. The polling interval is grown in the function grow_halt_poll_ns() and
is multiplied by the module parameter halt_poll_ns_grow. is multiplied by the module parameters halt_poll_ns_grow and
halt_poll_ns_grow_start.
In the event that the total block time was greater than the global max polling In the event that the total block time was greater than the global max polling
interval then the host will never poll for long enough (limited by the global interval then the host will never poll for long enough (limited by the global
...@@ -80,22 +81,30 @@ shrunk. These variables are defined in include/linux/kvm_host.h and as module ...@@ -80,22 +81,30 @@ shrunk. These variables are defined in include/linux/kvm_host.h and as module
parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the
powerpc kvm-hv case. powerpc kvm-hv case.
Module Parameter | Description | Default Value Module Parameter | Description | Default Value
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
halt_poll_ns | The global max polling interval | KVM_HALT_POLL_NS_DEFAULT halt_poll_ns | The global max polling | KVM_HALT_POLL_NS_DEFAULT
| which defines the ceiling value | | interval which defines |
| of the polling interval for | (per arch value) | the ceiling value of the |
| each vcpu. | | polling interval for | (per arch value)
| each vcpu. |
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
halt_poll_ns_grow | The value by which the halt | 2 halt_poll_ns_grow | The value by which the | 2
| polling interval is multiplied | | halt polling interval is |
| in the grow_halt_poll_ns() | | multiplied in the |
| function. | | grow_halt_poll_ns() |
| function. |
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
halt_poll_ns_shrink | The value by which the halt | 0 halt_poll_ns_grow_start | The initial value to grow | 10000
| polling interval is divided in | | to from zero in the |
| the shrink_halt_poll_ns() | | grow_halt_poll_ns() |
| function. | | function. |
--------------------------------------------------------------------------------
halt_poll_ns_shrink | The value by which the | 0
| halt polling interval is |
| divided in the |
| shrink_halt_poll_ns() |
| function. |
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
These module parameters can be set from the debugfs files in: These module parameters can be set from the debugfs files in:
......
...@@ -224,10 +224,6 @@ Shadow pages contain the following information: ...@@ -224,10 +224,6 @@ Shadow pages contain the following information:
A bitmap indicating which sptes in spt point (directly or indirectly) at A bitmap indicating which sptes in spt point (directly or indirectly) at
pages that may be unsynchronized. Used to quickly locate all unsychronized pages that may be unsynchronized. Used to quickly locate all unsychronized
pages reachable from a given page. pages reachable from a given page.
mmu_valid_gen:
Generation number of the page. It is compared with kvm->arch.mmu_valid_gen
during hash table lookup, and used to skip invalidated shadow pages (see
"Zapping all pages" below.)
clear_spte_count: clear_spte_count:
Only present on 32-bit hosts, where a 64-bit spte cannot be written Only present on 32-bit hosts, where a 64-bit spte cannot be written
atomically. The reader uses this while running out of the MMU lock atomically. The reader uses this while running out of the MMU lock
...@@ -402,27 +398,6 @@ causes its disallow_lpage to be incremented, thus preventing instantiation of ...@@ -402,27 +398,6 @@ causes its disallow_lpage to be incremented, thus preventing instantiation of
a large spte. The frames at the end of an unaligned memory slot have a large spte. The frames at the end of an unaligned memory slot have
artificially inflated ->disallow_lpages so they can never be instantiated. artificially inflated ->disallow_lpages so they can never be instantiated.
Zapping all pages (page generation count)
=========================================
For the large memory guests, walking and zapping all pages is really slow
(because there are a lot of pages), and also blocks memory accesses of
all VCPUs because it needs to hold the MMU lock.
To make it be more scalable, kvm maintains a global generation number
which is stored in kvm->arch.mmu_valid_gen. Every shadow page stores
the current global generation-number into sp->mmu_valid_gen when it
is created. Pages with a mismatching generation number are "obsolete".
When KVM need zap all shadow pages sptes, it just simply increases the global
generation-number then reload root shadow pages on all vcpus. As the VCPUs
create new shadow page tables, the old pages are not used because of the
mismatching generation number.
KVM then walks through all pages and zaps obsolete pages. While the zap
operation needs to take the MMU lock, the lock can be released periodically
so that the VCPUs can make progress.
Fast invalidation of MMIO sptes Fast invalidation of MMIO sptes
=============================== ===============================
...@@ -435,8 +410,7 @@ shadow pages, and is made more scalable with a similar technique. ...@@ -435,8 +410,7 @@ shadow pages, and is made more scalable with a similar technique.
MMIO sptes have a few spare bits, which are used to store a MMIO sptes have a few spare bits, which are used to store a
generation number. The global generation number is stored in generation number. The global generation number is stored in
kvm_memslots(kvm)->generation, and increased whenever guest memory info kvm_memslots(kvm)->generation, and increased whenever guest memory info
changes. This generation number is distinct from the one described in changes.
the previous section.
When KVM finds an MMIO spte, it checks the generation number of the spte. When KVM finds an MMIO spte, it checks the generation number of the spte.
If the generation number of the spte does not equal the global generation If the generation number of the spte does not equal the global generation
...@@ -452,13 +426,16 @@ stored into the MMIO spte. Thus, the MMIO spte might be created based on ...@@ -452,13 +426,16 @@ stored into the MMIO spte. Thus, the MMIO spte might be created based on
out-of-date information, but with an up-to-date generation number. out-of-date information, but with an up-to-date generation number.
To avoid this, the generation number is incremented again after synchronize_srcu To avoid this, the generation number is incremented again after synchronize_srcu
returns; thus, the low bit of kvm_memslots(kvm)->generation is only 1 during a returns; thus, bit 63 of kvm_memslots(kvm)->generation set to 1 only during a
memslot update, while some SRCU readers might be using the old copy. We do not memslot update, while some SRCU readers might be using the old copy. We do not
want to use an MMIO sptes created with an odd generation number, and we can do want to use an MMIO sptes created with an odd generation number, and we can do
this without losing a bit in the MMIO spte. The low bit of the generation this without losing a bit in the MMIO spte. The "update in-progress" bit of the
is not stored in MMIO spte, and presumed zero when it is extracted out of the generation is not stored in MMIO spte, and is so is implicitly zero when the
spte. If KVM is unlucky and creates an MMIO spte while the low bit is 1, generation is extracted out of the spte. If KVM is unlucky and creates an MMIO
the next access to the spte will always be a cache miss. spte while an update is in-progress, the next access to the spte will always be
a cache miss. For example, a subsequent access during the update window will
miss due to the in-progress flag diverging, while an access after the update
window closes will have a higher generation number (as compared to the spte).
Further reading Further reading
......
...@@ -8461,6 +8461,7 @@ F: include/linux/kvm* ...@@ -8461,6 +8461,7 @@ F: include/linux/kvm*
F: include/kvm/iodev.h F: include/kvm/iodev.h
F: virt/kvm/* F: virt/kvm/*
F: tools/kvm/ F: tools/kvm/
F: tools/testing/selftests/kvm/
KERNEL VIRTUAL MACHINE FOR AMD-V (KVM/amd) KERNEL VIRTUAL MACHINE FOR AMD-V (KVM/amd)
M: Joerg Roedel <joro@8bytes.org> M: Joerg Roedel <joro@8bytes.org>
...@@ -8470,29 +8471,25 @@ S: Maintained ...@@ -8470,29 +8471,25 @@ S: Maintained
F: arch/x86/include/asm/svm.h F: arch/x86/include/asm/svm.h
F: arch/x86/kvm/svm.c F: arch/x86/kvm/svm.c
KERNEL VIRTUAL MACHINE FOR ARM (KVM/arm) KERNEL VIRTUAL MACHINE FOR ARM/ARM64 (KVM/arm, KVM/arm64)
M: Christoffer Dall <christoffer.dall@arm.com> M: Christoffer Dall <christoffer.dall@arm.com>
M: Marc Zyngier <marc.zyngier@arm.com> M: Marc Zyngier <marc.zyngier@arm.com>
R: James Morse <james.morse@arm.com>
R: Julien Thierry <julien.thierry@arm.com>
R: Suzuki K Pouloze <suzuki.poulose@arm.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: kvmarm@lists.cs.columbia.edu L: kvmarm@lists.cs.columbia.edu
W: http://systems.cs.columbia.edu/projects/kvm-arm W: http://systems.cs.columbia.edu/projects/kvm-arm
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git
S: Supported S: Maintained
F: arch/arm/include/uapi/asm/kvm* F: arch/arm/include/uapi/asm/kvm*
F: arch/arm/include/asm/kvm* F: arch/arm/include/asm/kvm*
F: arch/arm/kvm/ F: arch/arm/kvm/
F: virt/kvm/arm/
F: include/kvm/arm_*
KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
M: Christoffer Dall <christoffer.dall@arm.com>
M: Marc Zyngier <marc.zyngier@arm.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: kvmarm@lists.cs.columbia.edu
S: Maintained
F: arch/arm64/include/uapi/asm/kvm* F: arch/arm64/include/uapi/asm/kvm*
F: arch/arm64/include/asm/kvm* F: arch/arm64/include/asm/kvm*
F: arch/arm64/kvm/ F: arch/arm64/kvm/
F: virt/kvm/arm/
F: include/kvm/arm_*
KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips) KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)
M: James Hogan <jhogan@kernel.org> M: James Hogan <jhogan@kernel.org>
......
...@@ -55,7 +55,7 @@ ...@@ -55,7 +55,7 @@
#define ICH_VTR __ACCESS_CP15(c12, 4, c11, 1) #define ICH_VTR __ACCESS_CP15(c12, 4, c11, 1)
#define ICH_MISR __ACCESS_CP15(c12, 4, c11, 2) #define ICH_MISR __ACCESS_CP15(c12, 4, c11, 2)
#define ICH_EISR __ACCESS_CP15(c12, 4, c11, 3) #define ICH_EISR __ACCESS_CP15(c12, 4, c11, 3)
#define ICH_ELSR __ACCESS_CP15(c12, 4, c11, 5) #define ICH_ELRSR __ACCESS_CP15(c12, 4, c11, 5)
#define ICH_VMCR __ACCESS_CP15(c12, 4, c11, 7) #define ICH_VMCR __ACCESS_CP15(c12, 4, c11, 7)
#define __LR0(x) __ACCESS_CP15(c12, 4, c12, x) #define __LR0(x) __ACCESS_CP15(c12, 4, c12, x)
...@@ -152,7 +152,7 @@ CPUIF_MAP(ICH_HCR, ICH_HCR_EL2) ...@@ -152,7 +152,7 @@ CPUIF_MAP(ICH_HCR, ICH_HCR_EL2)
CPUIF_MAP(ICH_VTR, ICH_VTR_EL2) CPUIF_MAP(ICH_VTR, ICH_VTR_EL2)
CPUIF_MAP(ICH_MISR, ICH_MISR_EL2) CPUIF_MAP(ICH_MISR, ICH_MISR_EL2)
CPUIF_MAP(ICH_EISR, ICH_EISR_EL2) CPUIF_MAP(ICH_EISR, ICH_EISR_EL2)
CPUIF_MAP(ICH_ELSR, ICH_ELSR_EL2) CPUIF_MAP(ICH_ELRSR, ICH_ELRSR_EL2)
CPUIF_MAP(ICH_VMCR, ICH_VMCR_EL2) CPUIF_MAP(ICH_VMCR, ICH_VMCR_EL2)
CPUIF_MAP(ICH_AP0R3, ICH_AP0R3_EL2) CPUIF_MAP(ICH_AP0R3, ICH_AP0R3_EL2)
CPUIF_MAP(ICH_AP0R2, ICH_AP0R2_EL2) CPUIF_MAP(ICH_AP0R2, ICH_AP0R2_EL2)
......
...@@ -265,6 +265,14 @@ static inline bool kvm_vcpu_dabt_isextabt(struct kvm_vcpu *vcpu) ...@@ -265,6 +265,14 @@ static inline bool kvm_vcpu_dabt_isextabt(struct kvm_vcpu *vcpu)
} }
} }
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
return kvm_vcpu_dabt_iswrite(vcpu);
}
static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu) static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu)
{ {
return kvm_vcpu_get_hsr(vcpu) & HSR_HVC_IMM_MASK; return kvm_vcpu_get_hsr(vcpu) & HSR_HVC_IMM_MASK;
......
...@@ -26,6 +26,7 @@ ...@@ -26,6 +26,7 @@
#include <asm/kvm_asm.h> #include <asm/kvm_asm.h>
#include <asm/kvm_mmio.h> #include <asm/kvm_mmio.h>
#include <asm/fpstate.h> #include <asm/fpstate.h>
#include <asm/smp_plat.h>
#include <kvm/arm_arch_timer.h> #include <kvm/arm_arch_timer.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED #define __KVM_HAVE_ARCH_INTC_INITIALIZED
...@@ -57,10 +58,13 @@ int __attribute_const__ kvm_target_cpu(void); ...@@ -57,10 +58,13 @@ int __attribute_const__ kvm_target_cpu(void);
int kvm_reset_vcpu(struct kvm_vcpu *vcpu); int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
void kvm_reset_coprocs(struct kvm_vcpu *vcpu); void kvm_reset_coprocs(struct kvm_vcpu *vcpu);
struct kvm_arch { struct kvm_vmid {
/* VTTBR value associated with below pgd and vmid */ /* The VMID generation used for the virt. memory system */
u64 vttbr; u64 vmid_gen;
u32 vmid;
};
struct kvm_arch {
/* The last vcpu id that ran on each physical CPU */ /* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran; int __percpu *last_vcpu_ran;
...@@ -70,11 +74,11 @@ struct kvm_arch { ...@@ -70,11 +74,11 @@ struct kvm_arch {
*/ */
/* The VMID generation used for the virt. memory system */ /* The VMID generation used for the virt. memory system */
u64 vmid_gen; struct kvm_vmid vmid;
u32 vmid;
/* Stage-2 page table */ /* Stage-2 page table */
pgd_t *pgd; pgd_t *pgd;
phys_addr_t pgd_phys;
/* Interrupt controller */ /* Interrupt controller */
struct vgic_dist vgic; struct vgic_dist vgic;
...@@ -148,6 +152,13 @@ struct kvm_cpu_context { ...@@ -148,6 +152,13 @@ struct kvm_cpu_context {
typedef struct kvm_cpu_context kvm_cpu_context_t; typedef struct kvm_cpu_context kvm_cpu_context_t;
static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
int cpu)
{
/* The host's MPIDR is immutable, so let's set it up at boot time */
cpu_ctxt->cp15[c0_MPIDR] = cpu_logical_map(cpu);
}
struct vcpu_reset_state { struct vcpu_reset_state {
unsigned long pc; unsigned long pc;
unsigned long r0; unsigned long r0;
...@@ -224,7 +235,35 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu); ...@@ -224,7 +235,35 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices); int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg); int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg); int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
unsigned long kvm_call_hyp(void *hypfn, ...);
unsigned long __kvm_call_hyp(void *hypfn, ...);
/*
* The has_vhe() part doesn't get emitted, but is used for type-checking.
*/
#define kvm_call_hyp(f, ...) \
do { \
if (has_vhe()) { \
f(__VA_ARGS__); \
} else { \
__kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__); \
} \
} while(0)
#define kvm_call_hyp_ret(f, ...) \
({ \
typeof(f(__VA_ARGS__)) ret; \
\
if (has_vhe()) { \
ret = f(__VA_ARGS__); \
} else { \
ret = __kvm_call_hyp(kvm_ksym_ref(f), \
##__VA_ARGS__); \
} \
\
ret; \
})
void force_vm_exit(const cpumask_t *mask); void force_vm_exit(const cpumask_t *mask);
int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu, int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
struct kvm_vcpu_events *events); struct kvm_vcpu_events *events);
...@@ -275,7 +314,7 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr, ...@@ -275,7 +314,7 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
* compliant with the PCS!). * compliant with the PCS!).
*/ */
kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr); __kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
} }
static inline void __cpu_init_stage2(void) static inline void __cpu_init_stage2(void)
......
...@@ -40,6 +40,7 @@ ...@@ -40,6 +40,7 @@
#define TTBR1 __ACCESS_CP15_64(1, c2) #define TTBR1 __ACCESS_CP15_64(1, c2)
#define VTTBR __ACCESS_CP15_64(6, c2) #define VTTBR __ACCESS_CP15_64(6, c2)
#define PAR __ACCESS_CP15_64(0, c7) #define PAR __ACCESS_CP15_64(0, c7)
#define CNTP_CVAL __ACCESS_CP15_64(2, c14)
#define CNTV_CVAL __ACCESS_CP15_64(3, c14) #define CNTV_CVAL __ACCESS_CP15_64(3, c14)
#define CNTVOFF __ACCESS_CP15_64(4, c14) #define CNTVOFF __ACCESS_CP15_64(4, c14)
...@@ -85,6 +86,7 @@ ...@@ -85,6 +86,7 @@
#define TID_PRIV __ACCESS_CP15(c13, 0, c0, 4) #define TID_PRIV __ACCESS_CP15(c13, 0, c0, 4)
#define HTPIDR __ACCESS_CP15(c13, 4, c0, 2) #define HTPIDR __ACCESS_CP15(c13, 4, c0, 2)
#define CNTKCTL __ACCESS_CP15(c14, 0, c1, 0) #define CNTKCTL __ACCESS_CP15(c14, 0, c1, 0)
#define CNTP_CTL __ACCESS_CP15(c14, 0, c2, 1)
#define CNTV_CTL __ACCESS_CP15(c14, 0, c3, 1) #define CNTV_CTL __ACCESS_CP15(c14, 0, c3, 1)
#define CNTHCTL __ACCESS_CP15(c14, 4, c1, 0) #define CNTHCTL __ACCESS_CP15(c14, 4, c1, 0)
...@@ -94,6 +96,8 @@ ...@@ -94,6 +96,8 @@
#define read_sysreg_el0(r) read_sysreg(r##_el0) #define read_sysreg_el0(r) read_sysreg(r##_el0)
#define write_sysreg_el0(v, r) write_sysreg(v, r##_el0) #define write_sysreg_el0(v, r) write_sysreg(v, r##_el0)
#define cntp_ctl_el0 CNTP_CTL
#define cntp_cval_el0 CNTP_CVAL
#define cntv_ctl_el0 CNTV_CTL #define cntv_ctl_el0 CNTV_CTL
#define cntv_cval_el0 CNTV_CVAL #define cntv_cval_el0 CNTV_CVAL
#define cntvoff_el2 CNTVOFF #define cntvoff_el2 CNTVOFF
......
...@@ -421,9 +421,14 @@ static inline int hyp_map_aux_data(void) ...@@ -421,9 +421,14 @@ static inline int hyp_map_aux_data(void)
static inline void kvm_set_ipa_limit(void) {} static inline void kvm_set_ipa_limit(void) {}
static inline bool kvm_cpu_has_cnp(void) static __always_inline u64 kvm_get_vttbr(struct kvm *kvm)
{ {
return false; struct kvm_vmid *vmid = &kvm->arch.vmid;
u64 vmid_field, baddr;
baddr = kvm->arch.pgd_phys;
vmid_field = (u64)vmid->vmid << VTTBR_VMID_SHIFT;
return kvm_phys_to_vttbr(baddr) | vmid_field;
} }
#endif /* !__ASSEMBLY__ */ #endif /* !__ASSEMBLY__ */
......
...@@ -8,9 +8,8 @@ ifeq ($(plus_virt),+virt) ...@@ -8,9 +8,8 @@ ifeq ($(plus_virt),+virt)
plus_virt_def := -DREQUIRES_VIRT=1 plus_virt_def := -DREQUIRES_VIRT=1
endif endif
ccflags-y += -Iarch/arm/kvm -Ivirt/kvm/arm/vgic ccflags-y += -I $(srctree)/$(src) -I $(srctree)/virt/kvm/arm/vgic
CFLAGS_arm.o := -I. $(plus_virt_def) CFLAGS_arm.o := $(plus_virt_def)
CFLAGS_mmu.o := -I.
AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt) AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt) AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)
......
...@@ -293,15 +293,16 @@ static bool access_cntp_tval(struct kvm_vcpu *vcpu, ...@@ -293,15 +293,16 @@ static bool access_cntp_tval(struct kvm_vcpu *vcpu,
const struct coproc_params *p, const struct coproc_params *p,
const struct coproc_reg *r) const struct coproc_reg *r)
{ {
u64 now = kvm_phys_timer_read(); u32 val;
u64 val;
if (p->is_write) { if (p->is_write) {
val = *vcpu_reg(vcpu, p->Rt1); val = *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, val + now); kvm_arm_timer_write_sysreg(vcpu,
TIMER_PTIMER, TIMER_REG_TVAL, val);
} else { } else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL); val = kvm_arm_timer_read_sysreg(vcpu,
*vcpu_reg(vcpu, p->Rt1) = val - now; TIMER_PTIMER, TIMER_REG_TVAL);
*vcpu_reg(vcpu, p->Rt1) = val;
} }
return true; return true;
...@@ -315,9 +316,11 @@ static bool access_cntp_ctl(struct kvm_vcpu *vcpu, ...@@ -315,9 +316,11 @@ static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
if (p->is_write) { if (p->is_write) {
val = *vcpu_reg(vcpu, p->Rt1); val = *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CTL, val); kvm_arm_timer_write_sysreg(vcpu,
TIMER_PTIMER, TIMER_REG_CTL, val);
} else { } else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CTL); val = kvm_arm_timer_read_sysreg(vcpu,
TIMER_PTIMER, TIMER_REG_CTL);
*vcpu_reg(vcpu, p->Rt1) = val; *vcpu_reg(vcpu, p->Rt1) = val;
} }
...@@ -333,9 +336,11 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, ...@@ -333,9 +336,11 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
if (p->is_write) { if (p->is_write) {
val = (u64)*vcpu_reg(vcpu, p->Rt2) << 32; val = (u64)*vcpu_reg(vcpu, p->Rt2) << 32;
val |= *vcpu_reg(vcpu, p->Rt1); val |= *vcpu_reg(vcpu, p->Rt1);
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, val); kvm_arm_timer_write_sysreg(vcpu,
TIMER_PTIMER, TIMER_REG_CVAL, val);
} else { } else {
val = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL); val = kvm_arm_timer_read_sysreg(vcpu,
TIMER_PTIMER, TIMER_REG_CVAL);
*vcpu_reg(vcpu, p->Rt1) = val; *vcpu_reg(vcpu, p->Rt1) = val;
*vcpu_reg(vcpu, p->Rt2) = val >> 32; *vcpu_reg(vcpu, p->Rt2) = val >> 32;
} }
......
...@@ -27,7 +27,6 @@ static u64 *cp15_64(struct kvm_cpu_context *ctxt, int idx) ...@@ -27,7 +27,6 @@ static u64 *cp15_64(struct kvm_cpu_context *ctxt, int idx)
void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
{ {
ctxt->cp15[c0_MPIDR] = read_sysreg(VMPIDR);
ctxt->cp15[c0_CSSELR] = read_sysreg(CSSELR); ctxt->cp15[c0_CSSELR] = read_sysreg(CSSELR);
ctxt->cp15[c1_SCTLR] = read_sysreg(SCTLR); ctxt->cp15[c1_SCTLR] = read_sysreg(SCTLR);
ctxt->cp15[c1_CPACR] = read_sysreg(CPACR); ctxt->cp15[c1_CPACR] = read_sysreg(CPACR);
......
...@@ -176,7 +176,7 @@ THUMB( orr lr, lr, #PSR_T_BIT ) ...@@ -176,7 +176,7 @@ THUMB( orr lr, lr, #PSR_T_BIT )
msr spsr_cxsf, lr msr spsr_cxsf, lr
ldr lr, =panic ldr lr, =panic
msr ELR_hyp, lr msr ELR_hyp, lr
ldr lr, =kvm_call_hyp ldr lr, =__kvm_call_hyp
clrex clrex
eret eret
ENDPROC(__hyp_do_panic) ENDPROC(__hyp_do_panic)
......
...@@ -77,7 +77,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) ...@@ -77,7 +77,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
{ {
struct kvm *kvm = kern_hyp_va(vcpu->kvm); struct kvm *kvm = kern_hyp_va(vcpu->kvm);
write_sysreg(kvm->arch.vttbr, VTTBR); write_sysreg(kvm_get_vttbr(kvm), VTTBR);
write_sysreg(vcpu->arch.midr, VPIDR); write_sysreg(vcpu->arch.midr, VPIDR);
} }
......
...@@ -41,7 +41,7 @@ void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm) ...@@ -41,7 +41,7 @@ void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
/* Switch to requested VMID */ /* Switch to requested VMID */
kvm = kern_hyp_va(kvm); kvm = kern_hyp_va(kvm);
write_sysreg(kvm->arch.vttbr, VTTBR); write_sysreg(kvm_get_vttbr(kvm), VTTBR);
isb(); isb();
write_sysreg(0, TLBIALLIS); write_sysreg(0, TLBIALLIS);
...@@ -61,7 +61,7 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu) ...@@ -61,7 +61,7 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm); struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
/* Switch to requested VMID */ /* Switch to requested VMID */
write_sysreg(kvm->arch.vttbr, VTTBR); write_sysreg(kvm_get_vttbr(kvm), VTTBR);
isb(); isb();
write_sysreg(0, TLBIALL); write_sysreg(0, TLBIALL);
......
...@@ -42,7 +42,7 @@ ...@@ -42,7 +42,7 @@
* r12: caller save * r12: caller save
* rest: callee save * rest: callee save
*/ */
ENTRY(kvm_call_hyp) ENTRY(__kvm_call_hyp)
hvc #0 hvc #0
bx lr bx lr
ENDPROC(kvm_call_hyp) ENDPROC(__kvm_call_hyp)
...@@ -77,6 +77,10 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) ...@@ -77,6 +77,10 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
*/ */
if (!vcpu_el1_is_32bit(vcpu)) if (!vcpu_el1_is_32bit(vcpu))
vcpu->arch.hcr_el2 |= HCR_TID3; vcpu->arch.hcr_el2 |= HCR_TID3;
if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) ||
vcpu_el1_is_32bit(vcpu))
vcpu->arch.hcr_el2 |= HCR_TID2;
} }
static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu) static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
...@@ -331,6 +335,14 @@ static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) ...@@ -331,6 +335,14 @@ static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
return ESR_ELx_SYS64_ISS_RT(esr); return ESR_ELx_SYS64_ISS_RT(esr);
} }
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
return kvm_vcpu_dabt_iswrite(vcpu);
}
static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu) static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
{ {
return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
......
...@@ -31,6 +31,7 @@ ...@@ -31,6 +31,7 @@
#include <asm/kvm.h> #include <asm/kvm.h>
#include <asm/kvm_asm.h> #include <asm/kvm_asm.h>
#include <asm/kvm_mmio.h> #include <asm/kvm_mmio.h>
#include <asm/smp_plat.h>
#include <asm/thread_info.h> #include <asm/thread_info.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED #define __KVM_HAVE_ARCH_INTC_INITIALIZED
...@@ -58,16 +59,19 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu); ...@@ -58,16 +59,19 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext); int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext);
void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start); void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
struct kvm_arch { struct kvm_vmid {
/* The VMID generation used for the virt. memory system */ /* The VMID generation used for the virt. memory system */
u64 vmid_gen; u64 vmid_gen;
u32 vmid; u32 vmid;
};
struct kvm_arch {
struct kvm_vmid vmid;
/* stage2 entry level table */ /* stage2 entry level table */
pgd_t *pgd; pgd_t *pgd;
phys_addr_t pgd_phys;
/* VTTBR value associated with above pgd and vmid */
u64 vttbr;
/* VTCR_EL2 value for this VM */ /* VTCR_EL2 value for this VM */
u64 vtcr; u64 vtcr;
...@@ -382,7 +386,36 @@ void kvm_arm_halt_guest(struct kvm *kvm); ...@@ -382,7 +386,36 @@ void kvm_arm_halt_guest(struct kvm *kvm);
void kvm_arm_resume_guest(struct kvm *kvm); void kvm_arm_resume_guest(struct kvm *kvm);
u64 __kvm_call_hyp(void *hypfn, ...); u64 __kvm_call_hyp(void *hypfn, ...);
#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
/*
* The couple of isb() below are there to guarantee the same behaviour
* on VHE as on !VHE, where the eret to EL1 acts as a context
* synchronization event.
*/
#define kvm_call_hyp(f, ...) \
do { \
if (has_vhe()) { \
f(__VA_ARGS__); \
isb(); \
} else { \
__kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__); \
} \
} while(0)
#define kvm_call_hyp_ret(f, ...) \
({ \
typeof(f(__VA_ARGS__)) ret; \
\
if (has_vhe()) { \
ret = f(__VA_ARGS__); \
isb(); \
} else { \
ret = __kvm_call_hyp(kvm_ksym_ref(f), \
##__VA_ARGS__); \
} \
\
ret; \
})
void force_vm_exit(const cpumask_t *mask); void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
...@@ -401,6 +434,13 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); ...@@ -401,6 +434,13 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state); DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
int cpu)
{
/* The host's MPIDR is immutable, so let's set it up at boot time */
cpu_ctxt->sys_regs[MPIDR_EL1] = cpu_logical_map(cpu);
}
void __kvm_enable_ssbs(void); void __kvm_enable_ssbs(void);
static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr, static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
......
...@@ -21,6 +21,7 @@ ...@@ -21,6 +21,7 @@
#include <linux/compiler.h> #include <linux/compiler.h>
#include <linux/kvm_host.h> #include <linux/kvm_host.h>
#include <asm/alternative.h> #include <asm/alternative.h>
#include <asm/kvm_mmu.h>
#include <asm/sysreg.h> #include <asm/sysreg.h>
#define __hyp_text __section(.hyp.text) notrace #define __hyp_text __section(.hyp.text) notrace
...@@ -163,7 +164,7 @@ void __noreturn __hyp_do_panic(unsigned long, ...); ...@@ -163,7 +164,7 @@ void __noreturn __hyp_do_panic(unsigned long, ...);
static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm) static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm)
{ {
write_sysreg(kvm->arch.vtcr, vtcr_el2); write_sysreg(kvm->arch.vtcr, vtcr_el2);
write_sysreg(kvm->arch.vttbr, vttbr_el2); write_sysreg(kvm_get_vttbr(kvm), vttbr_el2);
/* /*
* ARM erratum 1165522 requires the actual execution of the above * ARM erratum 1165522 requires the actual execution of the above
......
...@@ -138,7 +138,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v) ...@@ -138,7 +138,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
}) })
/* /*
* We currently only support a 40bit IPA. * We currently support using a VM-specified IPA size. For backward
* compatibility, the default IPA size is fixed to 40bits.
*/ */
#define KVM_PHYS_SHIFT (40) #define KVM_PHYS_SHIFT (40)
...@@ -591,9 +592,15 @@ static inline u64 kvm_vttbr_baddr_mask(struct kvm *kvm) ...@@ -591,9 +592,15 @@ static inline u64 kvm_vttbr_baddr_mask(struct kvm *kvm)
return vttbr_baddr_mask(kvm_phys_shift(kvm), kvm_stage2_levels(kvm)); return vttbr_baddr_mask(kvm_phys_shift(kvm), kvm_stage2_levels(kvm));
} }
static inline bool kvm_cpu_has_cnp(void) static __always_inline u64 kvm_get_vttbr(struct kvm *kvm)
{ {
return system_supports_cnp(); struct kvm_vmid *vmid = &kvm->arch.vmid;
u64 vmid_field, baddr;
u64 cnp = system_supports_cnp() ? VTTBR_CNP_BIT : 0;
baddr = kvm->arch.pgd_phys;
vmid_field = (u64)vmid->vmid << VTTBR_VMID_SHIFT;
return kvm_phys_to_vttbr(baddr) | vmid_field | cnp;
} }
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
......
...@@ -361,6 +361,7 @@ ...@@ -361,6 +361,7 @@
#define SYS_CNTKCTL_EL1 sys_reg(3, 0, 14, 1, 0) #define SYS_CNTKCTL_EL1 sys_reg(3, 0, 14, 1, 0)
#define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0)
#define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1) #define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1)
#define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7) #define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7)
...@@ -392,6 +393,10 @@ ...@@ -392,6 +393,10 @@
#define SYS_CNTP_CTL_EL0 sys_reg(3, 3, 14, 2, 1) #define SYS_CNTP_CTL_EL0 sys_reg(3, 3, 14, 2, 1)
#define SYS_CNTP_CVAL_EL0 sys_reg(3, 3, 14, 2, 2) #define SYS_CNTP_CVAL_EL0 sys_reg(3, 3, 14, 2, 2)
#define SYS_AARCH32_CNTP_TVAL sys_reg(0, 0, 14, 2, 0)
#define SYS_AARCH32_CNTP_CTL sys_reg(0, 0, 14, 2, 1)
#define SYS_AARCH32_CNTP_CVAL sys_reg(0, 2, 0, 14, 0)
#define __PMEV_op2(n) ((n) & 0x7) #define __PMEV_op2(n) ((n) & 0x7)
#define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3)) #define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3))
#define SYS_PMEVCNTRn_EL0(n) sys_reg(3, 3, 14, __CNTR_CRm(n), __PMEV_op2(n)) #define SYS_PMEVCNTRn_EL0(n) sys_reg(3, 3, 14, __CNTR_CRm(n), __PMEV_op2(n))
...@@ -426,7 +431,7 @@ ...@@ -426,7 +431,7 @@
#define SYS_ICH_VTR_EL2 sys_reg(3, 4, 12, 11, 1) #define SYS_ICH_VTR_EL2 sys_reg(3, 4, 12, 11, 1)
#define SYS_ICH_MISR_EL2 sys_reg(3, 4, 12, 11, 2) #define SYS_ICH_MISR_EL2 sys_reg(3, 4, 12, 11, 2)
#define SYS_ICH_EISR_EL2 sys_reg(3, 4, 12, 11, 3) #define SYS_ICH_EISR_EL2 sys_reg(3, 4, 12, 11, 3)
#define SYS_ICH_ELSR_EL2 sys_reg(3, 4, 12, 11, 5) #define SYS_ICH_ELRSR_EL2 sys_reg(3, 4, 12, 11, 5)
#define SYS_ICH_VMCR_EL2 sys_reg(3, 4, 12, 11, 7) #define SYS_ICH_VMCR_EL2 sys_reg(3, 4, 12, 11, 7)
#define __SYS__LR0_EL2(x) sys_reg(3, 4, 12, 12, x) #define __SYS__LR0_EL2(x) sys_reg(3, 4, 12, 12, x)
......
...@@ -3,9 +3,7 @@ ...@@ -3,9 +3,7 @@
# Makefile for Kernel-based Virtual Machine module # Makefile for Kernel-based Virtual Machine module
# #
ccflags-y += -Iarch/arm64/kvm -Ivirt/kvm/arm/vgic ccflags-y += -I $(srctree)/$(src) -I $(srctree)/virt/kvm/arm/vgic
CFLAGS_arm.o := -I.
CFLAGS_mmu.o := -I.
KVM=../../../virt/kvm KVM=../../../virt/kvm
......
...@@ -76,7 +76,7 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu) ...@@ -76,7 +76,7 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
void kvm_arm_init_debug(void) void kvm_arm_init_debug(void)
{ {
__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2)); __this_cpu_write(mdcr_el2, kvm_call_hyp_ret(__kvm_get_mdcr_el2));
} }
/** /**
......
...@@ -40,9 +40,6 @@ ...@@ -40,9 +40,6 @@
* arch/arm64/kernel/hyp_stub.S. * arch/arm64/kernel/hyp_stub.S.
*/ */
ENTRY(__kvm_call_hyp) ENTRY(__kvm_call_hyp)
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
hvc #0 hvc #0
ret ret
alternative_else_nop_endif
b __vhe_hyp_call
ENDPROC(__kvm_call_hyp) ENDPROC(__kvm_call_hyp)
...@@ -43,18 +43,6 @@ ...@@ -43,18 +43,6 @@
ldr lr, [sp], #16 ldr lr, [sp], #16
.endm .endm
ENTRY(__vhe_hyp_call)
do_el2_call
/*
* We used to rely on having an exception return to get
* an implicit isb. In the E2H case, we don't have it anymore.
* rather than changing all the leaf functions, just do it here
* before returning to the rest of the kernel.
*/
isb
ret
ENDPROC(__vhe_hyp_call)
el1_sync: // Guest trapped into EL2 el1_sync: // Guest trapped into EL2
mrs x0, esr_el2 mrs x0, esr_el2
......
...@@ -53,7 +53,6 @@ static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt) ...@@ -53,7 +53,6 @@ static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
{ {
ctxt->sys_regs[MPIDR_EL1] = read_sysreg(vmpidr_el2);
ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1); ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1);
ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(sctlr); ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(sctlr);
ctxt->sys_regs[ACTLR_EL1] = read_sysreg(actlr_el1); ctxt->sys_regs[ACTLR_EL1] = read_sysreg(actlr_el1);
......
...@@ -982,6 +982,10 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -982,6 +982,10 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true; return true;
} }
#define reg_to_encoding(x) \
sys_reg((u32)(x)->Op0, (u32)(x)->Op1, \
(u32)(x)->CRn, (u32)(x)->CRm, (u32)(x)->Op2);
/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */ /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
#define DBG_BCR_BVR_WCR_WVR_EL1(n) \ #define DBG_BCR_BVR_WCR_WVR_EL1(n) \
{ SYS_DESC(SYS_DBGBVRn_EL1(n)), \ { SYS_DESC(SYS_DBGBVRn_EL1(n)), \
...@@ -1003,44 +1007,38 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -1003,44 +1007,38 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
{ SYS_DESC(SYS_PMEVTYPERn_EL0(n)), \ { SYS_DESC(SYS_PMEVTYPERn_EL0(n)), \
access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), } access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), }
static bool access_cntp_tval(struct kvm_vcpu *vcpu, static bool access_arch_timer(struct kvm_vcpu *vcpu,
struct sys_reg_params *p, struct sys_reg_params *p,
const struct sys_reg_desc *r) const struct sys_reg_desc *r)
{ {
u64 now = kvm_phys_timer_read(); enum kvm_arch_timers tmr;
u64 cval; enum kvm_arch_timer_regs treg;
u64 reg = reg_to_encoding(r);
if (p->is_write) { switch (reg) {
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, case SYS_CNTP_TVAL_EL0:
p->regval + now); case SYS_AARCH32_CNTP_TVAL:
} else { tmr = TIMER_PTIMER;
cval = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL); treg = TIMER_REG_TVAL;
p->regval = cval - now; break;
case SYS_CNTP_CTL_EL0:
case SYS_AARCH32_CNTP_CTL:
tmr = TIMER_PTIMER;
treg = TIMER_REG_CTL;
break;
case SYS_CNTP_CVAL_EL0:
case SYS_AARCH32_CNTP_CVAL:
tmr = TIMER_PTIMER;
treg = TIMER_REG_CVAL;
break;
default:
BUG();
} }
return true;
}
static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
if (p->is_write)
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CTL, p->regval);
else
p->regval = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CTL);
return true;
}
static bool access_cntp_cval(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
if (p->is_write) if (p->is_write)
kvm_arm_timer_set_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL, p->regval); kvm_arm_timer_write_sysreg(vcpu, tmr, treg, p->regval);
else else
p->regval = kvm_arm_timer_get_reg(vcpu, KVM_REG_ARM_PTIMER_CVAL); p->regval = kvm_arm_timer_read_sysreg(vcpu, tmr, treg);
return true; return true;
} }
...@@ -1160,6 +1158,64 @@ static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, ...@@ -1160,6 +1158,64 @@ static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
return __set_id_reg(rd, uaddr, true); return __set_id_reg(rd, uaddr, true);
} }
static bool access_ctr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
if (p->is_write)
return write_to_read_only(vcpu, p, r);
p->regval = read_sanitised_ftr_reg(SYS_CTR_EL0);
return true;
}
static bool access_clidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
if (p->is_write)
return write_to_read_only(vcpu, p, r);
p->regval = read_sysreg(clidr_el1);
return true;
}
static bool access_csselr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
if (p->is_write)
vcpu_write_sys_reg(vcpu, p->regval, r->reg);
else
p->regval = vcpu_read_sys_reg(vcpu, r->reg);
return true;
}
static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
u32 csselr;
if (p->is_write)
return write_to_read_only(vcpu, p, r);
csselr = vcpu_read_sys_reg(vcpu, CSSELR_EL1);
p->regval = get_ccsidr(csselr);
/*
* Guests should not be doing cache operations by set/way at all, and
* for this reason, we trap them and attempt to infer the intent, so
* that we can flush the entire guest's address space at the appropriate
* time.
* To prevent this trapping from causing performance problems, let's
* expose the geometry of all data and unified caches (which are
* guaranteed to be PIPT and thus non-aliasing) as 1 set and 1 way.
* [If guests should attempt to infer aliasing properties from the
* geometry (which is not permitted by the architecture), they would
* only do so for virtually indexed caches.]
*/
if (!(csselr & 1)) // data or unified cache
p->regval &= ~GENMASK(27, 3);
return true;
}
/* sys_reg_desc initialiser for known cpufeature ID registers */ /* sys_reg_desc initialiser for known cpufeature ID registers */
#define ID_SANITISED(name) { \ #define ID_SANITISED(name) { \
SYS_DESC(SYS_##name), \ SYS_DESC(SYS_##name), \
...@@ -1377,7 +1433,10 @@ static const struct sys_reg_desc sys_reg_descs[] = { ...@@ -1377,7 +1433,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_CNTKCTL_EL1), NULL, reset_val, CNTKCTL_EL1, 0}, { SYS_DESC(SYS_CNTKCTL_EL1), NULL, reset_val, CNTKCTL_EL1, 0},
{ SYS_DESC(SYS_CSSELR_EL1), NULL, reset_unknown, CSSELR_EL1 }, { SYS_DESC(SYS_CCSIDR_EL1), access_ccsidr },
{ SYS_DESC(SYS_CLIDR_EL1), access_clidr },
{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
{ SYS_DESC(SYS_CTR_EL0), access_ctr },
{ SYS_DESC(SYS_PMCR_EL0), access_pmcr, reset_pmcr, }, { SYS_DESC(SYS_PMCR_EL0), access_pmcr, reset_pmcr, },
{ SYS_DESC(SYS_PMCNTENSET_EL0), access_pmcnten, reset_unknown, PMCNTENSET_EL0 }, { SYS_DESC(SYS_PMCNTENSET_EL0), access_pmcnten, reset_unknown, PMCNTENSET_EL0 },
...@@ -1400,9 +1459,9 @@ static const struct sys_reg_desc sys_reg_descs[] = { ...@@ -1400,9 +1459,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 }, { SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 }, { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
{ SYS_DESC(SYS_CNTP_TVAL_EL0), access_cntp_tval }, { SYS_DESC(SYS_CNTP_TVAL_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTP_CTL_EL0), access_cntp_ctl }, { SYS_DESC(SYS_CNTP_CTL_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTP_CVAL_EL0), access_cntp_cval }, { SYS_DESC(SYS_CNTP_CVAL_EL0), access_arch_timer },
/* PMEVCNTRn_EL0 */ /* PMEVCNTRn_EL0 */
PMU_PMEVCNTR_EL0(0), PMU_PMEVCNTR_EL0(0),
...@@ -1476,7 +1535,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { ...@@ -1476,7 +1535,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_DACR32_EL2), NULL, reset_unknown, DACR32_EL2 }, { SYS_DESC(SYS_DACR32_EL2), NULL, reset_unknown, DACR32_EL2 },
{ SYS_DESC(SYS_IFSR32_EL2), NULL, reset_unknown, IFSR32_EL2 }, { SYS_DESC(SYS_IFSR32_EL2), NULL, reset_unknown, IFSR32_EL2 },
{ SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x70 }, { SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x700 },
}; };
static bool trap_dbgidr(struct kvm_vcpu *vcpu, static bool trap_dbgidr(struct kvm_vcpu *vcpu,
...@@ -1677,6 +1736,7 @@ static const struct sys_reg_desc cp14_64_regs[] = { ...@@ -1677,6 +1736,7 @@ static const struct sys_reg_desc cp14_64_regs[] = {
* register). * register).
*/ */
static const struct sys_reg_desc cp15_regs[] = { static const struct sys_reg_desc cp15_regs[] = {
{ Op1( 0), CRn( 0), CRm( 0), Op2( 1), access_ctr },
{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_vm_reg, NULL, c1_SCTLR }, { Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_vm_reg, NULL, c1_SCTLR },
{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 }, { Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 }, { Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
...@@ -1723,10 +1783,9 @@ static const struct sys_reg_desc cp15_regs[] = { ...@@ -1723,10 +1783,9 @@ static const struct sys_reg_desc cp15_regs[] = {
{ Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID }, { Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID },
/* CNTP_TVAL */ /* Arch Tmers */
{ Op1( 0), CRn(14), CRm( 2), Op2( 0), access_cntp_tval }, { SYS_DESC(SYS_AARCH32_CNTP_TVAL), access_arch_timer },
/* CNTP_CTL */ { SYS_DESC(SYS_AARCH32_CNTP_CTL), access_arch_timer },
{ Op1( 0), CRn(14), CRm( 2), Op2( 1), access_cntp_ctl },
/* PMEVCNTRn */ /* PMEVCNTRn */
PMU_PMEVCNTR(0), PMU_PMEVCNTR(0),
...@@ -1794,6 +1853,10 @@ static const struct sys_reg_desc cp15_regs[] = { ...@@ -1794,6 +1853,10 @@ static const struct sys_reg_desc cp15_regs[] = {
PMU_PMEVTYPER(30), PMU_PMEVTYPER(30),
/* PMCCFILTR */ /* PMCCFILTR */
{ Op1(0), CRn(14), CRm(15), Op2(7), access_pmu_evtyper }, { Op1(0), CRn(14), CRm(15), Op2(7), access_pmu_evtyper },
{ Op1(1), CRn( 0), CRm( 0), Op2(0), access_ccsidr },
{ Op1(1), CRn( 0), CRm( 0), Op2(1), access_clidr },
{ Op1(2), CRn( 0), CRm( 0), Op2(0), access_csselr, NULL, c0_CSSELR },
}; };
static const struct sys_reg_desc cp15_64_regs[] = { static const struct sys_reg_desc cp15_64_regs[] = {
...@@ -1803,7 +1866,7 @@ static const struct sys_reg_desc cp15_64_regs[] = { ...@@ -1803,7 +1866,7 @@ static const struct sys_reg_desc cp15_64_regs[] = {
{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 }, { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
{ Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */ { Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */
{ Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */ { Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */
{ Op1( 2), CRn( 0), CRm(14), Op2( 0), access_cntp_cval }, { SYS_DESC(SYS_AARCH32_CNTP_CVAL), access_arch_timer },
}; };
/* Target specific emulation tables */ /* Target specific emulation tables */
...@@ -1832,30 +1895,19 @@ static const struct sys_reg_desc *get_target_table(unsigned target, ...@@ -1832,30 +1895,19 @@ static const struct sys_reg_desc *get_target_table(unsigned target,
} }
} }
#define reg_to_match_value(x) \
({ \
unsigned long val; \
val = (x)->Op0 << 14; \
val |= (x)->Op1 << 11; \
val |= (x)->CRn << 7; \
val |= (x)->CRm << 3; \
val |= (x)->Op2; \
val; \
})
static int match_sys_reg(const void *key, const void *elt) static int match_sys_reg(const void *key, const void *elt)
{ {
const unsigned long pval = (unsigned long)key; const unsigned long pval = (unsigned long)key;
const struct sys_reg_desc *r = elt; const struct sys_reg_desc *r = elt;
return pval - reg_to_match_value(r); return pval - reg_to_encoding(r);
} }
static const struct sys_reg_desc *find_reg(const struct sys_reg_params *params, static const struct sys_reg_desc *find_reg(const struct sys_reg_params *params,
const struct sys_reg_desc table[], const struct sys_reg_desc table[],
unsigned int num) unsigned int num)
{ {
unsigned long pval = reg_to_match_value(params); unsigned long pval = reg_to_encoding(params);
return bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg); return bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
} }
...@@ -2218,11 +2270,15 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu, ...@@ -2218,11 +2270,15 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
} }
FUNCTION_INVARIANT(midr_el1) FUNCTION_INVARIANT(midr_el1)
FUNCTION_INVARIANT(ctr_el0)
FUNCTION_INVARIANT(revidr_el1) FUNCTION_INVARIANT(revidr_el1)
FUNCTION_INVARIANT(clidr_el1) FUNCTION_INVARIANT(clidr_el1)
FUNCTION_INVARIANT(aidr_el1) FUNCTION_INVARIANT(aidr_el1)
static void get_ctr_el0(struct kvm_vcpu *v, const struct sys_reg_desc *r)
{
((struct sys_reg_desc *)r)->val = read_sanitised_ftr_reg(SYS_CTR_EL0);
}
/* ->val is filled in by kvm_sys_reg_table_init() */ /* ->val is filled in by kvm_sys_reg_table_init() */
static struct sys_reg_desc invariant_sys_regs[] = { static struct sys_reg_desc invariant_sys_regs[] = {
{ SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 }, { SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 },
......
...@@ -1134,7 +1134,7 @@ static inline void kvm_arch_hardware_unsetup(void) {} ...@@ -1134,7 +1134,7 @@ static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_free_memslot(struct kvm *kvm, static inline void kvm_arch_free_memslot(struct kvm *kvm,
struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
......
...@@ -99,6 +99,8 @@ struct kvm_nested_guest; ...@@ -99,6 +99,8 @@ struct kvm_nested_guest;
struct kvm_vm_stat { struct kvm_vm_stat {
ulong remote_tlb_flush; ulong remote_tlb_flush;
ulong num_2M_pages;
ulong num_1G_pages;
}; };
struct kvm_vcpu_stat { struct kvm_vcpu_stat {
...@@ -377,6 +379,7 @@ struct kvmppc_mmu { ...@@ -377,6 +379,7 @@ struct kvmppc_mmu {
void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs); void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs);
u64 (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr); u64 (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr);
u64 (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr); u64 (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr);
int (*slbfee)(struct kvm_vcpu *vcpu, gva_t eaddr, ulong *ret_slb);
void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr); void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr);
void (*slbia)(struct kvm_vcpu *vcpu); void (*slbia)(struct kvm_vcpu *vcpu);
/* book3s */ /* book3s */
...@@ -837,7 +840,7 @@ struct kvm_vcpu_arch { ...@@ -837,7 +840,7 @@ struct kvm_vcpu_arch {
static inline void kvm_arch_hardware_disable(void) {} static inline void kvm_arch_hardware_disable(void) {}
static inline void kvm_arch_hardware_unsetup(void) {} static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_exit(void) {} static inline void kvm_arch_exit(void) {}
......
...@@ -36,6 +36,8 @@ ...@@ -36,6 +36,8 @@
#endif #endif
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
#include <asm/paca.h> #include <asm/paca.h>
#include <asm/xive.h>
#include <asm/cpu_has_feature.h>
#endif #endif
/* /*
...@@ -617,6 +619,18 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 ir ...@@ -617,6 +619,18 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 ir
static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
#endif /* CONFIG_KVM_XIVE */ #endif /* CONFIG_KVM_XIVE */
#if defined(CONFIG_PPC_POWERNV) && defined(CONFIG_KVM_BOOK3S_64_HANDLER)
static inline bool xics_on_xive(void)
{
return xive_enabled() && cpu_has_feature(CPU_FTR_HVMODE);
}
#else
static inline bool xics_on_xive(void)
{
return false;
}
#endif
/* /*
* Prototypes for functions called only from assembler code. * Prototypes for functions called only from assembler code.
* Having prototypes reduces sparse errors. * Having prototypes reduces sparse errors.
......
...@@ -463,10 +463,12 @@ struct kvm_ppc_cpu_char { ...@@ -463,10 +463,12 @@ struct kvm_ppc_cpu_char {
#define KVM_PPC_CPU_CHAR_BR_HINT_HONOURED (1ULL << 58) #define KVM_PPC_CPU_CHAR_BR_HINT_HONOURED (1ULL << 58)
#define KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF (1ULL << 57) #define KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF (1ULL << 57)
#define KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS (1ULL << 56) #define KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS (1ULL << 56)
#define KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54)
#define KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY (1ULL << 63) #define KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY (1ULL << 63)
#define KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR (1ULL << 62) #define KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR (1ULL << 62)
#define KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ULL << 61) #define KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ULL << 61)
#define KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58)
/* Per-vcpu XICS interrupt controller state */ /* Per-vcpu XICS interrupt controller state */
#define KVM_REG_PPC_ICP_STATE (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c) #define KVM_REG_PPC_ICP_STATE (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c)
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include "book3s.h" #include "book3s.h"
#include "trace.h" #include "trace.h"
#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
/* #define EXIT_DEBUG */ /* #define EXIT_DEBUG */
...@@ -71,6 +72,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { ...@@ -71,6 +72,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "pthru_all", VCPU_STAT(pthru_all) }, { "pthru_all", VCPU_STAT(pthru_all) },
{ "pthru_host", VCPU_STAT(pthru_host) }, { "pthru_host", VCPU_STAT(pthru_host) },
{ "pthru_bad_aff", VCPU_STAT(pthru_bad_aff) }, { "pthru_bad_aff", VCPU_STAT(pthru_bad_aff) },
{ "largepages_2M", VM_STAT(num_2M_pages) },
{ "largepages_1G", VM_STAT(num_1G_pages) },
{ NULL } { NULL }
}; };
...@@ -642,7 +645,7 @@ int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id, ...@@ -642,7 +645,7 @@ int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
r = -ENXIO; r = -ENXIO;
break; break;
} }
if (xive_enabled()) if (xics_on_xive())
*val = get_reg_val(id, kvmppc_xive_get_icp(vcpu)); *val = get_reg_val(id, kvmppc_xive_get_icp(vcpu));
else else
*val = get_reg_val(id, kvmppc_xics_get_icp(vcpu)); *val = get_reg_val(id, kvmppc_xics_get_icp(vcpu));
...@@ -715,7 +718,7 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, ...@@ -715,7 +718,7 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
r = -ENXIO; r = -ENXIO;
break; break;
} }
if (xive_enabled()) if (xics_on_xive())
r = kvmppc_xive_set_icp(vcpu, set_reg_val(id, *val)); r = kvmppc_xive_set_icp(vcpu, set_reg_val(id, *val));
else else
r = kvmppc_xics_set_icp(vcpu, set_reg_val(id, *val)); r = kvmppc_xics_set_icp(vcpu, set_reg_val(id, *val));
...@@ -991,7 +994,7 @@ int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hcall) ...@@ -991,7 +994,7 @@ int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hcall)
int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level, int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
bool line_status) bool line_status)
{ {
if (xive_enabled()) if (xics_on_xive())
return kvmppc_xive_set_irq(kvm, irq_source_id, irq, level, return kvmppc_xive_set_irq(kvm, irq_source_id, irq, level,
line_status); line_status);
else else
...@@ -1044,7 +1047,7 @@ static int kvmppc_book3s_init(void) ...@@ -1044,7 +1047,7 @@ static int kvmppc_book3s_init(void)
#ifdef CONFIG_KVM_XICS #ifdef CONFIG_KVM_XICS
#ifdef CONFIG_KVM_XIVE #ifdef CONFIG_KVM_XIVE
if (xive_enabled()) { if (xics_on_xive()) {
kvmppc_xive_init_module(); kvmppc_xive_init_module();
kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS); kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS);
} else } else
...@@ -1057,7 +1060,7 @@ static int kvmppc_book3s_init(void) ...@@ -1057,7 +1060,7 @@ static int kvmppc_book3s_init(void)
static void kvmppc_book3s_exit(void) static void kvmppc_book3s_exit(void)
{ {
#ifdef CONFIG_KVM_XICS #ifdef CONFIG_KVM_XICS
if (xive_enabled()) if (xics_on_xive())
kvmppc_xive_exit_module(); kvmppc_xive_exit_module();
#endif #endif
#ifdef CONFIG_KVM_BOOK3S_32_HANDLER #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
......
...@@ -425,6 +425,7 @@ void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu) ...@@ -425,6 +425,7 @@ void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu)
mmu->slbmte = NULL; mmu->slbmte = NULL;
mmu->slbmfee = NULL; mmu->slbmfee = NULL;
mmu->slbmfev = NULL; mmu->slbmfev = NULL;
mmu->slbfee = NULL;
mmu->slbie = NULL; mmu->slbie = NULL;
mmu->slbia = NULL; mmu->slbia = NULL;
} }
...@@ -435,6 +435,19 @@ static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb) ...@@ -435,6 +435,19 @@ static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb)
kvmppc_mmu_map_segment(vcpu, esid << SID_SHIFT); kvmppc_mmu_map_segment(vcpu, esid << SID_SHIFT);
} }
static int kvmppc_mmu_book3s_64_slbfee(struct kvm_vcpu *vcpu, gva_t eaddr,
ulong *ret_slb)
{
struct kvmppc_slb *slbe = kvmppc_mmu_book3s_64_find_slbe(vcpu, eaddr);
if (slbe) {
*ret_slb = slbe->origv;
return 0;
}
*ret_slb = 0;
return -ENOENT;
}
static u64 kvmppc_mmu_book3s_64_slbmfee(struct kvm_vcpu *vcpu, u64 slb_nr) static u64 kvmppc_mmu_book3s_64_slbmfee(struct kvm_vcpu *vcpu, u64 slb_nr)
{ {
struct kvmppc_slb *slbe; struct kvmppc_slb *slbe;
...@@ -670,6 +683,7 @@ void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu) ...@@ -670,6 +683,7 @@ void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu)
mmu->slbmte = kvmppc_mmu_book3s_64_slbmte; mmu->slbmte = kvmppc_mmu_book3s_64_slbmte;
mmu->slbmfee = kvmppc_mmu_book3s_64_slbmfee; mmu->slbmfee = kvmppc_mmu_book3s_64_slbmfee;
mmu->slbmfev = kvmppc_mmu_book3s_64_slbmfev; mmu->slbmfev = kvmppc_mmu_book3s_64_slbmfev;
mmu->slbfee = kvmppc_mmu_book3s_64_slbfee;
mmu->slbie = kvmppc_mmu_book3s_64_slbie; mmu->slbie = kvmppc_mmu_book3s_64_slbie;
mmu->slbia = kvmppc_mmu_book3s_64_slbia; mmu->slbia = kvmppc_mmu_book3s_64_slbia;
mmu->xlate = kvmppc_mmu_book3s_64_xlate; mmu->xlate = kvmppc_mmu_book3s_64_xlate;
......
...@@ -441,6 +441,24 @@ int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu, ...@@ -441,6 +441,24 @@ int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
{ {
u32 last_inst; u32 last_inst;
/*
* Fast path - check if the guest physical address corresponds to a
* device on the FAST_MMIO_BUS, if so we can avoid loading the
* instruction all together, then we can just handle it and return.
*/
if (is_store) {
int idx, ret;
idx = srcu_read_lock(&vcpu->kvm->srcu);
ret = kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, (gpa_t) gpa, 0,
NULL);
srcu_read_unlock(&vcpu->kvm->srcu, idx);
if (!ret) {
kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) + 4);
return RESUME_GUEST;
}
}
/* /*
* If we fail, we just return to the guest and try executing it again. * If we fail, we just return to the guest and try executing it again.
*/ */
......
...@@ -403,8 +403,13 @@ void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte, unsigned long gpa, ...@@ -403,8 +403,13 @@ void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte, unsigned long gpa,
if (!memslot) if (!memslot)
return; return;
} }
if (shift) if (shift) { /* 1GB or 2MB page */
page_size = 1ul << shift; page_size = 1ul << shift;
if (shift == PMD_SHIFT)
kvm->stat.num_2M_pages--;
else if (shift == PUD_SHIFT)
kvm->stat.num_1G_pages--;
}
gpa &= ~(page_size - 1); gpa &= ~(page_size - 1);
hpa = old & PTE_RPN_MASK; hpa = old & PTE_RPN_MASK;
...@@ -878,6 +883,14 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, ...@@ -878,6 +883,14 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
put_page(page); put_page(page);
} }
/* Increment number of large pages if we (successfully) inserted one */
if (!ret) {
if (level == 1)
kvm->stat.num_2M_pages++;
else if (level == 2)
kvm->stat.num_1G_pages++;
}
return ret; return ret;
} }
......
...@@ -133,7 +133,6 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm, ...@@ -133,7 +133,6 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm,
continue; continue;
kref_put(&stit->kref, kvm_spapr_tce_liobn_put); kref_put(&stit->kref, kvm_spapr_tce_liobn_put);
return;
} }
} }
} }
...@@ -338,14 +337,15 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, ...@@ -338,14 +337,15 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
} }
} }
kvm_get_kvm(kvm);
if (!ret) if (!ret)
ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops, ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
stt, O_RDWR | O_CLOEXEC); stt, O_RDWR | O_CLOEXEC);
if (ret >= 0) { if (ret >= 0)
list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables); list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
kvm_get_kvm(kvm); else
} kvm_put_kvm(kvm);
mutex_unlock(&kvm->lock); mutex_unlock(&kvm->lock);
......
...@@ -47,6 +47,7 @@ ...@@ -47,6 +47,7 @@
#define OP_31_XOP_SLBMFEV 851 #define OP_31_XOP_SLBMFEV 851
#define OP_31_XOP_EIOIO 854 #define OP_31_XOP_EIOIO 854
#define OP_31_XOP_SLBMFEE 915 #define OP_31_XOP_SLBMFEE 915
#define OP_31_XOP_SLBFEE 979
#define OP_31_XOP_TBEGIN 654 #define OP_31_XOP_TBEGIN 654
#define OP_31_XOP_TABORT 910 #define OP_31_XOP_TABORT 910
...@@ -416,6 +417,23 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu, ...@@ -416,6 +417,23 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
vcpu->arch.mmu.slbia(vcpu); vcpu->arch.mmu.slbia(vcpu);
break; break;
case OP_31_XOP_SLBFEE:
if (!(inst & 1) || !vcpu->arch.mmu.slbfee) {
return EMULATE_FAIL;
} else {
ulong b, t;
ulong cr = kvmppc_get_cr(vcpu) & ~CR0_MASK;
b = kvmppc_get_gpr(vcpu, rb);
if (!vcpu->arch.mmu.slbfee(vcpu, b, &t))
cr |= 2 << CR0_SHIFT;
kvmppc_set_gpr(vcpu, rt, t);
/* copy XER[SO] bit to CR0[SO] */
cr |= (vcpu->arch.regs.xer & 0x80000000) >>
(31 - CR0_SHIFT);
kvmppc_set_cr(vcpu, cr);
}
break;
case OP_31_XOP_SLBMFEE: case OP_31_XOP_SLBMFEE:
if (!vcpu->arch.mmu.slbmfee) { if (!vcpu->arch.mmu.slbmfee) {
emulated = EMULATE_FAIL; emulated = EMULATE_FAIL;
......
...@@ -922,7 +922,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) ...@@ -922,7 +922,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
case H_IPOLL: case H_IPOLL:
case H_XIRR_X: case H_XIRR_X:
if (kvmppc_xics_enabled(vcpu)) { if (kvmppc_xics_enabled(vcpu)) {
if (xive_enabled()) { if (xics_on_xive()) {
ret = H_NOT_AVAILABLE; ret = H_NOT_AVAILABLE;
return RESUME_GUEST; return RESUME_GUEST;
} }
...@@ -937,6 +937,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) ...@@ -937,6 +937,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
ret = kvmppc_h_set_xdabr(vcpu, kvmppc_get_gpr(vcpu, 4), ret = kvmppc_h_set_xdabr(vcpu, kvmppc_get_gpr(vcpu, 4),
kvmppc_get_gpr(vcpu, 5)); kvmppc_get_gpr(vcpu, 5));
break; break;
#ifdef CONFIG_SPAPR_TCE_IOMMU
case H_GET_TCE: case H_GET_TCE:
ret = kvmppc_h_get_tce(vcpu, kvmppc_get_gpr(vcpu, 4), ret = kvmppc_h_get_tce(vcpu, kvmppc_get_gpr(vcpu, 4),
kvmppc_get_gpr(vcpu, 5)); kvmppc_get_gpr(vcpu, 5));
...@@ -966,6 +967,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) ...@@ -966,6 +967,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
if (ret == H_TOO_HARD) if (ret == H_TOO_HARD)
return RESUME_HOST; return RESUME_HOST;
break; break;
#endif
case H_RANDOM: case H_RANDOM:
if (!powernv_get_random_long(&vcpu->arch.regs.gpr[4])) if (!powernv_get_random_long(&vcpu->arch.regs.gpr[4]))
ret = H_HARDWARE; ret = H_HARDWARE;
...@@ -1445,7 +1447,7 @@ static int kvmppc_handle_nested_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) ...@@ -1445,7 +1447,7 @@ static int kvmppc_handle_nested_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
case BOOK3S_INTERRUPT_HV_RM_HARD: case BOOK3S_INTERRUPT_HV_RM_HARD:
vcpu->arch.trap = 0; vcpu->arch.trap = 0;
r = RESUME_GUEST; r = RESUME_GUEST;
if (!xive_enabled()) if (!xics_on_xive())
kvmppc_xics_rm_complete(vcpu, 0); kvmppc_xics_rm_complete(vcpu, 0);
break; break;
default: default:
...@@ -3648,11 +3650,12 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc, ...@@ -3648,11 +3650,12 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc,
static void grow_halt_poll_ns(struct kvmppc_vcore *vc) static void grow_halt_poll_ns(struct kvmppc_vcore *vc)
{ {
/* 10us base */ if (!halt_poll_ns_grow)
if (vc->halt_poll_ns == 0 && halt_poll_ns_grow) return;
vc->halt_poll_ns = 10000;
else vc->halt_poll_ns *= halt_poll_ns_grow;
vc->halt_poll_ns *= halt_poll_ns_grow; if (vc->halt_poll_ns < halt_poll_ns_grow_start)
vc->halt_poll_ns = halt_poll_ns_grow_start;
} }
static void shrink_halt_poll_ns(struct kvmppc_vcore *vc) static void shrink_halt_poll_ns(struct kvmppc_vcore *vc)
...@@ -3666,7 +3669,7 @@ static void shrink_halt_poll_ns(struct kvmppc_vcore *vc) ...@@ -3666,7 +3669,7 @@ static void shrink_halt_poll_ns(struct kvmppc_vcore *vc)
#ifdef CONFIG_KVM_XICS #ifdef CONFIG_KVM_XICS
static inline bool xive_interrupt_pending(struct kvm_vcpu *vcpu) static inline bool xive_interrupt_pending(struct kvm_vcpu *vcpu)
{ {
if (!xive_enabled()) if (!xics_on_xive())
return false; return false;
return vcpu->arch.irq_pending || vcpu->arch.xive_saved_state.pipr < return vcpu->arch.irq_pending || vcpu->arch.xive_saved_state.pipr <
vcpu->arch.xive_saved_state.cppr; vcpu->arch.xive_saved_state.cppr;
...@@ -4226,7 +4229,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu) ...@@ -4226,7 +4229,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu)
vcpu->arch.fault_dar, vcpu->arch.fault_dsisr); vcpu->arch.fault_dar, vcpu->arch.fault_dsisr);
srcu_read_unlock(&kvm->srcu, srcu_idx); srcu_read_unlock(&kvm->srcu, srcu_idx);
} else if (r == RESUME_PASSTHROUGH) { } else if (r == RESUME_PASSTHROUGH) {
if (WARN_ON(xive_enabled())) if (WARN_ON(xics_on_xive()))
r = H_SUCCESS; r = H_SUCCESS;
else else
r = kvmppc_xics_rm_complete(vcpu, 0); r = kvmppc_xics_rm_complete(vcpu, 0);
...@@ -4750,7 +4753,7 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm) ...@@ -4750,7 +4753,7 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
* If xive is enabled, we route 0x500 interrupts directly * If xive is enabled, we route 0x500 interrupts directly
* to the guest. * to the guest.
*/ */
if (xive_enabled()) if (xics_on_xive())
lpcr |= LPCR_LPES; lpcr |= LPCR_LPES;
} }
...@@ -4986,7 +4989,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) ...@@ -4986,7 +4989,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi)
if (i == pimap->n_mapped) if (i == pimap->n_mapped)
pimap->n_mapped++; pimap->n_mapped++;
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc);
else else
kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq); kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq);
...@@ -5027,7 +5030,7 @@ static int kvmppc_clr_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) ...@@ -5027,7 +5030,7 @@ static int kvmppc_clr_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi)
return -ENODEV; return -ENODEV;
} }
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, pimap->mapped[i].desc); rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, pimap->mapped[i].desc);
else else
kvmppc_xics_clr_mapped(kvm, guest_gsi, pimap->mapped[i].r_hwirq); kvmppc_xics_clr_mapped(kvm, guest_gsi, pimap->mapped[i].r_hwirq);
...@@ -5359,13 +5362,11 @@ static int kvm_init_subcore_bitmap(void) ...@@ -5359,13 +5362,11 @@ static int kvm_init_subcore_bitmap(void)
continue; continue;
sibling_subcore_state = sibling_subcore_state =
kmalloc_node(sizeof(struct sibling_subcore_state), kzalloc_node(sizeof(struct sibling_subcore_state),
GFP_KERNEL, node); GFP_KERNEL, node);
if (!sibling_subcore_state) if (!sibling_subcore_state)
return -ENOMEM; return -ENOMEM;
memset(sibling_subcore_state, 0,
sizeof(struct sibling_subcore_state));
for (j = 0; j < threads_per_core; j++) { for (j = 0; j < threads_per_core; j++) {
int cpu = first_cpu + j; int cpu = first_cpu + j;
...@@ -5406,7 +5407,7 @@ static int kvmppc_book3s_init_hv(void) ...@@ -5406,7 +5407,7 @@ static int kvmppc_book3s_init_hv(void)
* indirectly, via OPAL. * indirectly, via OPAL.
*/ */
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
if (!xive_enabled() && !kvmhv_on_pseries() && if (!xics_on_xive() && !kvmhv_on_pseries() &&
!local_paca->kvm_hstate.xics_phys) { !local_paca->kvm_hstate.xics_phys) {
struct device_node *np; struct device_node *np;
......
...@@ -257,7 +257,7 @@ void kvmhv_rm_send_ipi(int cpu) ...@@ -257,7 +257,7 @@ void kvmhv_rm_send_ipi(int cpu)
} }
/* We should never reach this */ /* We should never reach this */
if (WARN_ON_ONCE(xive_enabled())) if (WARN_ON_ONCE(xics_on_xive()))
return; return;
/* Else poke the target with an IPI */ /* Else poke the target with an IPI */
...@@ -577,7 +577,7 @@ unsigned long kvmppc_rm_h_xirr(struct kvm_vcpu *vcpu) ...@@ -577,7 +577,7 @@ unsigned long kvmppc_rm_h_xirr(struct kvm_vcpu *vcpu)
{ {
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_xirr(vcpu); return xive_rm_h_xirr(vcpu);
if (unlikely(!__xive_vm_h_xirr)) if (unlikely(!__xive_vm_h_xirr))
...@@ -592,7 +592,7 @@ unsigned long kvmppc_rm_h_xirr_x(struct kvm_vcpu *vcpu) ...@@ -592,7 +592,7 @@ unsigned long kvmppc_rm_h_xirr_x(struct kvm_vcpu *vcpu)
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
vcpu->arch.regs.gpr[5] = get_tb(); vcpu->arch.regs.gpr[5] = get_tb();
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_xirr(vcpu); return xive_rm_h_xirr(vcpu);
if (unlikely(!__xive_vm_h_xirr)) if (unlikely(!__xive_vm_h_xirr))
...@@ -606,7 +606,7 @@ unsigned long kvmppc_rm_h_ipoll(struct kvm_vcpu *vcpu, unsigned long server) ...@@ -606,7 +606,7 @@ unsigned long kvmppc_rm_h_ipoll(struct kvm_vcpu *vcpu, unsigned long server)
{ {
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_ipoll(vcpu, server); return xive_rm_h_ipoll(vcpu, server);
if (unlikely(!__xive_vm_h_ipoll)) if (unlikely(!__xive_vm_h_ipoll))
...@@ -621,7 +621,7 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long server, ...@@ -621,7 +621,7 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long server,
{ {
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_ipi(vcpu, server, mfrr); return xive_rm_h_ipi(vcpu, server, mfrr);
if (unlikely(!__xive_vm_h_ipi)) if (unlikely(!__xive_vm_h_ipi))
...@@ -635,7 +635,7 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long cppr) ...@@ -635,7 +635,7 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long cppr)
{ {
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_cppr(vcpu, cppr); return xive_rm_h_cppr(vcpu, cppr);
if (unlikely(!__xive_vm_h_cppr)) if (unlikely(!__xive_vm_h_cppr))
...@@ -649,7 +649,7 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr) ...@@ -649,7 +649,7 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
{ {
if (!kvmppc_xics_enabled(vcpu)) if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD; return H_TOO_HARD;
if (xive_enabled()) { if (xics_on_xive()) {
if (is_rm()) if (is_rm())
return xive_rm_h_eoi(vcpu, xirr); return xive_rm_h_eoi(vcpu, xirr);
if (unlikely(!__xive_vm_h_eoi)) if (unlikely(!__xive_vm_h_eoi))
......
...@@ -144,6 +144,13 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, ...@@ -144,6 +144,13 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
return; return;
} }
if (xive_enabled() && kvmhv_on_pseries()) {
/* No XICS access or hypercalls available, too hard */
this_icp->rm_action |= XICS_RM_KICK_VCPU;
this_icp->rm_kick_target = vcpu;
return;
}
/* /*
* Check if the core is loaded, * Check if the core is loaded,
* if not, find an available host core to post to wake the VCPU, * if not, find an available host core to post to wake the VCPU,
......
...@@ -2272,8 +2272,13 @@ hcall_real_table: ...@@ -2272,8 +2272,13 @@ hcall_real_table:
.long DOTSYM(kvmppc_h_clear_mod) - hcall_real_table .long DOTSYM(kvmppc_h_clear_mod) - hcall_real_table
.long DOTSYM(kvmppc_h_clear_ref) - hcall_real_table .long DOTSYM(kvmppc_h_clear_ref) - hcall_real_table
.long DOTSYM(kvmppc_h_protect) - hcall_real_table .long DOTSYM(kvmppc_h_protect) - hcall_real_table
#ifdef CONFIG_SPAPR_TCE_IOMMU
.long DOTSYM(kvmppc_h_get_tce) - hcall_real_table .long DOTSYM(kvmppc_h_get_tce) - hcall_real_table
.long DOTSYM(kvmppc_rm_h_put_tce) - hcall_real_table .long DOTSYM(kvmppc_rm_h_put_tce) - hcall_real_table
#else
.long 0 /* 0x1c */
.long 0 /* 0x20 */
#endif
.long 0 /* 0x24 - H_SET_SPRG0 */ .long 0 /* 0x24 - H_SET_SPRG0 */
.long DOTSYM(kvmppc_h_set_dabr) - hcall_real_table .long DOTSYM(kvmppc_h_set_dabr) - hcall_real_table
.long 0 /* 0x2c */ .long 0 /* 0x2c */
...@@ -2351,8 +2356,13 @@ hcall_real_table: ...@@ -2351,8 +2356,13 @@ hcall_real_table:
.long 0 /* 0x12c */ .long 0 /* 0x12c */
.long 0 /* 0x130 */ .long 0 /* 0x130 */
.long DOTSYM(kvmppc_h_set_xdabr) - hcall_real_table .long DOTSYM(kvmppc_h_set_xdabr) - hcall_real_table
#ifdef CONFIG_SPAPR_TCE_IOMMU
.long DOTSYM(kvmppc_rm_h_stuff_tce) - hcall_real_table .long DOTSYM(kvmppc_rm_h_stuff_tce) - hcall_real_table
.long DOTSYM(kvmppc_rm_h_put_tce_indirect) - hcall_real_table .long DOTSYM(kvmppc_rm_h_put_tce_indirect) - hcall_real_table
#else
.long 0 /* 0x138 */
.long 0 /* 0x13c */
#endif
.long 0 /* 0x140 */ .long 0 /* 0x140 */
.long 0 /* 0x144 */ .long 0 /* 0x144 */
.long 0 /* 0x148 */ .long 0 /* 0x148 */
......
...@@ -33,7 +33,7 @@ static void kvm_rtas_set_xive(struct kvm_vcpu *vcpu, struct rtas_args *args) ...@@ -33,7 +33,7 @@ static void kvm_rtas_set_xive(struct kvm_vcpu *vcpu, struct rtas_args *args)
server = be32_to_cpu(args->args[1]); server = be32_to_cpu(args->args[1]);
priority = be32_to_cpu(args->args[2]); priority = be32_to_cpu(args->args[2]);
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_set_xive(vcpu->kvm, irq, server, priority); rc = kvmppc_xive_set_xive(vcpu->kvm, irq, server, priority);
else else
rc = kvmppc_xics_set_xive(vcpu->kvm, irq, server, priority); rc = kvmppc_xics_set_xive(vcpu->kvm, irq, server, priority);
...@@ -56,7 +56,7 @@ static void kvm_rtas_get_xive(struct kvm_vcpu *vcpu, struct rtas_args *args) ...@@ -56,7 +56,7 @@ static void kvm_rtas_get_xive(struct kvm_vcpu *vcpu, struct rtas_args *args)
irq = be32_to_cpu(args->args[0]); irq = be32_to_cpu(args->args[0]);
server = priority = 0; server = priority = 0;
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_get_xive(vcpu->kvm, irq, &server, &priority); rc = kvmppc_xive_get_xive(vcpu->kvm, irq, &server, &priority);
else else
rc = kvmppc_xics_get_xive(vcpu->kvm, irq, &server, &priority); rc = kvmppc_xics_get_xive(vcpu->kvm, irq, &server, &priority);
...@@ -83,7 +83,7 @@ static void kvm_rtas_int_off(struct kvm_vcpu *vcpu, struct rtas_args *args) ...@@ -83,7 +83,7 @@ static void kvm_rtas_int_off(struct kvm_vcpu *vcpu, struct rtas_args *args)
irq = be32_to_cpu(args->args[0]); irq = be32_to_cpu(args->args[0]);
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_int_off(vcpu->kvm, irq); rc = kvmppc_xive_int_off(vcpu->kvm, irq);
else else
rc = kvmppc_xics_int_off(vcpu->kvm, irq); rc = kvmppc_xics_int_off(vcpu->kvm, irq);
...@@ -105,7 +105,7 @@ static void kvm_rtas_int_on(struct kvm_vcpu *vcpu, struct rtas_args *args) ...@@ -105,7 +105,7 @@ static void kvm_rtas_int_on(struct kvm_vcpu *vcpu, struct rtas_args *args)
irq = be32_to_cpu(args->args[0]); irq = be32_to_cpu(args->args[0]);
if (xive_enabled()) if (xics_on_xive())
rc = kvmppc_xive_int_on(vcpu->kvm, irq); rc = kvmppc_xive_int_on(vcpu->kvm, irq);
else else
rc = kvmppc_xics_int_on(vcpu->kvm, irq); rc = kvmppc_xics_int_on(vcpu->kvm, irq);
......
...@@ -748,7 +748,7 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu) ...@@ -748,7 +748,7 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu); kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu);
break; break;
case KVMPPC_IRQ_XICS: case KVMPPC_IRQ_XICS:
if (xive_enabled()) if (xics_on_xive())
kvmppc_xive_cleanup_vcpu(vcpu); kvmppc_xive_cleanup_vcpu(vcpu);
else else
kvmppc_xics_free_icp(vcpu); kvmppc_xics_free_icp(vcpu);
...@@ -1931,7 +1931,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, ...@@ -1931,7 +1931,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = -EPERM; r = -EPERM;
dev = kvm_device_from_filp(f.file); dev = kvm_device_from_filp(f.file);
if (dev) { if (dev) {
if (xive_enabled()) if (xics_on_xive())
r = kvmppc_xive_connect_vcpu(dev, vcpu, cap->args[1]); r = kvmppc_xive_connect_vcpu(dev, vcpu, cap->args[1]);
else else
r = kvmppc_xics_connect_vcpu(dev, vcpu, cap->args[1]); r = kvmppc_xics_connect_vcpu(dev, vcpu, cap->args[1]);
...@@ -2189,10 +2189,12 @@ static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp) ...@@ -2189,10 +2189,12 @@ static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp)
KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV | KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV |
KVM_PPC_CPU_CHAR_BR_HINT_HONOURED | KVM_PPC_CPU_CHAR_BR_HINT_HONOURED |
KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF | KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF |
KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS; KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS |
KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST;
cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY | cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY |
KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR | KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR |
KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR; KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR |
KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE;
} }
return 0; return 0;
} }
...@@ -2251,12 +2253,16 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp) ...@@ -2251,12 +2253,16 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp)
if (have_fw_feat(fw_features, "enabled", if (have_fw_feat(fw_features, "enabled",
"fw-count-cache-disabled")) "fw-count-cache-disabled"))
cp->character |= KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS; cp->character |= KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS;
if (have_fw_feat(fw_features, "enabled",
"fw-count-cache-flush-bcctr2,0,0"))
cp->character |= KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST;
cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 | cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 |
KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED | KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED |
KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 | KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 |
KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 | KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 |
KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV | KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV |
KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS; KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS |
KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST;
if (have_fw_feat(fw_features, "enabled", if (have_fw_feat(fw_features, "enabled",
"speculation-policy-favor-security")) "speculation-policy-favor-security"))
...@@ -2267,9 +2273,13 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp) ...@@ -2267,9 +2273,13 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp)
if (!have_fw_feat(fw_features, "disabled", if (!have_fw_feat(fw_features, "disabled",
"needs-spec-barrier-for-bound-checks")) "needs-spec-barrier-for-bound-checks"))
cp->behaviour |= KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR; cp->behaviour |= KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR;
if (have_fw_feat(fw_features, "enabled",
"needs-count-cache-flush-on-context-switch"))
cp->behaviour |= KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE;
cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY | cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY |
KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR | KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR |
KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR; KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR |
KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE;
of_node_put(fw_features); of_node_put(fw_features);
} }
......
...@@ -331,5 +331,6 @@ extern void css_schedule_reprobe(void); ...@@ -331,5 +331,6 @@ extern void css_schedule_reprobe(void);
/* Function from drivers/s390/cio/chsc.c */ /* Function from drivers/s390/cio/chsc.c */
int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta); int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
int chsc_sstpi(void *page, void *result, size_t size); int chsc_sstpi(void *page, void *result, size_t size);
int chsc_sgib(u32 origin);
#endif #endif
...@@ -62,6 +62,7 @@ enum interruption_class { ...@@ -62,6 +62,7 @@ enum interruption_class {
IRQIO_MSI, IRQIO_MSI,
IRQIO_VIR, IRQIO_VIR,
IRQIO_VAI, IRQIO_VAI,
IRQIO_GAL,
NMI_NMI, NMI_NMI,
CPU_RST, CPU_RST,
NR_ARCH_IRQS NR_ARCH_IRQS
......
...@@ -21,6 +21,7 @@ ...@@ -21,6 +21,7 @@
/* Adapter interrupts. */ /* Adapter interrupts. */
#define QDIO_AIRQ_ISC IO_SCH_ISC /* I/O subchannel in qdio mode */ #define QDIO_AIRQ_ISC IO_SCH_ISC /* I/O subchannel in qdio mode */
#define PCI_ISC 2 /* PCI I/O subchannels */ #define PCI_ISC 2 /* PCI I/O subchannels */
#define GAL_ISC 5 /* GIB alert */
#define AP_ISC 6 /* adjunct processor (crypto) devices */ #define AP_ISC 6 /* adjunct processor (crypto) devices */
/* Functions for registration of I/O interruption subclasses */ /* Functions for registration of I/O interruption subclasses */
......
...@@ -591,7 +591,6 @@ struct kvm_s390_float_interrupt { ...@@ -591,7 +591,6 @@ struct kvm_s390_float_interrupt {
struct kvm_s390_mchk_info mchk; struct kvm_s390_mchk_info mchk;
struct kvm_s390_ext_info srv_signal; struct kvm_s390_ext_info srv_signal;
int next_rr_cpu; int next_rr_cpu;
unsigned long idle_mask[BITS_TO_LONGS(KVM_MAX_VCPUS)];
struct mutex ais_lock; struct mutex ais_lock;
u8 simm; u8 simm;
u8 nimm; u8 nimm;
...@@ -712,6 +711,7 @@ struct s390_io_adapter { ...@@ -712,6 +711,7 @@ struct s390_io_adapter {
struct kvm_s390_cpu_model { struct kvm_s390_cpu_model {
/* facility mask supported by kvm & hosting machine */ /* facility mask supported by kvm & hosting machine */
__u64 fac_mask[S390_ARCH_FAC_LIST_SIZE_U64]; __u64 fac_mask[S390_ARCH_FAC_LIST_SIZE_U64];
struct kvm_s390_vm_cpu_subfunc subfuncs;
/* facility list requested by guest (in dma page) */ /* facility list requested by guest (in dma page) */
__u64 *fac_list; __u64 *fac_list;
u64 cpuid; u64 cpuid;
...@@ -782,9 +782,21 @@ struct kvm_s390_gisa { ...@@ -782,9 +782,21 @@ struct kvm_s390_gisa {
u8 reserved03[11]; u8 reserved03[11];
u32 airq_count; u32 airq_count;
} g1; } g1;
struct {
u64 word[4];
} u64;
}; };
}; };
struct kvm_s390_gib {
u32 alert_list_origin;
u32 reserved01;
u8:5;
u8 nisc:3;
u8 reserved03[3];
u32 reserved04[5];
};
/* /*
* sie_page2 has to be allocated as DMA because fac_list, crycb and * sie_page2 has to be allocated as DMA because fac_list, crycb and
* gisa need 31bit addresses in the sie control block. * gisa need 31bit addresses in the sie control block.
...@@ -793,7 +805,8 @@ struct sie_page2 { ...@@ -793,7 +805,8 @@ struct sie_page2 {
__u64 fac_list[S390_ARCH_FAC_LIST_SIZE_U64]; /* 0x0000 */ __u64 fac_list[S390_ARCH_FAC_LIST_SIZE_U64]; /* 0x0000 */
struct kvm_s390_crypto_cb crycb; /* 0x0800 */ struct kvm_s390_crypto_cb crycb; /* 0x0800 */
struct kvm_s390_gisa gisa; /* 0x0900 */ struct kvm_s390_gisa gisa; /* 0x0900 */
u8 reserved920[0x1000 - 0x920]; /* 0x0920 */ struct kvm *kvm; /* 0x0920 */
u8 reserved928[0x1000 - 0x928]; /* 0x0928 */
}; };
struct kvm_s390_vsie { struct kvm_s390_vsie {
...@@ -804,6 +817,20 @@ struct kvm_s390_vsie { ...@@ -804,6 +817,20 @@ struct kvm_s390_vsie {
struct page *pages[KVM_MAX_VCPUS]; struct page *pages[KVM_MAX_VCPUS];
}; };
struct kvm_s390_gisa_iam {
u8 mask;
spinlock_t ref_lock;
u32 ref_count[MAX_ISC + 1];
};
struct kvm_s390_gisa_interrupt {
struct kvm_s390_gisa *origin;
struct kvm_s390_gisa_iam alert;
struct hrtimer timer;
u64 expires;
DECLARE_BITMAP(kicked_mask, KVM_MAX_VCPUS);
};
struct kvm_arch{ struct kvm_arch{
void *sca; void *sca;
int use_esca; int use_esca;
...@@ -837,7 +864,8 @@ struct kvm_arch{ ...@@ -837,7 +864,8 @@ struct kvm_arch{
atomic64_t cmma_dirty_pages; atomic64_t cmma_dirty_pages;
/* subset of available cpu features enabled by user space */ /* subset of available cpu features enabled by user space */
DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS); DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
struct kvm_s390_gisa *gisa; DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS);
struct kvm_s390_gisa_interrupt gisa_int;
}; };
#define KVM_HVA_ERR_BAD (-1UL) #define KVM_HVA_ERR_BAD (-1UL)
...@@ -871,6 +899,9 @@ void kvm_arch_crypto_set_masks(struct kvm *kvm, unsigned long *apm, ...@@ -871,6 +899,9 @@ void kvm_arch_crypto_set_masks(struct kvm *kvm, unsigned long *apm,
extern int sie64a(struct kvm_s390_sie_block *, u64 *); extern int sie64a(struct kvm_s390_sie_block *, u64 *);
extern char sie_exit; extern char sie_exit;
extern int kvm_s390_gisc_register(struct kvm *kvm, u32 gisc);
extern int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc);
static inline void kvm_arch_hardware_disable(void) {} static inline void kvm_arch_hardware_disable(void) {}
static inline void kvm_arch_check_processor_compat(void *rtn) {} static inline void kvm_arch_check_processor_compat(void *rtn) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {}
...@@ -878,7 +909,7 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} ...@@ -878,7 +909,7 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_free_memslot(struct kvm *kvm, static inline void kvm_arch_free_memslot(struct kvm *kvm,
struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {}
static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm, static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot) {} struct kvm_memory_slot *slot) {}
......
...@@ -88,6 +88,7 @@ static const struct irq_class irqclass_sub_desc[] = { ...@@ -88,6 +88,7 @@ static const struct irq_class irqclass_sub_desc[] = {
{.irq = IRQIO_MSI, .name = "MSI", .desc = "[I/O] MSI Interrupt" }, {.irq = IRQIO_MSI, .name = "MSI", .desc = "[I/O] MSI Interrupt" },
{.irq = IRQIO_VIR, .name = "VIR", .desc = "[I/O] Virtual I/O Devices"}, {.irq = IRQIO_VIR, .name = "VIR", .desc = "[I/O] Virtual I/O Devices"},
{.irq = IRQIO_VAI, .name = "VAI", .desc = "[I/O] Virtual I/O Devices AI"}, {.irq = IRQIO_VAI, .name = "VAI", .desc = "[I/O] Virtual I/O Devices AI"},
{.irq = IRQIO_GAL, .name = "GAL", .desc = "[I/O] GIB Alert"},
{.irq = NMI_NMI, .name = "NMI", .desc = "[NMI] Machine Check"}, {.irq = NMI_NMI, .name = "NMI", .desc = "[NMI] Machine Check"},
{.irq = CPU_RST, .name = "RST", .desc = "[CPU] CPU Restart"}, {.irq = CPU_RST, .name = "RST", .desc = "[CPU] CPU Restart"},
}; };
......
此差异已折叠。
...@@ -432,11 +432,18 @@ int kvm_arch_init(void *opaque) ...@@ -432,11 +432,18 @@ int kvm_arch_init(void *opaque)
/* Register floating interrupt controller interface. */ /* Register floating interrupt controller interface. */
rc = kvm_register_device_ops(&kvm_flic_ops, KVM_DEV_TYPE_FLIC); rc = kvm_register_device_ops(&kvm_flic_ops, KVM_DEV_TYPE_FLIC);
if (rc) { if (rc) {
pr_err("Failed to register FLIC rc=%d\n", rc); pr_err("A FLIC registration call failed with rc=%d\n", rc);
goto out_debug_unreg; goto out_debug_unreg;
} }
rc = kvm_s390_gib_init(GAL_ISC);
if (rc)
goto out_gib_destroy;
return 0; return 0;
out_gib_destroy:
kvm_s390_gib_destroy();
out_debug_unreg: out_debug_unreg:
debug_unregister(kvm_s390_dbf); debug_unregister(kvm_s390_dbf);
return rc; return rc;
...@@ -444,6 +451,7 @@ int kvm_arch_init(void *opaque) ...@@ -444,6 +451,7 @@ int kvm_arch_init(void *opaque)
void kvm_arch_exit(void) void kvm_arch_exit(void)
{ {
kvm_s390_gib_destroy();
debug_unregister(kvm_s390_dbf); debug_unregister(kvm_s390_dbf);
} }
...@@ -1258,11 +1266,65 @@ static int kvm_s390_set_processor_feat(struct kvm *kvm, ...@@ -1258,11 +1266,65 @@ static int kvm_s390_set_processor_feat(struct kvm *kvm,
static int kvm_s390_set_processor_subfunc(struct kvm *kvm, static int kvm_s390_set_processor_subfunc(struct kvm *kvm,
struct kvm_device_attr *attr) struct kvm_device_attr *attr)
{ {
/* mutex_lock(&kvm->lock);
* Once supported by kernel + hw, we have to store the subfunctions if (kvm->created_vcpus) {
* in kvm->arch and remember that user space configured them. mutex_unlock(&kvm->lock);
*/ return -EBUSY;
return -ENXIO; }
if (copy_from_user(&kvm->arch.model.subfuncs, (void __user *)attr->addr,
sizeof(struct kvm_s390_vm_cpu_subfunc))) {
mutex_unlock(&kvm->lock);
return -EFAULT;
}
mutex_unlock(&kvm->lock);
VM_EVENT(kvm, 3, "SET: guest PLO subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.plo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[1],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[2],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[3]);
VM_EVENT(kvm, 3, "SET: guest PTFF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.ptff)[0],
((unsigned long *) &kvm->arch.model.subfuncs.ptff)[1]);
VM_EVENT(kvm, 3, "SET: guest KMAC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmac)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmac)[1]);
VM_EVENT(kvm, 3, "SET: guest KMC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmc)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmc)[1]);
VM_EVENT(kvm, 3, "SET: guest KM subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.km)[0],
((unsigned long *) &kvm->arch.model.subfuncs.km)[1]);
VM_EVENT(kvm, 3, "SET: guest KIMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kimd)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kimd)[1]);
VM_EVENT(kvm, 3, "SET: guest KLMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.klmd)[0],
((unsigned long *) &kvm->arch.model.subfuncs.klmd)[1]);
VM_EVENT(kvm, 3, "SET: guest PCKMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.pckmo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.pckmo)[1]);
VM_EVENT(kvm, 3, "SET: guest KMCTR subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmctr)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmctr)[1]);
VM_EVENT(kvm, 3, "SET: guest KMF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmf)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmf)[1]);
VM_EVENT(kvm, 3, "SET: guest KMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmo)[1]);
VM_EVENT(kvm, 3, "SET: guest PCC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.pcc)[0],
((unsigned long *) &kvm->arch.model.subfuncs.pcc)[1]);
VM_EVENT(kvm, 3, "SET: guest PPNO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.ppno)[0],
((unsigned long *) &kvm->arch.model.subfuncs.ppno)[1]);
VM_EVENT(kvm, 3, "SET: guest KMA subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kma)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kma)[1]);
return 0;
} }
static int kvm_s390_set_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr) static int kvm_s390_set_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
...@@ -1381,12 +1443,56 @@ static int kvm_s390_get_machine_feat(struct kvm *kvm, ...@@ -1381,12 +1443,56 @@ static int kvm_s390_get_machine_feat(struct kvm *kvm,
static int kvm_s390_get_processor_subfunc(struct kvm *kvm, static int kvm_s390_get_processor_subfunc(struct kvm *kvm,
struct kvm_device_attr *attr) struct kvm_device_attr *attr)
{ {
/* if (copy_to_user((void __user *)attr->addr, &kvm->arch.model.subfuncs,
* Once we can actually configure subfunctions (kernel + hw support), sizeof(struct kvm_s390_vm_cpu_subfunc)))
* we have to check if they were already set by user space, if so copy return -EFAULT;
* them from kvm->arch.
*/ VM_EVENT(kvm, 3, "GET: guest PLO subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
return -ENXIO; ((unsigned long *) &kvm->arch.model.subfuncs.plo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[1],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[2],
((unsigned long *) &kvm->arch.model.subfuncs.plo)[3]);
VM_EVENT(kvm, 3, "GET: guest PTFF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.ptff)[0],
((unsigned long *) &kvm->arch.model.subfuncs.ptff)[1]);
VM_EVENT(kvm, 3, "GET: guest KMAC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmac)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmac)[1]);
VM_EVENT(kvm, 3, "GET: guest KMC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmc)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmc)[1]);
VM_EVENT(kvm, 3, "GET: guest KM subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.km)[0],
((unsigned long *) &kvm->arch.model.subfuncs.km)[1]);
VM_EVENT(kvm, 3, "GET: guest KIMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kimd)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kimd)[1]);
VM_EVENT(kvm, 3, "GET: guest KLMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.klmd)[0],
((unsigned long *) &kvm->arch.model.subfuncs.klmd)[1]);
VM_EVENT(kvm, 3, "GET: guest PCKMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.pckmo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.pckmo)[1]);
VM_EVENT(kvm, 3, "GET: guest KMCTR subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmctr)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmctr)[1]);
VM_EVENT(kvm, 3, "GET: guest KMF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmf)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmf)[1]);
VM_EVENT(kvm, 3, "GET: guest KMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kmo)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kmo)[1]);
VM_EVENT(kvm, 3, "GET: guest PCC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.pcc)[0],
((unsigned long *) &kvm->arch.model.subfuncs.pcc)[1]);
VM_EVENT(kvm, 3, "GET: guest PPNO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.ppno)[0],
((unsigned long *) &kvm->arch.model.subfuncs.ppno)[1]);
VM_EVENT(kvm, 3, "GET: guest KMA subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm->arch.model.subfuncs.kma)[0],
((unsigned long *) &kvm->arch.model.subfuncs.kma)[1]);
return 0;
} }
static int kvm_s390_get_machine_subfunc(struct kvm *kvm, static int kvm_s390_get_machine_subfunc(struct kvm *kvm,
...@@ -1395,8 +1501,55 @@ static int kvm_s390_get_machine_subfunc(struct kvm *kvm, ...@@ -1395,8 +1501,55 @@ static int kvm_s390_get_machine_subfunc(struct kvm *kvm,
if (copy_to_user((void __user *)attr->addr, &kvm_s390_available_subfunc, if (copy_to_user((void __user *)attr->addr, &kvm_s390_available_subfunc,
sizeof(struct kvm_s390_vm_cpu_subfunc))) sizeof(struct kvm_s390_vm_cpu_subfunc)))
return -EFAULT; return -EFAULT;
VM_EVENT(kvm, 3, "GET: host PLO subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.plo)[0],
((unsigned long *) &kvm_s390_available_subfunc.plo)[1],
((unsigned long *) &kvm_s390_available_subfunc.plo)[2],
((unsigned long *) &kvm_s390_available_subfunc.plo)[3]);
VM_EVENT(kvm, 3, "GET: host PTFF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.ptff)[0],
((unsigned long *) &kvm_s390_available_subfunc.ptff)[1]);
VM_EVENT(kvm, 3, "GET: host KMAC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kmac)[0],
((unsigned long *) &kvm_s390_available_subfunc.kmac)[1]);
VM_EVENT(kvm, 3, "GET: host KMC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kmc)[0],
((unsigned long *) &kvm_s390_available_subfunc.kmc)[1]);
VM_EVENT(kvm, 3, "GET: host KM subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.km)[0],
((unsigned long *) &kvm_s390_available_subfunc.km)[1]);
VM_EVENT(kvm, 3, "GET: host KIMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kimd)[0],
((unsigned long *) &kvm_s390_available_subfunc.kimd)[1]);
VM_EVENT(kvm, 3, "GET: host KLMD subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.klmd)[0],
((unsigned long *) &kvm_s390_available_subfunc.klmd)[1]);
VM_EVENT(kvm, 3, "GET: host PCKMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.pckmo)[0],
((unsigned long *) &kvm_s390_available_subfunc.pckmo)[1]);
VM_EVENT(kvm, 3, "GET: host KMCTR subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kmctr)[0],
((unsigned long *) &kvm_s390_available_subfunc.kmctr)[1]);
VM_EVENT(kvm, 3, "GET: host KMF subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kmf)[0],
((unsigned long *) &kvm_s390_available_subfunc.kmf)[1]);
VM_EVENT(kvm, 3, "GET: host KMO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kmo)[0],
((unsigned long *) &kvm_s390_available_subfunc.kmo)[1]);
VM_EVENT(kvm, 3, "GET: host PCC subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.pcc)[0],
((unsigned long *) &kvm_s390_available_subfunc.pcc)[1]);
VM_EVENT(kvm, 3, "GET: host PPNO subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.ppno)[0],
((unsigned long *) &kvm_s390_available_subfunc.ppno)[1]);
VM_EVENT(kvm, 3, "GET: host KMA subfunc 0x%16.16lx.%16.16lx",
((unsigned long *) &kvm_s390_available_subfunc.kma)[0],
((unsigned long *) &kvm_s390_available_subfunc.kma)[1]);
return 0; return 0;
} }
static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr) static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
{ {
int ret = -ENXIO; int ret = -ENXIO;
...@@ -1514,10 +1667,9 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr) ...@@ -1514,10 +1667,9 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
case KVM_S390_VM_CPU_PROCESSOR_FEAT: case KVM_S390_VM_CPU_PROCESSOR_FEAT:
case KVM_S390_VM_CPU_MACHINE_FEAT: case KVM_S390_VM_CPU_MACHINE_FEAT:
case KVM_S390_VM_CPU_MACHINE_SUBFUNC: case KVM_S390_VM_CPU_MACHINE_SUBFUNC:
case KVM_S390_VM_CPU_PROCESSOR_SUBFUNC:
ret = 0; ret = 0;
break; break;
/* configuring subfunctions is not supported yet */
case KVM_S390_VM_CPU_PROCESSOR_SUBFUNC:
default: default:
ret = -ENXIO; ret = -ENXIO;
break; break;
...@@ -2209,6 +2361,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -2209,6 +2361,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (!kvm->arch.sie_page2) if (!kvm->arch.sie_page2)
goto out_err; goto out_err;
kvm->arch.sie_page2->kvm = kvm;
kvm->arch.model.fac_list = kvm->arch.sie_page2->fac_list; kvm->arch.model.fac_list = kvm->arch.sie_page2->fac_list;
for (i = 0; i < kvm_s390_fac_size(); i++) { for (i = 0; i < kvm_s390_fac_size(); i++) {
...@@ -2218,6 +2371,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -2218,6 +2371,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.model.fac_list[i] = S390_lowcore.stfle_fac_list[i] & kvm->arch.model.fac_list[i] = S390_lowcore.stfle_fac_list[i] &
kvm_s390_fac_base[i]; kvm_s390_fac_base[i];
} }
kvm->arch.model.subfuncs = kvm_s390_available_subfunc;
/* we are always in czam mode - even on pre z14 machines */ /* we are always in czam mode - even on pre z14 machines */
set_kvm_facility(kvm->arch.model.fac_mask, 138); set_kvm_facility(kvm->arch.model.fac_mask, 138);
...@@ -2812,7 +2966,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, ...@@ -2812,7 +2966,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
vcpu->arch.sie_block->icpua = id; vcpu->arch.sie_block->icpua = id;
spin_lock_init(&vcpu->arch.local_int.lock); spin_lock_init(&vcpu->arch.local_int.lock);
vcpu->arch.sie_block->gd = (u32)(u64)kvm->arch.gisa; vcpu->arch.sie_block->gd = (u32)(u64)kvm->arch.gisa_int.origin;
if (vcpu->arch.sie_block->gd && sclp.has_gisaf) if (vcpu->arch.sie_block->gd && sclp.has_gisaf)
vcpu->arch.sie_block->gd |= GISA_FORMAT1; vcpu->arch.sie_block->gd |= GISA_FORMAT1;
seqcount_init(&vcpu->arch.cputm_seqcount); seqcount_init(&vcpu->arch.cputm_seqcount);
...@@ -3458,6 +3612,8 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) ...@@ -3458,6 +3612,8 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu)
kvm_s390_patch_guest_per_regs(vcpu); kvm_s390_patch_guest_per_regs(vcpu);
} }
clear_bit(vcpu->vcpu_id, vcpu->kvm->arch.gisa_int.kicked_mask);
vcpu->arch.sie_block->icptcode = 0; vcpu->arch.sie_block->icptcode = 0;
cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags);
VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags);
...@@ -4293,12 +4449,12 @@ static int __init kvm_s390_init(void) ...@@ -4293,12 +4449,12 @@ static int __init kvm_s390_init(void)
int i; int i;
if (!sclp.has_sief2) { if (!sclp.has_sief2) {
pr_info("SIE not available\n"); pr_info("SIE is not available\n");
return -ENODEV; return -ENODEV;
} }
if (nested && hpage) { if (nested && hpage) {
pr_info("nested (vSIE) and hpage (huge page backing) can currently not be activated concurrently"); pr_info("A KVM host that supports nesting cannot back its KVM guests with huge pages\n");
return -EINVAL; return -EINVAL;
} }
......
...@@ -67,7 +67,7 @@ static inline int is_vcpu_stopped(struct kvm_vcpu *vcpu) ...@@ -67,7 +67,7 @@ static inline int is_vcpu_stopped(struct kvm_vcpu *vcpu)
static inline int is_vcpu_idle(struct kvm_vcpu *vcpu) static inline int is_vcpu_idle(struct kvm_vcpu *vcpu)
{ {
return test_bit(vcpu->vcpu_id, vcpu->kvm->arch.float_int.idle_mask); return test_bit(vcpu->vcpu_id, vcpu->kvm->arch.idle_mask);
} }
static inline int kvm_is_ucontrol(struct kvm *kvm) static inline int kvm_is_ucontrol(struct kvm *kvm)
...@@ -381,6 +381,8 @@ int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu, ...@@ -381,6 +381,8 @@ int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu,
void kvm_s390_gisa_init(struct kvm *kvm); void kvm_s390_gisa_init(struct kvm *kvm);
void kvm_s390_gisa_clear(struct kvm *kvm); void kvm_s390_gisa_clear(struct kvm *kvm);
void kvm_s390_gisa_destroy(struct kvm *kvm); void kvm_s390_gisa_destroy(struct kvm *kvm);
int kvm_s390_gib_init(u8 nisc);
void kvm_s390_gib_destroy(void);
/* implemented in guestdbg.c */ /* implemented in guestdbg.c */
void kvm_s390_backup_guest_per_regs(struct kvm_vcpu *vcpu); void kvm_s390_backup_guest_per_regs(struct kvm_vcpu *vcpu);
......
...@@ -35,6 +35,7 @@ ...@@ -35,6 +35,7 @@
#include <asm/msr-index.h> #include <asm/msr-index.h>
#include <asm/asm.h> #include <asm/asm.h>
#include <asm/kvm_page_track.h> #include <asm/kvm_page_track.h>
#include <asm/kvm_vcpu_regs.h>
#include <asm/hyperv-tlfs.h> #include <asm/hyperv-tlfs.h>
#define KVM_MAX_VCPUS 288 #define KVM_MAX_VCPUS 288
...@@ -137,23 +138,23 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) ...@@ -137,23 +138,23 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level)
#define ASYNC_PF_PER_VCPU 64 #define ASYNC_PF_PER_VCPU 64
enum kvm_reg { enum kvm_reg {
VCPU_REGS_RAX = 0, VCPU_REGS_RAX = __VCPU_REGS_RAX,
VCPU_REGS_RCX = 1, VCPU_REGS_RCX = __VCPU_REGS_RCX,
VCPU_REGS_RDX = 2, VCPU_REGS_RDX = __VCPU_REGS_RDX,
VCPU_REGS_RBX = 3, VCPU_REGS_RBX = __VCPU_REGS_RBX,
VCPU_REGS_RSP = 4, VCPU_REGS_RSP = __VCPU_REGS_RSP,
VCPU_REGS_RBP = 5, VCPU_REGS_RBP = __VCPU_REGS_RBP,
VCPU_REGS_RSI = 6, VCPU_REGS_RSI = __VCPU_REGS_RSI,
VCPU_REGS_RDI = 7, VCPU_REGS_RDI = __VCPU_REGS_RDI,
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
VCPU_REGS_R8 = 8, VCPU_REGS_R8 = __VCPU_REGS_R8,
VCPU_REGS_R9 = 9, VCPU_REGS_R9 = __VCPU_REGS_R9,
VCPU_REGS_R10 = 10, VCPU_REGS_R10 = __VCPU_REGS_R10,
VCPU_REGS_R11 = 11, VCPU_REGS_R11 = __VCPU_REGS_R11,
VCPU_REGS_R12 = 12, VCPU_REGS_R12 = __VCPU_REGS_R12,
VCPU_REGS_R13 = 13, VCPU_REGS_R13 = __VCPU_REGS_R13,
VCPU_REGS_R14 = 14, VCPU_REGS_R14 = __VCPU_REGS_R14,
VCPU_REGS_R15 = 15, VCPU_REGS_R15 = __VCPU_REGS_R15,
#endif #endif
VCPU_REGS_RIP, VCPU_REGS_RIP,
NR_VCPU_REGS NR_VCPU_REGS
...@@ -319,6 +320,7 @@ struct kvm_mmu_page { ...@@ -319,6 +320,7 @@ struct kvm_mmu_page {
struct list_head link; struct list_head link;
struct hlist_node hash_link; struct hlist_node hash_link;
bool unsync; bool unsync;
bool mmio_cached;
/* /*
* The following two entries are used to key the shadow page in the * The following two entries are used to key the shadow page in the
...@@ -333,10 +335,6 @@ struct kvm_mmu_page { ...@@ -333,10 +335,6 @@ struct kvm_mmu_page {
int root_count; /* Currently serving as active root */ int root_count; /* Currently serving as active root */
unsigned int unsync_children; unsigned int unsync_children;
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */ struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
/* The page is obsolete if mmu_valid_gen != kvm->arch.mmu_valid_gen. */
unsigned long mmu_valid_gen;
DECLARE_BITMAP(unsync_child_bitmap, 512); DECLARE_BITMAP(unsync_child_bitmap, 512);
#ifdef CONFIG_X86_32 #ifdef CONFIG_X86_32
...@@ -848,13 +846,11 @@ struct kvm_arch { ...@@ -848,13 +846,11 @@ struct kvm_arch {
unsigned int n_requested_mmu_pages; unsigned int n_requested_mmu_pages;
unsigned int n_max_mmu_pages; unsigned int n_max_mmu_pages;
unsigned int indirect_shadow_pages; unsigned int indirect_shadow_pages;
unsigned long mmu_valid_gen;
struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES]; struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
/* /*
* Hash table of struct kvm_mmu_page. * Hash table of struct kvm_mmu_page.
*/ */
struct list_head active_mmu_pages; struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
struct kvm_page_track_notifier_node mmu_sp_tracker; struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head; struct kvm_page_track_notifier_head track_notifier_head;
...@@ -1255,7 +1251,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm, ...@@ -1255,7 +1251,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot, struct kvm_memory_slot *slot,
gfn_t gfn_offset, unsigned long mask); gfn_t gfn_offset, unsigned long mask);
void kvm_mmu_zap_all(struct kvm *kvm); void kvm_mmu_zap_all(struct kvm *kvm);
void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, struct kvm_memslots *slots); void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen);
unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
......
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_X86_KVM_VCPU_REGS_H
#define _ASM_X86_KVM_VCPU_REGS_H
#define __VCPU_REGS_RAX 0
#define __VCPU_REGS_RCX 1
#define __VCPU_REGS_RDX 2
#define __VCPU_REGS_RBX 3
#define __VCPU_REGS_RSP 4
#define __VCPU_REGS_RBP 5
#define __VCPU_REGS_RSI 6
#define __VCPU_REGS_RDI 7
#ifdef CONFIG_X86_64
#define __VCPU_REGS_R8 8
#define __VCPU_REGS_R9 9
#define __VCPU_REGS_R10 10
#define __VCPU_REGS_R11 11
#define __VCPU_REGS_R12 12
#define __VCPU_REGS_R13 13
#define __VCPU_REGS_R14 14
#define __VCPU_REGS_R15 15
#endif
#endif /* _ASM_X86_KVM_VCPU_REGS_H */
...@@ -104,12 +104,8 @@ static u64 kvm_sched_clock_read(void) ...@@ -104,12 +104,8 @@ static u64 kvm_sched_clock_read(void)
static inline void kvm_sched_clock_init(bool stable) static inline void kvm_sched_clock_init(bool stable)
{ {
if (!stable) { if (!stable)
pv_ops.time.sched_clock = kvm_clock_read;
clear_sched_clock_stable(); clear_sched_clock_stable();
return;
}
kvm_sched_clock_offset = kvm_clock_read(); kvm_sched_clock_offset = kvm_clock_read();
pv_ops.time.sched_clock = kvm_sched_clock_read; pv_ops.time.sched_clock = kvm_sched_clock_read;
...@@ -355,6 +351,20 @@ void __init kvmclock_init(void) ...@@ -355,6 +351,20 @@ void __init kvmclock_init(void)
machine_ops.crash_shutdown = kvm_crash_shutdown; machine_ops.crash_shutdown = kvm_crash_shutdown;
#endif #endif
kvm_get_preset_lpj(); kvm_get_preset_lpj();
/*
* X86_FEATURE_NONSTOP_TSC is TSC runs at constant rate
* with P/T states and does not stop in deep C-states.
*
* Invariant TSC exposed by host means kvmclock is not necessary:
* can use TSC as clocksource.
*
*/
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
!check_tsc_unstable())
kvm_clock.rating = 299;
clocksource_register_hz(&kvm_clock, NSEC_PER_SEC); clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
pv_info.name = "KVM"; pv_info.name = "KVM";
} }
...@@ -405,7 +405,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, ...@@ -405,7 +405,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE); F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B);
/* cpuid 7.0.edx*/ /* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features = const u32 kvm_cpuid_7_0_edx_x86_features =
......
...@@ -1729,7 +1729,7 @@ static int kvm_hv_eventfd_assign(struct kvm *kvm, u32 conn_id, int fd) ...@@ -1729,7 +1729,7 @@ static int kvm_hv_eventfd_assign(struct kvm *kvm, u32 conn_id, int fd)
mutex_lock(&hv->hv_lock); mutex_lock(&hv->hv_lock);
ret = idr_alloc(&hv->conn_to_evt, eventfd, conn_id, conn_id + 1, ret = idr_alloc(&hv->conn_to_evt, eventfd, conn_id, conn_id + 1,
GFP_KERNEL); GFP_KERNEL_ACCOUNT);
mutex_unlock(&hv->hv_lock); mutex_unlock(&hv->hv_lock);
if (ret >= 0) if (ret >= 0)
......
...@@ -653,7 +653,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags) ...@@ -653,7 +653,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
pid_t pid_nr; pid_t pid_nr;
int ret; int ret;
pit = kzalloc(sizeof(struct kvm_pit), GFP_KERNEL); pit = kzalloc(sizeof(struct kvm_pit), GFP_KERNEL_ACCOUNT);
if (!pit) if (!pit)
return NULL; return NULL;
......
...@@ -583,7 +583,7 @@ int kvm_pic_init(struct kvm *kvm) ...@@ -583,7 +583,7 @@ int kvm_pic_init(struct kvm *kvm)
struct kvm_pic *s; struct kvm_pic *s;
int ret; int ret;
s = kzalloc(sizeof(struct kvm_pic), GFP_KERNEL); s = kzalloc(sizeof(struct kvm_pic), GFP_KERNEL_ACCOUNT);
if (!s) if (!s)
return -ENOMEM; return -ENOMEM;
spin_lock_init(&s->lock); spin_lock_init(&s->lock);
......
...@@ -622,7 +622,7 @@ int kvm_ioapic_init(struct kvm *kvm) ...@@ -622,7 +622,7 @@ int kvm_ioapic_init(struct kvm *kvm)
struct kvm_ioapic *ioapic; struct kvm_ioapic *ioapic;
int ret; int ret;
ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL); ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL_ACCOUNT);
if (!ioapic) if (!ioapic)
return -ENOMEM; return -ENOMEM;
spin_lock_init(&ioapic->lock); spin_lock_init(&ioapic->lock);
......
...@@ -181,7 +181,8 @@ static void recalculate_apic_map(struct kvm *kvm) ...@@ -181,7 +181,8 @@ static void recalculate_apic_map(struct kvm *kvm)
max_id = max(max_id, kvm_x2apic_id(vcpu->arch.apic)); max_id = max(max_id, kvm_x2apic_id(vcpu->arch.apic));
new = kvzalloc(sizeof(struct kvm_apic_map) + new = kvzalloc(sizeof(struct kvm_apic_map) +
sizeof(struct kvm_lapic *) * ((u64)max_id + 1), GFP_KERNEL); sizeof(struct kvm_lapic *) * ((u64)max_id + 1),
GFP_KERNEL_ACCOUNT);
if (!new) if (!new)
goto out; goto out;
...@@ -2259,13 +2260,13 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu) ...@@ -2259,13 +2260,13 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
ASSERT(vcpu != NULL); ASSERT(vcpu != NULL);
apic_debug("apic_init %d\n", vcpu->vcpu_id); apic_debug("apic_init %d\n", vcpu->vcpu_id);
apic = kzalloc(sizeof(*apic), GFP_KERNEL); apic = kzalloc(sizeof(*apic), GFP_KERNEL_ACCOUNT);
if (!apic) if (!apic)
goto nomem; goto nomem;
vcpu->arch.apic = apic; vcpu->arch.apic = apic;
apic->regs = (void *)get_zeroed_page(GFP_KERNEL); apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (!apic->regs) { if (!apic->regs) {
printk(KERN_ERR "malloc apic regs error for vcpu %x\n", printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
vcpu->vcpu_id); vcpu->vcpu_id);
......
此差异已折叠。
...@@ -203,7 +203,6 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, ...@@ -203,7 +203,6 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
return -(u32)fault & errcode; return -(u32)fault & errcode;
} }
void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm);
void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
......
...@@ -8,18 +8,16 @@ ...@@ -8,18 +8,16 @@
#undef TRACE_SYSTEM #undef TRACE_SYSTEM
#define TRACE_SYSTEM kvmmmu #define TRACE_SYSTEM kvmmmu
#define KVM_MMU_PAGE_FIELDS \ #define KVM_MMU_PAGE_FIELDS \
__field(unsigned long, mmu_valid_gen) \ __field(__u64, gfn) \
__field(__u64, gfn) \ __field(__u32, role) \
__field(__u32, role) \ __field(__u32, root_count) \
__field(__u32, root_count) \
__field(bool, unsync) __field(bool, unsync)
#define KVM_MMU_PAGE_ASSIGN(sp) \ #define KVM_MMU_PAGE_ASSIGN(sp) \
__entry->mmu_valid_gen = sp->mmu_valid_gen; \ __entry->gfn = sp->gfn; \
__entry->gfn = sp->gfn; \ __entry->role = sp->role.word; \
__entry->role = sp->role.word; \ __entry->root_count = sp->root_count; \
__entry->root_count = sp->root_count; \
__entry->unsync = sp->unsync; __entry->unsync = sp->unsync;
#define KVM_MMU_PAGE_PRINTK() ({ \ #define KVM_MMU_PAGE_PRINTK() ({ \
...@@ -31,9 +29,8 @@ ...@@ -31,9 +29,8 @@
\ \
role.word = __entry->role; \ role.word = __entry->role; \
\ \
trace_seq_printf(p, "sp gen %lx gfn %llx l%u%s q%u%s %s%s" \ trace_seq_printf(p, "sp gfn %llx l%u%s q%u%s %s%s" \
" %snxe %sad root %u %s%c", \ " %snxe %sad root %u %s%c", \
__entry->mmu_valid_gen, \
__entry->gfn, role.level, \ __entry->gfn, role.level, \
role.cr4_pae ? " pae" : "", \ role.cr4_pae ? " pae" : "", \
role.quadrant, \ role.quadrant, \
...@@ -282,27 +279,6 @@ TRACE_EVENT( ...@@ -282,27 +279,6 @@ TRACE_EVENT(
) )
); );
TRACE_EVENT(
kvm_mmu_invalidate_zap_all_pages,
TP_PROTO(struct kvm *kvm),
TP_ARGS(kvm),
TP_STRUCT__entry(
__field(unsigned long, mmu_valid_gen)
__field(unsigned int, mmu_used_pages)
),
TP_fast_assign(
__entry->mmu_valid_gen = kvm->arch.mmu_valid_gen;
__entry->mmu_used_pages = kvm->arch.n_used_mmu_pages;
),
TP_printk("kvm-mmu-valid-gen %lx used_pages %x",
__entry->mmu_valid_gen, __entry->mmu_used_pages
)
);
TRACE_EVENT( TRACE_EVENT(
check_mmio_spte, check_mmio_spte,
TP_PROTO(u64 spte, unsigned int kvm_gen, unsigned int spte_gen), TP_PROTO(u64 spte, unsigned int kvm_gen, unsigned int spte_gen),
......
...@@ -42,7 +42,7 @@ int kvm_page_track_create_memslot(struct kvm_memory_slot *slot, ...@@ -42,7 +42,7 @@ int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
for (i = 0; i < KVM_PAGE_TRACK_MAX; i++) { for (i = 0; i < KVM_PAGE_TRACK_MAX; i++) {
slot->arch.gfn_track[i] = slot->arch.gfn_track[i] =
kvcalloc(npages, sizeof(*slot->arch.gfn_track[i]), kvcalloc(npages, sizeof(*slot->arch.gfn_track[i]),
GFP_KERNEL); GFP_KERNEL_ACCOUNT);
if (!slot->arch.gfn_track[i]) if (!slot->arch.gfn_track[i])
goto track_free; goto track_free;
} }
......
...@@ -145,7 +145,6 @@ struct kvm_svm { ...@@ -145,7 +145,6 @@ struct kvm_svm {
/* Struct members for AVIC */ /* Struct members for AVIC */
u32 avic_vm_id; u32 avic_vm_id;
u32 ldr_mode;
struct page *avic_logical_id_table_page; struct page *avic_logical_id_table_page;
struct page *avic_physical_id_table_page; struct page *avic_physical_id_table_page;
struct hlist_node hnode; struct hlist_node hnode;
...@@ -236,6 +235,7 @@ struct vcpu_svm { ...@@ -236,6 +235,7 @@ struct vcpu_svm {
bool nrips_enabled : 1; bool nrips_enabled : 1;
u32 ldr_reg; u32 ldr_reg;
u32 dfr_reg;
struct page *avic_backing_page; struct page *avic_backing_page;
u64 *avic_physical_id_cache; u64 *avic_physical_id_cache;
bool avic_is_running; bool avic_is_running;
...@@ -1795,9 +1795,10 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr, ...@@ -1795,9 +1795,10 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
/* Avoid using vmalloc for smaller buffers. */ /* Avoid using vmalloc for smaller buffers. */
size = npages * sizeof(struct page *); size = npages * sizeof(struct page *);
if (size > PAGE_SIZE) if (size > PAGE_SIZE)
pages = vmalloc(size); pages = __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_ZERO,
PAGE_KERNEL);
else else
pages = kmalloc(size, GFP_KERNEL); pages = kmalloc(size, GFP_KERNEL_ACCOUNT);
if (!pages) if (!pages)
return NULL; return NULL;
...@@ -1865,7 +1866,9 @@ static void __unregister_enc_region_locked(struct kvm *kvm, ...@@ -1865,7 +1866,9 @@ static void __unregister_enc_region_locked(struct kvm *kvm,
static struct kvm *svm_vm_alloc(void) static struct kvm *svm_vm_alloc(void)
{ {
struct kvm_svm *kvm_svm = vzalloc(sizeof(struct kvm_svm)); struct kvm_svm *kvm_svm = __vmalloc(sizeof(struct kvm_svm),
GFP_KERNEL_ACCOUNT | __GFP_ZERO,
PAGE_KERNEL);
return &kvm_svm->kvm; return &kvm_svm->kvm;
} }
...@@ -1940,7 +1943,7 @@ static int avic_vm_init(struct kvm *kvm) ...@@ -1940,7 +1943,7 @@ static int avic_vm_init(struct kvm *kvm)
return 0; return 0;
/* Allocating physical APIC ID table (4KB) */ /* Allocating physical APIC ID table (4KB) */
p_page = alloc_page(GFP_KERNEL); p_page = alloc_page(GFP_KERNEL_ACCOUNT);
if (!p_page) if (!p_page)
goto free_avic; goto free_avic;
...@@ -1948,7 +1951,7 @@ static int avic_vm_init(struct kvm *kvm) ...@@ -1948,7 +1951,7 @@ static int avic_vm_init(struct kvm *kvm)
clear_page(page_address(p_page)); clear_page(page_address(p_page));
/* Allocating logical APIC ID table (4KB) */ /* Allocating logical APIC ID table (4KB) */
l_page = alloc_page(GFP_KERNEL); l_page = alloc_page(GFP_KERNEL_ACCOUNT);
if (!l_page) if (!l_page)
goto free_avic; goto free_avic;
...@@ -2106,6 +2109,7 @@ static int avic_init_vcpu(struct vcpu_svm *svm) ...@@ -2106,6 +2109,7 @@ static int avic_init_vcpu(struct vcpu_svm *svm)
INIT_LIST_HEAD(&svm->ir_list); INIT_LIST_HEAD(&svm->ir_list);
spin_lock_init(&svm->ir_list_lock); spin_lock_init(&svm->ir_list_lock);
svm->dfr_reg = APIC_DFR_FLAT;
return ret; return ret;
} }
...@@ -2119,13 +2123,14 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) ...@@ -2119,13 +2123,14 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
struct page *nested_msrpm_pages; struct page *nested_msrpm_pages;
int err; int err;
svm = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); svm = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL_ACCOUNT);
if (!svm) { if (!svm) {
err = -ENOMEM; err = -ENOMEM;
goto out; goto out;
} }
svm->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache, GFP_KERNEL); svm->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
GFP_KERNEL_ACCOUNT);
if (!svm->vcpu.arch.guest_fpu) { if (!svm->vcpu.arch.guest_fpu) {
printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n"); printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
err = -ENOMEM; err = -ENOMEM;
...@@ -2137,19 +2142,19 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) ...@@ -2137,19 +2142,19 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
goto free_svm; goto free_svm;
err = -ENOMEM; err = -ENOMEM;
page = alloc_page(GFP_KERNEL); page = alloc_page(GFP_KERNEL_ACCOUNT);
if (!page) if (!page)
goto uninit; goto uninit;
msrpm_pages = alloc_pages(GFP_KERNEL, MSRPM_ALLOC_ORDER); msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
if (!msrpm_pages) if (!msrpm_pages)
goto free_page1; goto free_page1;
nested_msrpm_pages = alloc_pages(GFP_KERNEL, MSRPM_ALLOC_ORDER); nested_msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
if (!nested_msrpm_pages) if (!nested_msrpm_pages)
goto free_page2; goto free_page2;
hsave_page = alloc_page(GFP_KERNEL); hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);
if (!hsave_page) if (!hsave_page)
goto free_page3; goto free_page3;
...@@ -4565,8 +4570,7 @@ static u32 *avic_get_logical_id_entry(struct kvm_vcpu *vcpu, u32 ldr, bool flat) ...@@ -4565,8 +4570,7 @@ static u32 *avic_get_logical_id_entry(struct kvm_vcpu *vcpu, u32 ldr, bool flat)
return &logical_apic_id_table[index]; return &logical_apic_id_table[index];
} }
static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr, static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr)
bool valid)
{ {
bool flat; bool flat;
u32 *entry, new_entry; u32 *entry, new_entry;
...@@ -4579,31 +4583,39 @@ static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr, ...@@ -4579,31 +4583,39 @@ static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr,
new_entry = READ_ONCE(*entry); new_entry = READ_ONCE(*entry);
new_entry &= ~AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK; new_entry &= ~AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
new_entry |= (g_physical_id & AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK); new_entry |= (g_physical_id & AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK);
if (valid) new_entry |= AVIC_LOGICAL_ID_ENTRY_VALID_MASK;
new_entry |= AVIC_LOGICAL_ID_ENTRY_VALID_MASK;
else
new_entry &= ~AVIC_LOGICAL_ID_ENTRY_VALID_MASK;
WRITE_ONCE(*entry, new_entry); WRITE_ONCE(*entry, new_entry);
return 0; return 0;
} }
static void avic_invalidate_logical_id_entry(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
bool flat = svm->dfr_reg == APIC_DFR_FLAT;
u32 *entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
if (entry)
WRITE_ONCE(*entry, (u32) ~AVIC_LOGICAL_ID_ENTRY_VALID_MASK);
}
static int avic_handle_ldr_update(struct kvm_vcpu *vcpu) static int avic_handle_ldr_update(struct kvm_vcpu *vcpu)
{ {
int ret; int ret = 0;
struct vcpu_svm *svm = to_svm(vcpu); struct vcpu_svm *svm = to_svm(vcpu);
u32 ldr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_LDR); u32 ldr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_LDR);
if (!ldr) if (ldr == svm->ldr_reg)
return 1; return 0;
ret = avic_ldr_write(vcpu, vcpu->vcpu_id, ldr, true); avic_invalidate_logical_id_entry(vcpu);
if (ret && svm->ldr_reg) {
avic_ldr_write(vcpu, 0, svm->ldr_reg, false); if (ldr)
svm->ldr_reg = 0; ret = avic_ldr_write(vcpu, vcpu->vcpu_id, ldr);
} else {
if (!ret)
svm->ldr_reg = ldr; svm->ldr_reg = ldr;
}
return ret; return ret;
} }
...@@ -4637,27 +4649,16 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu) ...@@ -4637,27 +4649,16 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)
return 0; return 0;
} }
static int avic_handle_dfr_update(struct kvm_vcpu *vcpu) static void avic_handle_dfr_update(struct kvm_vcpu *vcpu)
{ {
struct vcpu_svm *svm = to_svm(vcpu); struct vcpu_svm *svm = to_svm(vcpu);
struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
u32 dfr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_DFR); u32 dfr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_DFR);
u32 mod = (dfr >> 28) & 0xf;
/* if (svm->dfr_reg == dfr)
* We assume that all local APICs are using the same type. return;
* If this changes, we need to flush the AVIC logical
* APID id table.
*/
if (kvm_svm->ldr_mode == mod)
return 0;
clear_page(page_address(kvm_svm->avic_logical_id_table_page));
kvm_svm->ldr_mode = mod;
if (svm->ldr_reg) avic_invalidate_logical_id_entry(vcpu);
avic_handle_ldr_update(vcpu); svm->dfr_reg = dfr;
return 0;
} }
static int avic_unaccel_trap_write(struct vcpu_svm *svm) static int avic_unaccel_trap_write(struct vcpu_svm *svm)
...@@ -5125,11 +5126,11 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) ...@@ -5125,11 +5126,11 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu); struct vcpu_svm *svm = to_svm(vcpu);
struct vmcb *vmcb = svm->vmcb; struct vmcb *vmcb = svm->vmcb;
if (!kvm_vcpu_apicv_active(&svm->vcpu)) if (kvm_vcpu_apicv_active(vcpu))
return; vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
else
vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK; vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
mark_dirty(vmcb, VMCB_INTR); mark_dirty(vmcb, VMCB_AVIC);
} }
static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
...@@ -5195,7 +5196,7 @@ static int svm_ir_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi) ...@@ -5195,7 +5196,7 @@ static int svm_ir_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
* Allocating new amd_iommu_pi_data, which will get * Allocating new amd_iommu_pi_data, which will get
* add to the per-vcpu ir_list. * add to the per-vcpu ir_list.
*/ */
ir = kzalloc(sizeof(struct amd_svm_iommu_ir), GFP_KERNEL); ir = kzalloc(sizeof(struct amd_svm_iommu_ir), GFP_KERNEL_ACCOUNT);
if (!ir) { if (!ir) {
ret = -ENOMEM; ret = -ENOMEM;
goto out; goto out;
...@@ -6163,8 +6164,7 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu) ...@@ -6163,8 +6164,7 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
{ {
if (avic_handle_apic_id_update(vcpu) != 0) if (avic_handle_apic_id_update(vcpu) != 0)
return; return;
if (avic_handle_dfr_update(vcpu) != 0) avic_handle_dfr_update(vcpu);
return;
avic_handle_ldr_update(vcpu); avic_handle_ldr_update(vcpu);
} }
...@@ -6311,7 +6311,7 @@ static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error) ...@@ -6311,7 +6311,7 @@ static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
if (ret) if (ret)
return ret; return ret;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6361,7 +6361,7 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6361,7 +6361,7 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params))) if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params)))
return -EFAULT; return -EFAULT;
start = kzalloc(sizeof(*start), GFP_KERNEL); start = kzalloc(sizeof(*start), GFP_KERNEL_ACCOUNT);
if (!start) if (!start)
return -ENOMEM; return -ENOMEM;
...@@ -6458,7 +6458,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6458,7 +6458,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params))) if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params)))
return -EFAULT; return -EFAULT;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6535,7 +6535,7 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6535,7 +6535,7 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (copy_from_user(&params, measure, sizeof(params))) if (copy_from_user(&params, measure, sizeof(params)))
return -EFAULT; return -EFAULT;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6597,7 +6597,7 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6597,7 +6597,7 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (!sev_guest(kvm)) if (!sev_guest(kvm))
return -ENOTTY; return -ENOTTY;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6618,7 +6618,7 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6618,7 +6618,7 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (!sev_guest(kvm)) if (!sev_guest(kvm))
return -ENOTTY; return -ENOTTY;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6646,7 +6646,7 @@ static int __sev_issue_dbg_cmd(struct kvm *kvm, unsigned long src, ...@@ -6646,7 +6646,7 @@ static int __sev_issue_dbg_cmd(struct kvm *kvm, unsigned long src,
struct sev_data_dbg *data; struct sev_data_dbg *data;
int ret; int ret;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
return -ENOMEM; return -ENOMEM;
...@@ -6901,7 +6901,7 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp) ...@@ -6901,7 +6901,7 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
} }
ret = -ENOMEM; ret = -ENOMEM;
data = kzalloc(sizeof(*data), GFP_KERNEL); data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
if (!data) if (!data)
goto e_unpin_memory; goto e_unpin_memory;
...@@ -7007,7 +7007,7 @@ static int svm_register_enc_region(struct kvm *kvm, ...@@ -7007,7 +7007,7 @@ static int svm_register_enc_region(struct kvm *kvm,
if (range->addr > ULONG_MAX || range->size > ULONG_MAX) if (range->addr > ULONG_MAX || range->size > ULONG_MAX)
return -EINVAL; return -EINVAL;
region = kzalloc(sizeof(*region), GFP_KERNEL); region = kzalloc(sizeof(*region), GFP_KERNEL_ACCOUNT);
if (!region) if (!region)
return -ENOMEM; return -ENOMEM;
......
...@@ -211,7 +211,6 @@ static void free_nested(struct kvm_vcpu *vcpu) ...@@ -211,7 +211,6 @@ static void free_nested(struct kvm_vcpu *vcpu)
if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon)
return; return;
hrtimer_cancel(&vmx->nested.preemption_timer);
vmx->nested.vmxon = false; vmx->nested.vmxon = false;
vmx->nested.smm.vmxon = false; vmx->nested.smm.vmxon = false;
free_vpid(vmx->nested.vpid02); free_vpid(vmx->nested.vpid02);
...@@ -274,6 +273,7 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs) ...@@ -274,6 +273,7 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu) void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu)
{ {
vcpu_load(vcpu); vcpu_load(vcpu);
vmx_leave_nested(vcpu);
vmx_switch_vmcs(vcpu, &to_vmx(vcpu)->vmcs01); vmx_switch_vmcs(vcpu, &to_vmx(vcpu)->vmcs01);
free_nested(vcpu); free_nested(vcpu);
vcpu_put(vcpu); vcpu_put(vcpu);
...@@ -1979,17 +1979,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) ...@@ -1979,17 +1979,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
if (vmx->nested.dirty_vmcs12 || vmx->nested.hv_evmcs) if (vmx->nested.dirty_vmcs12 || vmx->nested.hv_evmcs)
prepare_vmcs02_early_full(vmx, vmcs12); prepare_vmcs02_early_full(vmx, vmcs12);
/*
* HOST_RSP is normally set correctly in vmx_vcpu_run() just before
* entry, but only if the current (host) sp changed from the value
* we wrote last (vmx->host_rsp). This cache is no longer relevant
* if we switch vmcs, and rather than hold a separate cache per vmcs,
* here we just force the write to happen on entry. host_rsp will
* also be written unconditionally by nested_vmx_check_vmentry_hw()
* if we are doing early consistency checks via hardware.
*/
vmx->host_rsp = 0;
/* /*
* PIN CONTROLS * PIN CONTROLS
*/ */
...@@ -2289,10 +2278,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, ...@@ -2289,10 +2278,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
} }
vmx_set_rflags(vcpu, vmcs12->guest_rflags); vmx_set_rflags(vcpu, vmcs12->guest_rflags);
vmx->nested.preemption_timer_expired = false;
if (nested_cpu_has_preemption_timer(vmcs12))
vmx_start_preemption_timer(vcpu);
/* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the /* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the
* bitwise-or of what L1 wants to trap for L2, and what we want to * bitwise-or of what L1 wants to trap for L2, and what we want to
* trap. Note that CR0.TS also needs updating - we do this later. * trap. Note that CR0.TS also needs updating - we do this later.
...@@ -2722,6 +2707,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) ...@@ -2722,6 +2707,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
{ {
struct vcpu_vmx *vmx = to_vmx(vcpu); struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long cr3, cr4; unsigned long cr3, cr4;
bool vm_fail;
if (!nested_early_check) if (!nested_early_check)
return 0; return 0;
...@@ -2755,29 +2741,34 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) ...@@ -2755,29 +2741,34 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->host_state.cr4 = cr4; vmx->loaded_vmcs->host_state.cr4 = cr4;
} }
vmx->__launched = vmx->loaded_vmcs->launched;
asm( asm(
/* Set HOST_RSP */
"sub $%c[wordsize], %%" _ASM_SP "\n\t" /* temporarily adjust RSP for CALL */ "sub $%c[wordsize], %%" _ASM_SP "\n\t" /* temporarily adjust RSP for CALL */
__ex("vmwrite %%" _ASM_SP ", %%" _ASM_DX) "\n\t" "cmp %%" _ASM_SP ", %c[host_state_rsp](%[loaded_vmcs]) \n\t"
"mov %%" _ASM_SP ", %c[host_rsp](%1)\n\t" "je 1f \n\t"
__ex("vmwrite %%" _ASM_SP ", %[HOST_RSP]") "\n\t"
"mov %%" _ASM_SP ", %c[host_state_rsp](%[loaded_vmcs]) \n\t"
"1: \n\t"
"add $%c[wordsize], %%" _ASM_SP "\n\t" /* un-adjust RSP */ "add $%c[wordsize], %%" _ASM_SP "\n\t" /* un-adjust RSP */
/* Check if vmlaunch or vmresume is needed */ /* Check if vmlaunch or vmresume is needed */
"cmpl $0, %c[launched](%% " _ASM_CX")\n\t" "cmpb $0, %c[launched](%[loaded_vmcs])\n\t"
/*
* VMLAUNCH and VMRESUME clear RFLAGS.{CF,ZF} on VM-Exit, set
* RFLAGS.CF on VM-Fail Invalid and set RFLAGS.ZF on VM-Fail
* Valid. vmx_vmenter() directly "returns" RFLAGS, and so the
* results of VM-Enter is captured via CC_{SET,OUT} to vm_fail.
*/
"call vmx_vmenter\n\t" "call vmx_vmenter\n\t"
/* Set vmx->fail accordingly */ CC_SET(be)
"setbe %c[fail](%% " _ASM_CX")\n\t" : ASM_CALL_CONSTRAINT, CC_OUT(be) (vm_fail)
: ASM_CALL_CONSTRAINT : [HOST_RSP]"r"((unsigned long)HOST_RSP),
: "c"(vmx), "d"((unsigned long)HOST_RSP), [loaded_vmcs]"r"(vmx->loaded_vmcs),
[launched]"i"(offsetof(struct vcpu_vmx, __launched)), [launched]"i"(offsetof(struct loaded_vmcs, launched)),
[fail]"i"(offsetof(struct vcpu_vmx, fail)), [host_state_rsp]"i"(offsetof(struct loaded_vmcs, host_state.rsp)),
[host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)),
[wordsize]"i"(sizeof(ulong)) [wordsize]"i"(sizeof(ulong))
: "rax", "cc", "memory" : "cc", "memory"
); );
preempt_enable(); preempt_enable();
...@@ -2787,10 +2778,9 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) ...@@ -2787,10 +2778,9 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
if (vmx->msr_autoload.guest.nr) if (vmx->msr_autoload.guest.nr)
vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
if (vmx->fail) { if (vm_fail) {
WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) != WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) !=
VMXERR_ENTRY_INVALID_CONTROL_FIELD); VMXERR_ENTRY_INVALID_CONTROL_FIELD);
vmx->fail = 0;
return 1; return 1;
} }
...@@ -2813,8 +2803,6 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) ...@@ -2813,8 +2803,6 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
return 0; return 0;
} }
STACK_FRAME_NON_STANDARD(nested_vmx_check_vmentry_hw);
static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12); struct vmcs12 *vmcs12);
...@@ -3030,6 +3018,15 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry) ...@@ -3030,6 +3018,15 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry)
if (unlikely(evaluate_pending_interrupts)) if (unlikely(evaluate_pending_interrupts))
kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_make_request(KVM_REQ_EVENT, vcpu);
/*
* Do not start the preemption timer hrtimer until after we know
* we are successful, so that only nested_vmx_vmexit needs to cancel
* the timer.
*/
vmx->nested.preemption_timer_expired = false;
if (nested_cpu_has_preemption_timer(vmcs12))
vmx_start_preemption_timer(vcpu);
/* /*
* Note no nested_vmx_succeed or nested_vmx_fail here. At this point * Note no nested_vmx_succeed or nested_vmx_fail here. At this point
* we are no longer running L1, and VMLAUNCH/VMRESUME has not yet * we are no longer running L1, and VMLAUNCH/VMRESUME has not yet
...@@ -3450,13 +3447,10 @@ static void sync_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) ...@@ -3450,13 +3447,10 @@ static void sync_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
else else
vmcs12->guest_activity_state = GUEST_ACTIVITY_ACTIVE; vmcs12->guest_activity_state = GUEST_ACTIVITY_ACTIVE;
if (nested_cpu_has_preemption_timer(vmcs12)) { if (nested_cpu_has_preemption_timer(vmcs12) &&
if (vmcs12->vm_exit_controls & vmcs12->vm_exit_controls & VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
vmcs12->vmx_preemption_timer_value = vmcs12->vmx_preemption_timer_value =
vmx_get_preemption_timer_value(vcpu); vmx_get_preemption_timer_value(vcpu);
hrtimer_cancel(&to_vmx(vcpu)->nested.preemption_timer);
}
/* /*
* In some cases (usually, nested EPT), L2 is allowed to change its * In some cases (usually, nested EPT), L2 is allowed to change its
...@@ -3864,6 +3858,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, ...@@ -3864,6 +3858,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
leave_guest_mode(vcpu); leave_guest_mode(vcpu);
if (nested_cpu_has_preemption_timer(vmcs12))
hrtimer_cancel(&to_vmx(vcpu)->nested.preemption_timer);
if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING)
vcpu->arch.tsc_offset -= vmcs12->tsc_offset; vcpu->arch.tsc_offset -= vmcs12->tsc_offset;
...@@ -3915,9 +3912,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, ...@@ -3915,9 +3912,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
vmx_flush_tlb(vcpu, true); vmx_flush_tlb(vcpu, true);
} }
/* This is needed for same reason as it was needed in prepare_vmcs02 */
vmx->host_rsp = 0;
/* Unpin physical memory we referred to in vmcs02 */ /* Unpin physical memory we referred to in vmcs02 */
if (vmx->nested.apic_access_page) { if (vmx->nested.apic_access_page) {
kvm_release_page_dirty(vmx->nested.apic_access_page); kvm_release_page_dirty(vmx->nested.apic_access_page);
...@@ -4035,25 +4029,50 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification, ...@@ -4035,25 +4029,50 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
/* Addr = segment_base + offset */ /* Addr = segment_base + offset */
/* offset = base + [index * scale] + displacement */ /* offset = base + [index * scale] + displacement */
off = exit_qualification; /* holds the displacement */ off = exit_qualification; /* holds the displacement */
if (addr_size == 1)
off = (gva_t)sign_extend64(off, 31);
else if (addr_size == 0)
off = (gva_t)sign_extend64(off, 15);
if (base_is_valid) if (base_is_valid)
off += kvm_register_read(vcpu, base_reg); off += kvm_register_read(vcpu, base_reg);
if (index_is_valid) if (index_is_valid)
off += kvm_register_read(vcpu, index_reg)<<scaling; off += kvm_register_read(vcpu, index_reg)<<scaling;
vmx_get_segment(vcpu, &s, seg_reg); vmx_get_segment(vcpu, &s, seg_reg);
*ret = s.base + off;
/*
* The effective address, i.e. @off, of a memory operand is truncated
* based on the address size of the instruction. Note that this is
* the *effective address*, i.e. the address prior to accounting for
* the segment's base.
*/
if (addr_size == 1) /* 32 bit */ if (addr_size == 1) /* 32 bit */
*ret &= 0xffffffff; off &= 0xffffffff;
else if (addr_size == 0) /* 16 bit */
off &= 0xffff;
/* Checks for #GP/#SS exceptions. */ /* Checks for #GP/#SS exceptions. */
exn = false; exn = false;
if (is_long_mode(vcpu)) { if (is_long_mode(vcpu)) {
/*
* The virtual/linear address is never truncated in 64-bit
* mode, e.g. a 32-bit address size can yield a 64-bit virtual
* address when using FS/GS with a non-zero base.
*/
*ret = s.base + off;
/* Long mode: #GP(0)/#SS(0) if the memory address is in a /* Long mode: #GP(0)/#SS(0) if the memory address is in a
* non-canonical form. This is the only check on the memory * non-canonical form. This is the only check on the memory
* destination for long mode! * destination for long mode!
*/ */
exn = is_noncanonical_address(*ret, vcpu); exn = is_noncanonical_address(*ret, vcpu);
} else if (is_protmode(vcpu)) { } else {
/*
* When not in long mode, the virtual/linear address is
* unconditionally truncated to 32 bits regardless of the
* address size.
*/
*ret = (s.base + off) & 0xffffffff;
/* Protected mode: apply checks for segment validity in the /* Protected mode: apply checks for segment validity in the
* following order: * following order:
* - segment type check (#GP(0) may be thrown) * - segment type check (#GP(0) may be thrown)
...@@ -4077,10 +4096,16 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification, ...@@ -4077,10 +4096,16 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
/* Protected mode: #GP(0)/#SS(0) if the segment is unusable. /* Protected mode: #GP(0)/#SS(0) if the segment is unusable.
*/ */
exn = (s.unusable != 0); exn = (s.unusable != 0);
/* Protected mode: #GP(0)/#SS(0) if the memory
* operand is outside the segment limit. /*
* Protected mode: #GP(0)/#SS(0) if the memory operand is
* outside the segment limit. All CPUs that support VMX ignore
* limit checks for flat segments, i.e. segments with base==0,
* limit==0xffffffff and of type expand-up data or code.
*/ */
exn = exn || (off + sizeof(u64) > s.limit); if (!(s.base == 0 && s.limit == 0xffffffff &&
((s.type & 8) || !(s.type & 4))))
exn = exn || (off + sizeof(u64) > s.limit);
} }
if (exn) { if (exn) {
kvm_queue_exception_e(vcpu, kvm_queue_exception_e(vcpu,
...@@ -4145,11 +4170,11 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) ...@@ -4145,11 +4170,11 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu)
if (r < 0) if (r < 0)
goto out_vmcs02; goto out_vmcs02;
vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL_ACCOUNT);
if (!vmx->nested.cached_vmcs12) if (!vmx->nested.cached_vmcs12)
goto out_cached_vmcs12; goto out_cached_vmcs12;
vmx->nested.cached_shadow_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); vmx->nested.cached_shadow_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL_ACCOUNT);
if (!vmx->nested.cached_shadow_vmcs12) if (!vmx->nested.cached_shadow_vmcs12)
goto out_cached_shadow_vmcs12; goto out_cached_shadow_vmcs12;
...@@ -5696,6 +5721,10 @@ __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *)) ...@@ -5696,6 +5721,10 @@ __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *))
enable_shadow_vmcs = 0; enable_shadow_vmcs = 0;
if (enable_shadow_vmcs) { if (enable_shadow_vmcs) {
for (i = 0; i < VMX_BITMAP_NR; i++) { for (i = 0; i < VMX_BITMAP_NR; i++) {
/*
* The vmx_bitmap is not tied to a VM and so should
* not be charged to a memcg.
*/
vmx_bitmap[i] = (unsigned long *) vmx_bitmap[i] = (unsigned long *)
__get_free_page(GFP_KERNEL); __get_free_page(GFP_KERNEL);
if (!vmx_bitmap[i]) { if (!vmx_bitmap[i]) {
......
...@@ -34,6 +34,7 @@ struct vmcs_host_state { ...@@ -34,6 +34,7 @@ struct vmcs_host_state {
unsigned long cr4; /* May not match real cr4 */ unsigned long cr4; /* May not match real cr4 */
unsigned long gs_base; unsigned long gs_base;
unsigned long fs_base; unsigned long fs_base;
unsigned long rsp;
u16 fs_sel, gs_sel, ldt_sel; u16 fs_sel, gs_sel, ldt_sel;
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
......
/* SPDX-License-Identifier: GPL-2.0 */ /* SPDX-License-Identifier: GPL-2.0 */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <asm/asm.h> #include <asm/asm.h>
#include <asm/bitsperlong.h>
#include <asm/kvm_vcpu_regs.h>
#define WORD_SIZE (BITS_PER_LONG / 8)
#define VCPU_RAX __VCPU_REGS_RAX * WORD_SIZE
#define VCPU_RCX __VCPU_REGS_RCX * WORD_SIZE
#define VCPU_RDX __VCPU_REGS_RDX * WORD_SIZE
#define VCPU_RBX __VCPU_REGS_RBX * WORD_SIZE
/* Intentionally omit RSP as it's context switched by hardware */
#define VCPU_RBP __VCPU_REGS_RBP * WORD_SIZE
#define VCPU_RSI __VCPU_REGS_RSI * WORD_SIZE
#define VCPU_RDI __VCPU_REGS_RDI * WORD_SIZE
#ifdef CONFIG_X86_64
#define VCPU_R8 __VCPU_REGS_R8 * WORD_SIZE
#define VCPU_R9 __VCPU_REGS_R9 * WORD_SIZE
#define VCPU_R10 __VCPU_REGS_R10 * WORD_SIZE
#define VCPU_R11 __VCPU_REGS_R11 * WORD_SIZE
#define VCPU_R12 __VCPU_REGS_R12 * WORD_SIZE
#define VCPU_R13 __VCPU_REGS_R13 * WORD_SIZE
#define VCPU_R14 __VCPU_REGS_R14 * WORD_SIZE
#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
#endif
.text .text
...@@ -55,3 +79,146 @@ ENDPROC(vmx_vmenter) ...@@ -55,3 +79,146 @@ ENDPROC(vmx_vmenter)
ENTRY(vmx_vmexit) ENTRY(vmx_vmexit)
ret ret
ENDPROC(vmx_vmexit) ENDPROC(vmx_vmexit)
/**
* __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
* @vmx: struct vcpu_vmx *
* @regs: unsigned long * (to guest registers)
* @launched: %true if the VMCS has been launched
*
* Returns:
* 0 on VM-Exit, 1 on VM-Fail
*/
ENTRY(__vmx_vcpu_run)
push %_ASM_BP
mov %_ASM_SP, %_ASM_BP
#ifdef CONFIG_X86_64
push %r15
push %r14
push %r13
push %r12
#else
push %edi
push %esi
#endif
push %_ASM_BX
/*
* Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
* @regs is needed after VM-Exit to save the guest's register values.
*/
push %_ASM_ARG2
/* Copy @launched to BL, _ASM_ARG3 is volatile. */
mov %_ASM_ARG3B, %bl
/* Adjust RSP to account for the CALL to vmx_vmenter(). */
lea -WORD_SIZE(%_ASM_SP), %_ASM_ARG2
call vmx_update_host_rsp
/* Load @regs to RAX. */
mov (%_ASM_SP), %_ASM_AX
/* Check if vmlaunch or vmresume is needed */
cmpb $0, %bl
/* Load guest registers. Don't clobber flags. */
mov VCPU_RBX(%_ASM_AX), %_ASM_BX
mov VCPU_RCX(%_ASM_AX), %_ASM_CX
mov VCPU_RDX(%_ASM_AX), %_ASM_DX
mov VCPU_RSI(%_ASM_AX), %_ASM_SI
mov VCPU_RDI(%_ASM_AX), %_ASM_DI
mov VCPU_RBP(%_ASM_AX), %_ASM_BP
#ifdef CONFIG_X86_64
mov VCPU_R8 (%_ASM_AX), %r8
mov VCPU_R9 (%_ASM_AX), %r9
mov VCPU_R10(%_ASM_AX), %r10
mov VCPU_R11(%_ASM_AX), %r11
mov VCPU_R12(%_ASM_AX), %r12
mov VCPU_R13(%_ASM_AX), %r13
mov VCPU_R14(%_ASM_AX), %r14
mov VCPU_R15(%_ASM_AX), %r15
#endif
/* Load guest RAX. This kills the vmx_vcpu pointer! */
mov VCPU_RAX(%_ASM_AX), %_ASM_AX
/* Enter guest mode */
call vmx_vmenter
/* Jump on VM-Fail. */
jbe 2f
/* Temporarily save guest's RAX. */
push %_ASM_AX
/* Reload @regs to RAX. */
mov WORD_SIZE(%_ASM_SP), %_ASM_AX
/* Save all guest registers, including RAX from the stack */
__ASM_SIZE(pop) VCPU_RAX(%_ASM_AX)
mov %_ASM_BX, VCPU_RBX(%_ASM_AX)
mov %_ASM_CX, VCPU_RCX(%_ASM_AX)
mov %_ASM_DX, VCPU_RDX(%_ASM_AX)
mov %_ASM_SI, VCPU_RSI(%_ASM_AX)
mov %_ASM_DI, VCPU_RDI(%_ASM_AX)
mov %_ASM_BP, VCPU_RBP(%_ASM_AX)
#ifdef CONFIG_X86_64
mov %r8, VCPU_R8 (%_ASM_AX)
mov %r9, VCPU_R9 (%_ASM_AX)
mov %r10, VCPU_R10(%_ASM_AX)
mov %r11, VCPU_R11(%_ASM_AX)
mov %r12, VCPU_R12(%_ASM_AX)
mov %r13, VCPU_R13(%_ASM_AX)
mov %r14, VCPU_R14(%_ASM_AX)
mov %r15, VCPU_R15(%_ASM_AX)
#endif
/* Clear RAX to indicate VM-Exit (as opposed to VM-Fail). */
xor %eax, %eax
/*
* Clear all general purpose registers except RSP and RAX to prevent
* speculative use of the guest's values, even those that are reloaded
* via the stack. In theory, an L1 cache miss when restoring registers
* could lead to speculative execution with the guest's values.
* Zeroing XORs are dirt cheap, i.e. the extra paranoia is essentially
* free. RSP and RAX are exempt as RSP is restored by hardware during
* VM-Exit and RAX is explicitly loaded with 0 or 1 to return VM-Fail.
*/
1: xor %ebx, %ebx
xor %ecx, %ecx
xor %edx, %edx
xor %esi, %esi
xor %edi, %edi
xor %ebp, %ebp
#ifdef CONFIG_X86_64
xor %r8d, %r8d
xor %r9d, %r9d
xor %r10d, %r10d
xor %r11d, %r11d
xor %r12d, %r12d
xor %r13d, %r13d
xor %r14d, %r14d
xor %r15d, %r15d
#endif
/* "POP" @regs. */
add $WORD_SIZE, %_ASM_SP
pop %_ASM_BX
#ifdef CONFIG_X86_64
pop %r12
pop %r13
pop %r14
pop %r15
#else
pop %esi
pop %edi
#endif
pop %_ASM_BP
ret
/* VM-Fail. Out-of-line to avoid a taken Jcc after VM-Exit. */
2: mov $1, %eax
jmp 1b
ENDPROC(__vmx_vcpu_run)
...@@ -246,6 +246,10 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) ...@@ -246,6 +246,10 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages && if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages &&
!boot_cpu_has(X86_FEATURE_FLUSH_L1D)) { !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) {
/*
* This allocation for vmx_l1d_flush_pages is not tied to a VM
* lifetime and so should not be charged to a memcg.
*/
page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER); page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER);
if (!page) if (!page)
return -ENOMEM; return -ENOMEM;
...@@ -2387,13 +2391,13 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, ...@@ -2387,13 +2391,13 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
return 0; return 0;
} }
struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu) struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
{ {
int node = cpu_to_node(cpu); int node = cpu_to_node(cpu);
struct page *pages; struct page *pages;
struct vmcs *vmcs; struct vmcs *vmcs;
pages = __alloc_pages_node(node, GFP_KERNEL, vmcs_config.order); pages = __alloc_pages_node(node, flags, vmcs_config.order);
if (!pages) if (!pages)
return NULL; return NULL;
vmcs = page_address(pages); vmcs = page_address(pages);
...@@ -2440,7 +2444,8 @@ int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs) ...@@ -2440,7 +2444,8 @@ int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
loaded_vmcs_init(loaded_vmcs); loaded_vmcs_init(loaded_vmcs);
if (cpu_has_vmx_msr_bitmap()) { if (cpu_has_vmx_msr_bitmap()) {
loaded_vmcs->msr_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL); loaded_vmcs->msr_bitmap = (unsigned long *)
__get_free_page(GFP_KERNEL_ACCOUNT);
if (!loaded_vmcs->msr_bitmap) if (!loaded_vmcs->msr_bitmap)
goto out_vmcs; goto out_vmcs;
memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE); memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE);
...@@ -2481,7 +2486,7 @@ static __init int alloc_kvm_area(void) ...@@ -2481,7 +2486,7 @@ static __init int alloc_kvm_area(void)
for_each_possible_cpu(cpu) { for_each_possible_cpu(cpu) {
struct vmcs *vmcs; struct vmcs *vmcs;
vmcs = alloc_vmcs_cpu(false, cpu); vmcs = alloc_vmcs_cpu(false, cpu, GFP_KERNEL);
if (!vmcs) { if (!vmcs) {
free_kvm_area(); free_kvm_area();
return -ENOMEM; return -ENOMEM;
...@@ -6360,150 +6365,15 @@ static void vmx_update_hv_timer(struct kvm_vcpu *vcpu) ...@@ -6360,150 +6365,15 @@ static void vmx_update_hv_timer(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->hv_timer_armed = false; vmx->loaded_vmcs->hv_timer_armed = false;
} }
static void __vmx_vcpu_run(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx) void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
{ {
unsigned long evmcs_rsp; if (unlikely(host_rsp != vmx->loaded_vmcs->host_state.rsp)) {
vmx->loaded_vmcs->host_state.rsp = host_rsp;
vmx->__launched = vmx->loaded_vmcs->launched; vmcs_writel(HOST_RSP, host_rsp);
}
evmcs_rsp = static_branch_unlikely(&enable_evmcs) ?
(unsigned long)&current_evmcs->host_rsp : 0;
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
asm(
/* Store host registers */
"push %%" _ASM_DX "; push %%" _ASM_BP ";"
"push %%" _ASM_CX " \n\t" /* placeholder for guest rcx */
"push %%" _ASM_CX " \n\t"
"sub $%c[wordsize], %%" _ASM_SP "\n\t" /* temporarily adjust RSP for CALL */
"cmp %%" _ASM_SP ", %c[host_rsp](%%" _ASM_CX ") \n\t"
"je 1f \n\t"
"mov %%" _ASM_SP ", %c[host_rsp](%%" _ASM_CX ") \n\t"
/* Avoid VMWRITE when Enlightened VMCS is in use */
"test %%" _ASM_SI ", %%" _ASM_SI " \n\t"
"jz 2f \n\t"
"mov %%" _ASM_SP ", (%%" _ASM_SI ") \n\t"
"jmp 1f \n\t"
"2: \n\t"
__ex("vmwrite %%" _ASM_SP ", %%" _ASM_DX) "\n\t"
"1: \n\t"
"add $%c[wordsize], %%" _ASM_SP "\n\t" /* un-adjust RSP */
/* Reload cr2 if changed */
"mov %c[cr2](%%" _ASM_CX "), %%" _ASM_AX " \n\t"
"mov %%cr2, %%" _ASM_DX " \n\t"
"cmp %%" _ASM_AX ", %%" _ASM_DX " \n\t"
"je 3f \n\t"
"mov %%" _ASM_AX", %%cr2 \n\t"
"3: \n\t"
/* Check if vmlaunch or vmresume is needed */
"cmpl $0, %c[launched](%%" _ASM_CX ") \n\t"
/* Load guest registers. Don't clobber flags. */
"mov %c[rax](%%" _ASM_CX "), %%" _ASM_AX " \n\t"
"mov %c[rbx](%%" _ASM_CX "), %%" _ASM_BX " \n\t"
"mov %c[rdx](%%" _ASM_CX "), %%" _ASM_DX " \n\t"
"mov %c[rsi](%%" _ASM_CX "), %%" _ASM_SI " \n\t"
"mov %c[rdi](%%" _ASM_CX "), %%" _ASM_DI " \n\t"
"mov %c[rbp](%%" _ASM_CX "), %%" _ASM_BP " \n\t"
#ifdef CONFIG_X86_64
"mov %c[r8](%%" _ASM_CX "), %%r8 \n\t"
"mov %c[r9](%%" _ASM_CX "), %%r9 \n\t"
"mov %c[r10](%%" _ASM_CX "), %%r10 \n\t"
"mov %c[r11](%%" _ASM_CX "), %%r11 \n\t"
"mov %c[r12](%%" _ASM_CX "), %%r12 \n\t"
"mov %c[r13](%%" _ASM_CX "), %%r13 \n\t"
"mov %c[r14](%%" _ASM_CX "), %%r14 \n\t"
"mov %c[r15](%%" _ASM_CX "), %%r15 \n\t"
#endif
/* Load guest RCX. This kills the vmx_vcpu pointer! */
"mov %c[rcx](%%" _ASM_CX "), %%" _ASM_CX " \n\t"
/* Enter guest mode */
"call vmx_vmenter\n\t"
/* Save guest's RCX to the stack placeholder (see above) */
"mov %%" _ASM_CX ", %c[wordsize](%%" _ASM_SP ") \n\t"
/* Load host's RCX, i.e. the vmx_vcpu pointer */
"pop %%" _ASM_CX " \n\t"
/* Set vmx->fail based on EFLAGS.{CF,ZF} */
"setbe %c[fail](%%" _ASM_CX ")\n\t"
/* Save all guest registers, including RCX from the stack */
"mov %%" _ASM_AX ", %c[rax](%%" _ASM_CX ") \n\t"
"mov %%" _ASM_BX ", %c[rbx](%%" _ASM_CX ") \n\t"
__ASM_SIZE(pop) " %c[rcx](%%" _ASM_CX ") \n\t"
"mov %%" _ASM_DX ", %c[rdx](%%" _ASM_CX ") \n\t"
"mov %%" _ASM_SI ", %c[rsi](%%" _ASM_CX ") \n\t"
"mov %%" _ASM_DI ", %c[rdi](%%" _ASM_CX ") \n\t"
"mov %%" _ASM_BP ", %c[rbp](%%" _ASM_CX ") \n\t"
#ifdef CONFIG_X86_64
"mov %%r8, %c[r8](%%" _ASM_CX ") \n\t"
"mov %%r9, %c[r9](%%" _ASM_CX ") \n\t"
"mov %%r10, %c[r10](%%" _ASM_CX ") \n\t"
"mov %%r11, %c[r11](%%" _ASM_CX ") \n\t"
"mov %%r12, %c[r12](%%" _ASM_CX ") \n\t"
"mov %%r13, %c[r13](%%" _ASM_CX ") \n\t"
"mov %%r14, %c[r14](%%" _ASM_CX ") \n\t"
"mov %%r15, %c[r15](%%" _ASM_CX ") \n\t"
/*
* Clear host registers marked as clobbered to prevent
* speculative use.
*/
"xor %%r8d, %%r8d \n\t"
"xor %%r9d, %%r9d \n\t"
"xor %%r10d, %%r10d \n\t"
"xor %%r11d, %%r11d \n\t"
"xor %%r12d, %%r12d \n\t"
"xor %%r13d, %%r13d \n\t"
"xor %%r14d, %%r14d \n\t"
"xor %%r15d, %%r15d \n\t"
#endif
"mov %%cr2, %%" _ASM_AX " \n\t"
"mov %%" _ASM_AX ", %c[cr2](%%" _ASM_CX ") \n\t"
"xor %%eax, %%eax \n\t"
"xor %%ebx, %%ebx \n\t"
"xor %%esi, %%esi \n\t"
"xor %%edi, %%edi \n\t"
"pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t"
: ASM_CALL_CONSTRAINT
: "c"(vmx), "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp),
[launched]"i"(offsetof(struct vcpu_vmx, __launched)),
[fail]"i"(offsetof(struct vcpu_vmx, fail)),
[host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)),
[rax]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RAX])),
[rbx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RBX])),
[rcx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RCX])),
[rdx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RDX])),
[rsi]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RSI])),
[rdi]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RDI])),
[rbp]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RBP])),
#ifdef CONFIG_X86_64
[r8]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R8])),
[r9]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R9])),
[r10]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R10])),
[r11]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R11])),
[r12]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R12])),
[r13]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R13])),
[r14]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R14])),
[r15]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R15])),
#endif
[cr2]"i"(offsetof(struct vcpu_vmx, vcpu.arch.cr2)),
[wordsize]"i"(sizeof(ulong))
: "cc", "memory"
#ifdef CONFIG_X86_64
, "rax", "rbx", "rdi"
, "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15"
#else
, "eax", "ebx", "edi"
#endif
);
} }
STACK_FRAME_NON_STANDARD(__vmx_vcpu_run);
bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, bool launched);
static void vmx_vcpu_run(struct kvm_vcpu *vcpu) static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
{ {
...@@ -6572,7 +6442,16 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -6572,7 +6442,16 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
*/ */
x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0); x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
__vmx_vcpu_run(vcpu, vmx); if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
if (vcpu->arch.cr2 != read_cr2())
write_cr2(vcpu->arch.cr2);
vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
vmx->loaded_vmcs->launched);
vcpu->arch.cr2 = read_cr2();
/* /*
* We do not use IBRS in the kernel. If this vCPU has used the * We do not use IBRS in the kernel. If this vCPU has used the
...@@ -6657,7 +6536,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) ...@@ -6657,7 +6536,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
static struct kvm *vmx_vm_alloc(void) static struct kvm *vmx_vm_alloc(void)
{ {
struct kvm_vmx *kvm_vmx = vzalloc(sizeof(struct kvm_vmx)); struct kvm_vmx *kvm_vmx = __vmalloc(sizeof(struct kvm_vmx),
GFP_KERNEL_ACCOUNT | __GFP_ZERO,
PAGE_KERNEL);
return &kvm_vmx->kvm; return &kvm_vmx->kvm;
} }
...@@ -6673,7 +6554,6 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) ...@@ -6673,7 +6554,6 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
if (enable_pml) if (enable_pml)
vmx_destroy_pml_buffer(vmx); vmx_destroy_pml_buffer(vmx);
free_vpid(vmx->vpid); free_vpid(vmx->vpid);
leave_guest_mode(vcpu);
nested_vmx_free_vcpu(vcpu); nested_vmx_free_vcpu(vcpu);
free_loaded_vmcs(vmx->loaded_vmcs); free_loaded_vmcs(vmx->loaded_vmcs);
kfree(vmx->guest_msrs); kfree(vmx->guest_msrs);
...@@ -6685,14 +6565,16 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) ...@@ -6685,14 +6565,16 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
{ {
int err; int err;
struct vcpu_vmx *vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); struct vcpu_vmx *vmx;
unsigned long *msr_bitmap; unsigned long *msr_bitmap;
int cpu; int cpu;
vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL_ACCOUNT);
if (!vmx) if (!vmx)
return ERR_PTR(-ENOMEM); return ERR_PTR(-ENOMEM);
vmx->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache, GFP_KERNEL); vmx->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
GFP_KERNEL_ACCOUNT);
if (!vmx->vcpu.arch.guest_fpu) { if (!vmx->vcpu.arch.guest_fpu) {
printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n"); printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
err = -ENOMEM; err = -ENOMEM;
...@@ -6714,12 +6596,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) ...@@ -6714,12 +6596,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
* for the guest, etc. * for the guest, etc.
*/ */
if (enable_pml) { if (enable_pml) {
vmx->pml_pg = alloc_page(GFP_KERNEL | __GFP_ZERO); vmx->pml_pg = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
if (!vmx->pml_pg) if (!vmx->pml_pg)
goto uninit_vcpu; goto uninit_vcpu;
} }
vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL); vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL_ACCOUNT);
BUILD_BUG_ON(ARRAY_SIZE(vmx_msr_index) * sizeof(vmx->guest_msrs[0]) BUILD_BUG_ON(ARRAY_SIZE(vmx_msr_index) * sizeof(vmx->guest_msrs[0])
> PAGE_SIZE); > PAGE_SIZE);
......
...@@ -175,7 +175,6 @@ struct nested_vmx { ...@@ -175,7 +175,6 @@ struct nested_vmx {
struct vcpu_vmx { struct vcpu_vmx {
struct kvm_vcpu vcpu; struct kvm_vcpu vcpu;
unsigned long host_rsp;
u8 fail; u8 fail;
u8 msr_bitmap_mode; u8 msr_bitmap_mode;
u32 exit_intr_info; u32 exit_intr_info;
...@@ -209,7 +208,7 @@ struct vcpu_vmx { ...@@ -209,7 +208,7 @@ struct vcpu_vmx {
struct loaded_vmcs vmcs01; struct loaded_vmcs vmcs01;
struct loaded_vmcs *loaded_vmcs; struct loaded_vmcs *loaded_vmcs;
struct loaded_vmcs *loaded_cpu_state; struct loaded_vmcs *loaded_cpu_state;
bool __launched; /* temporary, used in vmx_vcpu_run */
struct msr_autoload { struct msr_autoload {
struct vmx_msrs guest; struct vmx_msrs guest;
struct vmx_msrs host; struct vmx_msrs host;
...@@ -339,8 +338,8 @@ static inline int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc) ...@@ -339,8 +338,8 @@ static inline int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
static inline void pi_set_sn(struct pi_desc *pi_desc) static inline void pi_set_sn(struct pi_desc *pi_desc)
{ {
return set_bit(POSTED_INTR_SN, set_bit(POSTED_INTR_SN,
(unsigned long *)&pi_desc->control); (unsigned long *)&pi_desc->control);
} }
static inline void pi_set_on(struct pi_desc *pi_desc) static inline void pi_set_on(struct pi_desc *pi_desc)
...@@ -445,7 +444,8 @@ static inline u32 vmx_vmentry_ctrl(void) ...@@ -445,7 +444,8 @@ static inline u32 vmx_vmentry_ctrl(void)
{ {
u32 vmentry_ctrl = vmcs_config.vmentry_ctrl; u32 vmentry_ctrl = vmcs_config.vmentry_ctrl;
if (pt_mode == PT_MODE_SYSTEM) if (pt_mode == PT_MODE_SYSTEM)
vmentry_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP | VM_EXIT_CLEAR_IA32_RTIT_CTL); vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
VM_ENTRY_LOAD_IA32_RTIT_CTL);
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */ /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmentry_ctrl & return vmentry_ctrl &
~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER); ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
...@@ -455,9 +455,10 @@ static inline u32 vmx_vmexit_ctrl(void) ...@@ -455,9 +455,10 @@ static inline u32 vmx_vmexit_ctrl(void)
{ {
u32 vmexit_ctrl = vmcs_config.vmexit_ctrl; u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
if (pt_mode == PT_MODE_SYSTEM) if (pt_mode == PT_MODE_SYSTEM)
vmexit_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL); vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */ /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmcs_config.vmexit_ctrl & return vmexit_ctrl &
~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER); ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
} }
...@@ -478,7 +479,7 @@ static inline struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) ...@@ -478,7 +479,7 @@ static inline struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
return &(to_vmx(vcpu)->pi_desc); return &(to_vmx(vcpu)->pi_desc);
} }
struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu); struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags);
void free_vmcs(struct vmcs *vmcs); void free_vmcs(struct vmcs *vmcs);
int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs); int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs); void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
...@@ -487,7 +488,8 @@ void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs); ...@@ -487,7 +488,8 @@ void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs);
static inline struct vmcs *alloc_vmcs(bool shadow) static inline struct vmcs *alloc_vmcs(bool shadow)
{ {
return alloc_vmcs_cpu(shadow, raw_smp_processor_id()); return alloc_vmcs_cpu(shadow, raw_smp_processor_id(),
GFP_KERNEL_ACCOUNT);
} }
u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa); u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa);
......
...@@ -3879,7 +3879,8 @@ long kvm_arch_vcpu_ioctl(struct file *filp, ...@@ -3879,7 +3879,8 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = -EINVAL; r = -EINVAL;
if (!lapic_in_kernel(vcpu)) if (!lapic_in_kernel(vcpu))
goto out; goto out;
u.lapic = kzalloc(sizeof(struct kvm_lapic_state), GFP_KERNEL); u.lapic = kzalloc(sizeof(struct kvm_lapic_state),
GFP_KERNEL_ACCOUNT);
r = -ENOMEM; r = -ENOMEM;
if (!u.lapic) if (!u.lapic)
...@@ -4066,7 +4067,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, ...@@ -4066,7 +4067,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
break; break;
} }
case KVM_GET_XSAVE: { case KVM_GET_XSAVE: {
u.xsave = kzalloc(sizeof(struct kvm_xsave), GFP_KERNEL); u.xsave = kzalloc(sizeof(struct kvm_xsave), GFP_KERNEL_ACCOUNT);
r = -ENOMEM; r = -ENOMEM;
if (!u.xsave) if (!u.xsave)
break; break;
...@@ -4090,7 +4091,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, ...@@ -4090,7 +4091,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
break; break;
} }
case KVM_GET_XCRS: { case KVM_GET_XCRS: {
u.xcrs = kzalloc(sizeof(struct kvm_xcrs), GFP_KERNEL); u.xcrs = kzalloc(sizeof(struct kvm_xcrs), GFP_KERNEL_ACCOUNT);
r = -ENOMEM; r = -ENOMEM;
if (!u.xcrs) if (!u.xcrs)
break; break;
...@@ -7055,6 +7056,13 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid) ...@@ -7055,6 +7056,13 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu) void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
{ {
if (!lapic_in_kernel(vcpu)) {
WARN_ON_ONCE(vcpu->arch.apicv_active);
return;
}
if (!vcpu->arch.apicv_active)
return;
vcpu->arch.apicv_active = false; vcpu->arch.apicv_active = false;
kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu); kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
} }
...@@ -9005,7 +9013,6 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) ...@@ -9005,7 +9013,6 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
struct page *page; struct page *page;
int r; int r;
vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
vcpu->arch.emulate_ctxt.ops = &emulate_ops; vcpu->arch.emulate_ctxt.ops = &emulate_ops;
if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu)) if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
...@@ -9026,6 +9033,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) ...@@ -9026,6 +9033,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
goto fail_free_pio_data; goto fail_free_pio_data;
if (irqchip_in_kernel(vcpu->kvm)) { if (irqchip_in_kernel(vcpu->kvm)) {
vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
r = kvm_create_lapic(vcpu); r = kvm_create_lapic(vcpu);
if (r < 0) if (r < 0)
goto fail_mmu_destroy; goto fail_mmu_destroy;
...@@ -9033,14 +9041,15 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) ...@@ -9033,14 +9041,15 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
static_key_slow_inc(&kvm_no_apic_vcpu); static_key_slow_inc(&kvm_no_apic_vcpu);
vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4, vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
GFP_KERNEL); GFP_KERNEL_ACCOUNT);
if (!vcpu->arch.mce_banks) { if (!vcpu->arch.mce_banks) {
r = -ENOMEM; r = -ENOMEM;
goto fail_free_lapic; goto fail_free_lapic;
} }
vcpu->arch.mcg_cap = KVM_MAX_MCE_BANKS; vcpu->arch.mcg_cap = KVM_MAX_MCE_BANKS;
if (!zalloc_cpumask_var(&vcpu->arch.wbinvd_dirty_mask, GFP_KERNEL)) { if (!zalloc_cpumask_var(&vcpu->arch.wbinvd_dirty_mask,
GFP_KERNEL_ACCOUNT)) {
r = -ENOMEM; r = -ENOMEM;
goto fail_free_mce_banks; goto fail_free_mce_banks;
} }
...@@ -9104,7 +9113,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -9104,7 +9113,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head); INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
atomic_set(&kvm->arch.noncoherent_dma_count, 0); atomic_set(&kvm->arch.noncoherent_dma_count, 0);
...@@ -9299,13 +9307,13 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -9299,13 +9307,13 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
slot->arch.rmap[i] = slot->arch.rmap[i] =
kvcalloc(lpages, sizeof(*slot->arch.rmap[i]), kvcalloc(lpages, sizeof(*slot->arch.rmap[i]),
GFP_KERNEL); GFP_KERNEL_ACCOUNT);
if (!slot->arch.rmap[i]) if (!slot->arch.rmap[i])
goto out_free; goto out_free;
if (i == 0) if (i == 0)
continue; continue;
linfo = kvcalloc(lpages, sizeof(*linfo), GFP_KERNEL); linfo = kvcalloc(lpages, sizeof(*linfo), GFP_KERNEL_ACCOUNT);
if (!linfo) if (!linfo)
goto out_free; goto out_free;
...@@ -9348,13 +9356,13 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, ...@@ -9348,13 +9356,13 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
return -ENOMEM; return -ENOMEM;
} }
void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen)
{ {
/* /*
* memslots->generation has been incremented. * memslots->generation has been incremented.
* mmio generation may have reached its maximum value. * mmio generation may have reached its maximum value.
*/ */
kvm_mmu_invalidate_mmio_sptes(kvm, slots); kvm_mmu_invalidate_mmio_sptes(kvm, gen);
} }
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
...@@ -9462,7 +9470,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, ...@@ -9462,7 +9470,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
void kvm_arch_flush_shadow_all(struct kvm *kvm) void kvm_arch_flush_shadow_all(struct kvm *kvm)
{ {
kvm_mmu_invalidate_zap_all_pages(kvm); kvm_mmu_zap_all(kvm);
} }
void kvm_arch_flush_shadow_memslot(struct kvm *kvm, void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
......
...@@ -181,6 +181,11 @@ static inline bool emul_is_noncanonical_address(u64 la, ...@@ -181,6 +181,11 @@ static inline bool emul_is_noncanonical_address(u64 la,
static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
gva_t gva, gfn_t gfn, unsigned access) gva_t gva, gfn_t gfn, unsigned access)
{ {
u64 gen = kvm_memslots(vcpu->kvm)->generation;
if (unlikely(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS))
return;
/* /*
* If this is a shadow nested page table, the "GVA" is * If this is a shadow nested page table, the "GVA" is
* actually a nGPA. * actually a nGPA.
...@@ -188,7 +193,7 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, ...@@ -188,7 +193,7 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
vcpu->arch.mmio_gva = mmu_is_nested(vcpu) ? 0 : gva & PAGE_MASK; vcpu->arch.mmio_gva = mmu_is_nested(vcpu) ? 0 : gva & PAGE_MASK;
vcpu->arch.access = access; vcpu->arch.access = access;
vcpu->arch.mmio_gfn = gfn; vcpu->arch.mmio_gfn = gfn;
vcpu->arch.mmio_gen = kvm_memslots(vcpu->kvm)->generation; vcpu->arch.mmio_gen = gen;
} }
static inline bool vcpu_match_mmio_gen(struct kvm_vcpu *vcpu) static inline bool vcpu_match_mmio_gen(struct kvm_vcpu *vcpu)
......
...@@ -1261,6 +1261,13 @@ static enum arch_timer_ppi_nr __init arch_timer_select_ppi(void) ...@@ -1261,6 +1261,13 @@ static enum arch_timer_ppi_nr __init arch_timer_select_ppi(void)
return ARCH_TIMER_PHYS_SECURE_PPI; return ARCH_TIMER_PHYS_SECURE_PPI;
} }
static void __init arch_timer_populate_kvm_info(void)
{
arch_timer_kvm_info.virtual_irq = arch_timer_ppi[ARCH_TIMER_VIRT_PPI];
if (is_kernel_in_hyp_mode())
arch_timer_kvm_info.physical_irq = arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI];
}
static int __init arch_timer_of_init(struct device_node *np) static int __init arch_timer_of_init(struct device_node *np)
{ {
int i, ret; int i, ret;
...@@ -1275,7 +1282,7 @@ static int __init arch_timer_of_init(struct device_node *np) ...@@ -1275,7 +1282,7 @@ static int __init arch_timer_of_init(struct device_node *np)
for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++) for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
arch_timer_ppi[i] = irq_of_parse_and_map(np, i); arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
arch_timer_kvm_info.virtual_irq = arch_timer_ppi[ARCH_TIMER_VIRT_PPI]; arch_timer_populate_kvm_info();
rate = arch_timer_get_cntfrq(); rate = arch_timer_get_cntfrq();
arch_timer_of_configure_rate(rate, np); arch_timer_of_configure_rate(rate, np);
...@@ -1605,7 +1612,7 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table) ...@@ -1605,7 +1612,7 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table)
arch_timer_ppi[ARCH_TIMER_HYP_PPI] = arch_timer_ppi[ARCH_TIMER_HYP_PPI] =
acpi_gtdt_map_ppi(ARCH_TIMER_HYP_PPI); acpi_gtdt_map_ppi(ARCH_TIMER_HYP_PPI);
arch_timer_kvm_info.virtual_irq = arch_timer_ppi[ARCH_TIMER_VIRT_PPI]; arch_timer_populate_kvm_info();
/* /*
* When probing via ACPI, we have no mechanism to override the sysreg * When probing via ACPI, we have no mechanism to override the sysreg
......
...@@ -1382,3 +1382,40 @@ int chsc_pnso_brinfo(struct subchannel_id schid, ...@@ -1382,3 +1382,40 @@ int chsc_pnso_brinfo(struct subchannel_id schid,
return chsc_error_from_response(brinfo_area->response.code); return chsc_error_from_response(brinfo_area->response.code);
} }
EXPORT_SYMBOL_GPL(chsc_pnso_brinfo); EXPORT_SYMBOL_GPL(chsc_pnso_brinfo);
int chsc_sgib(u32 origin)
{
struct {
struct chsc_header request;
u16 op;
u8 reserved01[2];
u8 reserved02:4;
u8 fmt:4;
u8 reserved03[7];
/* operation data area begin */
u8 reserved04[4];
u32 gib_origin;
u8 reserved05[10];
u8 aix;
u8 reserved06[4029];
struct chsc_header response;
u8 reserved07[4];
} *sgib_area;
int ret;
spin_lock_irq(&chsc_page_lock);
memset(chsc_page, 0, PAGE_SIZE);
sgib_area = chsc_page;
sgib_area->request.length = 0x0fe0;
sgib_area->request.code = 0x0021;
sgib_area->op = 0x1;
sgib_area->gib_origin = origin;
ret = chsc(sgib_area);
if (ret == 0)
ret = chsc_error_from_response(sgib_area->response.code);
spin_unlock_irq(&chsc_page_lock);
return ret;
}
EXPORT_SYMBOL_GPL(chsc_sgib);
...@@ -164,6 +164,7 @@ int chsc_get_channel_measurement_chars(struct channel_path *chp); ...@@ -164,6 +164,7 @@ int chsc_get_channel_measurement_chars(struct channel_path *chp);
int chsc_ssqd(struct subchannel_id schid, struct chsc_ssqd_area *ssqd); int chsc_ssqd(struct subchannel_id schid, struct chsc_ssqd_area *ssqd);
int chsc_sadc(struct subchannel_id schid, struct chsc_scssc_area *scssc, int chsc_sadc(struct subchannel_id schid, struct chsc_scssc_area *scssc,
u64 summary_indicator_addr, u64 subchannel_indicator_addr); u64 summary_indicator_addr, u64 subchannel_indicator_addr);
int chsc_sgib(u32 origin);
int chsc_error_from_response(int response); int chsc_error_from_response(int response);
int chsc_siosl(struct subchannel_id schid); int chsc_siosl(struct subchannel_id schid);
......
...@@ -74,6 +74,7 @@ enum arch_timer_spi_nr { ...@@ -74,6 +74,7 @@ enum arch_timer_spi_nr {
struct arch_timer_kvm_info { struct arch_timer_kvm_info {
struct timecounter timecounter; struct timecounter timecounter;
int virtual_irq; int virtual_irq;
int physical_irq;
}; };
struct arch_timer_mem_frame { struct arch_timer_mem_frame {
......
...@@ -22,7 +22,22 @@ ...@@ -22,7 +22,22 @@
#include <linux/clocksource.h> #include <linux/clocksource.h>
#include <linux/hrtimer.h> #include <linux/hrtimer.h>
enum kvm_arch_timers {
TIMER_PTIMER,
TIMER_VTIMER,
NR_KVM_TIMERS
};
enum kvm_arch_timer_regs {
TIMER_REG_CNT,
TIMER_REG_CVAL,
TIMER_REG_TVAL,
TIMER_REG_CTL,
};
struct arch_timer_context { struct arch_timer_context {
struct kvm_vcpu *vcpu;
/* Registers: control register, timer value */ /* Registers: control register, timer value */
u32 cnt_ctl; u32 cnt_ctl;
u64 cnt_cval; u64 cnt_cval;
...@@ -30,30 +45,36 @@ struct arch_timer_context { ...@@ -30,30 +45,36 @@ struct arch_timer_context {
/* Timer IRQ */ /* Timer IRQ */
struct kvm_irq_level irq; struct kvm_irq_level irq;
/* Virtual offset */
u64 cntvoff;
/* Emulated Timer (may be unused) */
struct hrtimer hrtimer;
/* /*
* We have multiple paths which can save/restore the timer state * We have multiple paths which can save/restore the timer state onto
* onto the hardware, so we need some way of keeping track of * the hardware, so we need some way of keeping track of where the
* where the latest state is. * latest state is.
*
* loaded == true: State is loaded on the hardware registers.
* loaded == false: State is stored in memory.
*/ */
bool loaded; bool loaded;
/* Virtual offset */ /* Duplicated state from arch_timer.c for convenience */
u64 cntvoff; u32 host_timer_irq;
u32 host_timer_irq_flags;
};
struct timer_map {
struct arch_timer_context *direct_vtimer;
struct arch_timer_context *direct_ptimer;
struct arch_timer_context *emul_ptimer;
}; };
struct arch_timer_cpu { struct arch_timer_cpu {
struct arch_timer_context vtimer; struct arch_timer_context timers[NR_KVM_TIMERS];
struct arch_timer_context ptimer;
/* Background timer used when the guest is not running */ /* Background timer used when the guest is not running */
struct hrtimer bg_timer; struct hrtimer bg_timer;
/* Physical timer emulation */
struct hrtimer phys_timer;
/* Is the timer enabled */ /* Is the timer enabled */
bool enabled; bool enabled;
}; };
...@@ -76,9 +97,6 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr); ...@@ -76,9 +97,6 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
bool kvm_timer_is_pending(struct kvm_vcpu *vcpu); bool kvm_timer_is_pending(struct kvm_vcpu *vcpu);
void kvm_timer_schedule(struct kvm_vcpu *vcpu);
void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
u64 kvm_phys_timer_read(void); u64 kvm_phys_timer_read(void);
void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu); void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);
...@@ -88,7 +106,19 @@ void kvm_timer_init_vhe(void); ...@@ -88,7 +106,19 @@ void kvm_timer_init_vhe(void);
bool kvm_arch_timer_get_input_level(int vintid); bool kvm_arch_timer_get_input_level(int vintid);
#define vcpu_vtimer(v) (&(v)->arch.timer_cpu.vtimer) #define vcpu_timer(v) (&(v)->arch.timer_cpu)
#define vcpu_ptimer(v) (&(v)->arch.timer_cpu.ptimer) #define vcpu_get_timer(v,t) (&vcpu_timer(v)->timers[(t)])
#define vcpu_vtimer(v) (&(v)->arch.timer_cpu.timers[TIMER_VTIMER])
#define vcpu_ptimer(v) (&(v)->arch.timer_cpu.timers[TIMER_PTIMER])
#define arch_timer_ctx_index(ctx) ((ctx) - vcpu_timer((ctx)->vcpu)->timers)
u64 kvm_arm_timer_read_sysreg(struct kvm_vcpu *vcpu,
enum kvm_arch_timers tmr,
enum kvm_arch_timer_regs treg);
void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu,
enum kvm_arch_timers tmr,
enum kvm_arch_timer_regs treg,
u64 val);
#endif #endif
...@@ -48,6 +48,27 @@ ...@@ -48,6 +48,27 @@
*/ */
#define KVM_MEMSLOT_INVALID (1UL << 16) #define KVM_MEMSLOT_INVALID (1UL << 16)
/*
* Bit 63 of the memslot generation number is an "update in-progress flag",
* e.g. is temporarily set for the duration of install_new_memslots().
* This flag effectively creates a unique generation number that is used to
* mark cached memslot data, e.g. MMIO accesses, as potentially being stale,
* i.e. may (or may not) have come from the previous memslots generation.
*
* This is necessary because the actual memslots update is not atomic with
* respect to the generation number update. Updating the generation number
* first would allow a vCPU to cache a spte from the old memslots using the
* new generation number, and updating the generation number after switching
* to the new memslots would allow cache hits using the old generation number
* to reference the defunct memslots.
*
* This mechanism is used to prevent getting hits in KVM's caches while a
* memslot update is in-progress, and to prevent cache hits *after* updating
* the actual generation number against accesses that were inserted into the
* cache *before* the memslots were updated.
*/
#define KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS BIT_ULL(63)
/* Two fragments for cross MMIO pages. */ /* Two fragments for cross MMIO pages. */
#define KVM_MAX_MMIO_FRAGMENTS 2 #define KVM_MAX_MMIO_FRAGMENTS 2
...@@ -634,7 +655,7 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, ...@@ -634,7 +655,7 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
struct kvm_memory_slot *dont); struct kvm_memory_slot *dont);
int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
unsigned long npages); unsigned long npages);
void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots); void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen);
int kvm_arch_prepare_memory_region(struct kvm *kvm, int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot, struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem, const struct kvm_userspace_memory_region *mem,
...@@ -1182,6 +1203,7 @@ extern bool kvm_rebooting; ...@@ -1182,6 +1203,7 @@ extern bool kvm_rebooting;
extern unsigned int halt_poll_ns; extern unsigned int halt_poll_ns;
extern unsigned int halt_poll_ns_grow; extern unsigned int halt_poll_ns_grow;
extern unsigned int halt_poll_ns_grow_start;
extern unsigned int halt_poll_ns_shrink; extern unsigned int halt_poll_ns_shrink;
struct kvm_device { struct kvm_device {
......
...@@ -3,6 +3,7 @@ ...@@ -3,6 +3,7 @@
/x86_64/platform_info_test /x86_64/platform_info_test
/x86_64/set_sregs_test /x86_64/set_sregs_test
/x86_64/sync_regs_test /x86_64/sync_regs_test
/x86_64/vmx_close_while_nested_test
/x86_64/vmx_tsc_adjust_test /x86_64/vmx_tsc_adjust_test
/x86_64/state_test /x86_64/state_test
/dirty_log_test /dirty_log_test
...@@ -16,6 +16,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test ...@@ -16,6 +16,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
TEST_GEN_PROGS_x86_64 += x86_64/state_test TEST_GEN_PROGS_x86_64 += x86_64/state_test
TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test
TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_test
TEST_GEN_PROGS_x86_64 += clear_dirty_log_test TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
......
此差异已折叠。
此差异已折叠。
...@@ -226,7 +226,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) ...@@ -226,7 +226,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
int i; int i;
u32 elrsr; u32 elrsr;
elrsr = read_gicreg(ICH_ELSR_EL2); elrsr = read_gicreg(ICH_ELRSR_EL2);
write_gicreg(cpu_if->vgic_hcr & ~ICH_HCR_EN, ICH_HCR_EL2); write_gicreg(cpu_if->vgic_hcr & ~ICH_HCR_EN, ICH_HCR_EL2);
......
此差异已折叠。
此差异已折叠。
...@@ -589,7 +589,7 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable); ...@@ -589,7 +589,7 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable);
*/ */
int vgic_v3_probe(const struct gic_kvm_info *info) int vgic_v3_probe(const struct gic_kvm_info *info)
{ {
u32 ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2); u32 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_ich_vtr_el2);
int ret; int ret;
/* /*
...@@ -679,7 +679,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu) ...@@ -679,7 +679,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
if (likely(cpu_if->vgic_sre)) if (likely(cpu_if->vgic_sre))
cpu_if->vgic_vmcr = kvm_call_hyp(__vgic_v3_read_vmcr); cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
kvm_call_hyp(__vgic_v3_save_aprs, vcpu); kvm_call_hyp(__vgic_v3_save_aprs, vcpu);
......
...@@ -144,7 +144,8 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm, ...@@ -144,7 +144,8 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
if (zone->pio != 1 && zone->pio != 0) if (zone->pio != 1 && zone->pio != 0)
return -EINVAL; return -EINVAL;
dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev), GFP_KERNEL); dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev),
GFP_KERNEL_ACCOUNT);
if (!dev) if (!dev)
return -ENOMEM; return -ENOMEM;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册