提交 15303ba5 编写于 作者: L Linus Torvalds

Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "ARM:

   - icache invalidation optimizations, improving VM startup time

   - support for forwarded level-triggered interrupts, improving
     performance for timers and passthrough platform devices

   - a small fix for power-management notifiers, and some cosmetic
     changes

  PPC:

   - add MMIO emulation for vector loads and stores

   - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
     requiring the complex thread synchronization of older CPU versions

   - improve the handling of escalation interrupts with the XIVE
     interrupt controller

   - support decrement register migration

   - various cleanups and bugfixes.

  s390:

   - Cornelia Huck passed maintainership to Janosch Frank

   - exitless interrupts for emulated devices

   - cleanup of cpuflag handling

   - kvm_stat counter improvements

   - VSIE improvements

   - mm cleanup

  x86:

   - hypervisor part of SEV

   - UMIP, RDPID, and MSR_SMI_COUNT emulation

   - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit

   - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
     AVX512 features

   - show vcpu id in its anonymous inode name

   - many fixes and cleanups

   - per-VCPU MSR bitmaps (already merged through x86/pti branch)

   - stable KVM clock when nesting on Hyper-V (merged through
     x86/hyperv)"

* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
  KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
  KVM: PPC: Book3S HV: Branch inside feature section
  KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
  KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
  KVM: PPC: Book3S PR: Fix broken select due to misspelling
  KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
  KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
  KVM: PPC: Book3S HV: Drop locks before reading guest memory
  kvm: x86: remove efer_reload entry in kvm_vcpu_stat
  KVM: x86: AMD Processor Topology Information
  x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
  kvm: embed vcpu id to dentry of vcpu anon inode
  kvm: Map PFN-type memory regions as writable (if possible)
  x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
  KVM: arm/arm64: Fixup userspace irqchip static key optimization
  KVM: arm/arm64: Fix userspace_irqchip_in_use counting
  KVM: arm/arm64: Fix incorrect timer_is_pending logic
  MAINTAINERS: update KVM/s390 maintainers
  MAINTAINERS: add Halil as additional vfio-ccw maintainer
  MAINTAINERS: add David as a reviewer for KVM/s390
  ...
......@@ -26,3 +26,6 @@ s390-diag.txt
- Diagnose hypercall description (for IBM S/390)
timekeeping.txt
- timekeeping virtualization for x86-based architectures.
amd-memory-encryption.txt
- notes on AMD Secure Encrypted Virtualization feature and SEV firmware
command description
======================================
Secure Encrypted Virtualization (SEV)
======================================
Overview
========
Secure Encrypted Virtualization (SEV) is a feature found on AMD processors.
SEV is an extension to the AMD-V architecture which supports running
virtual machines (VMs) under the control of a hypervisor. When enabled,
the memory contents of a VM will be transparently encrypted with a key
unique to that VM.
The hypervisor can determine the SEV support through the CPUID
instruction. The CPUID function 0x8000001f reports information related
to SEV::
0x8000001f[eax]:
Bit[1] indicates support for SEV
...
[ecx]:
Bits[31:0] Number of encrypted guests supported simultaneously
If support for SEV is present, MSR 0xc001_0010 (MSR_K8_SYSCFG) and MSR 0xc001_0015
(MSR_K7_HWCR) can be used to determine if it can be enabled::
0xc001_0010:
Bit[23] 1 = memory encryption can be enabled
0 = memory encryption can not be enabled
0xc001_0015:
Bit[0] 1 = memory encryption can be enabled
0 = memory encryption can not be enabled
When SEV support is available, it can be enabled in a specific VM by
setting the SEV bit before executing VMRUN.::
VMCB[0x90]:
Bit[1] 1 = SEV is enabled
0 = SEV is disabled
SEV hardware uses ASIDs to associate a memory encryption key with a VM.
Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
defined in the CPUID 0x8000001f[ecx] field.
SEV Key Management
==================
The SEV guest key management is handled by a separate processor called the AMD
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_
KVM implements the following commands to support common lifecycle events of SEV
guests, such as launching, running, snapshotting, migrating and decommissioning.
1. KVM_SEV_INIT
---------------
The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
context. In a typical workflow, this command should be the first command issued.
Returns: 0 on success, -negative on error
2. KVM_SEV_LAUNCH_START
-----------------------
The KVM_SEV_LAUNCH_START command is used for creating the memory encryption
context. To create the encryption context, user must provide a guest policy,
the owner's public Diffie-Hellman (PDH) key and session information.
Parameters: struct kvm_sev_launch_start (in/out)
Returns: 0 on success, -negative on error
::
struct kvm_sev_launch_start {
__u32 handle; /* if zero then firmware creates a new handle */
__u32 policy; /* guest's policy */
__u64 dh_uaddr; /* userspace address pointing to the guest owner's PDH key */
__u32 dh_len;
__u64 session_addr; /* userspace address which points to the guest session information */
__u32 session_len;
};
On success, the 'handle' field contains a new handle and on error, a negative value.
For more details, see SEV spec Section 6.2.
3. KVM_SEV_LAUNCH_UPDATE_DATA
-----------------------------
The KVM_SEV_LAUNCH_UPDATE_DATA is used for encrypting a memory region. It also
calculates a measurement of the memory contents. The measurement is a signature
of the memory contents that can be sent to the guest owner as an attestation
that the memory was encrypted correctly by the firmware.
Parameters (in): struct kvm_sev_launch_update_data
Returns: 0 on success, -negative on error
::
struct kvm_sev_launch_update {
__u64 uaddr; /* userspace address to be encrypted (must be 16-byte aligned) */
__u32 len; /* length of the data to be encrypted (must be 16-byte aligned) */
};
For more details, see SEV spec Section 6.3.
4. KVM_SEV_LAUNCH_MEASURE
-------------------------
The KVM_SEV_LAUNCH_MEASURE command is used to retrieve the measurement of the
data encrypted by the KVM_SEV_LAUNCH_UPDATE_DATA command. The guest owner may
wait to provide the guest with confidential information until it can verify the
measurement. Since the guest owner knows the initial contents of the guest at
boot, the measurement can be verified by comparing it to what the guest owner
expects.
Parameters (in): struct kvm_sev_launch_measure
Returns: 0 on success, -negative on error
::
struct kvm_sev_launch_measure {
__u64 uaddr; /* where to copy the measurement */
__u32 len; /* length of measurement blob */
};
For more details on the measurement verification flow, see SEV spec Section 6.4.
5. KVM_SEV_LAUNCH_FINISH
------------------------
After completion of the launch flow, the KVM_SEV_LAUNCH_FINISH command can be
issued to make the guest ready for the execution.
Returns: 0 on success, -negative on error
6. KVM_SEV_GUEST_STATUS
-----------------------
The KVM_SEV_GUEST_STATUS command is used to retrieve status information about a
SEV-enabled guest.
Parameters (out): struct kvm_sev_guest_status
Returns: 0 on success, -negative on error
::
struct kvm_sev_guest_status {
__u32 handle; /* guest handle */
__u32 policy; /* guest policy */
__u8 state; /* guest state (see enum below) */
};
SEV guest state:
::
enum {
SEV_STATE_INVALID = 0;
SEV_STATE_LAUNCHING, /* guest is currently being launched */
SEV_STATE_SECRET, /* guest is being launched and ready to accept the ciphertext data */
SEV_STATE_RUNNING, /* guest is fully launched and running */
SEV_STATE_RECEIVING, /* guest is being migrated in from another SEV machine */
SEV_STATE_SENDING /* guest is getting migrated out to another SEV machine */
};
7. KVM_SEV_DBG_DECRYPT
----------------------
The KVM_SEV_DEBUG_DECRYPT command can be used by the hypervisor to request the
firmware to decrypt the data at the given memory region.
Parameters (in): struct kvm_sev_dbg
Returns: 0 on success, -negative on error
::
struct kvm_sev_dbg {
__u64 src_uaddr; /* userspace address of data to decrypt */
__u64 dst_uaddr; /* userspace address of destination */
__u32 len; /* length of memory region to decrypt */
};
The command returns an error if the guest policy does not allow debugging.
8. KVM_SEV_DBG_ENCRYPT
----------------------
The KVM_SEV_DEBUG_ENCRYPT command can be used by the hypervisor to request the
firmware to encrypt the data at the given memory region.
Parameters (in): struct kvm_sev_dbg
Returns: 0 on success, -negative on error
::
struct kvm_sev_dbg {
__u64 src_uaddr; /* userspace address of data to encrypt */
__u64 dst_uaddr; /* userspace address of destination */
__u32 len; /* length of memory region to encrypt */
};
The command returns an error if the guest policy does not allow debugging.
9. KVM_SEV_LAUNCH_SECRET
------------------------
The KVM_SEV_LAUNCH_SECRET command can be used by the hypervisor to inject secret
data after the measurement has been validated by the guest owner.
Parameters (in): struct kvm_sev_launch_secret
Returns: 0 on success, -negative on error
::
struct kvm_sev_launch_secret {
__u64 hdr_uaddr; /* userspace address containing the packet header */
__u32 hdr_len;
__u64 guest_uaddr; /* the guest memory region where the secret should be injected */
__u32 guest_len;
__u64 trans_uaddr; /* the hypervisor memory region which contains the secret */
__u32 trans_len;
};
References
==========
.. [white-paper] http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
.. [api-spec] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf
.. [amd-apm] http://support.amd.com/TechDocs/24593.pdf (section 15.34)
.. [kvm-forum] http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
......@@ -1841,6 +1841,7 @@ registers, find a list below:
PPC | KVM_REG_PPC_DBSR | 32
PPC | KVM_REG_PPC_TIDR | 64
PPC | KVM_REG_PPC_PSSCR | 64
PPC | KVM_REG_PPC_DEC_EXPIRY | 64
PPC | KVM_REG_PPC_TM_GPR0 | 64
...
PPC | KVM_REG_PPC_TM_GPR31 | 64
......@@ -3403,7 +3404,7 @@ invalid, if invalid pages are written to (e.g. after the end of memory)
or if no page table is present for the addresses (e.g. when using
hugepages).
4.108 KVM_PPC_GET_CPU_CHAR
4.109 KVM_PPC_GET_CPU_CHAR
Capability: KVM_CAP_PPC_GET_CPU_CHAR
Architectures: powerpc
......@@ -3449,6 +3450,57 @@ array bounds check and the array access.
These fields use the same bit definitions as the new
H_GET_CPU_CHARACTERISTICS hypercall.
4.110 KVM_MEMORY_ENCRYPT_OP
Capability: basic
Architectures: x86
Type: system
Parameters: an opaque platform specific structure (in/out)
Returns: 0 on success; -1 on error
If the platform supports creating encrypted VMs then this ioctl can be used
for issuing platform-specific memory encryption commands to manage those
encrypted VMs.
Currently, this ioctl is used for issuing Secure Encrypted Virtualization
(SEV) commands on AMD Processors. The SEV commands are defined in
Documentation/virtual/kvm/amd-memory-encryption.txt.
4.111 KVM_MEMORY_ENCRYPT_REG_REGION
Capability: basic
Architectures: x86
Type: system
Parameters: struct kvm_enc_region (in)
Returns: 0 on success; -1 on error
This ioctl can be used to register a guest memory region which may
contain encrypted data (e.g. guest RAM, SMRAM etc).
It is used in the SEV-enabled guest. When encryption is enabled, a guest
memory region may contain encrypted data. The SEV memory encryption
engine uses a tweak such that two identical plaintext pages, each at
different locations will have differing ciphertexts. So swapping or
moving ciphertext of those pages will not result in plaintext being
swapped. So relocating (or migrating) physical backing pages for the SEV
guest will require some additional steps.
Note: The current SEV key management spec does not provide commands to
swap or migrate (move) ciphertext pages. Hence, for now we pin the guest
memory region registered with the ioctl.
4.112 KVM_MEMORY_ENCRYPT_UNREG_REGION
Capability: basic
Architectures: x86
Type: system
Parameters: struct kvm_enc_region (in)
Returns: 0 on success; -1 on error
This ioctl can be used to unregister the guest memory region registered
with KVM_MEMORY_ENCRYPT_REG_REGION ioctl above.
5. The kvm_run structure
------------------------
......
KVM/ARM VGIC Forwarded Physical Interrupts
==========================================
The KVM/ARM code implements software support for the ARM Generic
Interrupt Controller's (GIC's) hardware support for virtualization by
allowing software to inject virtual interrupts to a VM, which the guest
OS sees as regular interrupts. The code is famously known as the VGIC.
Some of these virtual interrupts, however, correspond to physical
interrupts from real physical devices. One example could be the
architected timer, which itself supports virtualization, and therefore
lets a guest OS program the hardware device directly to raise an
interrupt at some point in time. When such an interrupt is raised, the
host OS initially handles the interrupt and must somehow signal this
event as a virtual interrupt to the guest. Another example could be a
passthrough device, where the physical interrupts are initially handled
by the host, but the device driver for the device lives in the guest OS
and KVM must therefore somehow inject a virtual interrupt on behalf of
the physical one to the guest OS.
These virtual interrupts corresponding to a physical interrupt on the
host are called forwarded physical interrupts, but are also sometimes
referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
Forwarded physical interrupts are handled slightly differently compared
to virtual interrupts generated purely by a software emulated device.
The HW bit
----------
Virtual interrupts are signalled to the guest by programming the List
Registers (LRs) on the GIC before running a VCPU. The LR is programmed
with the virtual IRQ number and the state of the interrupt (Pending,
Active, or Pending+Active). When the guest ACKs and EOIs a virtual
interrupt, the LR state moves from Pending to Active, and finally to
inactive.
The LRs include an extra bit, called the HW bit. When this bit is set,
KVM must also program an additional field in the LR, the physical IRQ
number, to link the virtual with the physical IRQ.
When the HW bit is set, KVM must EITHER set the Pending OR the Active
bit, never both at the same time.
Setting the HW bit causes the hardware to deactivate the physical
interrupt on the physical distributor when the guest deactivates the
corresponding virtual interrupt.
Forwarded Physical Interrupts Life Cycle
----------------------------------------
The state of forwarded physical interrupts is managed in the following way:
- The physical interrupt is acked by the host, and becomes active on
the physical distributor (*).
- KVM sets the LR.Pending bit, because this is the only way the GICV
interface is going to present it to the guest.
- LR.Pending will stay set as long as the guest has not acked the interrupt.
- LR.Pending transitions to LR.Active on the guest read of the IAR, as
expected.
- On guest EOI, the *physical distributor* active bit gets cleared,
but the LR.Active is left untouched (set).
- KVM clears the LR on VM exits when the physical distributor
active state has been cleared.
(*): The host handling is slightly more complicated. For some forwarded
interrupts (shared), KVM directly sets the active state on the physical
distributor before entering the guest, because the interrupt is never actually
handled on the host (see details on the timer as an example below). For other
forwarded interrupts (non-shared) the host does not deactivate the interrupt
when the host ISR completes, but leaves the interrupt active until the guest
deactivates it. Leaving the interrupt active is allowed, because Linux
configures the physical GIC with EOIMode=1, which causes EOI operations to
perform a priority drop allowing the GIC to receive other interrupts of the
default priority.
Forwarded Edge and Level Triggered PPIs and SPIs
------------------------------------------------
Forwarded physical interrupts injected should always be active on the
physical distributor when injected to a guest.
Level-triggered interrupts will keep the interrupt line to the GIC
asserted, typically until the guest programs the device to deassert the
line. This means that the interrupt will remain pending on the physical
distributor until the guest has reprogrammed the device. Since we
always run the VM with interrupts enabled on the CPU, a pending
interrupt will exit the guest as soon as we switch into the guest,
preventing the guest from ever making progress as the process repeats
over and over. Therefore, the active state on the physical distributor
must be set when entering the guest, preventing the GIC from forwarding
the pending interrupt to the CPU. As soon as the guest deactivates the
interrupt, the physical line is sampled by the hardware again and the host
takes a new interrupt if and only if the physical line is still asserted.
Edge-triggered interrupts do not exhibit the same problem with
preventing guest execution that level-triggered interrupts do. One
option is to not use HW bit at all, and inject edge-triggered interrupts
from a physical device as pure virtual interrupts. But that would
potentially slow down handling of the interrupt in the guest, because a
physical interrupt occurring in the middle of the guest ISR would
preempt the guest for the host to handle the interrupt. Additionally,
if you configure the system to handle interrupts on a separate physical
core from that running your VCPU, you still have to interrupt the VCPU
to queue the pending state onto the LR, even though the guest won't use
this information until the guest ISR completes. Therefore, the HW
bit should always be set for forwarded edge-triggered interrupts. With
the HW bit set, the virtual interrupt is injected and additional
physical interrupts occurring before the guest deactivates the interrupt
simply mark the state on the physical distributor as Pending+Active. As
soon as the guest deactivates the interrupt, the host takes another
interrupt if and only if there was a physical interrupt between injecting
the forwarded interrupt to the guest and the guest deactivating the
interrupt.
Consequently, whenever we schedule a VCPU with one or more LRs with the
HW bit set, the interrupt must also be active on the physical
distributor.
Forwarded LPIs
--------------
LPIs, introduced in GICv3, are always edge-triggered and do not have an
active state. They become pending when a device signal them, and as
soon as they are acked by the CPU, they are inactive again.
It therefore doesn't make sense, and is not supported, to set the HW bit
for physical LPIs that are forwarded to a VM as virtual interrupts,
typically virtual SPIs.
For LPIs, there is no other choice than to preempt the VCPU thread if
necessary, and queue the pending state onto the LR.
Putting It Together: The Architected Timer
------------------------------------------
The architected timer is a device that signals interrupts with level
triggered semantics. The timer hardware is directly accessed by VCPUs
which program the timer to fire at some point in time. Each VCPU on a
system programs the timer to fire at different times, and therefore the
hardware is multiplexed between multiple VCPUs. This is implemented by
context-switching the timer state along with each VCPU thread.
However, this means that a scenario like the following is entirely
possible, and in fact, typical:
1. KVM runs the VCPU
2. The guest programs the time to fire in T+100
3. The guest is idle and calls WFI (wait-for-interrupts)
4. The hardware traps to the host
5. KVM stores the timer state to memory and disables the hardware timer
6. KVM schedules a soft timer to fire in T+(100 - time since step 2)
7. KVM puts the VCPU thread to sleep (on a waitqueue)
8. The soft timer fires, waking up the VCPU thread
9. KVM reprograms the timer hardware with the VCPU's values
10. KVM marks the timer interrupt as active on the physical distributor
11. KVM injects a forwarded physical interrupt to the guest
12. KVM runs the VCPU
Notice that KVM injects a forwarded physical interrupt in step 11 without
the corresponding interrupt having actually fired on the host. That is
exactly why we mark the timer interrupt as active in step 10, because
the active state on the physical distributor is part of the state
belonging to the timer hardware, which is context-switched along with
the VCPU thread.
If the guest does not idle because it is busy, the flow looks like this
instead:
1. KVM runs the VCPU
2. The guest programs the time to fire in T+100
4. At T+100 the timer fires and a physical IRQ causes the VM to exit
(note that this initially only traps to EL2 and does not run the host ISR
until KVM has returned to the host).
5. With interrupts still disabled on the CPU coming back from the guest, KVM
stores the virtual timer state to memory and disables the virtual hw timer.
6. KVM looks at the timer state (in memory) and injects a forwarded physical
interrupt because it concludes the timer has expired.
7. KVM marks the timer interrupt as active on the physical distributor
7. KVM enables the timer, enables interrupts, and runs the VCPU
Notice that again the forwarded physical interrupt is injected to the
guest without having actually been handled on the host. In this case it
is because the physical interrupt is never actually seen by the host because the
timer is disabled upon guest return, and the virtual forwarded interrupt is
injected on the KVM guest entry path.
......@@ -54,6 +54,10 @@ KVM_FEATURE_PV_UNHALT || 7 || guest checks this feature bit
|| || before enabling paravirtualized
|| || spinlock support.
------------------------------------------------------------------------------
KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature bit
|| || before enabling paravirtualized
|| || tlb flush.
------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
......
......@@ -7748,7 +7748,9 @@ F: arch/powerpc/kernel/kvm*
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
M: Christian Borntraeger <borntraeger@de.ibm.com>
M: Cornelia Huck <cohuck@redhat.com>
M: Janosch Frank <frankja@linux.vnet.ibm.com>
R: David Hildenbrand <david@redhat.com>
R: Cornelia Huck <cohuck@redhat.com>
L: linux-s390@vger.kernel.org
W: http://www.ibm.com/developerworks/linux/linux390/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
......@@ -12026,6 +12028,7 @@ F: drivers/pci/hotplug/s390_pci_hpc.c
S390 VFIO-CCW DRIVER
M: Cornelia Huck <cohuck@redhat.com>
M: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
M: Halil Pasic <pasic@linux.vnet.ibm.com>
L: linux-s390@vger.kernel.org
L: kvm@vger.kernel.org
S: Supported
......
......@@ -131,7 +131,7 @@ static inline bool mode_has_spsr(struct kvm_vcpu *vcpu)
static inline bool vcpu_mode_priv(struct kvm_vcpu *vcpu)
{
unsigned long cpsr_mode = vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr & MODE_MASK;
return cpsr_mode > USR_MODE;;
return cpsr_mode > USR_MODE;
}
static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
......
......@@ -48,6 +48,8 @@
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
int __attribute_const__ kvm_target_cpu(void);
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
......
......@@ -21,7 +21,6 @@
#include <linux/compiler.h>
#include <linux/kvm_host.h>
#include <asm/cp15.h>
#include <asm/kvm_mmu.h>
#include <asm/vfp.h>
#define __hyp_text __section(.hyp.text) notrace
......@@ -69,6 +68,8 @@
#define HIFAR __ACCESS_CP15(c6, 4, c0, 2)
#define HPFAR __ACCESS_CP15(c6, 4, c0, 4)
#define ICIALLUIS __ACCESS_CP15(c7, 0, c1, 0)
#define BPIALLIS __ACCESS_CP15(c7, 0, c1, 6)
#define ICIMVAU __ACCESS_CP15(c7, 0, c5, 1)
#define ATS1CPR __ACCESS_CP15(c7, 0, c8, 0)
#define TLBIALLIS __ACCESS_CP15(c8, 0, c3, 0)
#define TLBIALL __ACCESS_CP15(c8, 0, c7, 0)
......
......@@ -37,6 +37,8 @@
#include <linux/highmem.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/kvm_hyp.h>
#include <asm/pgalloc.h>
#include <asm/stage2_pgtable.h>
......@@ -83,6 +85,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
}
static inline pte_t kvm_s2pte_mkexec(pte_t pte)
{
pte_val(pte) &= ~L_PTE_XN;
return pte;
}
static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
{
pmd_val(pmd) &= ~PMD_SECT_XN;
return pmd;
}
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
......@@ -93,6 +107,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
}
static inline bool kvm_s2pte_exec(pte_t *pte)
{
return !(pte_val(*pte) & L_PTE_XN);
}
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
......@@ -103,6 +122,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
}
static inline bool kvm_s2pmd_exec(pmd_t *pmd)
{
return !(pmd_val(*pmd) & PMD_SECT_XN);
}
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
......@@ -126,10 +150,36 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
return (vcpu_cp15(vcpu, c1_SCTLR) & 0b101) == 0b101;
}
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kvm_pfn_t pfn,
unsigned long size)
static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
/*
* Clean the dcache to the Point of Coherency.
*
* We need to do this through a kernel mapping (using the
* user-space mapping has proved to be the wrong
* solution). For that, we need to kmap one page at a time,
* and iterate over the range.
*/
VM_BUG_ON(size & ~PAGE_MASK);
while (size) {
void *va = kmap_atomic_pfn(pfn);
kvm_flush_dcache_to_poc(va, PAGE_SIZE);
size -= PAGE_SIZE;
pfn++;
kunmap_atomic(va);
}
}
static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
unsigned long size)
{
u32 iclsz;
/*
* If we are going to insert an instruction page and the icache is
* either VIPT or PIPT, there is a potential problem where the host
......@@ -141,23 +191,40 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
*
* VIVT caches are tagged using both the ASID and the VMID and doesn't
* need any kind of flushing (DDI 0406C.b - Page B3-1392).
*
* We need to do this through a kernel mapping (using the
* user-space mapping has proved to be the wrong
* solution). For that, we need to kmap one page at a time,
* and iterate over the range.
*/
VM_BUG_ON(size & ~PAGE_MASK);
if (icache_is_vivt_asid_tagged())
return;
if (!icache_is_pipt()) {
/* any kind of VIPT cache */
__flush_icache_all();
return;
}
/*
* CTR IminLine contains Log2 of the number of words in the
* cache line, so we can get the number of words as
* 2 << (IminLine - 1). To get the number of bytes, we
* multiply by 4 (the number of bytes in a 32-bit word), and
* get 4 << (IminLine).
*/
iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
while (size) {
void *va = kmap_atomic_pfn(pfn);
void *end = va + PAGE_SIZE;
void *addr = va;
kvm_flush_dcache_to_poc(va, PAGE_SIZE);
do {
write_sysreg(addr, ICIMVAU);
addr += iclsz;
} while (addr < end);
if (icache_is_pipt())
__cpuc_coherent_user_range((unsigned long)va,
(unsigned long)va + PAGE_SIZE);
dsb(ishst);
isb();
size -= PAGE_SIZE;
pfn++;
......@@ -165,9 +232,11 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kunmap_atomic(va);
}
if (!icache_is_pipt() && !icache_is_vivt_asid_tagged()) {
/* any kind of VIPT cache */
__flush_icache_all();
/* Check if we need to invalidate the BTB */
if ((read_cpuid_ext(CPUID_EXT_MMFR1) >> 28) != 4) {
write_sysreg(0, BPIALLIS);
dsb(ishst);
isb();
}
}
......
......@@ -102,8 +102,8 @@ extern pgprot_t pgprot_s2_device;
#define PAGE_HYP_EXEC _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY)
#define PAGE_HYP_RO _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY | L_PTE_XN)
#define PAGE_HYP_DEVICE _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY)
#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY | L_PTE_XN)
#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY | L_PTE_XN)
#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
#define __PAGE_SHARED __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
......
......@@ -18,6 +18,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
__asm__(".arch_extension virt");
......
......@@ -19,6 +19,7 @@
*/
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
/**
* Flush per-VMID TLBs
......
......@@ -435,6 +435,27 @@ alternative_endif
dsb \domain
.endm
/*
* Macro to perform an instruction cache maintenance for the interval
* [start, end)
*
* start, end: virtual addresses describing the region
* label: A label to branch to on user fault.
* Corrupts: tmp1, tmp2
*/
.macro invalidate_icache_by_line start, end, tmp1, tmp2, label
icache_line_size \tmp1, \tmp2
sub \tmp2, \tmp1, #1
bic \tmp2, \start, \tmp2
9997:
USER(\label, ic ivau, \tmp2) // invalidate I line PoU
add \tmp2, \tmp2, \tmp1
cmp \tmp2, \end
b.lo 9997b
dsb ish
isb
.endm
/*
* reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
*/
......
......@@ -52,6 +52,12 @@
* - start - virtual start address
* - end - virtual end address
*
* invalidate_icache_range(start, end)
*
* Invalidate the I-cache in the region described by start, end.
* - start - virtual start address
* - end - virtual end address
*
* __flush_cache_user_range(start, end)
*
* Ensure coherency between the I-cache and the D-cache in the
......@@ -66,6 +72,7 @@
* - size - region size
*/
extern void flush_icache_range(unsigned long start, unsigned long end);
extern int invalidate_icache_range(unsigned long start, unsigned long end);
extern void __flush_dcache_area(void *addr, size_t len);
extern void __inval_dcache_area(void *addr, size_t len);
extern void __clean_dcache_area_poc(void *addr, size_t len);
......
......@@ -48,6 +48,8 @@
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
int __attribute_const__ kvm_target_cpu(void);
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
......
......@@ -20,7 +20,6 @@
#include <linux/compiler.h>
#include <linux/kvm_host.h>
#include <asm/kvm_mmu.h>
#include <asm/sysreg.h>
#define __hyp_text __section(.hyp.text) notrace
......
......@@ -173,6 +173,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
}
static inline pte_t kvm_s2pte_mkexec(pte_t pte)
{
pte_val(pte) &= ~PTE_S2_XN;
return pte;
}
static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
{
pmd_val(pmd) &= ~PMD_S2_XN;
return pmd;
}
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pteval_t old_pteval, pteval;
......@@ -191,6 +203,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
}
static inline bool kvm_s2pte_exec(pte_t *pte)
{
return !(pte_val(*pte) & PTE_S2_XN);
}
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
kvm_set_s2pte_readonly((pte_t *)pmd);
......@@ -201,6 +218,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
return kvm_s2pte_readonly((pte_t *)pmd);
}
static inline bool kvm_s2pmd_exec(pmd_t *pmd)
{
return !(pmd_val(*pmd) & PMD_S2_XN);
}
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
......@@ -230,21 +252,25 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
}
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kvm_pfn_t pfn,
unsigned long size)
static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
{
void *va = page_address(pfn_to_page(pfn));
kvm_flush_dcache_to_poc(va, size);
}
static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
unsigned long size)
{
if (icache_is_aliasing()) {
/* any kind of VIPT cache */
__flush_icache_all();
} else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
flush_icache_range((unsigned long)va,
(unsigned long)va + size);
void *va = page_address(pfn_to_page(pfn));
invalidate_icache_range((unsigned long)va,
(unsigned long)va + size);
}
}
......
......@@ -187,9 +187,11 @@
*/
#define PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[2:1] */
#define PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
#define PTE_S2_XN (_AT(pteval_t, 2) << 53) /* XN[1:0] */
#define PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[2:1] */
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
#define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
/*
* Memory Attribute override for Stage-2 (MemAttr[3:0])
......
......@@ -67,8 +67,8 @@
#define PAGE_HYP_RO __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
#define PAGE_S2 __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
#define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
#define PAGE_S2 __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
#define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
......
......@@ -361,10 +361,16 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
int ret = 0;
vcpu_load(vcpu);
trace_kvm_set_guest_debug(vcpu, dbg->control);
if (dbg->control & ~KVM_GUESTDBG_VALID_MASK)
return -EINVAL;
if (dbg->control & ~KVM_GUESTDBG_VALID_MASK) {
ret = -EINVAL;
goto out;
}
if (dbg->control & KVM_GUESTDBG_ENABLE) {
vcpu->guest_debug = dbg->control;
......@@ -378,7 +384,10 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
/* If not enabled clear all flags */
vcpu->guest_debug = 0;
}
return 0;
out:
vcpu_put(vcpu);
return ret;
}
int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
......
......@@ -21,6 +21,7 @@
#include <asm/debug-monitors.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#define read_debug(r,n) read_sysreg(r##n##_el1)
#define write_debug(v,r,n) write_sysreg(v, r##n##_el1)
......
......@@ -24,6 +24,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#include <asm/fpsimd.h>
#include <asm/debug-monitors.h>
......
......@@ -16,6 +16,7 @@
*/
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#include <asm/tlbflush.h>
static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
......
......@@ -60,16 +60,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE
b.lo 1b
dsb ish
icache_line_size x2, x3
sub x3, x2, #1
bic x4, x0, x3
1:
USER(9f, ic ivau, x4 ) // invalidate I line PoU
add x4, x4, x2
cmp x4, x1
b.lo 1b
dsb ish
isb
invalidate_icache_by_line x0, x1, x2, x3, 9f
mov x0, #0
1:
uaccess_ttbr0_disable x1, x2
......@@ -80,6 +71,27 @@ USER(9f, ic ivau, x4 ) // invalidate I line PoU
ENDPROC(flush_icache_range)
ENDPROC(__flush_cache_user_range)
/*
* invalidate_icache_range(start,end)
*
* Ensure that the I cache is invalid within specified region.
*
* - start - virtual start address of region
* - end - virtual end address of region
*/
ENTRY(invalidate_icache_range)
uaccess_ttbr0_enable x2, x3, x4
invalidate_icache_by_line x0, x1, x2, x3, 2f
mov x0, xzr
1:
uaccess_ttbr0_disable x1, x2
ret
2:
mov x0, #-EFAULT
b 1b
ENDPROC(invalidate_icache_range)
/*
* __flush_dcache_area(kaddr, size)
*
......
......@@ -22,6 +22,7 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_MMIO
select MMU_NOTIFIER
select SRCU
......
......@@ -446,6 +446,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
int r = -EINTR;
vcpu_load(vcpu);
kvm_sigset_activate(vcpu);
if (vcpu->mmio_needed) {
......@@ -480,6 +482,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
out:
kvm_sigset_deactivate(vcpu);
vcpu_put(vcpu);
return r;
}
......@@ -900,6 +903,26 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
return r;
}
long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
if (ioctl == KVM_INTERRUPT) {
struct kvm_mips_interrupt irq;
if (copy_from_user(&irq, argp, sizeof(irq)))
return -EFAULT;
kvm_debug("[%d] %s: irq: %d\n", vcpu->vcpu_id, __func__,
irq.irq);
return kvm_vcpu_ioctl_interrupt(vcpu, &irq);
}
return -ENOIOCTLCMD;
}
long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
......@@ -907,56 +930,54 @@ long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl,
void __user *argp = (void __user *)arg;
long r;
vcpu_load(vcpu);
switch (ioctl) {
case KVM_SET_ONE_REG:
case KVM_GET_ONE_REG: {
struct kvm_one_reg reg;
r = -EFAULT;
if (copy_from_user(&reg, argp, sizeof(reg)))
return -EFAULT;
break;
if (ioctl == KVM_SET_ONE_REG)
return kvm_mips_set_reg(vcpu, &reg);
r = kvm_mips_set_reg(vcpu, &reg);
else
return kvm_mips_get_reg(vcpu, &reg);
r = kvm_mips_get_reg(vcpu, &reg);
break;
}
case KVM_GET_REG_LIST: {
struct kvm_reg_list __user *user_list = argp;
struct kvm_reg_list reg_list;
unsigned n;
r = -EFAULT;
if (copy_from_user(&reg_list, user_list, sizeof(reg_list)))
return -EFAULT;
break;
n = reg_list.n;
reg_list.n = kvm_mips_num_regs(vcpu);
if (copy_to_user(user_list, &reg_list, sizeof(reg_list)))
return -EFAULT;
break;
r = -E2BIG;
if (n < reg_list.n)
return -E2BIG;
return kvm_mips_copy_reg_indices(vcpu, user_list->reg);
}
case KVM_INTERRUPT:
{
struct kvm_mips_interrupt irq;
if (copy_from_user(&irq, argp, sizeof(irq)))
return -EFAULT;
kvm_debug("[%d] %s: irq: %d\n", vcpu->vcpu_id, __func__,
irq.irq);
r = kvm_vcpu_ioctl_interrupt(vcpu, &irq);
break;
}
r = kvm_mips_copy_reg_indices(vcpu, user_list->reg);
break;
}
case KVM_ENABLE_CAP: {
struct kvm_enable_cap cap;
r = -EFAULT;
if (copy_from_user(&cap, argp, sizeof(cap)))
return -EFAULT;
break;
r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
break;
}
default:
r = -ENOIOCTLCMD;
}
vcpu_put(vcpu);
return r;
}
......@@ -1145,6 +1166,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
for (i = 1; i < ARRAY_SIZE(vcpu->arch.gprs); i++)
vcpu->arch.gprs[i] = regs->gpr[i];
vcpu->arch.gprs[0] = 0; /* zero is special, and cannot be set. */
......@@ -1152,6 +1175,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
vcpu->arch.lo = regs->lo;
vcpu->arch.pc = regs->pc;
vcpu_put(vcpu);
return 0;
}
......@@ -1159,6 +1183,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
for (i = 0; i < ARRAY_SIZE(vcpu->arch.gprs); i++)
regs->gpr[i] = vcpu->arch.gprs[i];
......@@ -1166,6 +1192,7 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
regs->lo = vcpu->arch.lo;
regs->pc = vcpu->arch.pc;
vcpu_put(vcpu);
return 0;
}
......
......@@ -249,10 +249,8 @@ extern int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd);
extern void kvmppc_pr_init_default_hcalls(struct kvm *kvm);
extern int kvmppc_hcall_impl_pr(unsigned long cmd);
extern int kvmppc_hcall_impl_hv_realmode(unsigned long cmd);
extern void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
struct kvm_vcpu *vcpu);
extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
struct kvmppc_book3s_shadow_vcpu *svcpu);
extern void kvmppc_copy_to_svcpu(struct kvm_vcpu *vcpu);
extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu);
extern int kvm_irq_bypass;
static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
......
......@@ -122,13 +122,13 @@ static inline int kvmppc_hpte_page_shifts(unsigned long h, unsigned long l)
lphi = (l >> 16) & 0xf;
switch ((l >> 12) & 0xf) {
case 0:
return !lphi ? 24 : -1; /* 16MB */
return !lphi ? 24 : 0; /* 16MB */
break;
case 1:
return 16; /* 64kB */
break;
case 3:
return !lphi ? 34 : -1; /* 16GB */
return !lphi ? 34 : 0; /* 16GB */
break;
case 7:
return (16 << 8) + 12; /* 64kB in 4kB */
......@@ -140,7 +140,7 @@ static inline int kvmppc_hpte_page_shifts(unsigned long h, unsigned long l)
return (24 << 8) + 12; /* 16MB in 4kB */
break;
}
return -1;
return 0;
}
static inline int kvmppc_hpte_base_page_shift(unsigned long h, unsigned long l)
......@@ -159,7 +159,11 @@ static inline int kvmppc_hpte_actual_page_shift(unsigned long h, unsigned long l
static inline unsigned long kvmppc_actual_pgsz(unsigned long v, unsigned long r)
{
return 1ul << kvmppc_hpte_actual_page_shift(v, r);
int shift = kvmppc_hpte_actual_page_shift(v, r);
if (shift)
return 1ul << shift;
return 0;
}
static inline int kvmppc_pgsize_lp_encoding(int base_shift, int actual_shift)
......@@ -232,7 +236,7 @@ static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r,
va_low ^= v >> (SID_SHIFT_1T - 16);
va_low &= 0x7ff;
if (b_pgshift == 12) {
if (b_pgshift <= 12) {
if (a_pgshift > 12) {
sllp = (a_pgshift == 16) ? 5 : 4;
rb |= sllp << 5; /* AP field */
......
......@@ -690,6 +690,7 @@ struct kvm_vcpu_arch {
u8 mmio_vsx_offset;
u8 mmio_vsx_copy_type;
u8 mmio_vsx_tx_sx_enabled;
u8 mmio_vmx_copy_nums;
u8 osi_needed;
u8 osi_enabled;
u8 papr_enabled;
......@@ -709,6 +710,7 @@ struct kvm_vcpu_arch {
u8 ceded;
u8 prodded;
u8 doorbell_request;
u8 irq_pending; /* Used by XIVE to signal pending guest irqs */
u32 last_inst;
struct swait_queue_head *wqp;
......@@ -738,8 +740,11 @@ struct kvm_vcpu_arch {
struct kvmppc_icp *icp; /* XICS presentation controller */
struct kvmppc_xive_vcpu *xive_vcpu; /* XIVE virtual CPU data */
__be32 xive_cam_word; /* Cooked W2 in proper endian with valid bit */
u32 xive_pushed; /* Is the VP pushed on the physical CPU ? */
u8 xive_pushed; /* Is the VP pushed on the physical CPU ? */
u8 xive_esc_on; /* Is the escalation irq enabled ? */
union xive_tma_w01 xive_saved_state; /* W0..1 of XIVE thread state */
u64 xive_esc_raddr; /* Escalation interrupt ESB real addr */
u64 xive_esc_vaddr; /* Escalation interrupt ESB virt addr */
#endif
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
......@@ -800,6 +805,7 @@ struct kvm_vcpu_arch {
#define KVM_MMIO_REG_QPR 0x0040
#define KVM_MMIO_REG_FQPR 0x0060
#define KVM_MMIO_REG_VSX 0x0080
#define KVM_MMIO_REG_VMX 0x00c0
#define __KVM_HAVE_ARCH_WQP
#define __KVM_HAVE_CREATE_DEVICE
......
......@@ -81,6 +81,10 @@ extern int kvmppc_handle_loads(struct kvm_run *run, struct kvm_vcpu *vcpu,
extern int kvmppc_handle_vsx_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int rt, unsigned int bytes,
int is_default_endian, int mmio_sign_extend);
extern int kvmppc_handle_load128_by2x64(struct kvm_run *run,
struct kvm_vcpu *vcpu, unsigned int rt, int is_default_endian);
extern int kvmppc_handle_store128_by2x64(struct kvm_run *run,
struct kvm_vcpu *vcpu, unsigned int rs, int is_default_endian);
extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
u64 val, unsigned int bytes,
int is_default_endian);
......
......@@ -1076,6 +1076,7 @@ enum {
/* Flags for OPAL_XIVE_GET/SET_VP_INFO */
enum {
OPAL_XIVE_VP_ENABLED = 0x00000001,
OPAL_XIVE_VP_SINGLE_ESCALATION = 0x00000002,
};
/* "Any chip" replacement for chip ID for allocation functions */
......
......@@ -156,6 +156,12 @@
#define OP_31_XOP_LFDX 599
#define OP_31_XOP_LFDUX 631
/* VMX Vector Load Instructions */
#define OP_31_XOP_LVX 103
/* VMX Vector Store Instructions */
#define OP_31_XOP_STVX 231
#define OP_LWZ 32
#define OP_STFS 52
#define OP_STFSU 53
......
......@@ -111,9 +111,10 @@ extern void xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio);
extern void xive_native_sync_source(u32 hw_irq);
extern bool is_xive_irq(struct irq_chip *chip);
extern int xive_native_enable_vp(u32 vp_id);
extern int xive_native_enable_vp(u32 vp_id, bool single_escalation);
extern int xive_native_disable_vp(u32 vp_id);
extern int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id);
extern bool xive_native_has_single_escalation(void);
#else
......
......@@ -632,6 +632,8 @@ struct kvm_ppc_cpu_char {
#define KVM_REG_PPC_TIDR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbc)
#define KVM_REG_PPC_PSSCR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbd)
#define KVM_REG_PPC_DEC_EXPIRY (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbe)
/* Transactional Memory checkpointed state:
* This is all GPRs, all VSX regs and a subset of SPRs
*/
......
......@@ -520,6 +520,7 @@ int main(void)
OFFSET(VCPU_PENDING_EXC, kvm_vcpu, arch.pending_exceptions);
OFFSET(VCPU_CEDED, kvm_vcpu, arch.ceded);
OFFSET(VCPU_PRODDED, kvm_vcpu, arch.prodded);
OFFSET(VCPU_IRQ_PENDING, kvm_vcpu, arch.irq_pending);
OFFSET(VCPU_DBELL_REQ, kvm_vcpu, arch.doorbell_request);
OFFSET(VCPU_MMCR, kvm_vcpu, arch.mmcr);
OFFSET(VCPU_PMC, kvm_vcpu, arch.pmc);
......@@ -739,6 +740,9 @@ int main(void)
DEFINE(VCPU_XIVE_CAM_WORD, offsetof(struct kvm_vcpu,
arch.xive_cam_word));
DEFINE(VCPU_XIVE_PUSHED, offsetof(struct kvm_vcpu, arch.xive_pushed));
DEFINE(VCPU_XIVE_ESC_ON, offsetof(struct kvm_vcpu, arch.xive_esc_on));
DEFINE(VCPU_XIVE_ESC_RADDR, offsetof(struct kvm_vcpu, arch.xive_esc_raddr));
DEFINE(VCPU_XIVE_ESC_VADDR, offsetof(struct kvm_vcpu, arch.xive_esc_vaddr));
#endif
#ifdef CONFIG_KVM_EXIT_TIMING
......
......@@ -22,6 +22,7 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_EVENTFD
select HAVE_KVM_VCPU_ASYNC_IOCTL
select SRCU
select KVM_VFIO
select IRQ_BYPASS_MANAGER
......@@ -68,7 +69,7 @@ config KVM_BOOK3S_64
select KVM_BOOK3S_64_HANDLER
select KVM
select KVM_BOOK3S_PR_POSSIBLE if !KVM_BOOK3S_HV_POSSIBLE
select SPAPR_TCE_IOMMU if IOMMU_SUPPORT && (PPC_SERIES || PPC_POWERNV)
select SPAPR_TCE_IOMMU if IOMMU_SUPPORT && (PPC_PSERIES || PPC_POWERNV)
---help---
Support running unmodified book3s_64 and book3s_32 guest kernels
in virtual machines on book3s_64 host processors.
......
......@@ -484,19 +484,33 @@ void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
return vcpu->kvm->arch.kvm_ops->get_sregs(vcpu, sregs);
int ret;
vcpu_load(vcpu);
ret = vcpu->kvm->arch.kvm_ops->get_sregs(vcpu, sregs);
vcpu_put(vcpu);
return ret;
}
int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
return vcpu->kvm->arch.kvm_ops->set_sregs(vcpu, sregs);
int ret;
vcpu_load(vcpu);
ret = vcpu->kvm->arch.kvm_ops->set_sregs(vcpu, sregs);
vcpu_put(vcpu);
return ret;
}
int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
regs->pc = kvmppc_get_pc(vcpu);
regs->cr = kvmppc_get_cr(vcpu);
regs->ctr = kvmppc_get_ctr(vcpu);
......@@ -518,6 +532,7 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
vcpu_put(vcpu);
return 0;
}
......@@ -525,6 +540,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
kvmppc_set_pc(vcpu, regs->pc);
kvmppc_set_cr(vcpu, regs->cr);
kvmppc_set_ctr(vcpu, regs->ctr);
......@@ -545,6 +562,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
kvmppc_set_gpr(vcpu, i, regs->gpr[i]);
vcpu_put(vcpu);
return 0;
}
......@@ -737,7 +755,9 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
vcpu_load(vcpu);
vcpu->guest_debug = dbg->control;
vcpu_put(vcpu);
return 0;
}
......
......@@ -1269,6 +1269,11 @@ static unsigned long resize_hpt_rehash_hpte(struct kvm_resize_hpt *resize,
/* Nothing to do */
goto out;
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
rpte = be64_to_cpu(hptep[1]);
vpte = hpte_new_to_old_v(vpte, rpte);
}
/* Unmap */
rev = &old->rev[idx];
guest_rpte = rev->guest_rpte;
......@@ -1298,7 +1303,6 @@ static unsigned long resize_hpt_rehash_hpte(struct kvm_resize_hpt *resize,
/* Reload PTE after unmap */
vpte = be64_to_cpu(hptep[0]);
BUG_ON(vpte & HPTE_V_VALID);
BUG_ON(!(vpte & HPTE_V_ABSENT));
......@@ -1307,6 +1311,12 @@ static unsigned long resize_hpt_rehash_hpte(struct kvm_resize_hpt *resize,
goto out;
rpte = be64_to_cpu(hptep[1]);
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
vpte = hpte_new_to_old_v(vpte, rpte);
rpte = hpte_new_to_old_r(rpte);
}
pshift = kvmppc_hpte_base_page_shift(vpte, rpte);
avpn = HPTE_V_AVPN_VAL(vpte) & ~(((1ul << pshift) - 1) >> 23);
pteg = idx / HPTES_PER_GROUP;
......@@ -1337,17 +1347,17 @@ static unsigned long resize_hpt_rehash_hpte(struct kvm_resize_hpt *resize,
}
new_pteg = hash & new_hash_mask;
if (vpte & HPTE_V_SECONDARY) {
BUG_ON(~pteg != (hash & old_hash_mask));
new_pteg = ~new_pteg;
} else {
BUG_ON(pteg != (hash & old_hash_mask));
}
if (vpte & HPTE_V_SECONDARY)
new_pteg = ~hash & new_hash_mask;
new_idx = new_pteg * HPTES_PER_GROUP + (idx % HPTES_PER_GROUP);
new_hptep = (__be64 *)(new->virt + (new_idx << 4));
replace_vpte = be64_to_cpu(new_hptep[0]);
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
unsigned long replace_rpte = be64_to_cpu(new_hptep[1]);
replace_vpte = hpte_new_to_old_v(replace_vpte, replace_rpte);
}
if (replace_vpte & (HPTE_V_VALID | HPTE_V_ABSENT)) {
BUG_ON(new->order >= old->order);
......@@ -1363,6 +1373,11 @@ static unsigned long resize_hpt_rehash_hpte(struct kvm_resize_hpt *resize,
/* Discard the previous HPTE */
}
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
rpte = hpte_old_to_new_r(vpte, rpte);
vpte = hpte_old_to_new_v(vpte);
}
new_hptep[1] = cpu_to_be64(rpte);
new->rev[new_idx].guest_rpte = guest_rpte;
/* No need for a barrier, since new HPT isn't active */
......@@ -1380,12 +1395,6 @@ static int resize_hpt_rehash(struct kvm_resize_hpt *resize)
unsigned long i;
int rc;
/*
* resize_hpt_rehash_hpte() doesn't handle the new-format HPTEs
* that POWER9 uses, and could well hit a BUG_ON on POWER9.
*/
if (cpu_has_feature(CPU_FTR_ARCH_300))
return -EIO;
for (i = 0; i < kvmppc_hpt_npte(&kvm->arch.hpt); i++) {
rc = resize_hpt_rehash_hpte(resize, i);
if (rc != 0)
......@@ -1416,6 +1425,9 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize)
synchronize_srcu_expedited(&kvm->srcu);
if (cpu_has_feature(CPU_FTR_ARCH_300))
kvmppc_setup_partition_table(kvm);
resize_hpt_debug(resize, "resize_hpt_pivot() done\n");
}
......
......@@ -573,7 +573,7 @@ long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm,
j = i + 1;
if (npages) {
set_dirty_bits(map, i, npages);
i = j + npages;
j = i + npages;
}
}
return 0;
......
......@@ -116,6 +116,9 @@ module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect, 0644);
MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core");
#endif
/* If set, the threads on each CPU core have to be in the same MMU mode */
static bool no_mixing_hpt_and_radix;
static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
......@@ -1003,8 +1006,6 @@ static int kvmppc_emulate_doorbell_instr(struct kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tvcpu;
if (!cpu_has_feature(CPU_FTR_ARCH_300))
return EMULATE_FAIL;
if (kvmppc_get_last_inst(vcpu, INST_GENERIC, &inst) != EMULATE_DONE)
return RESUME_GUEST;
if (get_op(inst) != 31)
......@@ -1054,6 +1055,7 @@ static int kvmppc_emulate_doorbell_instr(struct kvm_vcpu *vcpu)
return RESUME_GUEST;
}
/* Called with vcpu->arch.vcore->lock held */
static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
struct task_struct *tsk)
{
......@@ -1174,7 +1176,10 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
swab32(vcpu->arch.emul_inst) :
vcpu->arch.emul_inst;
if (vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP) {
/* Need vcore unlocked to call kvmppc_get_last_inst */
spin_unlock(&vcpu->arch.vcore->lock);
r = kvmppc_emulate_debug_inst(run, vcpu);
spin_lock(&vcpu->arch.vcore->lock);
} else {
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
r = RESUME_GUEST;
......@@ -1189,8 +1194,13 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
*/
case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
r = EMULATE_FAIL;
if ((vcpu->arch.hfscr >> 56) == FSCR_MSGP_LG)
if (((vcpu->arch.hfscr >> 56) == FSCR_MSGP_LG) &&
cpu_has_feature(CPU_FTR_ARCH_300)) {
/* Need vcore unlocked to call kvmppc_get_last_inst */
spin_unlock(&vcpu->arch.vcore->lock);
r = kvmppc_emulate_doorbell_instr(vcpu);
spin_lock(&vcpu->arch.vcore->lock);
}
if (r == EMULATE_FAIL) {
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
r = RESUME_GUEST;
......@@ -1495,6 +1505,10 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
case KVM_REG_PPC_ARCH_COMPAT:
*val = get_reg_val(id, vcpu->arch.vcore->arch_compat);
break;
case KVM_REG_PPC_DEC_EXPIRY:
*val = get_reg_val(id, vcpu->arch.dec_expires +
vcpu->arch.vcore->tb_offset);
break;
default:
r = -EINVAL;
break;
......@@ -1722,6 +1736,10 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
case KVM_REG_PPC_ARCH_COMPAT:
r = kvmppc_set_arch_compat(vcpu, set_reg_val(id, *val));
break;
case KVM_REG_PPC_DEC_EXPIRY:
vcpu->arch.dec_expires = set_reg_val(id, *val) -
vcpu->arch.vcore->tb_offset;
break;
default:
r = -EINVAL;
break;
......@@ -2376,8 +2394,8 @@ static void init_core_info(struct core_info *cip, struct kvmppc_vcore *vc)
static bool subcore_config_ok(int n_subcores, int n_threads)
{
/*
* POWER9 "SMT4" cores are permanently in what is effectively a 4-way split-core
* mode, with one thread per subcore.
* POWER9 "SMT4" cores are permanently in what is effectively a 4-way
* split-core mode, with one thread per subcore.
*/
if (cpu_has_feature(CPU_FTR_ARCH_300))
return n_subcores <= 4 && n_threads == 1;
......@@ -2413,8 +2431,8 @@ static bool can_dynamic_split(struct kvmppc_vcore *vc, struct core_info *cip)
if (!cpu_has_feature(CPU_FTR_ARCH_207S))
return false;
/* POWER9 currently requires all threads to be in the same MMU mode */
if (cpu_has_feature(CPU_FTR_ARCH_300) &&
/* Some POWER9 chips require all threads to be in the same MMU mode */
if (no_mixing_hpt_and_radix &&
kvm_is_radix(vc->kvm) != kvm_is_radix(cip->vc[0]->kvm))
return false;
......@@ -2677,9 +2695,11 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
* threads are offline. Also check if the number of threads in this
* guest are greater than the current system threads per guest.
* On POWER9, we need to be not in independent-threads mode if
* this is a HPT guest on a radix host.
* this is a HPT guest on a radix host machine where the
* CPU threads may not be in different MMU modes.
*/
hpt_on_radix = radix_enabled() && !kvm_is_radix(vc->kvm);
hpt_on_radix = no_mixing_hpt_and_radix && radix_enabled() &&
!kvm_is_radix(vc->kvm);
if (((controlled_threads > 1) &&
((vc->num_threads > threads_per_subcore) || !on_primary_thread())) ||
(hpt_on_radix && vc->kvm->arch.threads_indep)) {
......@@ -2829,7 +2849,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
*/
if (!thr0_done)
kvmppc_start_thread(NULL, pvc);
thr += pvc->num_threads;
}
/*
......@@ -2932,13 +2951,14 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
/* make sure updates to secondary vcpu structs are visible now */
smp_mb();
preempt_enable();
for (sub = 0; sub < core_info.n_subcores; ++sub) {
pvc = core_info.vc[sub];
post_guest_process(pvc, pvc == vc);
}
spin_lock(&vc->lock);
preempt_enable();
out:
vc->vcore_state = VCORE_INACTIVE;
......@@ -2985,7 +3005,7 @@ static inline bool xive_interrupt_pending(struct kvm_vcpu *vcpu)
{
if (!xive_enabled())
return false;
return vcpu->arch.xive_saved_state.pipr <
return vcpu->arch.irq_pending || vcpu->arch.xive_saved_state.pipr <
vcpu->arch.xive_saved_state.cppr;
}
#else
......@@ -3174,17 +3194,8 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
* this thread straight away and have it join in.
*/
if (!signal_pending(current)) {
if (vc->vcore_state == VCORE_PIGGYBACK) {
if (spin_trylock(&vc->lock)) {
if (vc->vcore_state == VCORE_RUNNING &&
!VCORE_IS_EXITING(vc)) {
kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu, vc);
trace_kvm_guest_enter(vcpu);
}
spin_unlock(&vc->lock);
}
} else if (vc->vcore_state == VCORE_RUNNING &&
if ((vc->vcore_state == VCORE_PIGGYBACK ||
vc->vcore_state == VCORE_RUNNING) &&
!VCORE_IS_EXITING(vc)) {
kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu, vc);
......@@ -4446,6 +4457,19 @@ static int kvmppc_book3s_init_hv(void)
if (kvmppc_radix_possible())
r = kvmppc_radix_init();
/*
* POWER9 chips before version 2.02 can't have some threads in
* HPT mode and some in radix mode on the same core.
*/
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
unsigned int pvr = mfspr(SPRN_PVR);
if ((pvr >> 16) == PVR_POWER9 &&
(((pvr & 0xe000) == 0 && (pvr & 0xfff) < 0x202) ||
((pvr & 0xe000) == 0x2000 && (pvr & 0xfff) < 0x101)))
no_mixing_hpt_and_radix = true;
}
return r;
}
......
......@@ -413,10 +413,11 @@ FTR_SECTION_ELSE
/* On P9 we use the split_info for coordinating LPCR changes */
lwz r4, KVM_SPLIT_DO_SET(r6)
cmpwi r4, 0
beq 63f
beq 1f
mr r3, r6
bl kvmhv_p9_set_lpcr
nop
1:
ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
63:
/* Order load of vcpu after load of vcore */
......@@ -617,13 +618,6 @@ kvmppc_hv_entry:
lbz r0, KVM_RADIX(r9)
cmpwi cr7, r0, 0
/* Clear out SLB if hash */
bne cr7, 2f
li r6,0
slbmte r6,r6
slbia
ptesync
2:
/*
* POWER7/POWER8 host -> guest partition switch code.
* We don't have to lock against concurrent tlbies,
......@@ -738,19 +732,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
10: cmpdi r4, 0
beq kvmppc_primary_no_guest
kvmppc_got_guest:
/* Load up guest SLB entries (N.B. slb_max will be 0 for radix) */
lwz r5,VCPU_SLB_MAX(r4)
cmpwi r5,0
beq 9f
mtctr r5
addi r6,r4,VCPU_SLB
1: ld r8,VCPU_SLB_E(r6)
ld r9,VCPU_SLB_V(r6)
slbmte r9,r8
addi r6,r6,VCPU_SLB_SIZE
bdnz 1b
9:
/* Increment yield count if they have a VPA */
ld r3, VCPU_VPA(r4)
cmpdi r3, 0
......@@ -957,7 +938,6 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
mftb r7
subf r3,r7,r8
mtspr SPRN_DEC,r3
std r3,VCPU_DEC(r4)
ld r5, VCPU_SPRG0(r4)
ld r6, VCPU_SPRG1(r4)
......@@ -1018,6 +998,29 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
cmpdi r3, 512 /* 1 microsecond */
blt hdec_soon
/* For hash guest, clear out and reload the SLB */
ld r6, VCPU_KVM(r4)
lbz r0, KVM_RADIX(r6)
cmpwi r0, 0
bne 9f
li r6, 0
slbmte r6, r6
slbia
ptesync
/* Load up guest SLB entries (N.B. slb_max will be 0 for radix) */
lwz r5,VCPU_SLB_MAX(r4)
cmpwi r5,0
beq 9f
mtctr r5
addi r6,r4,VCPU_SLB
1: ld r8,VCPU_SLB_E(r6)
ld r9,VCPU_SLB_V(r6)
slbmte r9,r8
addi r6,r6,VCPU_SLB_SIZE
bdnz 1b
9:
#ifdef CONFIG_KVM_XICS
/* We are entering the guest on that thread, push VCPU to XIVE */
ld r10, HSTATE_XIVE_TIMA_PHYS(r13)
......@@ -1031,8 +1034,53 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
li r9, TM_QW1_OS + TM_WORD2
stwcix r11,r9,r10
li r9, 1
stw r9, VCPU_XIVE_PUSHED(r4)
stb r9, VCPU_XIVE_PUSHED(r4)
eieio
/*
* We clear the irq_pending flag. There is a small chance of a
* race vs. the escalation interrupt happening on another
* processor setting it again, but the only consequence is to
* cause a spurrious wakeup on the next H_CEDE which is not an
* issue.
*/
li r0,0
stb r0, VCPU_IRQ_PENDING(r4)
/*
* In single escalation mode, if the escalation interrupt is
* on, we mask it.
*/
lbz r0, VCPU_XIVE_ESC_ON(r4)
cmpwi r0,0
beq 1f
ld r10, VCPU_XIVE_ESC_RADDR(r4)
li r9, XIVE_ESB_SET_PQ_01
ldcix r0, r10, r9
sync
/* We have a possible subtle race here: The escalation interrupt might
* have fired and be on its way to the host queue while we mask it,
* and if we unmask it early enough (re-cede right away), there is
* a theorical possibility that it fires again, thus landing in the
* target queue more than once which is a big no-no.
*
* Fortunately, solving this is rather easy. If the above load setting
* PQ to 01 returns a previous value where P is set, then we know the
* escalation interrupt is somewhere on its way to the host. In that
* case we simply don't clear the xive_esc_on flag below. It will be
* eventually cleared by the handler for the escalation interrupt.
*
* Then, when doing a cede, we check that flag again before re-enabling
* the escalation interrupt, and if set, we abort the cede.
*/
andi. r0, r0, XIVE_ESB_VAL_P
bne- 1f
/* Now P is 0, we can clear the flag */
li r0, 0
stb r0, VCPU_XIVE_ESC_ON(r4)
1:
no_xive:
#endif /* CONFIG_KVM_XICS */
......@@ -1193,7 +1241,7 @@ hdec_soon:
addi r3, r4, VCPU_TB_RMEXIT
bl kvmhv_accumulate_time
#endif
b guest_exit_cont
b guest_bypass
/******************************************************************************
* *
......@@ -1423,15 +1471,35 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
blt deliver_guest_interrupt
guest_exit_cont: /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save more register state */
mfdar r6
mfdsisr r7
std r6, VCPU_DAR(r9)
stw r7, VCPU_DSISR(r9)
/* don't overwrite fault_dar/fault_dsisr if HDSI */
cmpwi r12,BOOK3S_INTERRUPT_H_DATA_STORAGE
beq mc_cont
std r6, VCPU_FAULT_DAR(r9)
stw r7, VCPU_FAULT_DSISR(r9)
/* See if it is a machine check */
cmpwi r12, BOOK3S_INTERRUPT_MACHINE_CHECK
beq machine_check_realmode
mc_cont:
#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
addi r3, r9, VCPU_TB_RMEXIT
mr r4, r9
bl kvmhv_accumulate_time
#endif
#ifdef CONFIG_KVM_XICS
/* We are exiting, pull the VP from the XIVE */
lwz r0, VCPU_XIVE_PUSHED(r9)
lbz r0, VCPU_XIVE_PUSHED(r9)
cmpwi cr0, r0, 0
beq 1f
li r7, TM_SPC_PULL_OS_CTX
li r6, TM_QW1_OS
mfmsr r0
andi. r0, r0, MSR_IR /* in real mode? */
andi. r0, r0, MSR_DR /* in real mode? */
beq 2f
ld r10, HSTATE_XIVE_TIMA_VIRT(r13)
cmpldi cr0, r10, 0
......@@ -1454,33 +1522,42 @@ guest_exit_cont: /* r9 = vcpu, r12 = trap, r13 = paca */
/* Fixup some of the state for the next load */
li r10, 0
li r0, 0xff
stw r10, VCPU_XIVE_PUSHED(r9)
stb r10, VCPU_XIVE_PUSHED(r9)
stb r10, (VCPU_XIVE_SAVED_STATE+3)(r9)
stb r0, (VCPU_XIVE_SAVED_STATE+4)(r9)
eieio
1:
#endif /* CONFIG_KVM_XICS */
/* Save more register state */
mfdar r6
mfdsisr r7
std r6, VCPU_DAR(r9)
stw r7, VCPU_DSISR(r9)
/* don't overwrite fault_dar/fault_dsisr if HDSI */
cmpwi r12,BOOK3S_INTERRUPT_H_DATA_STORAGE
beq mc_cont
std r6, VCPU_FAULT_DAR(r9)
stw r7, VCPU_FAULT_DSISR(r9)
/* See if it is a machine check */
cmpwi r12, BOOK3S_INTERRUPT_MACHINE_CHECK
beq machine_check_realmode
mc_cont:
#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
addi r3, r9, VCPU_TB_RMEXIT
mr r4, r9
bl kvmhv_accumulate_time
#endif
/* For hash guest, read the guest SLB and save it away */
ld r5, VCPU_KVM(r9)
lbz r0, KVM_RADIX(r5)
li r5, 0
cmpwi r0, 0
bne 3f /* for radix, save 0 entries */
lwz r0,VCPU_SLB_NR(r9) /* number of entries in SLB */
mtctr r0
li r6,0
addi r7,r9,VCPU_SLB
1: slbmfee r8,r6
andis. r0,r8,SLB_ESID_V@h
beq 2f
add r8,r8,r6 /* put index in */
slbmfev r3,r6
std r8,VCPU_SLB_E(r7)
std r3,VCPU_SLB_V(r7)
addi r7,r7,VCPU_SLB_SIZE
addi r5,r5,1
2: addi r6,r6,1
bdnz 1b
/* Finally clear out the SLB */
li r0,0
slbmte r0,r0
slbia
ptesync
3: stw r5,VCPU_SLB_MAX(r9)
guest_bypass:
mr r3, r12
/* Increment exit count, poke other threads to exit */
bl kvmhv_commence_exit
......@@ -1501,31 +1578,6 @@ mc_cont:
ori r6,r6,1
mtspr SPRN_CTRLT,r6
4:
/* Check if we are running hash or radix and store it in cr2 */
ld r5, VCPU_KVM(r9)
lbz r0, KVM_RADIX(r5)
cmpwi cr2,r0,0
/* Read the guest SLB and save it away */
li r5, 0
bne cr2, 3f /* for radix, save 0 entries */
lwz r0,VCPU_SLB_NR(r9) /* number of entries in SLB */
mtctr r0
li r6,0
addi r7,r9,VCPU_SLB
1: slbmfee r8,r6
andis. r0,r8,SLB_ESID_V@h
beq 2f
add r8,r8,r6 /* put index in */
slbmfev r3,r6
std r8,VCPU_SLB_E(r7)
std r3,VCPU_SLB_V(r7)
addi r7,r7,VCPU_SLB_SIZE
addi r5,r5,1
2: addi r6,r6,1
bdnz 1b
3: stw r5,VCPU_SLB_MAX(r9)
/*
* Save the guest PURR/SPURR
*/
......@@ -1803,7 +1855,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
ld r5, VCPU_KVM(r9)
lbz r0, KVM_RADIX(r5)
cmpwi cr2, r0, 0
beq cr2, 3f
beq cr2, 4f
/* Radix: Handle the case where the guest used an illegal PID */
LOAD_REG_ADDR(r4, mmu_base_pid)
......@@ -1839,15 +1891,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
BEGIN_FTR_SECTION
PPC_INVALIDATE_ERAT
END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1)
b 4f
4:
#endif /* CONFIG_PPC_RADIX_MMU */
/* Hash: clear out SLB */
3: li r5,0
slbmte r5,r5
slbia
ptesync
4:
/*
* POWER7/POWER8 guest -> host partition switch code.
* We don't have to lock against tlbies but we do
......@@ -2745,7 +2791,32 @@ kvm_cede_prodded:
/* we've ceded but we want to give control to the host */
kvm_cede_exit:
ld r9, HSTATE_KVM_VCPU(r13)
b guest_exit_cont
#ifdef CONFIG_KVM_XICS
/* Abort if we still have a pending escalation */
lbz r5, VCPU_XIVE_ESC_ON(r9)
cmpwi r5, 0
beq 1f
li r0, 0
stb r0, VCPU_CEDED(r9)
1: /* Enable XIVE escalation */
li r5, XIVE_ESB_SET_PQ_00
mfmsr r0
andi. r0, r0, MSR_DR /* in real mode? */
beq 1f
ld r10, VCPU_XIVE_ESC_VADDR(r9)
cmpdi r10, 0
beq 3f
ldx r0, r10, r5
b 2f
1: ld r10, VCPU_XIVE_ESC_RADDR(r9)
cmpdi r10, 0
beq 3f
ldcix r0, r10, r5
2: sync
li r0, 1
stb r0, VCPU_XIVE_ESC_ON(r9)
#endif /* CONFIG_KVM_XICS */
3: b guest_exit_cont
/* Try to handle a machine check in real mode */
machine_check_realmode:
......
......@@ -96,7 +96,7 @@ kvm_start_entry:
kvm_start_lightweight:
/* Copy registers into shadow vcpu so we can access them in real mode */
GET_SHADOW_VCPU(r3)
mr r3, r4
bl FUNC(kvmppc_copy_to_svcpu)
nop
REST_GPR(4, r1)
......@@ -165,9 +165,7 @@ after_sprg3_load:
stw r12, VCPU_TRAP(r3)
/* Transfer reg values from shadow vcpu back to vcpu struct */
/* On 64-bit, interrupts are still off at this point */
GET_SHADOW_VCPU(r4)
bl FUNC(kvmppc_copy_from_svcpu)
nop
......
......@@ -121,7 +121,7 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
#ifdef CONFIG_PPC_BOOK3S_64
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
if (svcpu->in_use) {
kvmppc_copy_from_svcpu(vcpu, svcpu);
kvmppc_copy_from_svcpu(vcpu);
}
memcpy(to_book3s(vcpu)->slb_shadow, svcpu->slb, sizeof(svcpu->slb));
to_book3s(vcpu)->slb_shadow_max = svcpu->slb_max;
......@@ -143,9 +143,10 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
}
/* Copy data needed by real-mode code from vcpu to shadow vcpu */
void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
struct kvm_vcpu *vcpu)
void kvmppc_copy_to_svcpu(struct kvm_vcpu *vcpu)
{
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
svcpu->gpr[0] = vcpu->arch.gpr[0];
svcpu->gpr[1] = vcpu->arch.gpr[1];
svcpu->gpr[2] = vcpu->arch.gpr[2];
......@@ -177,17 +178,14 @@ void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
if (cpu_has_feature(CPU_FTR_ARCH_207S))
vcpu->arch.entry_ic = mfspr(SPRN_IC);
svcpu->in_use = true;
svcpu_put(svcpu);
}
/* Copy data touched by real-mode code from shadow vcpu back to vcpu */
void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
struct kvmppc_book3s_shadow_vcpu *svcpu)
void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
{
/*
* vcpu_put would just call us again because in_use hasn't
* been updated yet.
*/
preempt_disable();
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
/*
* Maybe we were already preempted and synced the svcpu from
......@@ -233,7 +231,7 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
svcpu->in_use = false;
out:
preempt_enable();
svcpu_put(svcpu);
}
static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
......
......@@ -84,12 +84,22 @@ static irqreturn_t xive_esc_irq(int irq, void *data)
{
struct kvm_vcpu *vcpu = data;
/* We use the existing H_PROD mechanism to wake up the target */
vcpu->arch.prodded = 1;
vcpu->arch.irq_pending = 1;
smp_mb();
if (vcpu->arch.ceded)
kvmppc_fast_vcpu_kick(vcpu);
/* Since we have the no-EOI flag, the interrupt is effectively
* disabled now. Clearing xive_esc_on means we won't bother
* doing so on the next entry.
*
* This also allows the entry code to know that if a PQ combination
* of 10 is observed while xive_esc_on is true, it means the queue
* contains an unprocessed escalation interrupt. We don't make use of
* that knowledge today but might (see comment in book3s_hv_rmhandler.S)
*/
vcpu->arch.xive_esc_on = false;
return IRQ_HANDLED;
}
......@@ -112,19 +122,21 @@ static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
return -EIO;
}
/*
* Future improvement: start with them disabled
* and handle DD2 and later scheme of merged escalation
* interrupts
*/
name = kasprintf(GFP_KERNEL, "kvm-%d-%d-%d",
vcpu->kvm->arch.lpid, xc->server_num, prio);
if (xc->xive->single_escalation)
name = kasprintf(GFP_KERNEL, "kvm-%d-%d",
vcpu->kvm->arch.lpid, xc->server_num);
else
name = kasprintf(GFP_KERNEL, "kvm-%d-%d-%d",
vcpu->kvm->arch.lpid, xc->server_num, prio);
if (!name) {
pr_err("Failed to allocate escalation irq name for queue %d of VCPU %d\n",
prio, xc->server_num);
rc = -ENOMEM;
goto error;
}
pr_devel("Escalation %s irq %d (prio %d)\n", name, xc->esc_virq[prio], prio);
rc = request_irq(xc->esc_virq[prio], xive_esc_irq,
IRQF_NO_THREAD, name, vcpu);
if (rc) {
......@@ -133,6 +145,25 @@ static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
goto error;
}
xc->esc_virq_names[prio] = name;
/* In single escalation mode, we grab the ESB MMIO of the
* interrupt and mask it. Also populate the VCPU v/raddr
* of the ESB page for use by asm entry/exit code. Finally
* set the XIVE_IRQ_NO_EOI flag which will prevent the
* core code from performing an EOI on the escalation
* interrupt, thus leaving it effectively masked after
* it fires once.
*/
if (xc->xive->single_escalation) {
struct irq_data *d = irq_get_irq_data(xc->esc_virq[prio]);
struct xive_irq_data *xd = irq_data_get_irq_handler_data(d);
xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_01);
vcpu->arch.xive_esc_raddr = xd->eoi_page;
vcpu->arch.xive_esc_vaddr = (__force u64)xd->eoi_mmio;
xd->flags |= XIVE_IRQ_NO_EOI;
}
return 0;
error:
irq_dispose_mapping(xc->esc_virq[prio]);
......@@ -191,12 +222,12 @@ static int xive_check_provisioning(struct kvm *kvm, u8 prio)
pr_devel("Provisioning prio... %d\n", prio);
/* Provision each VCPU and enable escalations */
/* Provision each VCPU and enable escalations if needed */
kvm_for_each_vcpu(i, vcpu, kvm) {
if (!vcpu->arch.xive_vcpu)
continue;
rc = xive_provision_queue(vcpu, prio);
if (rc == 0)
if (rc == 0 && !xive->single_escalation)
xive_attach_escalation(vcpu, prio);
if (rc)
return rc;
......@@ -1082,6 +1113,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
/* Allocate IPI */
xc->vp_ipi = xive_native_alloc_irq();
if (!xc->vp_ipi) {
pr_err("Failed to allocate xive irq for VCPU IPI\n");
r = -EIO;
goto bail;
}
......@@ -1091,19 +1123,34 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
if (r)
goto bail;
/*
* Enable the VP first as the single escalation mode will
* affect escalation interrupts numbering
*/
r = xive_native_enable_vp(xc->vp_id, xive->single_escalation);
if (r) {
pr_err("Failed to enable VP in OPAL, err %d\n", r);
goto bail;
}
/*
* Initialize queues. Initially we set them all for no queueing
* and we enable escalation for queue 0 only which we'll use for
* our mfrr change notifications. If the VCPU is hot-plugged, we
* do handle provisioning however.
* do handle provisioning however based on the existing "map"
* of enabled queues.
*/
for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
struct xive_q *q = &xc->queues[i];
/* Single escalation, no queue 7 */
if (i == 7 && xive->single_escalation)
break;
/* Is queue already enabled ? Provision it */
if (xive->qmap & (1 << i)) {
r = xive_provision_queue(vcpu, i);
if (r == 0)
if (r == 0 && !xive->single_escalation)
xive_attach_escalation(vcpu, i);
if (r)
goto bail;
......@@ -1123,11 +1170,6 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
if (r)
goto bail;
/* Enable the VP */
r = xive_native_enable_vp(xc->vp_id);
if (r)
goto bail;
/* Route the IPI */
r = xive_native_configure_irq(xc->vp_ipi, xc->vp_id, 0, XICS_IPI);
if (!r)
......@@ -1474,6 +1516,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr)
pr_devel(" val=0x016%llx (server=0x%x, guest_prio=%d)\n",
val, server, guest_prio);
/*
* If the source doesn't already have an IPI, allocate
* one and get the corresponding data
......@@ -1762,6 +1805,8 @@ static int kvmppc_xive_create(struct kvm_device *dev, u32 type)
if (xive->vp_base == XIVE_INVALID_VP)
ret = -ENOMEM;
xive->single_escalation = xive_native_has_single_escalation();
if (ret) {
kfree(xive);
return ret;
......@@ -1795,6 +1840,7 @@ static int xive_debug_show(struct seq_file *m, void *private)
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
unsigned int i;
if (!xc)
continue;
......@@ -1804,6 +1850,33 @@ static int xive_debug_show(struct seq_file *m, void *private)
xc->server_num, xc->cppr, xc->hw_cppr,
xc->mfrr, xc->pending,
xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
struct xive_q *q = &xc->queues[i];
u32 i0, i1, idx;
if (!q->qpage && !xc->esc_virq[i])
continue;
seq_printf(m, " [q%d]: ", i);
if (q->qpage) {
idx = q->idx;
i0 = be32_to_cpup(q->qpage + idx);
idx = (idx + 1) & q->msk;
i1 = be32_to_cpup(q->qpage + idx);
seq_printf(m, "T=%d %08x %08x... \n", q->toggle, i0, i1);
}
if (xc->esc_virq[i]) {
struct irq_data *d = irq_get_irq_data(xc->esc_virq[i]);
struct xive_irq_data *xd = irq_data_get_irq_handler_data(d);
u64 pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
seq_printf(m, "E:%c%c I(%d:%llx:%llx)",
(pq & XIVE_ESB_VAL_P) ? 'P' : 'p',
(pq & XIVE_ESB_VAL_Q) ? 'Q' : 'q',
xc->esc_virq[i], pq, xd->eoi_page);
seq_printf(m, "\n");
}
}
t_rm_h_xirr += xc->stat_rm_h_xirr;
t_rm_h_ipoll += xc->stat_rm_h_ipoll;
......
......@@ -120,6 +120,8 @@ struct kvmppc_xive {
u32 q_order;
u32 q_page_order;
/* Flags */
u8 single_escalation;
};
#define KVMPPC_XIVE_Q_COUNT 8
......@@ -201,25 +203,20 @@ static inline struct kvmppc_xive_src_block *kvmppc_xive_find_source(struct kvmpp
* is as follow.
*
* Guest request for 0...6 are honored. Guest request for anything
* higher results in a priority of 7 being applied.
*
* However, when XIRR is returned via H_XIRR, 7 is translated to 0xb
* in order to match AIX expectations
* higher results in a priority of 6 being applied.
*
* Similar mapping is done for CPPR values
*/
static inline u8 xive_prio_from_guest(u8 prio)
{
if (prio == 0xff || prio < 8)
if (prio == 0xff || prio < 6)
return prio;
return 7;
return 6;
}
static inline u8 xive_prio_to_guest(u8 prio)
{
if (prio == 0xff || prio < 7)
return prio;
return 0xb;
return prio;
}
static inline u32 __xive_read_eq(__be32 *qpage, u32 msk, u32 *idx, u32 *toggle)
......
......@@ -1431,6 +1431,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
regs->pc = vcpu->arch.pc;
regs->cr = kvmppc_get_cr(vcpu);
regs->ctr = vcpu->arch.ctr;
......@@ -1452,6 +1454,7 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
vcpu_put(vcpu);
return 0;
}
......@@ -1459,6 +1462,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
vcpu_load(vcpu);
vcpu->arch.pc = regs->pc;
kvmppc_set_cr(vcpu, regs->cr);
vcpu->arch.ctr = regs->ctr;
......@@ -1480,6 +1485,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
kvmppc_set_gpr(vcpu, i, regs->gpr[i]);
vcpu_put(vcpu);
return 0;
}
......@@ -1607,30 +1613,42 @@ int kvmppc_set_sregs_ivor(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
int ret;
vcpu_load(vcpu);
sregs->pvr = vcpu->arch.pvr;
get_sregs_base(vcpu, sregs);
get_sregs_arch206(vcpu, sregs);
return vcpu->kvm->arch.kvm_ops->get_sregs(vcpu, sregs);
ret = vcpu->kvm->arch.kvm_ops->get_sregs(vcpu, sregs);
vcpu_put(vcpu);
return ret;
}
int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
int ret;
int ret = -EINVAL;
vcpu_load(vcpu);
if (vcpu->arch.pvr != sregs->pvr)
return -EINVAL;
goto out;
ret = set_sregs_base(vcpu, sregs);
if (ret < 0)
return ret;
goto out;
ret = set_sregs_arch206(vcpu, sregs);
if (ret < 0)
return ret;
goto out;
return vcpu->kvm->arch.kvm_ops->set_sregs(vcpu, sregs);
ret = vcpu->kvm->arch.kvm_ops->set_sregs(vcpu, sregs);
out:
vcpu_put(vcpu);
return ret;
}
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
......@@ -1773,7 +1791,9 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
{
int r;
vcpu_load(vcpu);
r = kvmppc_core_vcpu_translate(vcpu, tr);
vcpu_put(vcpu);
return r;
}
......@@ -1996,12 +2016,15 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
{
struct debug_reg *dbg_reg;
int n, b = 0, w = 0;
int ret = 0;
vcpu_load(vcpu);
if (!(dbg->control & KVM_GUESTDBG_ENABLE)) {
vcpu->arch.dbg_reg.dbcr0 = 0;
vcpu->guest_debug = 0;
kvm_guest_protect_msr(vcpu, MSR_DE, false);
return 0;
goto out;
}
kvm_guest_protect_msr(vcpu, MSR_DE, true);
......@@ -2033,8 +2056,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
#endif
if (!(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP))
return 0;
goto out;
ret = -EINVAL;
for (n = 0; n < (KVMPPC_BOOKE_IAC_NUM + KVMPPC_BOOKE_DAC_NUM); n++) {
uint64_t addr = dbg->arch.bp[n].addr;
uint32_t type = dbg->arch.bp[n].type;
......@@ -2045,21 +2069,24 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
if (type & ~(KVMPPC_DEBUG_WATCH_READ |
KVMPPC_DEBUG_WATCH_WRITE |
KVMPPC_DEBUG_BREAKPOINT))
return -EINVAL;
goto out;
if (type & KVMPPC_DEBUG_BREAKPOINT) {
/* Setting H/W breakpoint */
if (kvmppc_booke_add_breakpoint(dbg_reg, addr, b++))
return -EINVAL;
goto out;
} else {
/* Setting H/W watchpoint */
if (kvmppc_booke_add_watchpoint(dbg_reg, addr,
type, w++))
return -EINVAL;
goto out;
}
}
return 0;
ret = 0;
out:
vcpu_put(vcpu);
return ret;
}
void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
......
......@@ -58,6 +58,18 @@ static bool kvmppc_check_vsx_disabled(struct kvm_vcpu *vcpu)
}
#endif /* CONFIG_VSX */
#ifdef CONFIG_ALTIVEC
static bool kvmppc_check_altivec_disabled(struct kvm_vcpu *vcpu)
{
if (!(kvmppc_get_msr(vcpu) & MSR_VEC)) {
kvmppc_core_queue_vec_unavail(vcpu);
return true;
}
return false;
}
#endif /* CONFIG_ALTIVEC */
/*
* XXX to do:
* lfiwax, lfiwzx
......@@ -98,6 +110,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_NONE;
vcpu->arch.mmio_sp64_extend = 0;
vcpu->arch.mmio_sign_extend = 0;
vcpu->arch.mmio_vmx_copy_nums = 0;
switch (get_op(inst)) {
case 31:
......@@ -459,6 +472,29 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
rs, 4, 1);
break;
#endif /* CONFIG_VSX */
#ifdef CONFIG_ALTIVEC
case OP_31_XOP_LVX:
if (kvmppc_check_altivec_disabled(vcpu))
return EMULATE_DONE;
vcpu->arch.vaddr_accessed &= ~0xFULL;
vcpu->arch.paddr_accessed &= ~0xFULL;
vcpu->arch.mmio_vmx_copy_nums = 2;
emulated = kvmppc_handle_load128_by2x64(run, vcpu,
KVM_MMIO_REG_VMX|rt, 1);
break;
case OP_31_XOP_STVX:
if (kvmppc_check_altivec_disabled(vcpu))
return EMULATE_DONE;
vcpu->arch.vaddr_accessed &= ~0xFULL;
vcpu->arch.paddr_accessed &= ~0xFULL;
vcpu->arch.mmio_vmx_copy_nums = 2;
emulated = kvmppc_handle_store128_by2x64(run, vcpu,
rs, 1);
break;
#endif /* CONFIG_ALTIVEC */
default:
emulated = EMULATE_FAIL;
break;
......
......@@ -638,8 +638,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = 1;
break;
case KVM_CAP_SPAPR_RESIZE_HPT:
/* Disable this on POWER9 until code handles new HPTE format */
r = !!hv_enabled && !cpu_has_feature(CPU_FTR_ARCH_300);
r = !!hv_enabled;
break;
#endif
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
......@@ -763,7 +762,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
vcpu->arch.dec_expires = ~(u64)0;
vcpu->arch.dec_expires = get_tb();
#ifdef CONFIG_KVM_EXIT_TIMING
mutex_init(&vcpu->arch.exit_timing_lock);
......@@ -930,6 +929,34 @@ static inline void kvmppc_set_vsr_word(struct kvm_vcpu *vcpu,
}
#endif /* CONFIG_VSX */
#ifdef CONFIG_ALTIVEC
static inline void kvmppc_set_vmx_dword(struct kvm_vcpu *vcpu,
u64 gpr)
{
int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
u32 hi, lo;
u32 di;
#ifdef __BIG_ENDIAN
hi = gpr >> 32;
lo = gpr & 0xffffffff;
#else
lo = gpr >> 32;
hi = gpr & 0xffffffff;
#endif
di = 2 - vcpu->arch.mmio_vmx_copy_nums; /* doubleword index */
if (di > 1)
return;
if (vcpu->arch.mmio_host_swabbed)
di = 1 - di;
VCPU_VSX_VR(vcpu, index).u[di * 2] = hi;
VCPU_VSX_VR(vcpu, index).u[di * 2 + 1] = lo;
}
#endif /* CONFIG_ALTIVEC */
#ifdef CONFIG_PPC_FPU
static inline u64 sp_to_dp(u32 fprs)
{
......@@ -1032,6 +1059,11 @@ static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu,
KVMPPC_VSX_COPY_DWORD_LOAD_DUMP)
kvmppc_set_vsr_dword_dump(vcpu, gpr);
break;
#endif
#ifdef CONFIG_ALTIVEC
case KVM_MMIO_REG_VMX:
kvmppc_set_vmx_dword(vcpu, gpr);
break;
#endif
default:
BUG();
......@@ -1106,11 +1138,9 @@ int kvmppc_handle_vsx_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
{
enum emulation_result emulated = EMULATE_DONE;
/* Currently, mmio_vsx_copy_nums only allowed to be less than 4 */
if ( (vcpu->arch.mmio_vsx_copy_nums > 4) ||
(vcpu->arch.mmio_vsx_copy_nums < 0) ) {
/* Currently, mmio_vsx_copy_nums only allowed to be 4 or less */
if (vcpu->arch.mmio_vsx_copy_nums > 4)
return EMULATE_FAIL;
}
while (vcpu->arch.mmio_vsx_copy_nums) {
emulated = __kvmppc_handle_load(run, vcpu, rt, bytes,
......@@ -1252,11 +1282,9 @@ int kvmppc_handle_vsx_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
vcpu->arch.io_gpr = rs;
/* Currently, mmio_vsx_copy_nums only allowed to be less than 4 */
if ( (vcpu->arch.mmio_vsx_copy_nums > 4) ||
(vcpu->arch.mmio_vsx_copy_nums < 0) ) {
/* Currently, mmio_vsx_copy_nums only allowed to be 4 or less */
if (vcpu->arch.mmio_vsx_copy_nums > 4)
return EMULATE_FAIL;
}
while (vcpu->arch.mmio_vsx_copy_nums) {
if (kvmppc_get_vsr_data(vcpu, rs, &val) == -1)
......@@ -1312,6 +1340,111 @@ static int kvmppc_emulate_mmio_vsx_loadstore(struct kvm_vcpu *vcpu,
}
#endif /* CONFIG_VSX */
#ifdef CONFIG_ALTIVEC
/* handle quadword load access in two halves */
int kvmppc_handle_load128_by2x64(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int rt, int is_default_endian)
{
enum emulation_result emulated;
while (vcpu->arch.mmio_vmx_copy_nums) {
emulated = __kvmppc_handle_load(run, vcpu, rt, 8,
is_default_endian, 0);
if (emulated != EMULATE_DONE)
break;
vcpu->arch.paddr_accessed += run->mmio.len;
vcpu->arch.mmio_vmx_copy_nums--;
}
return emulated;
}
static inline int kvmppc_get_vmx_data(struct kvm_vcpu *vcpu, int rs, u64 *val)
{
vector128 vrs = VCPU_VSX_VR(vcpu, rs);
u32 di;
u64 w0, w1;
di = 2 - vcpu->arch.mmio_vmx_copy_nums; /* doubleword index */
if (di > 1)
return -1;
if (vcpu->arch.mmio_host_swabbed)
di = 1 - di;
w0 = vrs.u[di * 2];
w1 = vrs.u[di * 2 + 1];
#ifdef __BIG_ENDIAN
*val = (w0 << 32) | w1;
#else
*val = (w1 << 32) | w0;
#endif
return 0;
}
/* handle quadword store in two halves */
int kvmppc_handle_store128_by2x64(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int rs, int is_default_endian)
{
u64 val = 0;
enum emulation_result emulated = EMULATE_DONE;
vcpu->arch.io_gpr = rs;
while (vcpu->arch.mmio_vmx_copy_nums) {
if (kvmppc_get_vmx_data(vcpu, rs, &val) == -1)
return EMULATE_FAIL;
emulated = kvmppc_handle_store(run, vcpu, val, 8,
is_default_endian);
if (emulated != EMULATE_DONE)
break;
vcpu->arch.paddr_accessed += run->mmio.len;
vcpu->arch.mmio_vmx_copy_nums--;
}
return emulated;
}
static int kvmppc_emulate_mmio_vmx_loadstore(struct kvm_vcpu *vcpu,
struct kvm_run *run)
{
enum emulation_result emulated = EMULATE_FAIL;
int r;
vcpu->arch.paddr_accessed += run->mmio.len;
if (!vcpu->mmio_is_write) {
emulated = kvmppc_handle_load128_by2x64(run, vcpu,
vcpu->arch.io_gpr, 1);
} else {
emulated = kvmppc_handle_store128_by2x64(run, vcpu,
vcpu->arch.io_gpr, 1);
}
switch (emulated) {
case EMULATE_DO_MMIO:
run->exit_reason = KVM_EXIT_MMIO;
r = RESUME_HOST;
break;
case EMULATE_FAIL:
pr_info("KVM: MMIO emulation failed (VMX repeat)\n");
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
r = RESUME_HOST;
break;
default:
r = RESUME_GUEST;
break;
}
return r;
}
#endif /* CONFIG_ALTIVEC */
int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
{
int r = 0;
......@@ -1413,6 +1546,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
int r;
vcpu_load(vcpu);
if (vcpu->mmio_needed) {
vcpu->mmio_needed = 0;
if (!vcpu->mmio_is_write)
......@@ -1427,7 +1562,19 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
r = kvmppc_emulate_mmio_vsx_loadstore(vcpu, run);
if (r == RESUME_HOST) {
vcpu->mmio_needed = 1;
return r;
goto out;
}
}
#endif
#ifdef CONFIG_ALTIVEC
if (vcpu->arch.mmio_vmx_copy_nums > 0)
vcpu->arch.mmio_vmx_copy_nums--;
if (vcpu->arch.mmio_vmx_copy_nums > 0) {
r = kvmppc_emulate_mmio_vmx_loadstore(vcpu, run);
if (r == RESUME_HOST) {
vcpu->mmio_needed = 1;
goto out;
}
}
#endif
......@@ -1461,6 +1608,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_sigset_deactivate(vcpu);
out:
vcpu_put(vcpu);
return r;
}
......@@ -1608,23 +1757,31 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
return -EINVAL;
}
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
long kvm_arch_vcpu_async_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
long r;
switch (ioctl) {
case KVM_INTERRUPT: {
if (ioctl == KVM_INTERRUPT) {
struct kvm_interrupt irq;
r = -EFAULT;
if (copy_from_user(&irq, argp, sizeof(irq)))
goto out;
r = kvm_vcpu_ioctl_interrupt(vcpu, &irq);
goto out;
return -EFAULT;
return kvm_vcpu_ioctl_interrupt(vcpu, &irq);
}
return -ENOIOCTLCMD;
}
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
long r;
vcpu_load(vcpu);
switch (ioctl) {
case KVM_ENABLE_CAP:
{
struct kvm_enable_cap cap;
......@@ -1664,6 +1821,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
}
out:
vcpu_put(vcpu);
return r;
}
......
......@@ -143,8 +143,7 @@ static int kvmppc_exit_timing_show(struct seq_file *m, void *private)
int i;
u64 min, max, sum, sum_quad;
seq_printf(m, "%s", "type count min max sum sum_squared\n");
seq_puts(m, "type count min max sum sum_squared\n");
for (i = 0; i < __NUMBER_OF_KVM_EXIT_TYPES; i++) {
......
......@@ -42,6 +42,7 @@ static u32 xive_provision_chip_count;
static u32 xive_queue_shift;
static u32 xive_pool_vps = XIVE_INVALID_VP;
static struct kmem_cache *xive_provision_cache;
static bool xive_has_single_esc;
int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
{
......@@ -571,6 +572,10 @@ bool __init xive_native_init(void)
break;
}
/* Do we support single escalation */
if (of_get_property(np, "single-escalation-support", NULL) != NULL)
xive_has_single_esc = true;
/* Configure Thread Management areas for KVM */
for_each_possible_cpu(cpu)
kvmppc_set_xive_tima(cpu, r.start, tima);
......@@ -667,12 +672,15 @@ void xive_native_free_vp_block(u32 vp_base)
}
EXPORT_SYMBOL_GPL(xive_native_free_vp_block);
int xive_native_enable_vp(u32 vp_id)
int xive_native_enable_vp(u32 vp_id, bool single_escalation)
{
s64 rc;
u64 flags = OPAL_XIVE_VP_ENABLED;
if (single_escalation)
flags |= OPAL_XIVE_VP_SINGLE_ESCALATION;
for (;;) {
rc = opal_xive_set_vp_info(vp_id, OPAL_XIVE_VP_ENABLED, 0);
rc = opal_xive_set_vp_info(vp_id, flags, 0);
if (rc != OPAL_BUSY)
break;
msleep(1);
......@@ -710,3 +718,9 @@ int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id)
return 0;
}
EXPORT_SYMBOL_GPL(xive_native_get_vp_info);
bool xive_native_has_single_escalation(void)
{
return xive_has_single_esc;
}
EXPORT_SYMBOL_GPL(xive_native_has_single_escalation);
......@@ -261,6 +261,11 @@ static inline void clear_bit_inv(unsigned long nr, volatile unsigned long *ptr)
return clear_bit(nr ^ (BITS_PER_LONG - 1), ptr);
}
static inline int test_and_clear_bit_inv(unsigned long nr, volatile unsigned long *ptr)
{
return test_and_clear_bit(nr ^ (BITS_PER_LONG - 1), ptr);
}
static inline void __set_bit_inv(unsigned long nr, volatile unsigned long *ptr)
{
return __set_bit(nr ^ (BITS_PER_LONG - 1), ptr);
......
......@@ -20,7 +20,9 @@ struct css_general_char {
u32 aif_tdd : 1; /* bit 56 */
u32 : 1;
u32 qebsm : 1; /* bit 58 */
u32 : 8;
u32 : 2;
u32 aiv : 1; /* bit 61 */
u32 : 5;
u32 aif_osa : 1; /* bit 67 */
u32 : 12;
u32 eadm_rf : 1; /* bit 80 */
......
......@@ -2,7 +2,7 @@
/*
* definition for kernel virtual machines on s390
*
* Copyright IBM Corp. 2008, 2009
* Copyright IBM Corp. 2008, 2018
*
* Author(s): Carsten Otte <cotte@de.ibm.com>
*/
......@@ -183,6 +183,7 @@ struct kvm_s390_sie_block {
#define ECA_IB 0x40000000
#define ECA_SIGPI 0x10000000
#define ECA_MVPGI 0x01000000
#define ECA_AIV 0x00200000
#define ECA_VX 0x00020000
#define ECA_PROTEXCI 0x00002000
#define ECA_SII 0x00000001
......@@ -228,7 +229,9 @@ struct kvm_s390_sie_block {
__u8 epdx; /* 0x0069 */
__u8 reserved6a[2]; /* 0x006a */
__u32 todpr; /* 0x006c */
__u8 reserved70[16]; /* 0x0070 */
#define GISA_FORMAT1 0x00000001
__u32 gd; /* 0x0070 */
__u8 reserved74[12]; /* 0x0074 */
__u64 mso; /* 0x0080 */
__u64 msl; /* 0x0088 */
psw_t gpsw; /* 0x0090 */
......@@ -317,18 +320,30 @@ struct kvm_vcpu_stat {
u64 deliver_program_int;
u64 deliver_io_int;
u64 exit_wait_state;
u64 instruction_epsw;
u64 instruction_gs;
u64 instruction_io_other;
u64 instruction_lpsw;
u64 instruction_lpswe;
u64 instruction_pfmf;
u64 instruction_ptff;
u64 instruction_sck;
u64 instruction_sckpf;
u64 instruction_stidp;
u64 instruction_spx;
u64 instruction_stpx;
u64 instruction_stap;
u64 instruction_storage_key;
u64 instruction_iske;
u64 instruction_ri;
u64 instruction_rrbe;
u64 instruction_sske;
u64 instruction_ipte_interlock;
u64 instruction_stsch;
u64 instruction_chsc;
u64 instruction_stsi;
u64 instruction_stfl;
u64 instruction_tb;
u64 instruction_tpi;
u64 instruction_tprot;
u64 instruction_tsch;
u64 instruction_sie;
u64 instruction_essa;
u64 instruction_sthyi;
......@@ -354,6 +369,7 @@ struct kvm_vcpu_stat {
u64 diagnose_258;
u64 diagnose_308;
u64 diagnose_500;
u64 diagnose_other;
};
#define PGM_OPERATION 0x01
......@@ -410,35 +426,35 @@ struct kvm_vcpu_stat {
#define PGM_PER 0x80
#define PGM_CRYPTO_OPERATION 0x119
/* irq types in order of priority */
/* irq types in ascend order of priorities */
enum irq_types {
IRQ_PEND_MCHK_EX = 0,
IRQ_PEND_SVC,
IRQ_PEND_PROG,
IRQ_PEND_MCHK_REP,
IRQ_PEND_EXT_IRQ_KEY,
IRQ_PEND_EXT_MALFUNC,
IRQ_PEND_EXT_EMERGENCY,
IRQ_PEND_EXT_EXTERNAL,
IRQ_PEND_EXT_CLOCK_COMP,
IRQ_PEND_EXT_CPU_TIMER,
IRQ_PEND_EXT_TIMING,
IRQ_PEND_EXT_SERVICE,
IRQ_PEND_EXT_HOST,
IRQ_PEND_PFAULT_INIT,
IRQ_PEND_PFAULT_DONE,
IRQ_PEND_VIRTIO,
IRQ_PEND_IO_ISC_0,
IRQ_PEND_IO_ISC_1,
IRQ_PEND_IO_ISC_2,
IRQ_PEND_IO_ISC_3,
IRQ_PEND_IO_ISC_4,
IRQ_PEND_IO_ISC_5,
IRQ_PEND_IO_ISC_6,
IRQ_PEND_IO_ISC_7,
IRQ_PEND_SIGP_STOP,
IRQ_PEND_SET_PREFIX = 0,
IRQ_PEND_RESTART,
IRQ_PEND_SET_PREFIX,
IRQ_PEND_SIGP_STOP,
IRQ_PEND_IO_ISC_7,
IRQ_PEND_IO_ISC_6,
IRQ_PEND_IO_ISC_5,
IRQ_PEND_IO_ISC_4,
IRQ_PEND_IO_ISC_3,
IRQ_PEND_IO_ISC_2,
IRQ_PEND_IO_ISC_1,
IRQ_PEND_IO_ISC_0,
IRQ_PEND_VIRTIO,
IRQ_PEND_PFAULT_DONE,
IRQ_PEND_PFAULT_INIT,
IRQ_PEND_EXT_HOST,
IRQ_PEND_EXT_SERVICE,
IRQ_PEND_EXT_TIMING,
IRQ_PEND_EXT_CPU_TIMER,
IRQ_PEND_EXT_CLOCK_COMP,
IRQ_PEND_EXT_EXTERNAL,
IRQ_PEND_EXT_EMERGENCY,
IRQ_PEND_EXT_MALFUNC,
IRQ_PEND_EXT_IRQ_KEY,
IRQ_PEND_MCHK_REP,
IRQ_PEND_PROG,
IRQ_PEND_SVC,
IRQ_PEND_MCHK_EX,
IRQ_PEND_COUNT
};
......@@ -516,9 +532,6 @@ struct kvm_s390_irq_payload {
struct kvm_s390_local_interrupt {
spinlock_t lock;
struct kvm_s390_float_interrupt *float_int;
struct swait_queue_head *wq;
atomic_t *cpuflags;
DECLARE_BITMAP(sigp_emerg_pending, KVM_MAX_VCPUS);
struct kvm_s390_irq_payload irq;
unsigned long pending_irqs;
......@@ -707,14 +720,50 @@ struct kvm_s390_crypto_cb {
struct kvm_s390_apcb1 apcb1; /* 0x0080 */
};
struct kvm_s390_gisa {
union {
struct { /* common to all formats */
u32 next_alert;
u8 ipm;
u8 reserved01[2];
u8 iam;
};
struct { /* format 0 */
u32 next_alert;
u8 ipm;
u8 reserved01;
u8 : 6;
u8 g : 1;
u8 c : 1;
u8 iam;
u8 reserved02[4];
u32 airq_count;
} g0;
struct { /* format 1 */
u32 next_alert;
u8 ipm;
u8 simm;
u8 nimm;
u8 iam;
u8 aism[8];
u8 : 6;
u8 g : 1;
u8 c : 1;
u8 reserved03[11];
u32 airq_count;
} g1;
};
};
/*
* sie_page2 has to be allocated as DMA because fac_list and crycb need
* 31bit addresses in the sie control block.
* sie_page2 has to be allocated as DMA because fac_list, crycb and
* gisa need 31bit addresses in the sie control block.
*/
struct sie_page2 {
__u64 fac_list[S390_ARCH_FAC_LIST_SIZE_U64]; /* 0x0000 */
struct kvm_s390_crypto_cb crycb; /* 0x0800 */
u8 reserved900[0x1000 - 0x900]; /* 0x0900 */
struct kvm_s390_gisa gisa; /* 0x0900 */
u8 reserved920[0x1000 - 0x920]; /* 0x0920 */
};
struct kvm_s390_vsie {
......@@ -761,6 +810,7 @@ struct kvm_arch{
struct kvm_s390_migration_state *migration_state;
/* subset of available cpu features enabled by user space */
DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
struct kvm_s390_gisa *gisa;
};
#define KVM_HVA_ERR_BAD (-1UL)
......
......@@ -77,6 +77,7 @@ struct sclp_info {
unsigned char has_ibs : 1;
unsigned char has_skey : 1;
unsigned char has_kss : 1;
unsigned char has_gisaf : 1;
unsigned int ibc;
unsigned int mtid;
unsigned int mtid_cp;
......
......@@ -23,6 +23,7 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_VCPU_ASYNC_IOCTL
select HAVE_KVM_EVENTFD
select KVM_ASYNC_PF
select KVM_ASYNC_PF_SYNC
......
......@@ -257,6 +257,7 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
case 0x500:
return __diag_virtio_hypercall(vcpu);
default:
vcpu->stat.diagnose_other++;
return -EOPNOTSUPP;
}
}
......@@ -36,7 +36,7 @@ static int sca_ext_call_pending(struct kvm_vcpu *vcpu, int *src_id)
{
int c, scn;
if (!(atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_ECALL_PEND))
if (!kvm_s390_test_cpuflags(vcpu, CPUSTAT_ECALL_PEND))
return 0;
BUG_ON(!kvm_s390_use_sca_entries());
......@@ -101,18 +101,17 @@ static int sca_inject_ext_call(struct kvm_vcpu *vcpu, int src_id)
/* another external call is pending */
return -EBUSY;
}
atomic_or(CPUSTAT_ECALL_PEND, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_ECALL_PEND);
return 0;
}
static void sca_clear_ext_call(struct kvm_vcpu *vcpu)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
int rc, expect;
if (!kvm_s390_use_sca_entries())
return;
atomic_andnot(CPUSTAT_ECALL_PEND, li->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_ECALL_PEND);
read_lock(&vcpu->kvm->arch.sca_lock);
if (vcpu->kvm->arch.use_esca) {
struct esca_block *sca = vcpu->kvm->arch.sca;
......@@ -190,8 +189,8 @@ static int cpu_timer_irq_pending(struct kvm_vcpu *vcpu)
static inline int is_ioirq(unsigned long irq_type)
{
return ((irq_type >= IRQ_PEND_IO_ISC_0) &&
(irq_type <= IRQ_PEND_IO_ISC_7));
return ((irq_type >= IRQ_PEND_IO_ISC_7) &&
(irq_type <= IRQ_PEND_IO_ISC_0));
}
static uint64_t isc_to_isc_bits(int isc)
......@@ -199,25 +198,59 @@ static uint64_t isc_to_isc_bits(int isc)
return (0x80 >> isc) << 24;
}
static inline u32 isc_to_int_word(u8 isc)
{
return ((u32)isc << 27) | 0x80000000;
}
static inline u8 int_word_to_isc(u32 int_word)
{
return (int_word & 0x38000000) >> 27;
}
/*
* To use atomic bitmap functions, we have to provide a bitmap address
* that is u64 aligned. However, the ipm might be u32 aligned.
* Therefore, we logically start the bitmap at the very beginning of the
* struct and fixup the bit number.
*/
#define IPM_BIT_OFFSET (offsetof(struct kvm_s390_gisa, ipm) * BITS_PER_BYTE)
static inline void kvm_s390_gisa_set_ipm_gisc(struct kvm_s390_gisa *gisa, u32 gisc)
{
set_bit_inv(IPM_BIT_OFFSET + gisc, (unsigned long *) gisa);
}
static inline u8 kvm_s390_gisa_get_ipm(struct kvm_s390_gisa *gisa)
{
return READ_ONCE(gisa->ipm);
}
static inline void kvm_s390_gisa_clear_ipm_gisc(struct kvm_s390_gisa *gisa, u32 gisc)
{
clear_bit_inv(IPM_BIT_OFFSET + gisc, (unsigned long *) gisa);
}
static inline int kvm_s390_gisa_tac_ipm_gisc(struct kvm_s390_gisa *gisa, u32 gisc)
{
return test_and_clear_bit_inv(IPM_BIT_OFFSET + gisc, (unsigned long *) gisa);
}
static inline unsigned long pending_irqs(struct kvm_vcpu *vcpu)
{
return vcpu->kvm->arch.float_int.pending_irqs |
vcpu->arch.local_int.pending_irqs;
vcpu->arch.local_int.pending_irqs |
kvm_s390_gisa_get_ipm(vcpu->kvm->arch.gisa) << IRQ_PEND_IO_ISC_7;
}
static inline int isc_to_irq_type(unsigned long isc)
{
return IRQ_PEND_IO_ISC_0 + isc;
return IRQ_PEND_IO_ISC_0 - isc;
}
static inline int irq_type_to_isc(unsigned long irq_type)
{
return irq_type - IRQ_PEND_IO_ISC_0;
return IRQ_PEND_IO_ISC_0 - irq_type;
}
static unsigned long disable_iscs(struct kvm_vcpu *vcpu,
......@@ -278,20 +311,20 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
static void __set_cpu_idle(struct kvm_vcpu *vcpu)
{
atomic_or(CPUSTAT_WAIT, &vcpu->arch.sie_block->cpuflags);
set_bit(vcpu->vcpu_id, vcpu->arch.local_int.float_int->idle_mask);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_WAIT);
set_bit(vcpu->vcpu_id, vcpu->kvm->arch.float_int.idle_mask);
}
static void __unset_cpu_idle(struct kvm_vcpu *vcpu)
{
atomic_andnot(CPUSTAT_WAIT, &vcpu->arch.sie_block->cpuflags);
clear_bit(vcpu->vcpu_id, vcpu->arch.local_int.float_int->idle_mask);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT);
clear_bit(vcpu->vcpu_id, vcpu->kvm->arch.float_int.idle_mask);
}
static void __reset_intercept_indicators(struct kvm_vcpu *vcpu)
{
atomic_andnot(CPUSTAT_IO_INT | CPUSTAT_EXT_INT | CPUSTAT_STOP_INT,
&vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_IO_INT | CPUSTAT_EXT_INT |
CPUSTAT_STOP_INT);
vcpu->arch.sie_block->lctl = 0x0000;
vcpu->arch.sie_block->ictl &= ~(ICTL_LPSW | ICTL_STCTL | ICTL_PINT);
......@@ -302,17 +335,12 @@ static void __reset_intercept_indicators(struct kvm_vcpu *vcpu)
}
}
static void __set_cpuflag(struct kvm_vcpu *vcpu, u32 flag)
{
atomic_or(flag, &vcpu->arch.sie_block->cpuflags);
}
static void set_intercept_indicators_io(struct kvm_vcpu *vcpu)
{
if (!(pending_irqs(vcpu) & IRQ_PEND_IO_MASK))
return;
else if (psw_ioint_disabled(vcpu))
__set_cpuflag(vcpu, CPUSTAT_IO_INT);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_IO_INT);
else
vcpu->arch.sie_block->lctl |= LCTL_CR6;
}
......@@ -322,7 +350,7 @@ static void set_intercept_indicators_ext(struct kvm_vcpu *vcpu)
if (!(pending_irqs(vcpu) & IRQ_PEND_EXT_MASK))
return;
if (psw_extint_disabled(vcpu))
__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
else
vcpu->arch.sie_block->lctl |= LCTL_CR0;
}
......@@ -340,7 +368,7 @@ static void set_intercept_indicators_mchk(struct kvm_vcpu *vcpu)
static void set_intercept_indicators_stop(struct kvm_vcpu *vcpu)
{
if (kvm_s390_is_stop_irq_pending(vcpu))
__set_cpuflag(vcpu, CPUSTAT_STOP_INT);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOP_INT);
}
/* Set interception request for non-deliverable interrupts */
......@@ -897,18 +925,38 @@ static int __must_check __deliver_virtio(struct kvm_vcpu *vcpu)
return rc ? -EFAULT : 0;
}
static int __do_deliver_io(struct kvm_vcpu *vcpu, struct kvm_s390_io_info *io)
{
int rc;
rc = put_guest_lc(vcpu, io->subchannel_id, (u16 *)__LC_SUBCHANNEL_ID);
rc |= put_guest_lc(vcpu, io->subchannel_nr, (u16 *)__LC_SUBCHANNEL_NR);
rc |= put_guest_lc(vcpu, io->io_int_parm, (u32 *)__LC_IO_INT_PARM);
rc |= put_guest_lc(vcpu, io->io_int_word, (u32 *)__LC_IO_INT_WORD);
rc |= write_guest_lc(vcpu, __LC_IO_OLD_PSW,
&vcpu->arch.sie_block->gpsw,
sizeof(psw_t));
rc |= read_guest_lc(vcpu, __LC_IO_NEW_PSW,
&vcpu->arch.sie_block->gpsw,
sizeof(psw_t));
return rc ? -EFAULT : 0;
}
static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
unsigned long irq_type)
{
struct list_head *isc_list;
struct kvm_s390_float_interrupt *fi;
struct kvm_s390_interrupt_info *inti = NULL;
struct kvm_s390_io_info io;
u32 isc;
int rc = 0;
fi = &vcpu->kvm->arch.float_int;
spin_lock(&fi->lock);
isc_list = &fi->lists[irq_type_to_isc(irq_type)];
isc = irq_type_to_isc(irq_type);
isc_list = &fi->lists[isc];
inti = list_first_entry_or_null(isc_list,
struct kvm_s390_interrupt_info,
list);
......@@ -936,24 +984,31 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
spin_unlock(&fi->lock);
if (inti) {
rc = put_guest_lc(vcpu, inti->io.subchannel_id,
(u16 *)__LC_SUBCHANNEL_ID);
rc |= put_guest_lc(vcpu, inti->io.subchannel_nr,
(u16 *)__LC_SUBCHANNEL_NR);
rc |= put_guest_lc(vcpu, inti->io.io_int_parm,
(u32 *)__LC_IO_INT_PARM);
rc |= put_guest_lc(vcpu, inti->io.io_int_word,
(u32 *)__LC_IO_INT_WORD);
rc |= write_guest_lc(vcpu, __LC_IO_OLD_PSW,
&vcpu->arch.sie_block->gpsw,
sizeof(psw_t));
rc |= read_guest_lc(vcpu, __LC_IO_NEW_PSW,
&vcpu->arch.sie_block->gpsw,
sizeof(psw_t));
rc = __do_deliver_io(vcpu, &(inti->io));
kfree(inti);
goto out;
}
return rc ? -EFAULT : 0;
if (vcpu->kvm->arch.gisa &&
kvm_s390_gisa_tac_ipm_gisc(vcpu->kvm->arch.gisa, isc)) {
/*
* in case an adapter interrupt was not delivered
* in SIE context KVM will handle the delivery
*/
VCPU_EVENT(vcpu, 4, "%s isc %u", "deliver: I/O (AI/gisa)", isc);
memset(&io, 0, sizeof(io));
io.io_int_word = isc_to_int_word(isc);
vcpu->stat.deliver_io_int++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
KVM_S390_INT_IO(1, 0, 0, 0),
((__u32)io.subchannel_id << 16) |
io.subchannel_nr,
((__u64)io.io_int_parm << 32) |
io.io_int_word);
rc = __do_deliver_io(vcpu, &io);
}
out:
return rc;
}
typedef int (*deliver_irq_t)(struct kvm_vcpu *vcpu);
......@@ -1155,8 +1210,8 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs);
while ((irqs = deliverable_irqs(vcpu)) && !rc) {
/* bits are in the order of interrupt priority */
irq_type = find_first_bit(&irqs, IRQ_PEND_COUNT);
/* bits are in the reverse order of interrupt priority */
irq_type = find_last_bit(&irqs, IRQ_PEND_COUNT);
if (is_ioirq(irq_type)) {
rc = __deliver_io(vcpu, irq_type);
} else {
......@@ -1228,7 +1283,7 @@ static int __inject_pfault_init(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
li->irq.ext = irq->u.ext;
set_bit(IRQ_PEND_PFAULT_INIT, &li->pending_irqs);
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
return 0;
}
......@@ -1253,7 +1308,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
return -EBUSY;
*extcall = irq->u.extcall;
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
return 0;
}
......@@ -1297,7 +1352,7 @@ static int __inject_sigp_stop(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
if (test_and_set_bit(IRQ_PEND_SIGP_STOP, &li->pending_irqs))
return -EBUSY;
stop->flags = irq->u.stop.flags;
__set_cpuflag(vcpu, CPUSTAT_STOP_INT);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOP_INT);
return 0;
}
......@@ -1329,7 +1384,7 @@ static int __inject_sigp_emergency(struct kvm_vcpu *vcpu,
set_bit(irq->u.emerg.code, li->sigp_emerg_pending);
set_bit(IRQ_PEND_EXT_EMERGENCY, &li->pending_irqs);
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
return 0;
}
......@@ -1373,7 +1428,7 @@ static int __inject_ckc(struct kvm_vcpu *vcpu)
0, 0);
set_bit(IRQ_PEND_EXT_CLOCK_COMP, &li->pending_irqs);
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
return 0;
}
......@@ -1386,7 +1441,7 @@ static int __inject_cpu_timer(struct kvm_vcpu *vcpu)
0, 0);
set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs);
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_EXT_INT);
return 0;
}
......@@ -1416,20 +1471,86 @@ static struct kvm_s390_interrupt_info *get_io_int(struct kvm *kvm,
return NULL;
}
static struct kvm_s390_interrupt_info *get_top_io_int(struct kvm *kvm,
u64 isc_mask, u32 schid)
{
struct kvm_s390_interrupt_info *inti = NULL;
int isc;
for (isc = 0; isc <= MAX_ISC && !inti; isc++) {
if (isc_mask & isc_to_isc_bits(isc))
inti = get_io_int(kvm, isc, schid);
}
return inti;
}
static int get_top_gisa_isc(struct kvm *kvm, u64 isc_mask, u32 schid)
{
unsigned long active_mask;
int isc;
if (schid)
goto out;
if (!kvm->arch.gisa)
goto out;
active_mask = (isc_mask & kvm_s390_gisa_get_ipm(kvm->arch.gisa) << 24) << 32;
while (active_mask) {
isc = __fls(active_mask) ^ (BITS_PER_LONG - 1);
if (kvm_s390_gisa_tac_ipm_gisc(kvm->arch.gisa, isc))
return isc;
clear_bit_inv(isc, &active_mask);
}
out:
return -EINVAL;
}
/*
* Dequeue and return an I/O interrupt matching any of the interruption
* subclasses as designated by the isc mask in cr6 and the schid (if != 0).
* Take into account the interrupts pending in the interrupt list and in GISA.
*
* Note that for a guest that does not enable I/O interrupts
* but relies on TPI, a flood of classic interrupts may starve
* out adapter interrupts on the same isc. Linux does not do
* that, and it is possible to work around the issue by configuring
* different iscs for classic and adapter interrupts in the guest,
* but we may want to revisit this in the future.
*/
struct kvm_s390_interrupt_info *kvm_s390_get_io_int(struct kvm *kvm,
u64 isc_mask, u32 schid)
{
struct kvm_s390_interrupt_info *inti = NULL;
struct kvm_s390_interrupt_info *inti, *tmp_inti;
int isc;
for (isc = 0; isc <= MAX_ISC && !inti; isc++) {
if (isc_mask & isc_to_isc_bits(isc))
inti = get_io_int(kvm, isc, schid);
inti = get_top_io_int(kvm, isc_mask, schid);
isc = get_top_gisa_isc(kvm, isc_mask, schid);
if (isc < 0)
/* no AI in GISA */
goto out;
if (!inti)
/* AI in GISA but no classical IO int */
goto gisa_out;
/* both types of interrupts present */
if (int_word_to_isc(inti->io.io_int_word) <= isc) {
/* classical IO int with higher priority */
kvm_s390_gisa_set_ipm_gisc(kvm->arch.gisa, isc);
goto out;
}
gisa_out:
tmp_inti = kzalloc(sizeof(*inti), GFP_KERNEL);
if (tmp_inti) {
tmp_inti->type = KVM_S390_INT_IO(1, 0, 0, 0);
tmp_inti->io.io_int_word = isc_to_int_word(isc);
if (inti)
kvm_s390_reinject_io_int(kvm, inti);
inti = tmp_inti;
} else
kvm_s390_gisa_set_ipm_gisc(kvm->arch.gisa, isc);
out:
return inti;
}
......@@ -1517,6 +1638,15 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
struct list_head *list;
int isc;
isc = int_word_to_isc(inti->io.io_int_word);
if (kvm->arch.gisa && inti->type & KVM_S390_INT_IO_AI_MASK) {
VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc);
kvm_s390_gisa_set_ipm_gisc(kvm->arch.gisa, isc);
kfree(inti);
return 0;
}
fi = &kvm->arch.float_int;
spin_lock(&fi->lock);
if (fi->counters[FIRQ_CNTR_IO] >= KVM_S390_MAX_FLOAT_IRQS) {
......@@ -1532,7 +1662,6 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
inti->io.subchannel_id >> 8,
inti->io.subchannel_id >> 1 & 0x3,
inti->io.subchannel_nr);
isc = int_word_to_isc(inti->io.io_int_word);
list = &fi->lists[FIRQ_LIST_IO_ISC_0 + isc];
list_add_tail(&inti->list, list);
set_bit(isc_to_irq_type(isc), &fi->pending_irqs);
......@@ -1546,7 +1675,6 @@ static int __inject_io(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
static void __floating_irq_kick(struct kvm *kvm, u64 type)
{
struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
struct kvm_s390_local_interrupt *li;
struct kvm_vcpu *dst_vcpu;
int sigcpu, online_vcpus, nr_tries = 0;
......@@ -1568,20 +1696,17 @@ static void __floating_irq_kick(struct kvm *kvm, u64 type)
dst_vcpu = kvm_get_vcpu(kvm, sigcpu);
/* make the VCPU drop out of the SIE, or wake it up if sleeping */
li = &dst_vcpu->arch.local_int;
spin_lock(&li->lock);
switch (type) {
case KVM_S390_MCHK:
atomic_or(CPUSTAT_STOP_INT, li->cpuflags);
kvm_s390_set_cpuflags(dst_vcpu, CPUSTAT_STOP_INT);
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
atomic_or(CPUSTAT_IO_INT, li->cpuflags);
kvm_s390_set_cpuflags(dst_vcpu, CPUSTAT_IO_INT);
break;
default:
atomic_or(CPUSTAT_EXT_INT, li->cpuflags);
kvm_s390_set_cpuflags(dst_vcpu, CPUSTAT_EXT_INT);
break;
}
spin_unlock(&li->lock);
kvm_s390_vcpu_wakeup(dst_vcpu);
}
......@@ -1820,6 +1945,7 @@ void kvm_s390_clear_float_irqs(struct kvm *kvm)
for (i = 0; i < FIRQ_MAX_COUNT; i++)
fi->counters[i] = 0;
spin_unlock(&fi->lock);
kvm_s390_gisa_clear(kvm);
};
static int get_all_floating_irqs(struct kvm *kvm, u8 __user *usrbuf, u64 len)
......@@ -1847,6 +1973,22 @@ static int get_all_floating_irqs(struct kvm *kvm, u8 __user *usrbuf, u64 len)
max_irqs = len / sizeof(struct kvm_s390_irq);
if (kvm->arch.gisa &&
kvm_s390_gisa_get_ipm(kvm->arch.gisa)) {
for (i = 0; i <= MAX_ISC; i++) {
if (n == max_irqs) {
/* signal userspace to try again */
ret = -ENOMEM;
goto out_nolock;
}
if (kvm_s390_gisa_tac_ipm_gisc(kvm->arch.gisa, i)) {
irq = (struct kvm_s390_irq *) &buf[n];
irq->type = KVM_S390_INT_IO(1, 0, 0, 0);
irq->u.io.io_int_word = isc_to_int_word(i);
n++;
}
}
}
fi = &kvm->arch.float_int;
spin_lock(&fi->lock);
for (i = 0; i < FIRQ_LIST_COUNT; i++) {
......@@ -1885,6 +2027,7 @@ static int get_all_floating_irqs(struct kvm *kvm, u8 __user *usrbuf, u64 len)
out:
spin_unlock(&fi->lock);
out_nolock:
if (!ret && n > 0) {
if (copy_to_user(usrbuf, buf, sizeof(struct kvm_s390_irq) * n))
ret = -EFAULT;
......@@ -2245,7 +2388,7 @@ static int kvm_s390_inject_airq(struct kvm *kvm,
struct kvm_s390_interrupt s390int = {
.type = KVM_S390_INT_IO(1, 0, 0, 0),
.parm = 0,
.parm64 = (adapter->isc << 27) | 0x80000000,
.parm64 = isc_to_int_word(adapter->isc),
};
int ret = 0;
......@@ -2687,3 +2830,28 @@ int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu, __u8 __user *buf, int len)
return n;
}
void kvm_s390_gisa_clear(struct kvm *kvm)
{
if (kvm->arch.gisa) {
memset(kvm->arch.gisa, 0, sizeof(struct kvm_s390_gisa));
kvm->arch.gisa->next_alert = (u32)(u64)kvm->arch.gisa;
VM_EVENT(kvm, 3, "gisa 0x%pK cleared", kvm->arch.gisa);
}
}
void kvm_s390_gisa_init(struct kvm *kvm)
{
if (css_general_characteristics.aiv) {
kvm->arch.gisa = &kvm->arch.sie_page2->gisa;
VM_EVENT(kvm, 3, "gisa 0x%pK initialized", kvm->arch.gisa);
kvm_s390_gisa_clear(kvm);
}
}
void kvm_s390_gisa_destroy(struct kvm *kvm)
{
if (!kvm->arch.gisa)
return;
kvm->arch.gisa = NULL;
}
......@@ -2,7 +2,7 @@
/*
* hosting IBM Z kernel virtual machines (s390x)
*
* Copyright IBM Corp. 2008, 2017
* Copyright IBM Corp. 2008, 2018
*
* Author(s): Carsten Otte <cotte@de.ibm.com>
* Christian Borntraeger <borntraeger@de.ibm.com>
......@@ -87,19 +87,31 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "deliver_restart_signal", VCPU_STAT(deliver_restart_signal) },
{ "deliver_program_interruption", VCPU_STAT(deliver_program_int) },
{ "exit_wait_state", VCPU_STAT(exit_wait_state) },
{ "instruction_epsw", VCPU_STAT(instruction_epsw) },
{ "instruction_gs", VCPU_STAT(instruction_gs) },
{ "instruction_io_other", VCPU_STAT(instruction_io_other) },
{ "instruction_lpsw", VCPU_STAT(instruction_lpsw) },
{ "instruction_lpswe", VCPU_STAT(instruction_lpswe) },
{ "instruction_pfmf", VCPU_STAT(instruction_pfmf) },
{ "instruction_ptff", VCPU_STAT(instruction_ptff) },
{ "instruction_stidp", VCPU_STAT(instruction_stidp) },
{ "instruction_sck", VCPU_STAT(instruction_sck) },
{ "instruction_sckpf", VCPU_STAT(instruction_sckpf) },
{ "instruction_spx", VCPU_STAT(instruction_spx) },
{ "instruction_stpx", VCPU_STAT(instruction_stpx) },
{ "instruction_stap", VCPU_STAT(instruction_stap) },
{ "instruction_storage_key", VCPU_STAT(instruction_storage_key) },
{ "instruction_iske", VCPU_STAT(instruction_iske) },
{ "instruction_ri", VCPU_STAT(instruction_ri) },
{ "instruction_rrbe", VCPU_STAT(instruction_rrbe) },
{ "instruction_sske", VCPU_STAT(instruction_sske) },
{ "instruction_ipte_interlock", VCPU_STAT(instruction_ipte_interlock) },
{ "instruction_stsch", VCPU_STAT(instruction_stsch) },
{ "instruction_chsc", VCPU_STAT(instruction_chsc) },
{ "instruction_essa", VCPU_STAT(instruction_essa) },
{ "instruction_stsi", VCPU_STAT(instruction_stsi) },
{ "instruction_stfl", VCPU_STAT(instruction_stfl) },
{ "instruction_tb", VCPU_STAT(instruction_tb) },
{ "instruction_tpi", VCPU_STAT(instruction_tpi) },
{ "instruction_tprot", VCPU_STAT(instruction_tprot) },
{ "instruction_tsch", VCPU_STAT(instruction_tsch) },
{ "instruction_sthyi", VCPU_STAT(instruction_sthyi) },
{ "instruction_sie", VCPU_STAT(instruction_sie) },
{ "instruction_sigp_sense", VCPU_STAT(instruction_sigp_sense) },
......@@ -118,12 +130,13 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "instruction_sigp_cpu_reset", VCPU_STAT(instruction_sigp_cpu_reset) },
{ "instruction_sigp_init_cpu_reset", VCPU_STAT(instruction_sigp_init_cpu_reset) },
{ "instruction_sigp_unknown", VCPU_STAT(instruction_sigp_unknown) },
{ "diagnose_10", VCPU_STAT(diagnose_10) },
{ "diagnose_44", VCPU_STAT(diagnose_44) },
{ "diagnose_9c", VCPU_STAT(diagnose_9c) },
{ "diagnose_258", VCPU_STAT(diagnose_258) },
{ "diagnose_308", VCPU_STAT(diagnose_308) },
{ "diagnose_500", VCPU_STAT(diagnose_500) },
{ "instruction_diag_10", VCPU_STAT(diagnose_10) },
{ "instruction_diag_44", VCPU_STAT(diagnose_44) },
{ "instruction_diag_9c", VCPU_STAT(diagnose_9c) },
{ "instruction_diag_258", VCPU_STAT(diagnose_258) },
{ "instruction_diag_308", VCPU_STAT(diagnose_308) },
{ "instruction_diag_500", VCPU_STAT(diagnose_500) },
{ "instruction_diag_other", VCPU_STAT(diagnose_other) },
{ NULL }
};
......@@ -576,7 +589,7 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
case KVM_CAP_S390_GS:
r = -EINVAL;
mutex_lock(&kvm->lock);
if (atomic_read(&kvm->online_vcpus)) {
if (kvm->created_vcpus) {
r = -EBUSY;
} else if (test_facility(133)) {
set_kvm_facility(kvm->arch.model.fac_mask, 133);
......@@ -1088,7 +1101,6 @@ static int kvm_s390_set_processor_feat(struct kvm *kvm,
struct kvm_device_attr *attr)
{
struct kvm_s390_vm_cpu_feat data;
int ret = -EBUSY;
if (copy_from_user(&data, (void __user *)attr->addr, sizeof(data)))
return -EFAULT;
......@@ -1098,13 +1110,18 @@ static int kvm_s390_set_processor_feat(struct kvm *kvm,
return -EINVAL;
mutex_lock(&kvm->lock);
if (!atomic_read(&kvm->online_vcpus)) {
bitmap_copy(kvm->arch.cpu_feat, (unsigned long *) data.feat,
KVM_S390_VM_CPU_FEAT_NR_BITS);
ret = 0;
if (kvm->created_vcpus) {
mutex_unlock(&kvm->lock);
return -EBUSY;
}
bitmap_copy(kvm->arch.cpu_feat, (unsigned long *) data.feat,
KVM_S390_VM_CPU_FEAT_NR_BITS);
mutex_unlock(&kvm->lock);
return ret;
VM_EVENT(kvm, 3, "SET: guest feat: 0x%16.16llx.0x%16.16llx.0x%16.16llx",
data.feat[0],
data.feat[1],
data.feat[2]);
return 0;
}
static int kvm_s390_set_processor_subfunc(struct kvm *kvm,
......@@ -1206,6 +1223,10 @@ static int kvm_s390_get_processor_feat(struct kvm *kvm,
KVM_S390_VM_CPU_FEAT_NR_BITS);
if (copy_to_user((void __user *)attr->addr, &data, sizeof(data)))
return -EFAULT;
VM_EVENT(kvm, 3, "GET: guest feat: 0x%16.16llx.0x%16.16llx.0x%16.16llx",
data.feat[0],
data.feat[1],
data.feat[2]);
return 0;
}
......@@ -1219,6 +1240,10 @@ static int kvm_s390_get_machine_feat(struct kvm *kvm,
KVM_S390_VM_CPU_FEAT_NR_BITS);
if (copy_to_user((void __user *)attr->addr, &data, sizeof(data)))
return -EFAULT;
VM_EVENT(kvm, 3, "GET: host feat: 0x%16.16llx.0x%16.16llx.0x%16.16llx",
data.feat[0],
data.feat[1],
data.feat[2]);
return 0;
}
......@@ -1911,6 +1936,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (!kvm->arch.dbf)
goto out_err;
BUILD_BUG_ON(sizeof(struct sie_page2) != 4096);
kvm->arch.sie_page2 =
(struct sie_page2 *) get_zeroed_page(GFP_KERNEL | GFP_DMA);
if (!kvm->arch.sie_page2)
......@@ -1981,6 +2007,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
spin_lock_init(&kvm->arch.start_stop_lock);
kvm_s390_vsie_init(kvm);
kvm_s390_gisa_init(kvm);
KVM_EVENT(3, "vm 0x%pK created by pid %u", kvm, current->pid);
return 0;
......@@ -2043,6 +2070,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kvm_free_vcpus(kvm);
sca_dispose(kvm);
debug_unregister(kvm->arch.dbf);
kvm_s390_gisa_destroy(kvm);
free_page((unsigned long)kvm->arch.sie_page2);
if (!kvm_is_ucontrol(kvm))
gmap_remove(kvm->arch.gmap);
......@@ -2314,7 +2342,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
gmap_enable(vcpu->arch.enabled_gmap);
atomic_or(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__start_cpu_timer_accounting(vcpu);
vcpu->cpu = cpu;
......@@ -2325,7 +2353,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
vcpu->cpu = -1;
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__stop_cpu_timer_accounting(vcpu);
atomic_andnot(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_RUNNING);
vcpu->arch.enabled_gmap = gmap_get_enabled();
gmap_disable(vcpu->arch.enabled_gmap);
......@@ -2422,9 +2450,9 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
CPUSTAT_STOPPED);
if (test_kvm_facility(vcpu->kvm, 78))
atomic_or(CPUSTAT_GED2, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_GED2);
else if (test_kvm_facility(vcpu->kvm, 8))
atomic_or(CPUSTAT_GED, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_GED);
kvm_s390_vcpu_setup_model(vcpu);
......@@ -2456,12 +2484,17 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
if (test_kvm_facility(vcpu->kvm, 139))
vcpu->arch.sie_block->ecd |= ECD_MEF;
if (vcpu->arch.sie_block->gd) {
vcpu->arch.sie_block->eca |= ECA_AIV;
VCPU_EVENT(vcpu, 3, "AIV gisa format-%u enabled for cpu %03u",
vcpu->arch.sie_block->gd & 0x3, vcpu->vcpu_id);
}
vcpu->arch.sie_block->sdnxo = ((unsigned long) &vcpu->run->s.regs.sdnx)
| SDNXC;
vcpu->arch.sie_block->riccbd = (unsigned long) &vcpu->run->s.regs.riccb;
if (sclp.has_kss)
atomic_or(CPUSTAT_KSS, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_KSS);
else
vcpu->arch.sie_block->ictl |= ICTL_ISKE | ICTL_SSKE | ICTL_RRBE;
......@@ -2508,9 +2541,9 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
vcpu->arch.sie_block->icpua = id;
spin_lock_init(&vcpu->arch.local_int.lock);
vcpu->arch.local_int.float_int = &kvm->arch.float_int;
vcpu->arch.local_int.wq = &vcpu->wq;
vcpu->arch.local_int.cpuflags = &vcpu->arch.sie_block->cpuflags;
vcpu->arch.sie_block->gd = (u32)(u64)kvm->arch.gisa;
if (vcpu->arch.sie_block->gd && sclp.has_gisaf)
vcpu->arch.sie_block->gd |= GISA_FORMAT1;
seqcount_init(&vcpu->arch.cputm_seqcount);
rc = kvm_vcpu_init(vcpu, kvm, id);
......@@ -2567,7 +2600,7 @@ static void kvm_s390_vcpu_request_handled(struct kvm_vcpu *vcpu)
* return immediately. */
void exit_sie(struct kvm_vcpu *vcpu)
{
atomic_or(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOP_INT);
while (vcpu->arch.sie_block->prog0c & PROG_IN_SIE)
cpu_relax();
}
......@@ -2720,47 +2753,70 @@ static int kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
vcpu_load(vcpu);
memcpy(&vcpu->run->s.regs.gprs, &regs->gprs, sizeof(regs->gprs));
vcpu_put(vcpu);
return 0;
}
int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
vcpu_load(vcpu);
memcpy(&regs->gprs, &vcpu->run->s.regs.gprs, sizeof(regs->gprs));
vcpu_put(vcpu);
return 0;
}
int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
vcpu_load(vcpu);
memcpy(&vcpu->run->s.regs.acrs, &sregs->acrs, sizeof(sregs->acrs));
memcpy(&vcpu->arch.sie_block->gcr, &sregs->crs, sizeof(sregs->crs));
vcpu_put(vcpu);
return 0;
}
int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
vcpu_load(vcpu);
memcpy(&sregs->acrs, &vcpu->run->s.regs.acrs, sizeof(sregs->acrs));
memcpy(&sregs->crs, &vcpu->arch.sie_block->gcr, sizeof(sregs->crs));
vcpu_put(vcpu);
return 0;
}
int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
if (test_fp_ctl(fpu->fpc))
return -EINVAL;
int ret = 0;
vcpu_load(vcpu);
if (test_fp_ctl(fpu->fpc)) {
ret = -EINVAL;
goto out;
}
vcpu->run->s.regs.fpc = fpu->fpc;
if (MACHINE_HAS_VX)
convert_fp_to_vx((__vector128 *) vcpu->run->s.regs.vrs,
(freg_t *) fpu->fprs);
else
memcpy(vcpu->run->s.regs.fprs, &fpu->fprs, sizeof(fpu->fprs));
return 0;
out:
vcpu_put(vcpu);
return ret;
}
int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
vcpu_load(vcpu);
/* make sure we have the latest values */
save_fpu_regs();
if (MACHINE_HAS_VX)
......@@ -2769,6 +2825,8 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
else
memcpy(fpu->fprs, vcpu->run->s.regs.fprs, sizeof(fpu->fprs));
fpu->fpc = vcpu->run->s.regs.fpc;
vcpu_put(vcpu);
return 0;
}
......@@ -2800,41 +2858,56 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
{
int rc = 0;
vcpu_load(vcpu);
vcpu->guest_debug = 0;
kvm_s390_clear_bp_data(vcpu);
if (dbg->control & ~VALID_GUESTDBG_FLAGS)
return -EINVAL;
if (!sclp.has_gpere)
return -EINVAL;
if (dbg->control & ~VALID_GUESTDBG_FLAGS) {
rc = -EINVAL;
goto out;
}
if (!sclp.has_gpere) {
rc = -EINVAL;
goto out;
}
if (dbg->control & KVM_GUESTDBG_ENABLE) {
vcpu->guest_debug = dbg->control;
/* enforce guest PER */
atomic_or(CPUSTAT_P, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_P);
if (dbg->control & KVM_GUESTDBG_USE_HW_BP)
rc = kvm_s390_import_bp_data(vcpu, dbg);
} else {
atomic_andnot(CPUSTAT_P, &vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_P);
vcpu->arch.guestdbg.last_bp = 0;
}
if (rc) {
vcpu->guest_debug = 0;
kvm_s390_clear_bp_data(vcpu);
atomic_andnot(CPUSTAT_P, &vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_P);
}
out:
vcpu_put(vcpu);
return rc;
}
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
{
int ret;
vcpu_load(vcpu);
/* CHECK_STOP and LOAD are not supported yet */
return is_vcpu_stopped(vcpu) ? KVM_MP_STATE_STOPPED :
KVM_MP_STATE_OPERATING;
ret = is_vcpu_stopped(vcpu) ? KVM_MP_STATE_STOPPED :
KVM_MP_STATE_OPERATING;
vcpu_put(vcpu);
return ret;
}
int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
......@@ -2842,6 +2915,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
{
int rc = 0;
vcpu_load(vcpu);
/* user space knows about this interface - let it control the state */
vcpu->kvm->arch.user_cpu_state_ctrl = 1;
......@@ -2859,12 +2934,13 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
rc = -ENXIO;
}
vcpu_put(vcpu);
return rc;
}
static bool ibs_enabled(struct kvm_vcpu *vcpu)
{
return atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_IBS;
return kvm_s390_test_cpuflags(vcpu, CPUSTAT_IBS);
}
static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
......@@ -2900,8 +2976,7 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_ENABLE_IBS, vcpu)) {
if (!ibs_enabled(vcpu)) {
trace_kvm_s390_enable_disable_ibs(vcpu->vcpu_id, 1);
atomic_or(CPUSTAT_IBS,
&vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_IBS);
}
goto retry;
}
......@@ -2909,8 +2984,7 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_DISABLE_IBS, vcpu)) {
if (ibs_enabled(vcpu)) {
trace_kvm_s390_enable_disable_ibs(vcpu->vcpu_id, 0);
atomic_andnot(CPUSTAT_IBS,
&vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_IBS);
}
goto retry;
}
......@@ -3390,9 +3464,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
if (kvm_run->immediate_exit)
return -EINTR;
vcpu_load(vcpu);
if (guestdbg_exit_pending(vcpu)) {
kvm_s390_prepare_debug_exit(vcpu);
return 0;
rc = 0;
goto out;
}
kvm_sigset_activate(vcpu);
......@@ -3402,7 +3479,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
} else if (is_vcpu_stopped(vcpu)) {
pr_err_ratelimited("can't run stopped vcpu %d\n",
vcpu->vcpu_id);
return -EINVAL;
rc = -EINVAL;
goto out;
}
sync_regs(vcpu, kvm_run);
......@@ -3432,6 +3510,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
kvm_sigset_deactivate(vcpu);
vcpu->stat.exit_userspace++;
out:
vcpu_put(vcpu);
return rc;
}
......@@ -3560,7 +3640,7 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
__disable_ibs_on_all_vcpus(vcpu->kvm);
}
atomic_andnot(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED);
/*
* Another VCPU might have used IBS while we were offline.
* Let's play safe and flush the VCPU at startup.
......@@ -3586,7 +3666,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
/* SIGP STOP and SIGP STOP AND STORE STATUS has been fully processed */
kvm_s390_clear_stop_irq(vcpu);
atomic_or(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
__disable_ibs_on_vcpu(vcpu);
for (i = 0; i < online_vcpus; i++) {
......@@ -3693,36 +3773,45 @@ static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu,
return r;
}
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
long kvm_arch_vcpu_async_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
int idx;
long r;
switch (ioctl) {
case KVM_S390_IRQ: {
struct kvm_s390_irq s390irq;
r = -EFAULT;
if (copy_from_user(&s390irq, argp, sizeof(s390irq)))
break;
r = kvm_s390_inject_vcpu(vcpu, &s390irq);
break;
return -EFAULT;
return kvm_s390_inject_vcpu(vcpu, &s390irq);
}
case KVM_S390_INTERRUPT: {
struct kvm_s390_interrupt s390int;
struct kvm_s390_irq s390irq;
r = -EFAULT;
if (copy_from_user(&s390int, argp, sizeof(s390int)))
break;
return -EFAULT;
if (s390int_to_s390irq(&s390int, &s390irq))
return -EINVAL;
r = kvm_s390_inject_vcpu(vcpu, &s390irq);
break;
return kvm_s390_inject_vcpu(vcpu, &s390irq);
}
}
return -ENOIOCTLCMD;
}
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
int idx;
long r;
vcpu_load(vcpu);
switch (ioctl) {
case KVM_S390_STORE_STATUS:
idx = srcu_read_lock(&vcpu->kvm->srcu);
r = kvm_s390_vcpu_store_status(vcpu, arg);
......@@ -3847,6 +3936,8 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
default:
r = -ENOTTY;
}
vcpu_put(vcpu);
return r;
}
......
......@@ -47,14 +47,29 @@ do { \
d_args); \
} while (0)
static inline void kvm_s390_set_cpuflags(struct kvm_vcpu *vcpu, u32 flags)
{
atomic_or(flags, &vcpu->arch.sie_block->cpuflags);
}
static inline void kvm_s390_clear_cpuflags(struct kvm_vcpu *vcpu, u32 flags)
{
atomic_andnot(flags, &vcpu->arch.sie_block->cpuflags);
}
static inline bool kvm_s390_test_cpuflags(struct kvm_vcpu *vcpu, u32 flags)
{
return (atomic_read(&vcpu->arch.sie_block->cpuflags) & flags) == flags;
}
static inline int is_vcpu_stopped(struct kvm_vcpu *vcpu)
{
return atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_STOPPED;
return kvm_s390_test_cpuflags(vcpu, CPUSTAT_STOPPED);
}
static inline int is_vcpu_idle(struct kvm_vcpu *vcpu)
{
return test_bit(vcpu->vcpu_id, vcpu->arch.local_int.float_int->idle_mask);
return test_bit(vcpu->vcpu_id, vcpu->kvm->arch.float_int.idle_mask);
}
static inline int kvm_is_ucontrol(struct kvm *kvm)
......@@ -367,6 +382,9 @@ int kvm_s390_set_irq_state(struct kvm_vcpu *vcpu,
void __user *buf, int len);
int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu,
__u8 __user *buf, int len);
void kvm_s390_gisa_init(struct kvm *kvm);
void kvm_s390_gisa_clear(struct kvm *kvm);
void kvm_s390_gisa_destroy(struct kvm *kvm);
/* implemented in guestdbg.c */
void kvm_s390_backup_guest_per_regs(struct kvm_vcpu *vcpu);
......
......@@ -2,7 +2,7 @@
/*
* handling privileged instructions
*
* Copyright IBM Corp. 2008, 2013
* Copyright IBM Corp. 2008, 2018
*
* Author(s): Carsten Otte <cotte@de.ibm.com>
* Christian Borntraeger <borntraeger@de.ibm.com>
......@@ -34,6 +34,8 @@
static int handle_ri(struct kvm_vcpu *vcpu)
{
vcpu->stat.instruction_ri++;
if (test_kvm_facility(vcpu->kvm, 64)) {
VCPU_EVENT(vcpu, 3, "%s", "ENABLE: RI (lazy)");
vcpu->arch.sie_block->ecb3 |= ECB3_RI;
......@@ -53,6 +55,8 @@ int kvm_s390_handle_aa(struct kvm_vcpu *vcpu)
static int handle_gs(struct kvm_vcpu *vcpu)
{
vcpu->stat.instruction_gs++;
if (test_kvm_facility(vcpu->kvm, 133)) {
VCPU_EVENT(vcpu, 3, "%s", "ENABLE: GS (lazy)");
preempt_disable();
......@@ -85,6 +89,8 @@ static int handle_set_clock(struct kvm_vcpu *vcpu)
u8 ar;
u64 op2, val;
vcpu->stat.instruction_sck++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -203,14 +209,14 @@ int kvm_s390_skey_check_enable(struct kvm_vcpu *vcpu)
trace_kvm_s390_skey_related_inst(vcpu);
if (!(sie_block->ictl & (ICTL_ISKE | ICTL_SSKE | ICTL_RRBE)) &&
!(atomic_read(&sie_block->cpuflags) & CPUSTAT_KSS))
!kvm_s390_test_cpuflags(vcpu, CPUSTAT_KSS))
return rc;
rc = s390_enable_skey();
VCPU_EVENT(vcpu, 3, "enabling storage keys for guest: %d", rc);
if (!rc) {
if (atomic_read(&sie_block->cpuflags) & CPUSTAT_KSS)
atomic_andnot(CPUSTAT_KSS, &sie_block->cpuflags);
if (kvm_s390_test_cpuflags(vcpu, CPUSTAT_KSS))
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_KSS);
else
sie_block->ictl &= ~(ICTL_ISKE | ICTL_SSKE |
ICTL_RRBE);
......@@ -222,7 +228,6 @@ static int try_handle_skey(struct kvm_vcpu *vcpu)
{
int rc;
vcpu->stat.instruction_storage_key++;
rc = kvm_s390_skey_check_enable(vcpu);
if (rc)
return rc;
......@@ -242,6 +247,8 @@ static int handle_iske(struct kvm_vcpu *vcpu)
int reg1, reg2;
int rc;
vcpu->stat.instruction_iske++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -274,6 +281,8 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
int reg1, reg2;
int rc;
vcpu->stat.instruction_rrbe++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -312,6 +321,8 @@ static int handle_sske(struct kvm_vcpu *vcpu)
int reg1, reg2;
int rc;
vcpu->stat.instruction_sske++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -392,6 +403,8 @@ static int handle_test_block(struct kvm_vcpu *vcpu)
gpa_t addr;
int reg2;
vcpu->stat.instruction_tb++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -424,6 +437,8 @@ static int handle_tpi(struct kvm_vcpu *vcpu)
u64 addr;
u8 ar;
vcpu->stat.instruction_tpi++;
addr = kvm_s390_get_base_disp_s(vcpu, &ar);
if (addr & 3)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
......@@ -484,6 +499,8 @@ static int handle_tsch(struct kvm_vcpu *vcpu)
struct kvm_s390_interrupt_info *inti = NULL;
const u64 isc_mask = 0xffUL << 24; /* all iscs set */
vcpu->stat.instruction_tsch++;
/* a valid schid has at least one bit set */
if (vcpu->run->s.regs.gprs[1])
inti = kvm_s390_get_io_int(vcpu->kvm, isc_mask,
......@@ -527,6 +544,7 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
if (vcpu->arch.sie_block->ipa == 0xb235)
return handle_tsch(vcpu);
/* Handle in userspace. */
vcpu->stat.instruction_io_other++;
return -EOPNOTSUPP;
} else {
/*
......@@ -592,6 +610,8 @@ int kvm_s390_handle_lpsw(struct kvm_vcpu *vcpu)
int rc;
u8 ar;
vcpu->stat.instruction_lpsw++;
if (gpsw->mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -619,6 +639,8 @@ static int handle_lpswe(struct kvm_vcpu *vcpu)
int rc;
u8 ar;
vcpu->stat.instruction_lpswe++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -828,6 +850,8 @@ static int handle_epsw(struct kvm_vcpu *vcpu)
{
int reg1, reg2;
vcpu->stat.instruction_epsw++;
kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
/* This basically extracts the mask half of the psw. */
......@@ -1332,6 +1356,8 @@ static int handle_sckpf(struct kvm_vcpu *vcpu)
{
u32 value;
vcpu->stat.instruction_sckpf++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -1347,6 +1373,8 @@ static int handle_sckpf(struct kvm_vcpu *vcpu)
static int handle_ptff(struct kvm_vcpu *vcpu)
{
vcpu->stat.instruction_ptff++;
/* we don't emulate any control instructions yet */
kvm_s390_set_psw_cc(vcpu, 3);
return 0;
......
......@@ -20,22 +20,18 @@
static int __sigp_sense(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu,
u64 *reg)
{
struct kvm_s390_local_interrupt *li;
int cpuflags;
const bool stopped = kvm_s390_test_cpuflags(dst_vcpu, CPUSTAT_STOPPED);
int rc;
int ext_call_pending;
li = &dst_vcpu->arch.local_int;
cpuflags = atomic_read(li->cpuflags);
ext_call_pending = kvm_s390_ext_call_pending(dst_vcpu);
if (!(cpuflags & CPUSTAT_STOPPED) && !ext_call_pending)
if (!stopped && !ext_call_pending)
rc = SIGP_CC_ORDER_CODE_ACCEPTED;
else {
*reg &= 0xffffffff00000000UL;
if (ext_call_pending)
*reg |= SIGP_STATUS_EXT_CALL_PENDING;
if (cpuflags & CPUSTAT_STOPPED)
if (stopped)
*reg |= SIGP_STATUS_STOPPED;
rc = SIGP_CC_STATUS_STORED;
}
......@@ -208,11 +204,9 @@ static int __sigp_store_status_at_addr(struct kvm_vcpu *vcpu,
struct kvm_vcpu *dst_vcpu,
u32 addr, u64 *reg)
{
int flags;
int rc;
flags = atomic_read(dst_vcpu->arch.local_int.cpuflags);
if (!(flags & CPUSTAT_STOPPED)) {
if (!kvm_s390_test_cpuflags(dst_vcpu, CPUSTAT_STOPPED)) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INCORRECT_STATE;
return SIGP_CC_STATUS_STORED;
......@@ -231,7 +225,6 @@ static int __sigp_store_status_at_addr(struct kvm_vcpu *vcpu,
static int __sigp_sense_running(struct kvm_vcpu *vcpu,
struct kvm_vcpu *dst_vcpu, u64 *reg)
{
struct kvm_s390_local_interrupt *li;
int rc;
if (!test_kvm_facility(vcpu->kvm, 9)) {
......@@ -240,8 +233,7 @@ static int __sigp_sense_running(struct kvm_vcpu *vcpu,
return SIGP_CC_STATUS_STORED;
}
li = &dst_vcpu->arch.local_int;
if (atomic_read(li->cpuflags) & CPUSTAT_RUNNING) {
if (kvm_s390_test_cpuflags(dst_vcpu, CPUSTAT_RUNNING)) {
/* running */
rc = SIGP_CC_ORDER_CODE_ACCEPTED;
} else {
......
......@@ -28,13 +28,23 @@ struct vsie_page {
* the same offset as that in struct sie_page!
*/
struct mcck_volatile_info mcck_info; /* 0x0200 */
/* the pinned originial scb */
/*
* The pinned original scb. Be aware that other VCPUs can modify
* it while we read from it. Values that are used for conditions or
* are reused conditionally, should be accessed via READ_ONCE.
*/
struct kvm_s390_sie_block *scb_o; /* 0x0218 */
/* the shadow gmap in use by the vsie_page */
struct gmap *gmap; /* 0x0220 */
/* address of the last reported fault to guest2 */
unsigned long fault_addr; /* 0x0228 */
__u8 reserved[0x0700 - 0x0230]; /* 0x0230 */
/* calculated guest addresses of satellite control blocks */
gpa_t sca_gpa; /* 0x0230 */
gpa_t itdba_gpa; /* 0x0238 */
gpa_t gvrd_gpa; /* 0x0240 */
gpa_t riccbd_gpa; /* 0x0248 */
gpa_t sdnx_gpa; /* 0x0250 */
__u8 reserved[0x0700 - 0x0258]; /* 0x0258 */
struct kvm_s390_crypto_cb crycb; /* 0x0700 */
__u8 fac[S390_ARCH_FAC_LIST_SIZE_BYTE]; /* 0x0800 */
};
......@@ -140,12 +150,13 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
struct kvm_s390_sie_block *scb_o = vsie_page->scb_o;
u32 crycb_addr = scb_o->crycbd & 0x7ffffff8U;
const uint32_t crycbd_o = READ_ONCE(scb_o->crycbd);
const u32 crycb_addr = crycbd_o & 0x7ffffff8U;
unsigned long *b1, *b2;
u8 ecb3_flags;
scb_s->crycbd = 0;
if (!(scb_o->crycbd & vcpu->arch.sie_block->crycbd & CRYCB_FORMAT1))
if (!(crycbd_o & vcpu->arch.sie_block->crycbd & CRYCB_FORMAT1))
return 0;
/* format-1 is supported with message-security-assist extension 3 */
if (!test_kvm_facility(vcpu->kvm, 76))
......@@ -183,12 +194,15 @@ static void prepare_ibc(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
struct kvm_s390_sie_block *scb_o = vsie_page->scb_o;
/* READ_ONCE does not work on bitfields - use a temporary variable */
const uint32_t __new_ibc = scb_o->ibc;
const uint32_t new_ibc = READ_ONCE(__new_ibc) & 0x0fffU;
__u64 min_ibc = (sclp.ibc >> 16) & 0x0fffU;
scb_s->ibc = 0;
/* ibc installed in g2 and requested for g3 */
if (vcpu->kvm->arch.model.ibc && (scb_o->ibc & 0x0fffU)) {
scb_s->ibc = scb_o->ibc & 0x0fffU;
if (vcpu->kvm->arch.model.ibc && new_ibc) {
scb_s->ibc = new_ibc;
/* takte care of the minimum ibc level of the machine */
if (scb_s->ibc < min_ibc)
scb_s->ibc = min_ibc;
......@@ -259,6 +273,10 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_o = vsie_page->scb_o;
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
/* READ_ONCE does not work on bitfields - use a temporary variable */
const uint32_t __new_prefix = scb_o->prefix;
const uint32_t new_prefix = READ_ONCE(__new_prefix);
const bool wants_tx = READ_ONCE(scb_o->ecb) & ECB_TE;
bool had_tx = scb_s->ecb & ECB_TE;
unsigned long new_mso = 0;
int rc;
......@@ -306,14 +324,14 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
scb_s->icpua = scb_o->icpua;
if (!(atomic_read(&scb_s->cpuflags) & CPUSTAT_SM))
new_mso = scb_o->mso & 0xfffffffffff00000UL;
new_mso = READ_ONCE(scb_o->mso) & 0xfffffffffff00000UL;
/* if the hva of the prefix changes, we have to remap the prefix */
if (scb_s->mso != new_mso || scb_s->prefix != scb_o->prefix)
if (scb_s->mso != new_mso || scb_s->prefix != new_prefix)
prefix_unmapped(vsie_page);
/* SIE will do mso/msl validity and exception checks for us */
scb_s->msl = scb_o->msl & 0xfffffffffff00000UL;
scb_s->mso = new_mso;
scb_s->prefix = scb_o->prefix;
scb_s->prefix = new_prefix;
/* We have to definetly flush the tlb if this scb never ran */
if (scb_s->ihcpu != 0xffffU)
......@@ -325,11 +343,11 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
/* transactional execution */
if (test_kvm_facility(vcpu->kvm, 73)) {
if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
/* remap the prefix is tx is toggled on */
if ((scb_o->ecb & ECB_TE) && !had_tx)
if (!had_tx)
prefix_unmapped(vsie_page);
scb_s->ecb |= scb_o->ecb & ECB_TE;
scb_s->ecb |= ECB_TE;
}
/* branch prediction */
if (test_kvm_facility(vcpu->kvm, 82))
......@@ -473,46 +491,42 @@ static void unpin_guest_page(struct kvm *kvm, gpa_t gpa, hpa_t hpa)
/* unpin all blocks previously pinned by pin_blocks(), marking them dirty */
static void unpin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_o = vsie_page->scb_o;
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
hpa_t hpa;
gpa_t gpa;
hpa = (u64) scb_s->scaoh << 32 | scb_s->scaol;
if (hpa) {
gpa = scb_o->scaol & ~0xfUL;
if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_64BSCAO))
gpa |= (u64) scb_o->scaoh << 32;
unpin_guest_page(vcpu->kvm, gpa, hpa);
unpin_guest_page(vcpu->kvm, vsie_page->sca_gpa, hpa);
vsie_page->sca_gpa = 0;
scb_s->scaol = 0;
scb_s->scaoh = 0;
}
hpa = scb_s->itdba;
if (hpa) {
gpa = scb_o->itdba & ~0xffUL;
unpin_guest_page(vcpu->kvm, gpa, hpa);
unpin_guest_page(vcpu->kvm, vsie_page->itdba_gpa, hpa);
vsie_page->itdba_gpa = 0;
scb_s->itdba = 0;
}
hpa = scb_s->gvrd;
if (hpa) {
gpa = scb_o->gvrd & ~0x1ffUL;
unpin_guest_page(vcpu->kvm, gpa, hpa);
unpin_guest_page(vcpu->kvm, vsie_page->gvrd_gpa, hpa);
vsie_page->gvrd_gpa = 0;
scb_s->gvrd = 0;
}
hpa = scb_s->riccbd;
if (hpa) {
gpa = scb_o->riccbd & ~0x3fUL;
unpin_guest_page(vcpu->kvm, gpa, hpa);
unpin_guest_page(vcpu->kvm, vsie_page->riccbd_gpa, hpa);
vsie_page->riccbd_gpa = 0;
scb_s->riccbd = 0;
}
hpa = scb_s->sdnxo;
if (hpa) {
gpa = scb_o->sdnxo;
unpin_guest_page(vcpu->kvm, gpa, hpa);
unpin_guest_page(vcpu->kvm, vsie_page->sdnx_gpa, hpa);
vsie_page->sdnx_gpa = 0;
scb_s->sdnxo = 0;
}
}
......@@ -539,9 +553,9 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
gpa_t gpa;
int rc = 0;
gpa = scb_o->scaol & ~0xfUL;
gpa = READ_ONCE(scb_o->scaol) & ~0xfUL;
if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_64BSCAO))
gpa |= (u64) scb_o->scaoh << 32;
gpa |= (u64) READ_ONCE(scb_o->scaoh) << 32;
if (gpa) {
if (!(gpa & ~0x1fffUL))
rc = set_validity_icpt(scb_s, 0x0038U);
......@@ -557,11 +571,12 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
}
if (rc)
goto unpin;
vsie_page->sca_gpa = gpa;
scb_s->scaoh = (u32)((u64)hpa >> 32);
scb_s->scaol = (u32)(u64)hpa;
}
gpa = scb_o->itdba & ~0xffUL;
gpa = READ_ONCE(scb_o->itdba) & ~0xffUL;
if (gpa && (scb_s->ecb & ECB_TE)) {
if (!(gpa & ~0x1fffU)) {
rc = set_validity_icpt(scb_s, 0x0080U);
......@@ -573,10 +588,11 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
rc = set_validity_icpt(scb_s, 0x0080U);
goto unpin;
}
vsie_page->itdba_gpa = gpa;
scb_s->itdba = hpa;
}
gpa = scb_o->gvrd & ~0x1ffUL;
gpa = READ_ONCE(scb_o->gvrd) & ~0x1ffUL;
if (gpa && (scb_s->eca & ECA_VX) && !(scb_s->ecd & ECD_HOSTREGMGMT)) {
if (!(gpa & ~0x1fffUL)) {
rc = set_validity_icpt(scb_s, 0x1310U);
......@@ -591,10 +607,11 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
rc = set_validity_icpt(scb_s, 0x1310U);
goto unpin;
}
vsie_page->gvrd_gpa = gpa;
scb_s->gvrd = hpa;
}
gpa = scb_o->riccbd & ~0x3fUL;
gpa = READ_ONCE(scb_o->riccbd) & ~0x3fUL;
if (gpa && (scb_s->ecb3 & ECB3_RI)) {
if (!(gpa & ~0x1fffUL)) {
rc = set_validity_icpt(scb_s, 0x0043U);
......@@ -607,13 +624,14 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
goto unpin;
}
/* Validity 0x0044 will be checked by SIE */
vsie_page->riccbd_gpa = gpa;
scb_s->riccbd = hpa;
}
if ((scb_s->ecb & ECB_GS) && !(scb_s->ecd & ECD_HOSTREGMGMT)) {
unsigned long sdnxc;
gpa = scb_o->sdnxo & ~0xfUL;
sdnxc = scb_o->sdnxo & 0xfUL;
gpa = READ_ONCE(scb_o->sdnxo) & ~0xfUL;
sdnxc = READ_ONCE(scb_o->sdnxo) & 0xfUL;
if (!gpa || !(gpa & ~0x1fffUL)) {
rc = set_validity_icpt(scb_s, 0x10b0U);
goto unpin;
......@@ -634,6 +652,7 @@ static int pin_blocks(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
rc = set_validity_icpt(scb_s, 0x10b0U);
goto unpin;
}
vsie_page->sdnx_gpa = gpa;
scb_s->sdnxo = hpa | sdnxc;
}
return 0;
......@@ -778,7 +797,7 @@ static void retry_vsie_icpt(struct vsie_page *vsie_page)
static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
__u32 fac = vsie_page->scb_o->fac & 0x7ffffff8U;
__u32 fac = READ_ONCE(vsie_page->scb_o->fac) & 0x7ffffff8U;
if (fac && test_kvm_facility(vcpu->kvm, 7)) {
retry_vsie_icpt(vsie_page);
......@@ -904,7 +923,7 @@ static void register_shadow_scb(struct kvm_vcpu *vcpu,
* External calls have to lead to a kick of the vcpu and
* therefore the vsie -> Simulate Wait state.
*/
atomic_or(CPUSTAT_WAIT, &vcpu->arch.sie_block->cpuflags);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_WAIT);
/*
* We have to adjust the g3 epoch by the g2 epoch. The epoch will
* automatically be adjusted on tod clock changes via kvm_sync_clock.
......@@ -926,7 +945,7 @@ static void register_shadow_scb(struct kvm_vcpu *vcpu,
*/
static void unregister_shadow_scb(struct kvm_vcpu *vcpu)
{
atomic_andnot(CPUSTAT_WAIT, &vcpu->arch.sie_block->cpuflags);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT);
WRITE_ONCE(vcpu->arch.vsie_block, NULL);
}
......
......@@ -815,27 +815,17 @@ static inline unsigned long *gmap_table_walk(struct gmap *gmap,
* @ptl: pointer to the spinlock pointer
*
* Returns a pointer to the locked pte for a guest address, or NULL
*
* Note: Can also be called for shadow gmaps.
*/
static pte_t *gmap_pte_op_walk(struct gmap *gmap, unsigned long gaddr,
spinlock_t **ptl)
{
unsigned long *table;
if (gmap_is_shadow(gmap))
spin_lock(&gmap->guest_table_lock);
BUG_ON(gmap_is_shadow(gmap));
/* Walk the gmap page table, lock and get pte pointer */
table = gmap_table_walk(gmap, gaddr, 1); /* get segment pointer */
if (!table || *table & _SEGMENT_ENTRY_INVALID) {
if (gmap_is_shadow(gmap))
spin_unlock(&gmap->guest_table_lock);
if (!table || *table & _SEGMENT_ENTRY_INVALID)
return NULL;
}
if (gmap_is_shadow(gmap)) {
*ptl = &gmap->guest_table_lock;
return pte_offset_map((pmd_t *) table, gaddr);
}
return pte_alloc_map_lock(gmap->mm, (pmd_t *) table, gaddr, ptl);
}
......@@ -889,8 +879,6 @@ static void gmap_pte_op_end(spinlock_t *ptl)
* -EFAULT if gaddr is invalid (or mapping for shadows is missing).
*
* Called with sg->mm->mmap_sem in read.
*
* Note: Can also be called for shadow gmaps.
*/
static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr,
unsigned long len, int prot, unsigned long bits)
......@@ -900,6 +888,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr,
pte_t *ptep;
int rc;
BUG_ON(gmap_is_shadow(gmap));
while (len) {
rc = -EAGAIN;
ptep = gmap_pte_op_walk(gmap, gaddr, &ptl);
......@@ -960,7 +949,8 @@ EXPORT_SYMBOL_GPL(gmap_mprotect_notify);
* @val: pointer to the unsigned long value to return
*
* Returns 0 if the value was read, -ENOMEM if out of memory and -EFAULT
* if reading using the virtual address failed.
* if reading using the virtual address failed. -EINVAL if called on a gmap
* shadow.
*
* Called with gmap->mm->mmap_sem in read.
*/
......@@ -971,6 +961,9 @@ int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val)
pte_t *ptep, pte;
int rc;
if (gmap_is_shadow(gmap))
return -EINVAL;
while (1) {
rc = -EAGAIN;
ptep = gmap_pte_op_walk(gmap, gaddr, &ptl);
......@@ -1028,18 +1021,17 @@ static inline void gmap_insert_rmap(struct gmap *sg, unsigned long vmaddr,
}
/**
* gmap_protect_rmap - modify access rights to memory and create an rmap
* gmap_protect_rmap - restrict access rights to memory (RO) and create an rmap
* @sg: pointer to the shadow guest address space structure
* @raddr: rmap address in the shadow gmap
* @paddr: address in the parent guest address space
* @len: length of the memory area to protect
* @prot: indicates access rights: none, read-only or read-write
*
* Returns 0 if successfully protected and the rmap was created, -ENOMEM
* if out of memory and -EFAULT if paddr is invalid.
*/
static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr,
unsigned long paddr, unsigned long len, int prot)
unsigned long paddr, unsigned long len)
{
struct gmap *parent;
struct gmap_rmap *rmap;
......@@ -1067,7 +1059,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr,
ptep = gmap_pte_op_walk(parent, paddr, &ptl);
if (ptep) {
spin_lock(&sg->guest_table_lock);
rc = ptep_force_prot(parent->mm, paddr, ptep, prot,
rc = ptep_force_prot(parent->mm, paddr, ptep, PROT_READ,
PGSTE_VSIE_BIT);
if (!rc)
gmap_insert_rmap(sg, vmaddr, rmap);
......@@ -1077,7 +1069,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr,
radix_tree_preload_end();
if (rc) {
kfree(rmap);
rc = gmap_pte_op_fixup(parent, paddr, vmaddr, prot);
rc = gmap_pte_op_fixup(parent, paddr, vmaddr, PROT_READ);
if (rc)
return rc;
continue;
......@@ -1616,7 +1608,7 @@ int gmap_shadow_r2t(struct gmap *sg, unsigned long saddr, unsigned long r2t,
origin = r2t & _REGION_ENTRY_ORIGIN;
offset = ((r2t & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE;
len = ((r2t & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset;
rc = gmap_protect_rmap(sg, raddr, origin + offset, len, PROT_READ);
rc = gmap_protect_rmap(sg, raddr, origin + offset, len);
spin_lock(&sg->guest_table_lock);
if (!rc) {
table = gmap_table_walk(sg, saddr, 4);
......@@ -1699,7 +1691,7 @@ int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t,
origin = r3t & _REGION_ENTRY_ORIGIN;
offset = ((r3t & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE;
len = ((r3t & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset;
rc = gmap_protect_rmap(sg, raddr, origin + offset, len, PROT_READ);
rc = gmap_protect_rmap(sg, raddr, origin + offset, len);
spin_lock(&sg->guest_table_lock);
if (!rc) {
table = gmap_table_walk(sg, saddr, 3);
......@@ -1783,7 +1775,7 @@ int gmap_shadow_sgt(struct gmap *sg, unsigned long saddr, unsigned long sgt,
origin = sgt & _REGION_ENTRY_ORIGIN;
offset = ((sgt & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE;
len = ((sgt & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset;
rc = gmap_protect_rmap(sg, raddr, origin + offset, len, PROT_READ);
rc = gmap_protect_rmap(sg, raddr, origin + offset, len);
spin_lock(&sg->guest_table_lock);
if (!rc) {
table = gmap_table_walk(sg, saddr, 2);
......@@ -1902,7 +1894,7 @@ int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt,
/* Make pgt read-only in parent gmap page table (not the pgste) */
raddr = (saddr & _SEGMENT_MASK) | _SHADOW_RMAP_SEGMENT;
origin = pgt & _SEGMENT_ENTRY_ORIGIN & PAGE_MASK;
rc = gmap_protect_rmap(sg, raddr, origin, PAGE_SIZE, PROT_READ);
rc = gmap_protect_rmap(sg, raddr, origin, PAGE_SIZE);
spin_lock(&sg->guest_table_lock);
if (!rc) {
table = gmap_table_walk(sg, saddr, 1);
......@@ -2005,7 +1997,7 @@ EXPORT_SYMBOL_GPL(gmap_shadow_page);
* Called with sg->parent->shadow_lock.
*/
static void gmap_shadow_notify(struct gmap *sg, unsigned long vmaddr,
unsigned long gaddr, pte_t *pte)
unsigned long gaddr)
{
struct gmap_rmap *rmap, *rnext, *head;
unsigned long start, end, bits, raddr;
......@@ -2090,7 +2082,7 @@ void ptep_notify(struct mm_struct *mm, unsigned long vmaddr,
spin_lock(&gmap->shadow_lock);
list_for_each_entry_safe(sg, next,
&gmap->children, list)
gmap_shadow_notify(sg, vmaddr, gaddr, pte);
gmap_shadow_notify(sg, vmaddr, gaddr);
spin_unlock(&gmap->shadow_lock);
}
if (bits & PGSTE_IN_BIT)
......
......@@ -900,6 +900,9 @@ BUILD_INTERRUPT3(xen_hvm_callback_vector, HYPERVISOR_CALLBACK_VECTOR,
BUILD_INTERRUPT3(hyperv_callback_vector, HYPERVISOR_CALLBACK_VECTOR,
hyperv_vector_handler)
BUILD_INTERRUPT3(hyperv_reenlightenment_vector, HYPERV_REENLIGHTENMENT_VECTOR,
hyperv_reenlightenment_intr)
#endif /* CONFIG_HYPERV */
ENTRY(page_fault)
......
......@@ -1136,6 +1136,9 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
#if IS_ENABLED(CONFIG_HYPERV)
apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
hyperv_callback_vector hyperv_vector_handler
apicinterrupt3 HYPERV_REENLIGHTENMENT_VECTOR \
hyperv_reenlightenment_vector hyperv_reenlightenment_intr
#endif /* CONFIG_HYPERV */
idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
......
......@@ -18,6 +18,8 @@
*/
#include <linux/types.h>
#include <asm/apic.h>
#include <asm/desc.h>
#include <asm/hypervisor.h>
#include <asm/hyperv.h>
#include <asm/mshyperv.h>
......@@ -37,6 +39,7 @@ struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
return tsc_pg;
}
EXPORT_SYMBOL_GPL(hv_get_tsc_page);
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
......@@ -101,6 +104,115 @@ static int hv_cpu_init(unsigned int cpu)
return 0;
}
static void (*hv_reenlightenment_cb)(void);
static void hv_reenlightenment_notify(struct work_struct *dummy)
{
struct hv_tsc_emulation_status emu_status;
rdmsrl(HV_X64_MSR_TSC_EMULATION_STATUS, *(u64 *)&emu_status);
/* Don't issue the callback if TSC accesses are not emulated */
if (hv_reenlightenment_cb && emu_status.inprogress)
hv_reenlightenment_cb();
}
static DECLARE_DELAYED_WORK(hv_reenlightenment_work, hv_reenlightenment_notify);
void hyperv_stop_tsc_emulation(void)
{
u64 freq;
struct hv_tsc_emulation_status emu_status;
rdmsrl(HV_X64_MSR_TSC_EMULATION_STATUS, *(u64 *)&emu_status);
emu_status.inprogress = 0;
wrmsrl(HV_X64_MSR_TSC_EMULATION_STATUS, *(u64 *)&emu_status);
rdmsrl(HV_X64_MSR_TSC_FREQUENCY, freq);
tsc_khz = div64_u64(freq, 1000);
}
EXPORT_SYMBOL_GPL(hyperv_stop_tsc_emulation);
static inline bool hv_reenlightenment_available(void)
{
/*
* Check for required features and priviliges to make TSC frequency
* change notifications work.
*/
return ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE &&
ms_hyperv.features & HV_X64_ACCESS_REENLIGHTENMENT;
}
__visible void __irq_entry hyperv_reenlightenment_intr(struct pt_regs *regs)
{
entering_ack_irq();
inc_irq_stat(irq_hv_reenlightenment_count);
schedule_delayed_work(&hv_reenlightenment_work, HZ/10);
exiting_irq();
}
void set_hv_tscchange_cb(void (*cb)(void))
{
struct hv_reenlightenment_control re_ctrl = {
.vector = HYPERV_REENLIGHTENMENT_VECTOR,
.enabled = 1,
.target_vp = hv_vp_index[smp_processor_id()]
};
struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
if (!hv_reenlightenment_available()) {
pr_warn("Hyper-V: reenlightenment support is unavailable\n");
return;
}
hv_reenlightenment_cb = cb;
/* Make sure callback is registered before we write to MSRs */
wmb();
wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl));
}
EXPORT_SYMBOL_GPL(set_hv_tscchange_cb);
void clear_hv_tscchange_cb(void)
{
struct hv_reenlightenment_control re_ctrl;
if (!hv_reenlightenment_available())
return;
rdmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *(u64 *)&re_ctrl);
re_ctrl.enabled = 0;
wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *(u64 *)&re_ctrl);
hv_reenlightenment_cb = NULL;
}
EXPORT_SYMBOL_GPL(clear_hv_tscchange_cb);
static int hv_cpu_die(unsigned int cpu)
{
struct hv_reenlightenment_control re_ctrl;
unsigned int new_cpu;
if (hv_reenlightenment_cb == NULL)
return 0;
rdmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
if (re_ctrl.target_vp == hv_vp_index[cpu]) {
/* Reassign to some other online CPU */
new_cpu = cpumask_any_but(cpu_online_mask, cpu);
re_ctrl.target_vp = hv_vp_index[new_cpu];
wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
}
return 0;
}
/*
* This function is to be invoked early in the boot sequence after the
* hypervisor has been detected.
......@@ -110,12 +222,19 @@ static int hv_cpu_init(unsigned int cpu)
*/
void hyperv_init(void)
{
u64 guest_id;
u64 guest_id, required_msrs;
union hv_x64_msr_hypercall_contents hypercall_msr;
if (x86_hyper_type != X86_HYPER_MS_HYPERV)
return;
/* Absolutely required MSRs */
required_msrs = HV_X64_MSR_HYPERCALL_AVAILABLE |
HV_X64_MSR_VP_INDEX_AVAILABLE;
if ((ms_hyperv.features & required_msrs) != required_msrs)
return;
/* Allocate percpu VP index */
hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index),
GFP_KERNEL);
......@@ -123,7 +242,7 @@ void hyperv_init(void)
return;
if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
hv_cpu_init, NULL) < 0)
hv_cpu_init, hv_cpu_die) < 0)
goto free_vp_index;
/*
......
......@@ -210,6 +210,7 @@
#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */
#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */
#define X86_FEATURE_SEV ( 7*32+20) /* AMD Secure Encrypted Virtualization */
#define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
......
......@@ -38,6 +38,9 @@ typedef struct {
#if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
unsigned int irq_hv_callback_count;
#endif
#if IS_ENABLED(CONFIG_HYPERV)
unsigned int irq_hv_reenlightenment_count;
#endif
} ____cacheline_aligned irq_cpustat_t;
DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
......
......@@ -103,7 +103,12 @@
#endif
#define MANAGED_IRQ_SHUTDOWN_VECTOR 0xef
#define LOCAL_TIMER_VECTOR 0xee
#if IS_ENABLED(CONFIG_HYPERV)
#define HYPERV_REENLIGHTENMENT_VECTOR 0xee
#endif
#define LOCAL_TIMER_VECTOR 0xed
#define NR_VECTORS 256
......
......@@ -86,7 +86,7 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
| X86_CR4_SMAP | X86_CR4_PKE))
| X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
......@@ -504,6 +504,7 @@ struct kvm_vcpu_arch {
int mp_state;
u64 ia32_misc_enable_msr;
u64 smbase;
u64 smi_count;
bool tpr_access_reporting;
u64 ia32_xss;
......@@ -760,6 +761,15 @@ enum kvm_irqchip_mode {
KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */
};
struct kvm_sev_info {
bool active; /* SEV enabled guest */
unsigned int asid; /* ASID used for this guest */
unsigned int handle; /* SEV firmware handle */
int fd; /* SEV device fd */
unsigned long pages_locked; /* Number of pages locked */
struct list_head regions_list; /* List of registered regions */
};
struct kvm_arch {
unsigned int n_used_mmu_pages;
unsigned int n_requested_mmu_pages;
......@@ -847,6 +857,8 @@ struct kvm_arch {
bool x2apic_format;
bool x2apic_broadcast_quirk_disabled;
struct kvm_sev_info sev_info;
};
struct kvm_vm_stat {
......@@ -883,7 +895,6 @@ struct kvm_vcpu_stat {
u64 request_irq_exits;
u64 irq_exits;
u64 host_state_reload;
u64 efer_reload;
u64 fpu_reload;
u64 insn_emulation;
u64 insn_emulation_fail;
......@@ -965,7 +976,7 @@ struct kvm_x86_ops {
unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
void (*tlb_flush)(struct kvm_vcpu *vcpu);
void (*tlb_flush)(struct kvm_vcpu *vcpu, bool invalidate_gpa);
void (*run)(struct kvm_vcpu *vcpu);
int (*handle_exit)(struct kvm_vcpu *vcpu);
......@@ -1017,6 +1028,7 @@ struct kvm_x86_ops {
void (*handle_external_intr)(struct kvm_vcpu *vcpu);
bool (*mpx_supported)(void);
bool (*xsaves_supported)(void);
bool (*umip_emulated)(void);
int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
......@@ -1079,6 +1091,10 @@ struct kvm_x86_ops {
int (*pre_enter_smm)(struct kvm_vcpu *vcpu, char *smstate);
int (*pre_leave_smm)(struct kvm_vcpu *vcpu, u64 smbase);
int (*enable_smi_window)(struct kvm_vcpu *vcpu);
int (*mem_enc_op)(struct kvm *kvm, void __user *argp);
int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
};
struct kvm_arch_async_pf {
......
此差异已折叠。
......@@ -397,6 +397,8 @@
#define MSR_K7_PERFCTR3 0xc0010007
#define MSR_K7_CLK_CTL 0xc001001b
#define MSR_K7_HWCR 0xc0010015
#define MSR_K7_HWCR_SMMLOCK_BIT 0
#define MSR_K7_HWCR_SMMLOCK BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
#define MSR_K7_FID_VID_CTL 0xc0010041
#define MSR_K7_FID_VID_STATUS 0xc0010042
......
......@@ -22,4 +22,6 @@ int io_reserve_memtype(resource_size_t start, resource_size_t end,
void io_free_memtype(resource_size_t start, resource_size_t end);
bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
#endif /* _ASM_X86_PAT_H */
......@@ -146,6 +146,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
#define SVM_VM_CR_SVM_DIS_MASK 0x0010ULL
#define SVM_NESTED_CTL_NP_ENABLE BIT(0)
#define SVM_NESTED_CTL_SEV_ENABLE BIT(1)
struct __attribute__ ((__packed__)) vmcb_seg {
u16 selector;
u16 attrib;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_SME, CPUID_EAX, 0, 0x8000001f, 0 },
{ X86_FEATURE_SEV, CPUID_EAX, 1, 0x8000001f, 0 },
{ 0, 0, 0, 0, 0 }
};
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册