提交 66cdd0ce 编写于 作者: L Linus Torvalds

Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Marcelo Tosatti:
 "Considerable KVM/PPC work, x86 kvmclock vsyscall support,
  IA32_TSC_ADJUST MSR emulation, amongst others."

Fix up trivial conflict in kernel/sched/core.c due to cross-cpu
migration notifier added next to rq migration call-back.

* tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits)
  KVM: emulator: fix real mode segment checks in address linearization
  VMX: remove unneeded enable_unrestricted_guest check
  KVM: VMX: fix DPL during entry to protected mode
  x86/kexec: crash_vmclear_local_vmcss needs __rcu
  kvm: Fix irqfd resampler list walk
  KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump
  x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary
  KVM: MMU: optimize for set_spte
  KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
  KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
  KVM: PPC: bookehv: Add guest computation mode for irq delivery
  KVM: PPC: Make EPCR a valid field for booke64 and bookehv
  KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
  KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
  KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
  KVM: PPC: e500: Add emulation helper for getting instruction ea
  KVM: PPC: bookehv64: Add support for interrupt handling
  KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
  KVM: PPC: booke: Fix get_tb() compile error on 64-bit
  KVM: PPC: e500: Silence bogus GCC warning in tlb code
  ...
......@@ -1194,12 +1194,15 @@ struct kvm_ppc_pvinfo {
This ioctl fetches PV specific information that need to be passed to the guest
using the device tree or other means from vm context.
For now the only implemented piece of information distributed here is an array
of 4 instructions that make up a hypercall.
The hcall array defines 4 instructions that make up a hypercall.
If any additional field gets added to this structure later on, a bit for that
additional piece of information will be set in the flags bitmap.
The flags bitmap is defined as:
/* the host supports the ePAPR idle hcall
#define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0)
4.48 KVM_ASSIGN_PCI_DEVICE
......@@ -1731,7 +1734,46 @@ registers, find a list below:
Arch | Register | Width (bits)
| |
PPC | KVM_REG_PPC_HIOR | 64
PPC | KVM_REG_PPC_IAC1 | 64
PPC | KVM_REG_PPC_IAC2 | 64
PPC | KVM_REG_PPC_IAC3 | 64
PPC | KVM_REG_PPC_IAC4 | 64
PPC | KVM_REG_PPC_DAC1 | 64
PPC | KVM_REG_PPC_DAC2 | 64
PPC | KVM_REG_PPC_DABR | 64
PPC | KVM_REG_PPC_DSCR | 64
PPC | KVM_REG_PPC_PURR | 64
PPC | KVM_REG_PPC_SPURR | 64
PPC | KVM_REG_PPC_DAR | 64
PPC | KVM_REG_PPC_DSISR | 32
PPC | KVM_REG_PPC_AMR | 64
PPC | KVM_REG_PPC_UAMOR | 64
PPC | KVM_REG_PPC_MMCR0 | 64
PPC | KVM_REG_PPC_MMCR1 | 64
PPC | KVM_REG_PPC_MMCRA | 64
PPC | KVM_REG_PPC_PMC1 | 32
PPC | KVM_REG_PPC_PMC2 | 32
PPC | KVM_REG_PPC_PMC3 | 32
PPC | KVM_REG_PPC_PMC4 | 32
PPC | KVM_REG_PPC_PMC5 | 32
PPC | KVM_REG_PPC_PMC6 | 32
PPC | KVM_REG_PPC_PMC7 | 32
PPC | KVM_REG_PPC_PMC8 | 32
PPC | KVM_REG_PPC_FPR0 | 64
...
PPC | KVM_REG_PPC_FPR31 | 64
PPC | KVM_REG_PPC_VR0 | 128
...
PPC | KVM_REG_PPC_VR31 | 128
PPC | KVM_REG_PPC_VSR0 | 128
...
PPC | KVM_REG_PPC_VSR31 | 128
PPC | KVM_REG_PPC_FPSCR | 64
PPC | KVM_REG_PPC_VSCR | 32
PPC | KVM_REG_PPC_VPA_ADDR | 64
PPC | KVM_REG_PPC_VPA_SLB | 128
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
4.69 KVM_GET_ONE_REG
......@@ -1747,7 +1789,7 @@ kvm_one_reg struct passed in. On success, the register value can be found
at the memory location pointed to by "addr".
The list of registers accessible using this interface is identical to the
list in 4.64.
list in 4.68.
4.70 KVM_KVMCLOCK_CTRL
......@@ -1997,6 +2039,93 @@ return the hash table order in the parameter. (If the guest is using
the virtualized real-mode area (VRMA) facility, the kernel will
re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.)
4.77 KVM_S390_INTERRUPT
Capability: basic
Architectures: s390
Type: vm ioctl, vcpu ioctl
Parameters: struct kvm_s390_interrupt (in)
Returns: 0 on success, -1 on error
Allows to inject an interrupt to the guest. Interrupts can be floating
(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type.
Interrupt parameters are passed via kvm_s390_interrupt:
struct kvm_s390_interrupt {
__u32 type;
__u32 parm;
__u64 parm64;
};
type can be one of the following:
KVM_S390_SIGP_STOP (vcpu) - sigp restart
KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
KVM_S390_RESTART (vcpu) - restart
KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
parameters in parm and parm64
KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
Note that the vcpu ioctl is asynchronous to vcpu execution.
4.78 KVM_PPC_GET_HTAB_FD
Capability: KVM_CAP_PPC_HTAB_FD
Architectures: powerpc
Type: vm ioctl
Parameters: Pointer to struct kvm_get_htab_fd (in)
Returns: file descriptor number (>= 0) on success, -1 on error
This returns a file descriptor that can be used either to read out the
entries in the guest's hashed page table (HPT), or to write entries to
initialize the HPT. The returned fd can only be written to if the
KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
can only be read if that bit is clear. The argument struct looks like
this:
/* For KVM_PPC_GET_HTAB_FD */
struct kvm_get_htab_fd {
__u64 flags;
__u64 start_index;
__u64 reserved[2];
};
/* Values for kvm_get_htab_fd.flags */
#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1)
#define KVM_GET_HTAB_WRITE ((__u64)0x2)
The `start_index' field gives the index in the HPT of the entry at
which to start reading. It is ignored when writing.
Reads on the fd will initially supply information about all
"interesting" HPT entries. Interesting entries are those with the
bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
all entries. When the end of the HPT is reached, the read() will
return. If read() is called again on the fd, it will start again from
the beginning of the HPT, but will only return HPT entries that have
changed since they were last read.
Data read or written is structured as a header (8 bytes) followed by a
series of valid HPT entries (16 bytes) each. The header indicates how
many valid HPT entries there are and how many invalid entries follow
the valid entries. The invalid entries are not represented explicitly
in the stream. The header format is:
struct kvm_get_htab_header {
__u32 index;
__u16 n_valid;
__u16 n_invalid;
};
Writes to the fd create HPT entries starting at the index given in the
header; first `n_valid' valid entries with contents from the data
written, then `n_invalid' invalid entries, invalidating any previously
valid entries found.
5. The kvm_run structure
------------------------
......@@ -2109,7 +2238,8 @@ executed a memory-mapped I/O instruction which could not be satisfied
by kvm. The 'data' member contains the written data if 'is_write' is
true, and should be filled by application code otherwise.
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR
and KVM_EXIT_PAPR the corresponding
operations are complete (and guest state is consistent) only after userspace
has re-entered the kernel with KVM_RUN. The kernel side will first finish
incomplete operations and then check for pending signals. Userspace
......
......@@ -4314,10 +4314,10 @@ F: include/linux/kvm*
F: virt/kvm/
KERNEL VIRTUAL MACHINE (KVM) FOR AMD-V
M: Joerg Roedel <joerg.roedel@amd.com>
M: Joerg Roedel <joro@8bytes.org>
L: kvm@vger.kernel.org
W: http://kvm.qumranet.com
S: Supported
S: Maintained
F: arch/x86/include/asm/svm.h
F: arch/x86/kvm/svm.c
......@@ -4325,6 +4325,7 @@ KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC
M: Alexander Graf <agraf@suse.de>
L: kvm-ppc@vger.kernel.org
W: http://kvm.qumranet.com
T: git git://github.com/agraf/linux-2.6.git
S: Supported
F: arch/powerpc/include/asm/kvm*
F: arch/powerpc/kvm/
......
......@@ -1330,6 +1330,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
return 0;
}
int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
return 0;
}
int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
return -EINVAL;
......@@ -1362,11 +1367,9 @@ static void kvm_release_vm_pages(struct kvm *kvm)
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int j;
unsigned long base_gfn;
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) {
base_gfn = memslot->base_gfn;
for (j = 0; j < memslot->npages; j++) {
if (memslot->rmap[j])
put_page((struct page *)memslot->rmap[j]);
......
generic-y += clkdev.h
generic-y += rwsem.h
generic-y += trace_clock.h
......@@ -50,64 +50,13 @@
#ifndef _EPAPR_HCALLS_H
#define _EPAPR_HCALLS_H
#include <uapi/asm/epapr_hcalls.h>
#ifndef __ASSEMBLY__
#include <linux/types.h>
#include <linux/errno.h>
#include <asm/byteorder.h>
#define EV_BYTE_CHANNEL_SEND 1
#define EV_BYTE_CHANNEL_RECEIVE 2
#define EV_BYTE_CHANNEL_POLL 3
#define EV_INT_SET_CONFIG 4
#define EV_INT_GET_CONFIG 5
#define EV_INT_SET_MASK 6
#define EV_INT_GET_MASK 7
#define EV_INT_IACK 9
#define EV_INT_EOI 10
#define EV_INT_SEND_IPI 11
#define EV_INT_SET_TASK_PRIORITY 12
#define EV_INT_GET_TASK_PRIORITY 13
#define EV_DOORBELL_SEND 14
#define EV_MSGSND 15
#define EV_IDLE 16
/* vendor ID: epapr */
#define EV_LOCAL_VENDOR_ID 0 /* for private use */
#define EV_EPAPR_VENDOR_ID 1
#define EV_FSL_VENDOR_ID 2 /* Freescale Semiconductor */
#define EV_IBM_VENDOR_ID 3 /* IBM */
#define EV_GHS_VENDOR_ID 4 /* Green Hills Software */
#define EV_ENEA_VENDOR_ID 5 /* Enea */
#define EV_WR_VENDOR_ID 6 /* Wind River Systems */
#define EV_AMCC_VENDOR_ID 7 /* Applied Micro Circuits */
#define EV_KVM_VENDOR_ID 42 /* KVM */
/* The max number of bytes that a byte channel can send or receive per call */
#define EV_BYTE_CHANNEL_MAX_BYTES 16
#define _EV_HCALL_TOKEN(id, num) (((id) << 16) | (num))
#define EV_HCALL_TOKEN(hcall_num) _EV_HCALL_TOKEN(EV_EPAPR_VENDOR_ID, hcall_num)
/* epapr error codes */
#define EV_EPERM 1 /* Operation not permitted */
#define EV_ENOENT 2 /* Entry Not Found */
#define EV_EIO 3 /* I/O error occured */
#define EV_EAGAIN 4 /* The operation had insufficient
* resources to complete and should be
* retried
*/
#define EV_ENOMEM 5 /* There was insufficient memory to
* complete the operation */
#define EV_EFAULT 6 /* Bad guest address */
#define EV_ENODEV 7 /* No such device */
#define EV_EINVAL 8 /* An argument supplied to the hcall
was out of range or invalid */
#define EV_INTERNAL 9 /* An internal error occured */
#define EV_CONFIG 10 /* A configuration error was detected */
#define EV_INVALID_STATE 11 /* The object is in an invalid state */
#define EV_UNIMPLEMENTED 12 /* Unimplemented hypercall */
#define EV_BUFFER_OVERFLOW 13 /* Caller-supplied buffer too small */
/*
* Hypercall register clobber list
*
......@@ -193,7 +142,7 @@ static inline unsigned int ev_int_set_config(unsigned int interrupt,
r5 = priority;
r6 = destination;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6)
: : EV_HCALL_CLOBBERS4
);
......@@ -222,7 +171,7 @@ static inline unsigned int ev_int_get_config(unsigned int interrupt,
r11 = EV_HCALL_TOKEN(EV_INT_GET_CONFIG);
r3 = interrupt;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5), "=r" (r6)
: : EV_HCALL_CLOBBERS4
);
......@@ -252,7 +201,7 @@ static inline unsigned int ev_int_set_mask(unsigned int interrupt,
r3 = interrupt;
r4 = mask;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -277,7 +226,7 @@ static inline unsigned int ev_int_get_mask(unsigned int interrupt,
r11 = EV_HCALL_TOKEN(EV_INT_GET_MASK);
r3 = interrupt;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -305,7 +254,7 @@ static inline unsigned int ev_int_eoi(unsigned int interrupt)
r11 = EV_HCALL_TOKEN(EV_INT_EOI);
r3 = interrupt;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -344,7 +293,7 @@ static inline unsigned int ev_byte_channel_send(unsigned int handle,
r7 = be32_to_cpu(p[2]);
r8 = be32_to_cpu(p[3]);
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3),
"+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7), "+r" (r8)
: : EV_HCALL_CLOBBERS6
......@@ -383,7 +332,7 @@ static inline unsigned int ev_byte_channel_receive(unsigned int handle,
r3 = handle;
r4 = *count;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4),
"=r" (r5), "=r" (r6), "=r" (r7), "=r" (r8)
: : EV_HCALL_CLOBBERS6
......@@ -421,7 +370,7 @@ static inline unsigned int ev_byte_channel_poll(unsigned int handle,
r11 = EV_HCALL_TOKEN(EV_BYTE_CHANNEL_POLL);
r3 = handle;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5)
: : EV_HCALL_CLOBBERS3
);
......@@ -454,7 +403,7 @@ static inline unsigned int ev_int_iack(unsigned int handle,
r11 = EV_HCALL_TOKEN(EV_INT_IACK);
r3 = handle;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -478,7 +427,7 @@ static inline unsigned int ev_doorbell_send(unsigned int handle)
r11 = EV_HCALL_TOKEN(EV_DOORBELL_SEND);
r3 = handle;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -498,12 +447,12 @@ static inline unsigned int ev_idle(void)
r11 = EV_HCALL_TOKEN(EV_IDLE);
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "=r" (r3)
: : EV_HCALL_CLOBBERS1
);
return r3;
}
#endif
#endif /* !__ASSEMBLY__ */
#endif /* _EPAPR_HCALLS_H */
......@@ -96,7 +96,7 @@ static inline unsigned int fh_send_nmi(unsigned int vcpu_mask)
r11 = FH_HCALL_TOKEN(FH_SEND_NMI);
r3 = vcpu_mask;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -151,7 +151,7 @@ static inline unsigned int fh_partition_get_dtprop(int handle,
r9 = (uint32_t)propvalue_addr;
r10 = *propvalue_len;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11),
"+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
"+r" (r8), "+r" (r9), "+r" (r10)
......@@ -205,7 +205,7 @@ static inline unsigned int fh_partition_set_dtprop(int handle,
r9 = (uint32_t)propvalue_addr;
r10 = propvalue_len;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11),
"+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
"+r" (r8), "+r" (r9), "+r" (r10)
......@@ -229,7 +229,7 @@ static inline unsigned int fh_partition_restart(unsigned int partition)
r11 = FH_HCALL_TOKEN(FH_PARTITION_RESTART);
r3 = partition;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -262,7 +262,7 @@ static inline unsigned int fh_partition_get_status(unsigned int partition,
r11 = FH_HCALL_TOKEN(FH_PARTITION_GET_STATUS);
r3 = partition;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -295,7 +295,7 @@ static inline unsigned int fh_partition_start(unsigned int partition,
r4 = entry_point;
r5 = load;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5)
: : EV_HCALL_CLOBBERS3
);
......@@ -317,7 +317,7 @@ static inline unsigned int fh_partition_stop(unsigned int partition)
r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP);
r3 = partition;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -376,7 +376,7 @@ static inline unsigned int fh_partition_memcpy(unsigned int source,
#endif
r7 = count;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11),
"+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7)
: : EV_HCALL_CLOBBERS5
......@@ -399,7 +399,7 @@ static inline unsigned int fh_dma_enable(unsigned int liodn)
r11 = FH_HCALL_TOKEN(FH_DMA_ENABLE);
r3 = liodn;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -421,7 +421,7 @@ static inline unsigned int fh_dma_disable(unsigned int liodn)
r11 = FH_HCALL_TOKEN(FH_DMA_DISABLE);
r3 = liodn;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -447,7 +447,7 @@ static inline unsigned int fh_vmpic_get_msir(unsigned int interrupt,
r11 = FH_HCALL_TOKEN(FH_VMPIC_GET_MSIR);
r3 = interrupt;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "=r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -469,7 +469,7 @@ static inline unsigned int fh_system_reset(void)
r11 = FH_HCALL_TOKEN(FH_SYSTEM_RESET);
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "=r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -506,7 +506,7 @@ static inline unsigned int fh_err_get_info(int queue, uint32_t *bufsize,
r6 = addr_lo;
r7 = peek;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6),
"+r" (r7)
: : EV_HCALL_CLOBBERS5
......@@ -542,7 +542,7 @@ static inline unsigned int fh_get_core_state(unsigned int handle,
r3 = handle;
r4 = vcpu;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -572,7 +572,7 @@ static inline unsigned int fh_enter_nap(unsigned int handle, unsigned int vcpu)
r3 = handle;
r4 = vcpu;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -597,7 +597,7 @@ static inline unsigned int fh_exit_nap(unsigned int handle, unsigned int vcpu)
r3 = handle;
r4 = vcpu;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3), "+r" (r4)
: : EV_HCALL_CLOBBERS2
);
......@@ -618,7 +618,7 @@ static inline unsigned int fh_claim_device(unsigned int handle)
r11 = FH_HCALL_TOKEN(FH_CLAIM_DEVICE);
r3 = handle;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......@@ -645,7 +645,7 @@ static inline unsigned int fh_partition_stop_dma(unsigned int handle)
r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP_DMA);
r3 = handle;
__asm__ __volatile__ ("sc 1"
asm volatile("bl epapr_hypercall_start"
: "+r" (r11), "+r" (r3)
: : EV_HCALL_CLOBBERS1
);
......
......@@ -118,6 +118,7 @@
#define RESUME_FLAG_NV (1<<0) /* Reload guest nonvolatile state? */
#define RESUME_FLAG_HOST (1<<1) /* Resume host? */
#define RESUME_FLAG_ARCH1 (1<<2)
#define RESUME_GUEST 0
#define RESUME_GUEST_NV RESUME_FLAG_NV
......
......@@ -81,6 +81,8 @@ struct kvmppc_vcpu_book3s {
u64 sdr1;
u64 hior;
u64 msr_mask;
u64 purr_offset;
u64 spurr_offset;
#ifdef CONFIG_PPC_BOOK3S_32
u32 vsid_pool[VSID_POOL_SIZE];
u32 vsid_next;
......@@ -157,10 +159,14 @@ extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr);
extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel);
extern long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel);
extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel,
pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
unsigned long pte_index, unsigned long avpn,
unsigned long *hpret);
extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
struct kvm_memory_slot *memslot);
struct kvm_memory_slot *memslot, unsigned long *map);
extern void kvmppc_entry_trampoline(void);
extern void kvmppc_hv_entry_trampoline(void);
......
......@@ -50,6 +50,15 @@ extern int kvm_hpt_order; /* order of preallocated HPTs */
#define HPTE_V_HVLOCK 0x40UL
#define HPTE_V_ABSENT 0x20UL
/*
* We use this bit in the guest_rpte field of the revmap entry
* to indicate a modified HPTE.
*/
#define HPTE_GR_MODIFIED (1ul << 62)
/* These bits are reserved in the guest view of the HPTE */
#define HPTE_GR_RESERVED HPTE_GR_MODIFIED
static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
{
unsigned long tmp, old;
......@@ -60,7 +69,7 @@ static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
" ori %0,%0,%4\n"
" stdcx. %0,0,%2\n"
" beq+ 2f\n"
" li %1,%3\n"
" mr %1,%3\n"
"2: isync"
: "=&r" (tmp), "=&r" (old)
: "r" (hpte), "r" (bits), "i" (HPTE_V_HVLOCK)
......@@ -237,4 +246,26 @@ static inline bool slot_is_aligned(struct kvm_memory_slot *memslot,
return !(memslot->base_gfn & mask) && !(memslot->npages & mask);
}
/*
* This works for 4k, 64k and 16M pages on POWER7,
* and 4k and 16M pages on PPC970.
*/
static inline unsigned long slb_pgsize_encoding(unsigned long psize)
{
unsigned long senc = 0;
if (psize > 0x1000) {
senc = SLB_VSID_L;
if (psize == 0x10000)
senc |= SLB_VSID_LP_01;
}
return senc;
}
static inline int is_vrma_hpte(unsigned long hpte_v)
{
return (hpte_v & ~0xffffffUL) ==
(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
}
#endif /* __ASM_KVM_BOOK3S_64_H__ */
......@@ -17,6 +17,7 @@
* there are no exceptions for which we fall through directly to
* the normal host handler.
*
* 32-bit host
* Expected inputs (normal exceptions):
* SCRATCH0 = saved r10
* r10 = thread struct
......@@ -33,14 +34,38 @@
* *(r8 + GPR9) = saved r9
* *(r8 + GPR10) = saved r10 (r10 not yet clobbered)
* *(r8 + GPR11) = saved r11
*
* 64-bit host
* Expected inputs (GEN/GDBELL/DBG/MC exception types):
* r10 = saved CR
* r13 = PACA_POINTER
* *(r13 + PACA_EX##type + EX_R10) = saved r10
* *(r13 + PACA_EX##type + EX_R11) = saved r11
* SPRN_SPRG_##type##_SCRATCH = saved r13
*
* Expected inputs (CRIT exception type):
* r10 = saved CR
* r13 = PACA_POINTER
* *(r13 + PACA_EX##type + EX_R10) = saved r10
* *(r13 + PACA_EX##type + EX_R11) = saved r11
* *(r13 + PACA_EX##type + EX_R13) = saved r13
*
* Expected inputs (TLB exception type):
* r10 = saved CR
* r13 = PACA_POINTER
* *(r13 + PACA_EX##type + EX_TLB_R10) = saved r10
* *(r13 + PACA_EX##type + EX_TLB_R11) = saved r11
* SPRN_SPRG_GEN_SCRATCH = saved r13
*
* Only the bolted version of TLB miss exception handlers is supported now.
*/
.macro DO_KVM intno srr1
#ifdef CONFIG_KVM_BOOKE_HV
BEGIN_FTR_SECTION
mtocrf 0x80, r11 /* check MSR[GS] without clobbering reg */
bf 3, kvmppc_resume_\intno\()_\srr1
bf 3, 1975f
b kvmppc_handler_\intno\()_\srr1
kvmppc_resume_\intno\()_\srr1:
1975:
END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
#endif
.endm
......
......@@ -46,7 +46,7 @@
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
#endif
#ifdef CONFIG_KVM_BOOK3S_64_HV
#if !defined(CONFIG_KVM_440)
#include <linux/mmu_notifier.h>
#define KVM_ARCH_WANT_MMU_NOTIFIER
......@@ -204,7 +204,7 @@ struct revmap_entry {
};
/*
* We use the top bit of each memslot->rmap entry as a lock bit,
* We use the top bit of each memslot->arch.rmap entry as a lock bit,
* and bit 32 as a present flag. The bottom 32 bits are the
* index in the guest HPT of a HPTE that points to the page.
*/
......@@ -215,14 +215,17 @@ struct revmap_entry {
#define KVMPPC_RMAP_PRESENT 0x100000000ul
#define KVMPPC_RMAP_INDEX 0xfffffffful
/* Low-order bits in kvm->arch.slot_phys[][] */
/* Low-order bits in memslot->arch.slot_phys[] */
#define KVMPPC_PAGE_ORDER_MASK 0x1f
#define KVMPPC_PAGE_NO_CACHE HPTE_R_I /* 0x20 */
#define KVMPPC_PAGE_WRITETHRU HPTE_R_W /* 0x40 */
#define KVMPPC_GOT_PAGE 0x80
struct kvm_arch_memory_slot {
#ifdef CONFIG_KVM_BOOK3S_64_HV
unsigned long *rmap;
unsigned long *slot_phys;
#endif /* CONFIG_KVM_BOOK3S_64_HV */
};
struct kvm_arch {
......@@ -243,12 +246,12 @@ struct kvm_arch {
int using_mmu_notifiers;
u32 hpt_order;
atomic_t vcpus_running;
u32 online_vcores;
unsigned long hpt_npte;
unsigned long hpt_mask;
atomic_t hpte_mod_interest;
spinlock_t slot_phys_lock;
unsigned long *slot_phys[KVM_MEM_SLOTS_NUM];
int slot_npages[KVM_MEM_SLOTS_NUM];
unsigned short last_vcpu[NR_CPUS];
cpumask_t need_tlb_flush;
struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
struct kvmppc_linear_info *hpt_li;
#endif /* CONFIG_KVM_BOOK3S_64_HV */
......@@ -273,6 +276,7 @@ struct kvmppc_vcore {
int nap_count;
int napping_threads;
u16 pcpu;
u16 last_cpu;
u8 vcore_state;
u8 in_guest;
struct list_head runnable_threads;
......@@ -288,9 +292,10 @@ struct kvmppc_vcore {
/* Values for vcore_state */
#define VCORE_INACTIVE 0
#define VCORE_RUNNING 1
#define VCORE_EXITING 2
#define VCORE_SLEEPING 3
#define VCORE_SLEEPING 1
#define VCORE_STARTING 2
#define VCORE_RUNNING 3
#define VCORE_EXITING 4
/*
* Struct used to manage memory for a virtual processor area
......@@ -346,6 +351,27 @@ struct kvmppc_slb {
bool class : 1;
};
# ifdef CONFIG_PPC_FSL_BOOK3E
#define KVMPPC_BOOKE_IAC_NUM 2
#define KVMPPC_BOOKE_DAC_NUM 2
# else
#define KVMPPC_BOOKE_IAC_NUM 4
#define KVMPPC_BOOKE_DAC_NUM 2
# endif
#define KVMPPC_BOOKE_MAX_IAC 4
#define KVMPPC_BOOKE_MAX_DAC 2
struct kvmppc_booke_debug_reg {
u32 dbcr0;
u32 dbcr1;
u32 dbcr2;
#ifdef CONFIG_KVM_E500MC
u32 dbcr4;
#endif
u64 iac[KVMPPC_BOOKE_MAX_IAC];
u64 dac[KVMPPC_BOOKE_MAX_DAC];
};
struct kvm_vcpu_arch {
ulong host_stack;
u32 host_pid;
......@@ -380,13 +406,18 @@ struct kvm_vcpu_arch {
u32 host_mas4;
u32 host_mas6;
u32 shadow_epcr;
u32 epcr;
u32 shadow_msrp;
u32 eplc;
u32 epsc;
u32 oldpir;
#endif
#if defined(CONFIG_BOOKE)
#if defined(CONFIG_KVM_BOOKE_HV) || defined(CONFIG_64BIT)
u32 epcr;
#endif
#endif
#ifdef CONFIG_PPC_BOOK3S
/* For Gekko paired singles */
u32 qpr[32];
......@@ -440,8 +471,6 @@ struct kvm_vcpu_arch {
u32 ccr0;
u32 ccr1;
u32 dbcr0;
u32 dbcr1;
u32 dbsr;
u64 mmcr[3];
......@@ -471,9 +500,12 @@ struct kvm_vcpu_arch {
ulong fault_esr;
ulong queued_dear;
ulong queued_esr;
spinlock_t wdt_lock;
struct timer_list wdt_timer;
u32 tlbcfg[4];
u32 mmucfg;
u32 epr;
struct kvmppc_booke_debug_reg dbg_reg;
#endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
......@@ -486,6 +518,7 @@ struct kvm_vcpu_arch {
u8 osi_needed;
u8 osi_enabled;
u8 papr_enabled;
u8 watchdog_enabled;
u8 sane;
u8 cpu_type;
u8 hcall_needed;
......@@ -497,7 +530,6 @@ struct kvm_vcpu_arch {
u64 dec_jiffies;
u64 dec_expires;
unsigned long pending_exceptions;
u16 last_cpu;
u8 ceded;
u8 prodded;
u32 last_inst;
......@@ -534,13 +566,17 @@ struct kvm_vcpu_arch {
unsigned long dtl_index;
u64 stolen_logged;
struct kvmppc_vpa slb_shadow;
spinlock_t tbacct_lock;
u64 busy_stolen;
u64 busy_preempt;
#endif
};
/* Values for vcpu->arch.state */
#define KVMPPC_VCPU_STOPPED 0
#define KVMPPC_VCPU_BUSY_IN_HOST 1
#define KVMPPC_VCPU_RUNNABLE 2
#define KVMPPC_VCPU_NOTREADY 0
#define KVMPPC_VCPU_RUNNABLE 1
#define KVMPPC_VCPU_BUSY_IN_HOST 2
/* Values for vcpu->arch.io_gpr */
#define KVM_MMIO_REG_MASK 0x001f
......
......@@ -21,7 +21,6 @@
#include <uapi/asm/kvm_para.h>
#ifdef CONFIG_KVM_GUEST
#include <linux/of.h>
......@@ -55,7 +54,7 @@ static unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr)
{
return HC_EV_UNIMPLEMENTED;
return EV_UNIMPLEMENTED;
}
#endif
......@@ -66,7 +65,7 @@ static inline long kvm_hypercall0_1(unsigned int nr, unsigned long *r2)
unsigned long out[8];
unsigned long r;
r = kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
r = kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
*r2 = out[0];
return r;
......@@ -77,7 +76,7 @@ static inline long kvm_hypercall0(unsigned int nr)
unsigned long in[8];
unsigned long out[8];
return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
}
static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
......@@ -86,7 +85,7 @@ static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
unsigned long out[8];
in[0] = p1;
return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
}
static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
......@@ -97,7 +96,7 @@ static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
in[0] = p1;
in[1] = p2;
return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
}
static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
......@@ -109,7 +108,7 @@ static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
in[0] = p1;
in[1] = p2;
in[2] = p3;
return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
}
static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
......@@ -123,7 +122,7 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
in[1] = p2;
in[2] = p3;
in[3] = p4;
return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
}
......
......@@ -28,6 +28,7 @@
#include <linux/types.h>
#include <linux/kvm_types.h>
#include <linux/kvm_host.h>
#include <linux/bug.h>
#ifdef CONFIG_PPC_BOOK3S
#include <asm/kvm_book3s.h>
#else
......@@ -68,6 +69,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
extern void kvmppc_decrementer_func(unsigned long data);
extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
/* Core-specific hooks */
......@@ -104,6 +107,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq);
extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq);
extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int op, int *advance);
......@@ -111,6 +115,7 @@ extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn,
ulong val);
extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn,
ulong *val);
extern int kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
extern int kvmppc_booke_init(void);
extern void kvmppc_booke_exit(void);
......@@ -139,16 +144,28 @@ extern struct kvmppc_linear_info *kvm_alloc_hpt(void);
extern void kvm_release_hpt(struct kvmppc_linear_info *li);
extern int kvmppc_core_init_vm(struct kvm *kvm);
extern void kvmppc_core_destroy_vm(struct kvm *kvm);
extern void kvmppc_core_free_memslot(struct kvm_memory_slot *free,
struct kvm_memory_slot *dont);
extern int kvmppc_core_create_memslot(struct kvm_memory_slot *slot,
unsigned long npages);
extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem);
extern void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem);
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old);
extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
struct kvm_ppc_smmu_info *info);
extern void kvmppc_core_flush_memslot(struct kvm *kvm,
struct kvm_memory_slot *memslot);
extern int kvmppc_bookehv_init(void);
extern void kvmppc_bookehv_exit(void);
extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
/*
* Cuts out inst bits with ordering according to spec.
* That means the leftmost bit is zero. All given bits are included.
......@@ -182,6 +199,41 @@ static inline u32 kvmppc_set_field(u64 inst, int msb, int lsb, int value)
return r;
}
union kvmppc_one_reg {
u32 wval;
u64 dval;
vector128 vval;
u64 vsxval[2];
struct {
u64 addr;
u64 length;
} vpaval;
};
#define one_reg_size(id) \
(1ul << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
#define get_reg_val(id, reg) ({ \
union kvmppc_one_reg __u; \
switch (one_reg_size(id)) { \
case 4: __u.wval = (reg); break; \
case 8: __u.dval = (reg); break; \
default: BUG(); \
} \
__u; \
})
#define set_reg_val(id, val) ({ \
u64 __v; \
switch (one_reg_size(id)) { \
case 4: __v = (val).wval; break; \
case 8: __v = (val).dval; break; \
default: BUG(); \
} \
__v; \
})
void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
......@@ -190,6 +242,8 @@ int kvmppc_set_sregs_ivor(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg);
int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg);
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
......@@ -230,5 +284,36 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
}
}
/* Please call after prepare_to_enter. This function puts the lazy ee state
back to normal mode, without actually enabling interrupts. */
static inline void kvmppc_lazy_ee_enable(void)
{
#ifdef CONFIG_PPC64
/* Only need to enable IRQs by hard enabling them after this */
local_paca->irq_happened = 0;
local_paca->soft_enabled = 1;
#endif
}
static inline ulong kvmppc_get_ea_indexed(struct kvm_vcpu *vcpu, int ra, int rb)
{
ulong ea;
ulong msr_64bit = 0;
ea = kvmppc_get_gpr(vcpu, rb);
if (ra)
ea += kvmppc_get_gpr(vcpu, ra);
#if defined(CONFIG_PPC_BOOK3E_64)
msr_64bit = MSR_CM;
#elif defined(CONFIG_PPC_BOOK3S_64)
msr_64bit = MSR_SF;
#endif
if (!(vcpu->arch.shared->msr & msr_64bit))
ea = (uint32_t)ea;
return ea;
}
#endif /* __POWERPC_KVM_PPC_H__ */
......@@ -59,7 +59,7 @@
#define MAS1_TSIZE_SHIFT 7
#define MAS1_TSIZE(x) (((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK)
#define MAS2_EPN 0xFFFFF000
#define MAS2_EPN (~0xFFFUL)
#define MAS2_X0 0x00000040
#define MAS2_X1 0x00000020
#define MAS2_W 0x00000010
......
......@@ -121,6 +121,16 @@ extern char initial_stab[];
#define PP_RXRX 3 /* Supervisor read, User read */
#define PP_RXXX (HPTE_R_PP0 | 2) /* Supervisor read, user none */
/* Fields for tlbiel instruction in architecture 2.06 */
#define TLBIEL_INVAL_SEL_MASK 0xc00 /* invalidation selector */
#define TLBIEL_INVAL_PAGE 0x000 /* invalidate a single page */
#define TLBIEL_INVAL_SET_LPID 0x800 /* invalidate a set for current LPID */
#define TLBIEL_INVAL_SET 0xc00 /* invalidate a set for all LPIDs */
#define TLBIEL_INVAL_SET_MASK 0xfff000 /* set number to inval. */
#define TLBIEL_INVAL_SET_SHIFT 12
#define POWER7_TLB_SETS 128 /* # sets in POWER7 TLB */
#ifndef __ASSEMBLY__
struct hash_pte {
......
......@@ -518,6 +518,7 @@
#define SRR1_WS_DEEPER 0x00020000 /* Some resources not maintained */
#define SRR1_WS_DEEP 0x00010000 /* All resources maintained */
#define SRR1_PROGFPE 0x00100000 /* Floating Point Enabled */
#define SRR1_PROGILL 0x00080000 /* Illegal instruction */
#define SRR1_PROGPRIV 0x00040000 /* Privileged instruction */
#define SRR1_PROGTRAP 0x00020000 /* Trap */
#define SRR1_PROGADDR 0x00010000 /* SRR0 contains subsequent addr */
......
......@@ -539,6 +539,13 @@
#define TCR_FIE 0x00800000 /* FIT Interrupt Enable */
#define TCR_ARE 0x00400000 /* Auto Reload Enable */
#ifdef CONFIG_E500
#define TCR_GET_WP(tcr) ((((tcr) & 0xC0000000) >> 30) | \
(((tcr) & 0x1E0000) >> 15))
#else
#define TCR_GET_WP(tcr) (((tcr) & 0xC0000000) >> 30)
#endif
/* Bit definitions for the TSR. */
#define TSR_ENW 0x80000000 /* Enable Next Watchdog */
#define TSR_WIS 0x40000000 /* WDT Interrupt Status */
......
......@@ -67,6 +67,14 @@ void generic_mach_cpu_die(void);
void generic_set_cpu_dead(unsigned int cpu);
void generic_set_cpu_up(unsigned int cpu);
int generic_check_cpu_restart(unsigned int cpu);
extern void inhibit_secondary_onlining(void);
extern void uninhibit_secondary_onlining(void);
#else /* HOTPLUG_CPU */
static inline void inhibit_secondary_onlining(void) {}
static inline void uninhibit_secondary_onlining(void) {}
#endif
#ifdef CONFIG_PPC64
......
......@@ -7,6 +7,7 @@ header-y += bootx.h
header-y += byteorder.h
header-y += cputable.h
header-y += elf.h
header-y += epapr_hcalls.h
header-y += errno.h
header-y += fcntl.h
header-y += ioctl.h
......
/*
* ePAPR hcall interface
*
* Copyright 2008-2011 Freescale Semiconductor, Inc.
*
* Author: Timur Tabi <timur@freescale.com>
*
* This file is provided under a dual BSD/GPL license. When using or
* redistributing this file, you may do so under either license.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Freescale Semiconductor nor the
* names of its contributors may be used to endorse or promote products
* derived from this software without specific prior written permission.
*
*
* ALTERNATIVELY, this software may be distributed under the terms of the
* GNU General Public License ("GPL") as published by the Free Software
* Foundation, either version 2 of that License or (at your option) any
* later version.
*
* THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _UAPI_ASM_POWERPC_EPAPR_HCALLS_H
#define _UAPI_ASM_POWERPC_EPAPR_HCALLS_H
#define EV_BYTE_CHANNEL_SEND 1
#define EV_BYTE_CHANNEL_RECEIVE 2
#define EV_BYTE_CHANNEL_POLL 3
#define EV_INT_SET_CONFIG 4
#define EV_INT_GET_CONFIG 5
#define EV_INT_SET_MASK 6
#define EV_INT_GET_MASK 7
#define EV_INT_IACK 9
#define EV_INT_EOI 10
#define EV_INT_SEND_IPI 11
#define EV_INT_SET_TASK_PRIORITY 12
#define EV_INT_GET_TASK_PRIORITY 13
#define EV_DOORBELL_SEND 14
#define EV_MSGSND 15
#define EV_IDLE 16
/* vendor ID: epapr */
#define EV_LOCAL_VENDOR_ID 0 /* for private use */
#define EV_EPAPR_VENDOR_ID 1
#define EV_FSL_VENDOR_ID 2 /* Freescale Semiconductor */
#define EV_IBM_VENDOR_ID 3 /* IBM */
#define EV_GHS_VENDOR_ID 4 /* Green Hills Software */
#define EV_ENEA_VENDOR_ID 5 /* Enea */
#define EV_WR_VENDOR_ID 6 /* Wind River Systems */
#define EV_AMCC_VENDOR_ID 7 /* Applied Micro Circuits */
#define EV_KVM_VENDOR_ID 42 /* KVM */
/* The max number of bytes that a byte channel can send or receive per call */
#define EV_BYTE_CHANNEL_MAX_BYTES 16
#define _EV_HCALL_TOKEN(id, num) (((id) << 16) | (num))
#define EV_HCALL_TOKEN(hcall_num) _EV_HCALL_TOKEN(EV_EPAPR_VENDOR_ID, hcall_num)
/* epapr return codes */
#define EV_SUCCESS 0
#define EV_EPERM 1 /* Operation not permitted */
#define EV_ENOENT 2 /* Entry Not Found */
#define EV_EIO 3 /* I/O error occured */
#define EV_EAGAIN 4 /* The operation had insufficient
* resources to complete and should be
* retried
*/
#define EV_ENOMEM 5 /* There was insufficient memory to
* complete the operation */
#define EV_EFAULT 6 /* Bad guest address */
#define EV_ENODEV 7 /* No such device */
#define EV_EINVAL 8 /* An argument supplied to the hcall
was out of range or invalid */
#define EV_INTERNAL 9 /* An internal error occured */
#define EV_CONFIG 10 /* A configuration error was detected */
#define EV_INVALID_STATE 11 /* The object is in an invalid state */
#define EV_UNIMPLEMENTED 12 /* Unimplemented hypercall */
#define EV_BUFFER_OVERFLOW 13 /* Caller-supplied buffer too small */
#endif /* _UAPI_ASM_POWERPC_EPAPR_HCALLS_H */
......@@ -221,6 +221,12 @@ struct kvm_sregs {
__u32 dbsr; /* KVM_SREGS_E_UPDATE_DBSR */
__u32 dbcr[3];
/*
* iac/dac registers are 64bit wide, while this API
* interface provides only lower 32 bits on 64 bit
* processors. ONE_REG interface is added for 64bit
* iac/dac registers.
*/
__u32 iac[4];
__u32 dac[2];
__u32 dvc[2];
......@@ -325,6 +331,86 @@ struct kvm_book3e_206_tlb_params {
__u32 reserved[8];
};
/* For KVM_PPC_GET_HTAB_FD */
struct kvm_get_htab_fd {
__u64 flags;
__u64 start_index;
__u64 reserved[2];
};
/* Values for kvm_get_htab_fd.flags */
#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1)
#define KVM_GET_HTAB_WRITE ((__u64)0x2)
/*
* Data read on the file descriptor is formatted as a series of
* records, each consisting of a header followed by a series of
* `n_valid' HPTEs (16 bytes each), which are all valid. Following
* those valid HPTEs there are `n_invalid' invalid HPTEs, which
* are not represented explicitly in the stream. The same format
* is used for writing.
*/
struct kvm_get_htab_header {
__u32 index;
__u16 n_valid;
__u16 n_invalid;
};
#define KVM_REG_PPC_HIOR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
#define KVM_REG_PPC_IAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
#define KVM_REG_PPC_IAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
#define KVM_REG_PPC_IAC3 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4)
#define KVM_REG_PPC_IAC4 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5)
#define KVM_REG_PPC_DAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6)
#define KVM_REG_PPC_DAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7)
#define KVM_REG_PPC_DABR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8)
#define KVM_REG_PPC_DSCR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9)
#define KVM_REG_PPC_PURR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa)
#define KVM_REG_PPC_SPURR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb)
#define KVM_REG_PPC_DAR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc)
#define KVM_REG_PPC_DSISR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xd)
#define KVM_REG_PPC_AMR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xe)
#define KVM_REG_PPC_UAMOR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xf)
#define KVM_REG_PPC_MMCR0 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x10)
#define KVM_REG_PPC_MMCR1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x11)
#define KVM_REG_PPC_MMCRA (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
#define KVM_REG_PPC_PMC1 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x18)
#define KVM_REG_PPC_PMC2 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x19)
#define KVM_REG_PPC_PMC3 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1a)
#define KVM_REG_PPC_PMC4 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1b)
#define KVM_REG_PPC_PMC5 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1c)
#define KVM_REG_PPC_PMC6 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1d)
#define KVM_REG_PPC_PMC7 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1e)
#define KVM_REG_PPC_PMC8 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x1f)
/* 32 floating-point registers */
#define KVM_REG_PPC_FPR0 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x20)
#define KVM_REG_PPC_FPR(n) (KVM_REG_PPC_FPR0 + (n))
#define KVM_REG_PPC_FPR31 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3f)
/* 32 VMX/Altivec vector registers */
#define KVM_REG_PPC_VR0 (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x40)
#define KVM_REG_PPC_VR(n) (KVM_REG_PPC_VR0 + (n))
#define KVM_REG_PPC_VR31 (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x5f)
/* 32 double-width FP registers for VSX */
/* High-order halves overlap with FP regs */
#define KVM_REG_PPC_VSR0 (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x60)
#define KVM_REG_PPC_VSR(n) (KVM_REG_PPC_VSR0 + (n))
#define KVM_REG_PPC_VSR31 (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x7f)
/* FP and vector status/control registers */
#define KVM_REG_PPC_FPSCR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x80)
#define KVM_REG_PPC_VSCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x81)
/* Virtual processor areas */
/* For SLB & DTL, address in high (first) half, length in low half */
#define KVM_REG_PPC_VPA_ADDR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x82)
#define KVM_REG_PPC_VPA_SLB (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x83)
#define KVM_REG_PPC_VPA_DTL (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x84)
#define KVM_REG_PPC_EPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x85)
#endif /* __LINUX_KVM_POWERPC_H */
......@@ -75,9 +75,10 @@ struct kvm_vcpu_arch_shared {
};
#define KVM_SC_MAGIC_R0 0x4b564d21 /* "KVM!" */
#define HC_VENDOR_KVM (42 << 16)
#define HC_EV_SUCCESS 0
#define HC_EV_UNIMPLEMENTED 12
#define KVM_HCALL_TOKEN(num) _EV_HCALL_TOKEN(EV_KVM_VENDOR_ID, num)
#include <uapi/asm/epapr_hcalls.h>
#define KVM_FEATURE_MAGIC_PAGE 1
......
......@@ -441,8 +441,7 @@ int main(void)
DEFINE(KVM_HOST_LPCR, offsetof(struct kvm, arch.host_lpcr));
DEFINE(KVM_HOST_SDR1, offsetof(struct kvm, arch.host_sdr1));
DEFINE(KVM_TLBIE_LOCK, offsetof(struct kvm, arch.tlbie_lock));
DEFINE(KVM_ONLINE_CPUS, offsetof(struct kvm, online_vcpus.counter));
DEFINE(KVM_LAST_VCPU, offsetof(struct kvm, arch.last_vcpu));
DEFINE(KVM_NEED_FLUSH, offsetof(struct kvm, arch.need_tlb_flush.bits));
DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
DEFINE(KVM_RMOR, offsetof(struct kvm, arch.rmor));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
......@@ -470,7 +469,6 @@ int main(void)
DEFINE(VCPU_SLB, offsetof(struct kvm_vcpu, arch.slb));
DEFINE(VCPU_SLB_MAX, offsetof(struct kvm_vcpu, arch.slb_max));
DEFINE(VCPU_SLB_NR, offsetof(struct kvm_vcpu, arch.slb_nr));
DEFINE(VCPU_LAST_CPU, offsetof(struct kvm_vcpu, arch.last_cpu));
DEFINE(VCPU_FAULT_DSISR, offsetof(struct kvm_vcpu, arch.fault_dsisr));
DEFINE(VCPU_FAULT_DAR, offsetof(struct kvm_vcpu, arch.fault_dar));
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
......
......@@ -8,13 +8,41 @@
*/
#include <linux/threads.h>
#include <asm/epapr_hcalls.h>
#include <asm/reg.h>
#include <asm/page.h>
#include <asm/cputable.h>
#include <asm/thread_info.h>
#include <asm/ppc_asm.h>
#include <asm/asm-compat.h>
#include <asm/asm-offsets.h>
/* epapr_ev_idle() was derived from e500_idle() */
_GLOBAL(epapr_ev_idle)
CURRENT_THREAD_INFO(r3, r1)
PPC_LL r4, TI_LOCAL_FLAGS(r3) /* set napping bit */
ori r4, r4,_TLF_NAPPING /* so when we take an exception */
PPC_STL r4, TI_LOCAL_FLAGS(r3) /* it will return to our caller */
wrteei 1
idle_loop:
LOAD_REG_IMMEDIATE(r11, EV_HCALL_TOKEN(EV_IDLE))
.global epapr_ev_idle_start
epapr_ev_idle_start:
li r3, -1
nop
nop
nop
/*
* Guard against spurious wakeups from a hypervisor --
* only interrupt will cause us to return to LR due to
* _TLF_NAPPING.
*/
b idle_loop
/* Hypercall entry point. Will be patched with device tree instructions. */
.global epapr_hypercall_start
epapr_hypercall_start:
......
......@@ -21,6 +21,10 @@
#include <asm/epapr_hcalls.h>
#include <asm/cacheflush.h>
#include <asm/code-patching.h>
#include <asm/machdep.h>
extern void epapr_ev_idle(void);
extern u32 epapr_ev_idle_start[];
bool epapr_paravirt_enabled;
......@@ -41,8 +45,13 @@ static int __init epapr_paravirt_init(void)
if (len % 4 || len > (4 * 4))
return -ENODEV;
for (i = 0; i < (len / 4); i++)
for (i = 0; i < (len / 4); i++) {
patch_instruction(epapr_hypercall_start + i, insts[i]);
patch_instruction(epapr_ev_idle_start + i, insts[i]);
}
if (of_get_property(hyper_node, "has-idle", NULL))
ppc_md.power_save = epapr_ev_idle;
epapr_paravirt_enabled = true;
......
......@@ -419,7 +419,7 @@ static void kvm_map_magic_page(void *data)
in[0] = KVM_MAGIC_PAGE;
in[1] = KVM_MAGIC_PAGE;
kvm_hypercall(in, out, HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE);
kvm_hypercall(in, out, KVM_HCALL_TOKEN(KVM_HC_PPC_MAP_MAGIC_PAGE));
*features = out[0];
}
......
......@@ -43,6 +43,7 @@
#include <asm/dcr.h>
#include <asm/ftrace.h>
#include <asm/switch_to.h>
#include <asm/epapr_hcalls.h>
#ifdef CONFIG_PPC32
extern void transfer_to_handler(void);
......@@ -191,3 +192,7 @@ EXPORT_SYMBOL(__arch_hweight64);
#ifdef CONFIG_PPC_BOOK3S_64
EXPORT_SYMBOL_GPL(mmu_psize_defs);
#endif
#ifdef CONFIG_EPAPR_PARAVIRT
EXPORT_SYMBOL(epapr_hypercall_start);
#endif
......@@ -427,6 +427,45 @@ int generic_check_cpu_restart(unsigned int cpu)
{
return per_cpu(cpu_state, cpu) == CPU_UP_PREPARE;
}
static atomic_t secondary_inhibit_count;
/*
* Don't allow secondary CPU threads to come online
*/
void inhibit_secondary_onlining(void)
{
/*
* This makes secondary_inhibit_count stable during cpu
* online/offline operations.
*/
get_online_cpus();
atomic_inc(&secondary_inhibit_count);
put_online_cpus();
}
EXPORT_SYMBOL_GPL(inhibit_secondary_onlining);
/*
* Allow secondary CPU threads to come online again
*/
void uninhibit_secondary_onlining(void)
{
get_online_cpus();
atomic_dec(&secondary_inhibit_count);
put_online_cpus();
}
EXPORT_SYMBOL_GPL(uninhibit_secondary_onlining);
static int secondaries_inhibited(void)
{
return atomic_read(&secondary_inhibit_count);
}
#else /* HOTPLUG_CPU */
#define secondaries_inhibited() 0
#endif
static void cpu_idle_thread_init(unsigned int cpu, struct task_struct *idle)
......@@ -445,6 +484,13 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
int rc, c;
/*
* Don't allow secondary threads to come online if inhibited
*/
if (threads_per_core > 1 && secondaries_inhibited() &&
cpu % threads_per_core != 0)
return -EBUSY;
if (smp_ops == NULL ||
(smp_ops->cpu_bootable && !smp_ops->cpu_bootable(cpu)))
return -EINVAL;
......
......@@ -83,6 +83,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu_44x->shadow_refs[i].gtlb_index = -1;
vcpu->arch.cpu_type = KVM_CPU_440;
vcpu->arch.pvr = mfspr(SPRN_PVR);
return 0;
}
......
......@@ -27,12 +27,70 @@
#include "booke.h"
#include "44x_tlb.h"
#define XOP_MFDCRX 259
#define XOP_MFDCR 323
#define XOP_MTDCRX 387
#define XOP_MTDCR 451
#define XOP_TLBSX 914
#define XOP_ICCCI 966
#define XOP_TLBWE 978
static int emulate_mtdcr(struct kvm_vcpu *vcpu, int rs, int dcrn)
{
/* emulate some access in kernel */
switch (dcrn) {
case DCRN_CPR0_CONFIG_ADDR:
vcpu->arch.cpr0_cfgaddr = kvmppc_get_gpr(vcpu, rs);
return EMULATE_DONE;
default:
vcpu->run->dcr.dcrn = dcrn;
vcpu->run->dcr.data = kvmppc_get_gpr(vcpu, rs);
vcpu->run->dcr.is_write = 1;
vcpu->arch.dcr_is_write = 1;
vcpu->arch.dcr_needed = 1;
kvmppc_account_exit(vcpu, DCR_EXITS);
return EMULATE_DO_DCR;
}
}
static int emulate_mfdcr(struct kvm_vcpu *vcpu, int rt, int dcrn)
{
/* The guest may access CPR0 registers to determine the timebase
* frequency, and it must know the real host frequency because it
* can directly access the timebase registers.
*
* It would be possible to emulate those accesses in userspace,
* but userspace can really only figure out the end frequency.
* We could decompose that into the factors that compute it, but
* that's tricky math, and it's easier to just report the real
* CPR0 values.
*/
switch (dcrn) {
case DCRN_CPR0_CONFIG_ADDR:
kvmppc_set_gpr(vcpu, rt, vcpu->arch.cpr0_cfgaddr);
break;
case DCRN_CPR0_CONFIG_DATA:
local_irq_disable();
mtdcr(DCRN_CPR0_CONFIG_ADDR,
vcpu->arch.cpr0_cfgaddr);
kvmppc_set_gpr(vcpu, rt,
mfdcr(DCRN_CPR0_CONFIG_DATA));
local_irq_enable();
break;
default:
vcpu->run->dcr.dcrn = dcrn;
vcpu->run->dcr.data = 0;
vcpu->run->dcr.is_write = 0;
vcpu->arch.dcr_is_write = 0;
vcpu->arch.io_gpr = rt;
vcpu->arch.dcr_needed = 1;
kvmppc_account_exit(vcpu, DCR_EXITS);
return EMULATE_DO_DCR;
}
return EMULATE_DONE;
}
int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int inst, int *advance)
{
......@@ -50,55 +108,21 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
switch (get_xop(inst)) {
case XOP_MFDCR:
/* The guest may access CPR0 registers to determine the timebase
* frequency, and it must know the real host frequency because it
* can directly access the timebase registers.
*
* It would be possible to emulate those accesses in userspace,
* but userspace can really only figure out the end frequency.
* We could decompose that into the factors that compute it, but
* that's tricky math, and it's easier to just report the real
* CPR0 values.
*/
switch (dcrn) {
case DCRN_CPR0_CONFIG_ADDR:
kvmppc_set_gpr(vcpu, rt, vcpu->arch.cpr0_cfgaddr);
break;
case DCRN_CPR0_CONFIG_DATA:
local_irq_disable();
mtdcr(DCRN_CPR0_CONFIG_ADDR,
vcpu->arch.cpr0_cfgaddr);
kvmppc_set_gpr(vcpu, rt,
mfdcr(DCRN_CPR0_CONFIG_DATA));
local_irq_enable();
break;
default:
run->dcr.dcrn = dcrn;
run->dcr.data = 0;
run->dcr.is_write = 0;
vcpu->arch.io_gpr = rt;
vcpu->arch.dcr_needed = 1;
kvmppc_account_exit(vcpu, DCR_EXITS);
emulated = EMULATE_DO_DCR;
}
emulated = emulate_mfdcr(vcpu, rt, dcrn);
break;
case XOP_MFDCRX:
emulated = emulate_mfdcr(vcpu, rt,
kvmppc_get_gpr(vcpu, ra));
break;
case XOP_MTDCR:
/* emulate some access in kernel */
switch (dcrn) {
case DCRN_CPR0_CONFIG_ADDR:
vcpu->arch.cpr0_cfgaddr = kvmppc_get_gpr(vcpu, rs);
break;
default:
run->dcr.dcrn = dcrn;
run->dcr.data = kvmppc_get_gpr(vcpu, rs);
run->dcr.is_write = 1;
vcpu->arch.dcr_needed = 1;
kvmppc_account_exit(vcpu, DCR_EXITS);
emulated = EMULATE_DO_DCR;
}
emulated = emulate_mtdcr(vcpu, rs, dcrn);
break;
case XOP_MTDCRX:
emulated = emulate_mtdcr(vcpu, rs,
kvmppc_get_gpr(vcpu, ra));
break;
case XOP_TLBWE:
......
......@@ -20,6 +20,7 @@ config KVM
bool
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_EVENTFD
config KVM_BOOK3S_HANDLER
bool
......@@ -36,6 +37,7 @@ config KVM_BOOK3S_64_HANDLER
config KVM_BOOK3S_PR
bool
select KVM_MMIO
select MMU_NOTIFIER
config KVM_BOOK3S_32
tristate "KVM support for PowerPC book3s_32 processors"
......@@ -123,6 +125,7 @@ config KVM_E500V2
depends on EXPERIMENTAL && E500 && !PPC_E500MC
select KVM
select KVM_MMIO
select MMU_NOTIFIER
---help---
Support running unmodified E500 guest kernels in virtual machines on
E500v2 host processors.
......@@ -138,6 +141,7 @@ config KVM_E500MC
select KVM
select KVM_MMIO
select KVM_BOOKE_HV
select MMU_NOTIFIER
---help---
Support running unmodified E500MC/E5500 (32-bit) guest kernels in
virtual machines on E500MC/E5500 host processors.
......
......@@ -6,7 +6,8 @@ subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm
common-objs-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
common-objs-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o \
eventfd.o)
CFLAGS_44x_tlb.o := -I.
CFLAGS_e500_tlb.o := -I.
......@@ -72,10 +73,12 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
book3s_hv_rmhandlers.o \
book3s_hv_rm_mmu.o \
book3s_64_vio_hv.o \
book3s_hv_ras.o \
book3s_hv_builtin.o
kvm-book3s_64-module-objs := \
../../../virt/kvm/kvm_main.o \
../../../virt/kvm/eventfd.o \
powerpc.o \
emulate.o \
book3s.o \
......
......@@ -411,6 +411,15 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
return 0;
}
int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu)
{
return 0;
}
void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
{
}
int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
{
int i;
......@@ -476,6 +485,122 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
return -ENOTSUPP;
}
int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
{
int r;
union kvmppc_one_reg val;
int size;
long int i;
size = one_reg_size(reg->id);
if (size > sizeof(val))
return -EINVAL;
r = kvmppc_get_one_reg(vcpu, reg->id, &val);
if (r == -EINVAL) {
r = 0;
switch (reg->id) {
case KVM_REG_PPC_DAR:
val = get_reg_val(reg->id, vcpu->arch.shared->dar);
break;
case KVM_REG_PPC_DSISR:
val = get_reg_val(reg->id, vcpu->arch.shared->dsisr);
break;
case KVM_REG_PPC_FPR0 ... KVM_REG_PPC_FPR31:
i = reg->id - KVM_REG_PPC_FPR0;
val = get_reg_val(reg->id, vcpu->arch.fpr[i]);
break;
case KVM_REG_PPC_FPSCR:
val = get_reg_val(reg->id, vcpu->arch.fpscr);
break;
#ifdef CONFIG_ALTIVEC
case KVM_REG_PPC_VR0 ... KVM_REG_PPC_VR31:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
val.vval = vcpu->arch.vr[reg->id - KVM_REG_PPC_VR0];
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
val = get_reg_val(reg->id, vcpu->arch.vscr.u[3]);
break;
#endif /* CONFIG_ALTIVEC */
default:
r = -EINVAL;
break;
}
}
if (r)
return r;
if (copy_to_user((char __user *)(unsigned long)reg->addr, &val, size))
r = -EFAULT;
return r;
}
int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
{
int r;
union kvmppc_one_reg val;
int size;
long int i;
size = one_reg_size(reg->id);
if (size > sizeof(val))
return -EINVAL;
if (copy_from_user(&val, (char __user *)(unsigned long)reg->addr, size))
return -EFAULT;
r = kvmppc_set_one_reg(vcpu, reg->id, &val);
if (r == -EINVAL) {
r = 0;
switch (reg->id) {
case KVM_REG_PPC_DAR:
vcpu->arch.shared->dar = set_reg_val(reg->id, val);
break;
case KVM_REG_PPC_DSISR:
vcpu->arch.shared->dsisr = set_reg_val(reg->id, val);
break;
case KVM_REG_PPC_FPR0 ... KVM_REG_PPC_FPR31:
i = reg->id - KVM_REG_PPC_FPR0;
vcpu->arch.fpr[i] = set_reg_val(reg->id, val);
break;
case KVM_REG_PPC_FPSCR:
vcpu->arch.fpscr = set_reg_val(reg->id, val);
break;
#ifdef CONFIG_ALTIVEC
case KVM_REG_PPC_VR0 ... KVM_REG_PPC_VR31:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
vcpu->arch.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
vcpu->arch.vscr.u[3] = set_reg_val(reg->id, val);
break;
#endif /* CONFIG_ALTIVEC */
default:
r = -EINVAL;
break;
}
}
return r;
}
int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
struct kvm_translation *tr)
{
......
......@@ -155,7 +155,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
if (is_error_pfn(hpaddr)) {
if (is_error_noslot_pfn(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n",
orig_pte->eaddr);
r = -EINVAL;
......@@ -254,6 +254,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
kvmppc_mmu_hpte_cache_map(vcpu, pte);
kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
out:
return r;
}
......
......@@ -93,7 +93,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte->raddr >> PAGE_SHIFT);
if (is_error_pfn(hpaddr)) {
if (is_error_noslot_pfn(hpaddr)) {
printk(KERN_INFO "Couldn't get guest page for gfn %lx!\n", orig_pte->eaddr);
r = -EINVAL;
goto out;
......@@ -171,6 +171,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
kvmppc_mmu_hpte_cache_map(vcpu, pte);
}
kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
out:
return r;
......
......@@ -24,6 +24,9 @@
#include <linux/slab.h>
#include <linux/hugetlb.h>
#include <linux/vmalloc.h>
#include <linux/srcu.h>
#include <linux/anon_inodes.h>
#include <linux/file.h>
#include <asm/tlbflush.h>
#include <asm/kvm_ppc.h>
......@@ -40,6 +43,11 @@
/* Power architecture requires HPT is at least 256kB */
#define PPC_MIN_HPT_ORDER 18
static long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh,
unsigned long ptel, unsigned long *pte_idx_ret);
static void kvmppc_rmap_reset(struct kvm *kvm);
long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
{
unsigned long hpt;
......@@ -137,10 +145,11 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, u32 *htab_orderp)
/* Set the entire HPT to 0, i.e. invalid HPTEs */
memset((void *)kvm->arch.hpt_virt, 0, 1ul << order);
/*
* Set the whole last_vcpu array to an invalid vcpu number.
* This ensures that each vcpu will flush its TLB on next entry.
* Reset all the reverse-mapping chains for all memslots
*/
memset(kvm->arch.last_vcpu, 0xff, sizeof(kvm->arch.last_vcpu));
kvmppc_rmap_reset(kvm);
/* Ensure that each vcpu will flush its TLB on next entry. */
cpumask_setall(&kvm->arch.need_tlb_flush);
*htab_orderp = order;
err = 0;
} else {
......@@ -184,6 +193,7 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
unsigned long addr, hash;
unsigned long psize;
unsigned long hp0, hp1;
unsigned long idx_ret;
long ret;
struct kvm *kvm = vcpu->kvm;
......@@ -215,7 +225,8 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
hash = (hash << 3) + 7;
hp_v = hp0 | ((addr >> 16) & ~0x7fUL);
hp_r = hp1 | addr;
ret = kvmppc_virtmode_h_enter(vcpu, H_EXACT, hash, hp_v, hp_r);
ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, hash, hp_v, hp_r,
&idx_ret);
if (ret != H_SUCCESS) {
pr_err("KVM: map_vrma at %lx failed, ret=%ld\n",
addr, ret);
......@@ -260,7 +271,7 @@ static void kvmppc_mmu_book3s_64_hv_reset_msr(struct kvm_vcpu *vcpu)
/*
* This is called to get a reference to a guest page if there isn't
* one already in the kvm->arch.slot_phys[][] arrays.
* one already in the memslot->arch.slot_phys[] array.
*/
static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
struct kvm_memory_slot *memslot,
......@@ -275,7 +286,7 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
struct vm_area_struct *vma;
unsigned long pfn, i, npages;
physp = kvm->arch.slot_phys[memslot->id];
physp = memslot->arch.slot_phys;
if (!physp)
return -EINVAL;
if (physp[gfn - memslot->base_gfn])
......@@ -353,15 +364,10 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
return err;
}
/*
* We come here on a H_ENTER call from the guest when we are not
* using mmu notifiers and we don't have the requested page pinned
* already.
*/
long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel)
long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh,
unsigned long ptel, unsigned long *pte_idx_ret)
{
struct kvm *kvm = vcpu->kvm;
unsigned long psize, gpa, gfn;
struct kvm_memory_slot *memslot;
long ret;
......@@ -389,8 +395,8 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
do_insert:
/* Protect linux PTE lookup from page table destruction */
rcu_read_lock_sched(); /* this disables preemption too */
vcpu->arch.pgdir = current->mm->pgd;
ret = kvmppc_h_enter(vcpu, flags, pte_index, pteh, ptel);
ret = kvmppc_do_h_enter(kvm, flags, pte_index, pteh, ptel,
current->mm->pgd, false, pte_idx_ret);
rcu_read_unlock_sched();
if (ret == H_TOO_HARD) {
/* this can't happen */
......@@ -401,6 +407,19 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
}
/*
* We come here on a H_ENTER call from the guest when we are not
* using mmu notifiers and we don't have the requested page pinned
* already.
*/
long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh,
unsigned long ptel)
{
return kvmppc_virtmode_do_h_enter(vcpu->kvm, flags, pte_index,
pteh, ptel, &vcpu->arch.gpr[4]);
}
static struct kvmppc_slb *kvmppc_mmu_book3s_hv_find_slbe(struct kvm_vcpu *vcpu,
gva_t eaddr)
{
......@@ -570,7 +589,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
struct kvm *kvm = vcpu->kvm;
unsigned long *hptep, hpte[3], r;
unsigned long mmu_seq, psize, pte_size;
unsigned long gfn, hva, pfn;
unsigned long gpa, gfn, hva, pfn;
struct kvm_memory_slot *memslot;
unsigned long *rmap;
struct revmap_entry *rev;
......@@ -608,15 +627,14 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
/* Translate the logical address and get the page */
psize = hpte_page_size(hpte[0], r);
gfn = hpte_rpn(r, psize);
gpa = (r & HPTE_R_RPN & ~(psize - 1)) | (ea & (psize - 1));
gfn = gpa >> PAGE_SHIFT;
memslot = gfn_to_memslot(kvm, gfn);
/* No memslot means it's an emulated MMIO region */
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID)) {
unsigned long gpa = (gfn << PAGE_SHIFT) | (ea & (psize - 1));
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
return kvmppc_hv_emulate_mmio(run, vcpu, gpa, ea,
dsisr & DSISR_ISSTORE);
}
if (!kvm->arch.using_mmu_notifiers)
return -EFAULT; /* should never get here */
......@@ -710,7 +728,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
/* Check if we might have been invalidated; let the guest retry if so */
ret = RESUME_GUEST;
if (mmu_notifier_retry(vcpu, mmu_seq)) {
if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) {
unlock_rmap(rmap);
goto out_unlock;
}
......@@ -756,6 +774,25 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
goto out_put;
}
static void kvmppc_rmap_reset(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int srcu_idx;
srcu_idx = srcu_read_lock(&kvm->srcu);
slots = kvm->memslots;
kvm_for_each_memslot(memslot, slots) {
/*
* This assumes it is acceptable to lose reference and
* change bits across a reset.
*/
memset(memslot->arch.rmap, 0,
memslot->npages * sizeof(*memslot->arch.rmap));
}
srcu_read_unlock(&kvm->srcu, srcu_idx);
}
static int kvm_handle_hva_range(struct kvm *kvm,
unsigned long start,
unsigned long end,
......@@ -850,7 +887,8 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
psize = hpte_page_size(hptep[0], ptel);
if ((hptep[0] & HPTE_V_VALID) &&
hpte_rpn(ptel, psize) == gfn) {
hptep[0] |= HPTE_V_ABSENT;
if (kvm->arch.using_mmu_notifiers)
hptep[0] |= HPTE_V_ABSENT;
kvmppc_invalidate_hpte(kvm, hptep, i);
/* Harvest R and C */
rcbits = hptep[1] & (HPTE_R_R | HPTE_R_C);
......@@ -877,6 +915,28 @@ int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
return 0;
}
void kvmppc_core_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
{
unsigned long *rmapp;
unsigned long gfn;
unsigned long n;
rmapp = memslot->arch.rmap;
gfn = memslot->base_gfn;
for (n = memslot->npages; n; --n) {
/*
* Testing the present bit without locking is OK because
* the memslot has been marked invalid already, and hence
* no new HPTEs referencing this page can be created,
* thus the present bit can't go from 0 to 1.
*/
if (*rmapp & KVMPPC_RMAP_PRESENT)
kvm_unmap_rmapp(kvm, rmapp, gfn);
++rmapp;
++gfn;
}
}
static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
unsigned long gfn)
{
......@@ -1030,16 +1090,16 @@ static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
return ret;
}
long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
unsigned long *map)
{
unsigned long i;
unsigned long *rmapp, *map;
unsigned long *rmapp;
preempt_disable();
rmapp = memslot->arch.rmap;
map = memslot->dirty_bitmap;
for (i = 0; i < memslot->npages; ++i) {
if (kvm_test_clear_dirty(kvm, rmapp))
if (kvm_test_clear_dirty(kvm, rmapp) && map)
__set_bit_le(i, map);
++rmapp;
}
......@@ -1057,20 +1117,22 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
unsigned long hva, psize, offset;
unsigned long pa;
unsigned long *physp;
int srcu_idx;
srcu_idx = srcu_read_lock(&kvm->srcu);
memslot = gfn_to_memslot(kvm, gfn);
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
return NULL;
goto err;
if (!kvm->arch.using_mmu_notifiers) {
physp = kvm->arch.slot_phys[memslot->id];
physp = memslot->arch.slot_phys;
if (!physp)
return NULL;
goto err;
physp += gfn - memslot->base_gfn;
pa = *physp;
if (!pa) {
if (kvmppc_get_guest_page(kvm, gfn, memslot,
PAGE_SIZE) < 0)
return NULL;
goto err;
pa = *physp;
}
page = pfn_to_page(pa >> PAGE_SHIFT);
......@@ -1079,9 +1141,11 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
hva = gfn_to_hva_memslot(memslot, gfn);
npages = get_user_pages_fast(hva, 1, 1, pages);
if (npages < 1)
return NULL;
goto err;
page = pages[0];
}
srcu_read_unlock(&kvm->srcu, srcu_idx);
psize = PAGE_SIZE;
if (PageHuge(page)) {
page = compound_head(page);
......@@ -1091,6 +1155,10 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
if (nb_ret)
*nb_ret = psize - offset;
return page_address(page) + offset;
err:
srcu_read_unlock(&kvm->srcu, srcu_idx);
return NULL;
}
void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
......@@ -1100,6 +1168,348 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
put_page(page);
}
/*
* Functions for reading and writing the hash table via reads and
* writes on a file descriptor.
*
* Reads return the guest view of the hash table, which has to be
* pieced together from the real hash table and the guest_rpte
* values in the revmap array.
*
* On writes, each HPTE written is considered in turn, and if it
* is valid, it is written to the HPT as if an H_ENTER with the
* exact flag set was done. When the invalid count is non-zero
* in the header written to the stream, the kernel will make
* sure that that many HPTEs are invalid, and invalidate them
* if not.
*/
struct kvm_htab_ctx {
unsigned long index;
unsigned long flags;
struct kvm *kvm;
int first_pass;
};
#define HPTE_SIZE (2 * sizeof(unsigned long))
static long record_hpte(unsigned long flags, unsigned long *hptp,
unsigned long *hpte, struct revmap_entry *revp,
int want_valid, int first_pass)
{
unsigned long v, r;
int ok = 1;
int valid, dirty;
/* Unmodified entries are uninteresting except on the first pass */
dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
if (!first_pass && !dirty)
return 0;
valid = 0;
if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT)) {
valid = 1;
if ((flags & KVM_GET_HTAB_BOLTED_ONLY) &&
!(hptp[0] & HPTE_V_BOLTED))
valid = 0;
}
if (valid != want_valid)
return 0;
v = r = 0;
if (valid || dirty) {
/* lock the HPTE so it's stable and read it */
preempt_disable();
while (!try_lock_hpte(hptp, HPTE_V_HVLOCK))
cpu_relax();
v = hptp[0];
if (v & HPTE_V_ABSENT) {
v &= ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
}
/* re-evaluate valid and dirty from synchronized HPTE value */
valid = !!(v & HPTE_V_VALID);
if ((flags & KVM_GET_HTAB_BOLTED_ONLY) && !(v & HPTE_V_BOLTED))
valid = 0;
r = revp->guest_rpte | (hptp[1] & (HPTE_R_R | HPTE_R_C));
dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
/* only clear modified if this is the right sort of entry */
if (valid == want_valid && dirty) {
r &= ~HPTE_GR_MODIFIED;
revp->guest_rpte = r;
}
asm volatile(PPC_RELEASE_BARRIER "" : : : "memory");
hptp[0] &= ~HPTE_V_HVLOCK;
preempt_enable();
if (!(valid == want_valid && (first_pass || dirty)))
ok = 0;
}
hpte[0] = v;
hpte[1] = r;
return ok;
}
static ssize_t kvm_htab_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
struct kvm_htab_ctx *ctx = file->private_data;
struct kvm *kvm = ctx->kvm;
struct kvm_get_htab_header hdr;
unsigned long *hptp;
struct revmap_entry *revp;
unsigned long i, nb, nw;
unsigned long __user *lbuf;
struct kvm_get_htab_header __user *hptr;
unsigned long flags;
int first_pass;
unsigned long hpte[2];
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
first_pass = ctx->first_pass;
flags = ctx->flags;
i = ctx->index;
hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
revp = kvm->arch.revmap + i;
lbuf = (unsigned long __user *)buf;
nb = 0;
while (nb + sizeof(hdr) + HPTE_SIZE < count) {
/* Initialize header */
hptr = (struct kvm_get_htab_header __user *)buf;
hdr.n_valid = 0;
hdr.n_invalid = 0;
nw = nb;
nb += sizeof(hdr);
lbuf = (unsigned long __user *)(buf + sizeof(hdr));
/* Skip uninteresting entries, i.e. clean on not-first pass */
if (!first_pass) {
while (i < kvm->arch.hpt_npte &&
!(revp->guest_rpte & HPTE_GR_MODIFIED)) {
++i;
hptp += 2;
++revp;
}
}
hdr.index = i;
/* Grab a series of valid entries */
while (i < kvm->arch.hpt_npte &&
hdr.n_valid < 0xffff &&
nb + HPTE_SIZE < count &&
record_hpte(flags, hptp, hpte, revp, 1, first_pass)) {
/* valid entry, write it out */
++hdr.n_valid;
if (__put_user(hpte[0], lbuf) ||
__put_user(hpte[1], lbuf + 1))
return -EFAULT;
nb += HPTE_SIZE;
lbuf += 2;
++i;
hptp += 2;
++revp;
}
/* Now skip invalid entries while we can */
while (i < kvm->arch.hpt_npte &&
hdr.n_invalid < 0xffff &&
record_hpte(flags, hptp, hpte, revp, 0, first_pass)) {
/* found an invalid entry */
++hdr.n_invalid;
++i;
hptp += 2;
++revp;
}
if (hdr.n_valid || hdr.n_invalid) {
/* write back the header */
if (__copy_to_user(hptr, &hdr, sizeof(hdr)))
return -EFAULT;
nw = nb;
buf = (char __user *)lbuf;
} else {
nb = nw;
}
/* Check if we've wrapped around the hash table */
if (i >= kvm->arch.hpt_npte) {
i = 0;
ctx->first_pass = 0;
break;
}
}
ctx->index = i;
return nb;
}
static ssize_t kvm_htab_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
struct kvm_htab_ctx *ctx = file->private_data;
struct kvm *kvm = ctx->kvm;
struct kvm_get_htab_header hdr;
unsigned long i, j;
unsigned long v, r;
unsigned long __user *lbuf;
unsigned long *hptp;
unsigned long tmp[2];
ssize_t nb;
long int err, ret;
int rma_setup;
if (!access_ok(VERIFY_READ, buf, count))
return -EFAULT;
/* lock out vcpus from running while we're doing this */
mutex_lock(&kvm->lock);
rma_setup = kvm->arch.rma_setup_done;
if (rma_setup) {
kvm->arch.rma_setup_done = 0; /* temporarily */
/* order rma_setup_done vs. vcpus_running */
smp_mb();
if (atomic_read(&kvm->arch.vcpus_running)) {
kvm->arch.rma_setup_done = 1;
mutex_unlock(&kvm->lock);
return -EBUSY;
}
}
err = 0;
for (nb = 0; nb + sizeof(hdr) <= count; ) {
err = -EFAULT;
if (__copy_from_user(&hdr, buf, sizeof(hdr)))
break;
err = 0;
if (nb + hdr.n_valid * HPTE_SIZE > count)
break;
nb += sizeof(hdr);
buf += sizeof(hdr);
err = -EINVAL;
i = hdr.index;
if (i >= kvm->arch.hpt_npte ||
i + hdr.n_valid + hdr.n_invalid > kvm->arch.hpt_npte)
break;
hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
lbuf = (unsigned long __user *)buf;
for (j = 0; j < hdr.n_valid; ++j) {
err = -EFAULT;
if (__get_user(v, lbuf) || __get_user(r, lbuf + 1))
goto out;
err = -EINVAL;
if (!(v & HPTE_V_VALID))
goto out;
lbuf += 2;
nb += HPTE_SIZE;
if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
err = -EIO;
ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, i, v, r,
tmp);
if (ret != H_SUCCESS) {
pr_err("kvm_htab_write ret %ld i=%ld v=%lx "
"r=%lx\n", ret, i, v, r);
goto out;
}
if (!rma_setup && is_vrma_hpte(v)) {
unsigned long psize = hpte_page_size(v, r);
unsigned long senc = slb_pgsize_encoding(psize);
unsigned long lpcr;
kvm->arch.vrma_slb_v = senc | SLB_VSID_B_1T |
(VRMA_VSID << SLB_VSID_SHIFT_1T);
lpcr = kvm->arch.lpcr & ~LPCR_VRMASD;
lpcr |= senc << (LPCR_VRMASD_SH - 4);
kvm->arch.lpcr = lpcr;
rma_setup = 1;
}
++i;
hptp += 2;
}
for (j = 0; j < hdr.n_invalid; ++j) {
if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
++i;
hptp += 2;
}
err = 0;
}
out:
/* Order HPTE updates vs. rma_setup_done */
smp_wmb();
kvm->arch.rma_setup_done = rma_setup;
mutex_unlock(&kvm->lock);
if (err)
return err;
return nb;
}
static int kvm_htab_release(struct inode *inode, struct file *filp)
{
struct kvm_htab_ctx *ctx = filp->private_data;
filp->private_data = NULL;
if (!(ctx->flags & KVM_GET_HTAB_WRITE))
atomic_dec(&ctx->kvm->arch.hpte_mod_interest);
kvm_put_kvm(ctx->kvm);
kfree(ctx);
return 0;
}
static struct file_operations kvm_htab_fops = {
.read = kvm_htab_read,
.write = kvm_htab_write,
.llseek = default_llseek,
.release = kvm_htab_release,
};
int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf)
{
int ret;
struct kvm_htab_ctx *ctx;
int rwflag;
/* reject flags we don't recognize */
if (ghf->flags & ~(KVM_GET_HTAB_BOLTED_ONLY | KVM_GET_HTAB_WRITE))
return -EINVAL;
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx)
return -ENOMEM;
kvm_get_kvm(kvm);
ctx->kvm = kvm;
ctx->index = ghf->start_index;
ctx->flags = ghf->flags;
ctx->first_pass = 1;
rwflag = (ghf->flags & KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag);
if (ret < 0) {
kvm_put_kvm(kvm);
return ret;
}
if (rwflag == O_RDONLY) {
mutex_lock(&kvm->slots_lock);
atomic_inc(&kvm->arch.hpte_mod_interest);
/* make sure kvmppc_do_h_enter etc. see the increment */
synchronize_srcu_expedited(&kvm->srcu);
mutex_unlock(&kvm->slots_lock);
}
return ret;
}
void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu)
{
struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
......
......@@ -22,6 +22,7 @@
#include <asm/kvm_book3s.h>
#include <asm/reg.h>
#include <asm/switch_to.h>
#include <asm/time.h>
#define OP_19_XOP_RFID 18
#define OP_19_XOP_RFI 50
......@@ -395,6 +396,12 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
(mfmsr() & MSR_HV))
vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
break;
case SPRN_PURR:
to_book3s(vcpu)->purr_offset = spr_val - get_tb();
break;
case SPRN_SPURR:
to_book3s(vcpu)->spurr_offset = spr_val - get_tb();
break;
case SPRN_GQR0:
case SPRN_GQR1:
case SPRN_GQR2:
......@@ -412,6 +419,7 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
case SPRN_CTRLF:
case SPRN_CTRLT:
case SPRN_L2CR:
case SPRN_DSCR:
case SPRN_MMCR0_GEKKO:
case SPRN_MMCR1_GEKKO:
case SPRN_PMC1_GEKKO:
......@@ -483,9 +491,15 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
*spr_val = to_book3s(vcpu)->hid[5];
break;
case SPRN_CFAR:
case SPRN_PURR:
case SPRN_DSCR:
*spr_val = 0;
break;
case SPRN_PURR:
*spr_val = get_tb() + to_book3s(vcpu)->purr_offset;
break;
case SPRN_SPURR:
*spr_val = get_tb() + to_book3s(vcpu)->purr_offset;
break;
case SPRN_GQR0:
case SPRN_GQR1:
case SPRN_GQR2:
......
......@@ -28,8 +28,5 @@ EXPORT_SYMBOL_GPL(kvmppc_load_up_fpu);
#ifdef CONFIG_ALTIVEC
EXPORT_SYMBOL_GPL(kvmppc_load_up_altivec);
#endif
#ifdef CONFIG_VSX
EXPORT_SYMBOL_GPL(kvmppc_load_up_vsx);
#endif
#endif
此差异已折叠。
......@@ -157,8 +157,8 @@ static void __init kvm_linear_init_one(ulong size, int count, int type)
linear_info = alloc_bootmem(count * sizeof(struct kvmppc_linear_info));
for (i = 0; i < count; ++i) {
linear = alloc_bootmem_align(size, size);
pr_info("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
size >> 20);
pr_debug("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
size >> 20);
linear_info[i].base_virt = linear;
linear_info[i].base_pfn = __pa(linear) >> PAGE_SHIFT;
linear_info[i].npages = npages;
......
/*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License, version 2, as
* published by the Free Software Foundation.
*
* Copyright 2012 Paul Mackerras, IBM Corp. <paulus@au1.ibm.com>
*/
#include <linux/types.h>
#include <linux/string.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <linux/kernel.h>
#include <asm/opal.h>
/* SRR1 bits for machine check on POWER7 */
#define SRR1_MC_LDSTERR (1ul << (63-42))
#define SRR1_MC_IFETCH_SH (63-45)
#define SRR1_MC_IFETCH_MASK 0x7
#define SRR1_MC_IFETCH_SLBPAR 2 /* SLB parity error */
#define SRR1_MC_IFETCH_SLBMULTI 3 /* SLB multi-hit */
#define SRR1_MC_IFETCH_SLBPARMULTI 4 /* SLB parity + multi-hit */
#define SRR1_MC_IFETCH_TLBMULTI 5 /* I-TLB multi-hit */
/* DSISR bits for machine check on POWER7 */
#define DSISR_MC_DERAT_MULTI 0x800 /* D-ERAT multi-hit */
#define DSISR_MC_TLB_MULTI 0x400 /* D-TLB multi-hit */
#define DSISR_MC_SLB_PARITY 0x100 /* SLB parity error */
#define DSISR_MC_SLB_MULTI 0x080 /* SLB multi-hit */
#define DSISR_MC_SLB_PARMULTI 0x040 /* SLB parity + multi-hit */
/* POWER7 SLB flush and reload */
static void reload_slb(struct kvm_vcpu *vcpu)
{
struct slb_shadow *slb;
unsigned long i, n;
/* First clear out SLB */
asm volatile("slbmte %0,%0; slbia" : : "r" (0));
/* Do they have an SLB shadow buffer registered? */
slb = vcpu->arch.slb_shadow.pinned_addr;
if (!slb)
return;
/* Sanity check */
n = min_t(u32, slb->persistent, SLB_MIN_SIZE);
if ((void *) &slb->save_area[n] > vcpu->arch.slb_shadow.pinned_end)
return;
/* Load up the SLB from that */
for (i = 0; i < n; ++i) {
unsigned long rb = slb->save_area[i].esid;
unsigned long rs = slb->save_area[i].vsid;
rb = (rb & ~0xFFFul) | i; /* insert entry number */
asm volatile("slbmte %0,%1" : : "r" (rs), "r" (rb));
}
}
/* POWER7 TLB flush */
static void flush_tlb_power7(struct kvm_vcpu *vcpu)
{
unsigned long i, rb;
rb = TLBIEL_INVAL_SET_LPID;
for (i = 0; i < POWER7_TLB_SETS; ++i) {
asm volatile("tlbiel %0" : : "r" (rb));
rb += 1 << TLBIEL_INVAL_SET_SHIFT;
}
}
/*
* On POWER7, see if we can handle a machine check that occurred inside
* the guest in real mode, without switching to the host partition.
*
* Returns: 0 => exit guest, 1 => deliver machine check to guest
*/
static long kvmppc_realmode_mc_power7(struct kvm_vcpu *vcpu)
{
unsigned long srr1 = vcpu->arch.shregs.msr;
struct opal_machine_check_event *opal_evt;
long handled = 1;
if (srr1 & SRR1_MC_LDSTERR) {
/* error on load/store */
unsigned long dsisr = vcpu->arch.shregs.dsisr;
if (dsisr & (DSISR_MC_SLB_PARMULTI | DSISR_MC_SLB_MULTI |
DSISR_MC_SLB_PARITY | DSISR_MC_DERAT_MULTI)) {
/* flush and reload SLB; flushes D-ERAT too */
reload_slb(vcpu);
dsisr &= ~(DSISR_MC_SLB_PARMULTI | DSISR_MC_SLB_MULTI |
DSISR_MC_SLB_PARITY | DSISR_MC_DERAT_MULTI);
}
if (dsisr & DSISR_MC_TLB_MULTI) {
flush_tlb_power7(vcpu);
dsisr &= ~DSISR_MC_TLB_MULTI;
}
/* Any other errors we don't understand? */
if (dsisr & 0xffffffffUL)
handled = 0;
}
switch ((srr1 >> SRR1_MC_IFETCH_SH) & SRR1_MC_IFETCH_MASK) {
case 0:
break;
case SRR1_MC_IFETCH_SLBPAR:
case SRR1_MC_IFETCH_SLBMULTI:
case SRR1_MC_IFETCH_SLBPARMULTI:
reload_slb(vcpu);
break;
case SRR1_MC_IFETCH_TLBMULTI:
flush_tlb_power7(vcpu);
break;
default:
handled = 0;
}
/*
* See if OPAL has already handled the condition.
* We assume that if the condition is recovered then OPAL
* will have generated an error log event that we will pick
* up and log later.
*/
opal_evt = local_paca->opal_mc_evt;
if (opal_evt->version == OpalMCE_V1 &&
(opal_evt->severity == OpalMCE_SEV_NO_ERROR ||
opal_evt->disposition == OpalMCE_DISPOSITION_RECOVERED))
handled = 1;
if (handled)
opal_evt->in_use = 0;
return handled;
}
long kvmppc_realmode_machine_check(struct kvm_vcpu *vcpu)
{
if (cpu_has_feature(CPU_FTR_ARCH_206))
return kvmppc_realmode_mc_power7(vcpu);
return 0;
}
......@@ -35,6 +35,37 @@ static void *real_vmalloc_addr(void *x)
return __va(addr);
}
/* Return 1 if we need to do a global tlbie, 0 if we can use tlbiel */
static int global_invalidates(struct kvm *kvm, unsigned long flags)
{
int global;
/*
* If there is only one vcore, and it's currently running,
* we can use tlbiel as long as we mark all other physical
* cores as potentially having stale TLB entries for this lpid.
* If we're not using MMU notifiers, we never take pages away
* from the guest, so we can use tlbiel if requested.
* Otherwise, don't use tlbiel.
*/
if (kvm->arch.online_vcores == 1 && local_paca->kvm_hstate.kvm_vcore)
global = 0;
else if (kvm->arch.using_mmu_notifiers)
global = 1;
else
global = !(flags & H_LOCAL);
if (!global) {
/* any other core might now have stale TLB entries... */
smp_wmb();
cpumask_setall(&kvm->arch.need_tlb_flush);
cpumask_clear_cpu(local_paca->kvm_hstate.kvm_vcore->pcpu,
&kvm->arch.need_tlb_flush);
}
return global;
}
/*
* Add this HPTE into the chain for the real page.
* Must be called with the chain locked; it unlocks the chain.
......@@ -59,13 +90,24 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
head->back = pte_index;
} else {
rev->forw = rev->back = pte_index;
i = pte_index;
*rmap = (*rmap & ~KVMPPC_RMAP_INDEX) |
pte_index | KVMPPC_RMAP_PRESENT;
}
smp_wmb();
*rmap = i | KVMPPC_RMAP_REFERENCED | KVMPPC_RMAP_PRESENT; /* unlock */
unlock_rmap(rmap);
}
EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
/*
* Note modification of an HPTE; set the HPTE modified bit
* if anyone is interested.
*/
static inline void note_hpte_modification(struct kvm *kvm,
struct revmap_entry *rev)
{
if (atomic_read(&kvm->arch.hpte_mod_interest))
rev->guest_rpte |= HPTE_GR_MODIFIED;
}
/* Remove this HPTE from the chain for a real page */
static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
......@@ -81,7 +123,7 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
ptel = rev->guest_rpte |= rcbits;
gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
memslot = __gfn_to_memslot(kvm_memslots(kvm), gfn);
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
if (!memslot)
return;
rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
......@@ -103,14 +145,14 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
unlock_rmap(rmap);
}
static pte_t lookup_linux_pte(struct kvm_vcpu *vcpu, unsigned long hva,
static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
int writing, unsigned long *pte_sizep)
{
pte_t *ptep;
unsigned long ps = *pte_sizep;
unsigned int shift;
ptep = find_linux_pte_or_hugepte(vcpu->arch.pgdir, hva, &shift);
ptep = find_linux_pte_or_hugepte(pgdir, hva, &shift);
if (!ptep)
return __pte(0);
if (shift)
......@@ -130,15 +172,15 @@ static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v)
hpte[0] = hpte_v;
}
long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel)
long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel,
pgd_t *pgdir, bool realmode, unsigned long *pte_idx_ret)
{
struct kvm *kvm = vcpu->kvm;
unsigned long i, pa, gpa, gfn, psize;
unsigned long slot_fn, hva;
unsigned long *hpte;
struct revmap_entry *rev;
unsigned long g_ptel = ptel;
unsigned long g_ptel;
struct kvm_memory_slot *memslot;
unsigned long *physp, pte_size;
unsigned long is_io;
......@@ -147,13 +189,14 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned int writing;
unsigned long mmu_seq;
unsigned long rcbits;
bool realmode = vcpu->arch.vcore->vcore_state == VCORE_RUNNING;
psize = hpte_page_size(pteh, ptel);
if (!psize)
return H_PARAMETER;
writing = hpte_is_writable(ptel);
pteh &= ~(HPTE_V_HVLOCK | HPTE_V_ABSENT | HPTE_V_VALID);
ptel &= ~HPTE_GR_RESERVED;
g_ptel = ptel;
/* used later to detect if we might have been invalidated */
mmu_seq = kvm->mmu_notifier_seq;
......@@ -183,7 +226,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
rmap = &memslot->arch.rmap[slot_fn];
if (!kvm->arch.using_mmu_notifiers) {
physp = kvm->arch.slot_phys[memslot->id];
physp = memslot->arch.slot_phys;
if (!physp)
return H_PARAMETER;
physp += slot_fn;
......@@ -201,7 +244,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
/* Look up the Linux PTE for the backing page */
pte_size = psize;
pte = lookup_linux_pte(vcpu, hva, writing, &pte_size);
pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
if (pte_present(pte)) {
if (writing && !pte_write(pte))
/* make the actual HPTE be read-only */
......@@ -210,6 +253,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
pa = pte_pfn(pte) << PAGE_SHIFT;
}
}
if (pte_size < psize)
return H_PARAMETER;
if (pa && pte_size > psize)
......@@ -287,8 +331,10 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
rev = &kvm->arch.revmap[pte_index];
if (realmode)
rev = real_vmalloc_addr(rev);
if (rev)
if (rev) {
rev->guest_rpte = g_ptel;
note_hpte_modification(kvm, rev);
}
/* Link HPTE into reverse-map chain */
if (pteh & HPTE_V_VALID) {
......@@ -297,7 +343,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
lock_rmap(rmap);
/* Check for pending invalidations under the rmap chain lock */
if (kvm->arch.using_mmu_notifiers &&
mmu_notifier_retry(vcpu, mmu_seq)) {
mmu_notifier_retry(kvm, mmu_seq)) {
/* inval in progress, write a non-present HPTE */
pteh |= HPTE_V_ABSENT;
pteh &= ~HPTE_V_VALID;
......@@ -318,10 +364,17 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
hpte[0] = pteh;
asm volatile("ptesync" : : : "memory");
vcpu->arch.gpr[4] = pte_index;
*pte_idx_ret = pte_index;
return H_SUCCESS;
}
EXPORT_SYMBOL_GPL(kvmppc_h_enter);
EXPORT_SYMBOL_GPL(kvmppc_do_h_enter);
long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel)
{
return kvmppc_do_h_enter(vcpu->kvm, flags, pte_index, pteh, ptel,
vcpu->arch.pgdir, true, &vcpu->arch.gpr[4]);
}
#define LOCK_TOKEN (*(u32 *)(&get_paca()->lock_token))
......@@ -343,11 +396,10 @@ static inline int try_lock_tlbie(unsigned int *lock)
return old == 0;
}
long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index, unsigned long avpn,
unsigned long va)
long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
unsigned long pte_index, unsigned long avpn,
unsigned long *hpret)
{
struct kvm *kvm = vcpu->kvm;
unsigned long *hpte;
unsigned long v, r, rb;
struct revmap_entry *rev;
......@@ -369,7 +421,7 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
if (v & HPTE_V_VALID) {
hpte[0] &= ~HPTE_V_VALID;
rb = compute_tlbie_rb(v, hpte[1], pte_index);
if (!(flags & H_LOCAL) && atomic_read(&kvm->online_vcpus) > 1) {
if (global_invalidates(kvm, flags)) {
while (!try_lock_tlbie(&kvm->arch.tlbie_lock))
cpu_relax();
asm volatile("ptesync" : : : "memory");
......@@ -385,13 +437,22 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
/* Read PTE low word after tlbie to get final R/C values */
remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
}
r = rev->guest_rpte;
r = rev->guest_rpte & ~HPTE_GR_RESERVED;
note_hpte_modification(kvm, rev);
unlock_hpte(hpte, 0);
vcpu->arch.gpr[4] = v;
vcpu->arch.gpr[5] = r;
hpret[0] = v;
hpret[1] = r;
return H_SUCCESS;
}
EXPORT_SYMBOL_GPL(kvmppc_do_h_remove);
long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index, unsigned long avpn)
{
return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
&vcpu->arch.gpr[4]);
}
long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
{
......@@ -459,6 +520,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
args[j] = ((0x80 | flags) << 56) + pte_index;
rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
note_hpte_modification(kvm, rev);
if (!(hp[0] & HPTE_V_VALID)) {
/* insert R and C bits from PTE */
......@@ -534,8 +596,6 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
return H_NOT_FOUND;
}
if (atomic_read(&kvm->online_vcpus) == 1)
flags |= H_LOCAL;
v = hpte[0];
bits = (flags << 55) & HPTE_R_PP0;
bits |= (flags << 48) & HPTE_R_KEY_HI;
......@@ -548,6 +608,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
if (rev) {
r = (rev->guest_rpte & ~mask) | bits;
rev->guest_rpte = r;
note_hpte_modification(kvm, rev);
}
r = (hpte[1] & ~mask) | bits;
......@@ -555,7 +616,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
if (v & HPTE_V_VALID) {
rb = compute_tlbie_rb(v, r, pte_index);
hpte[0] = v & ~HPTE_V_VALID;
if (!(flags & H_LOCAL)) {
if (global_invalidates(kvm, flags)) {
while(!try_lock_tlbie(&kvm->arch.tlbie_lock))
cpu_relax();
asm volatile("ptesync" : : : "memory");
......@@ -568,6 +629,28 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
asm volatile("tlbiel %0" : : "r" (rb));
asm volatile("ptesync" : : : "memory");
}
/*
* If the host has this page as readonly but the guest
* wants to make it read/write, reduce the permissions.
* Checking the host permissions involves finding the
* memslot and then the Linux PTE for the page.
*/
if (hpte_is_writable(r) && kvm->arch.using_mmu_notifiers) {
unsigned long psize, gfn, hva;
struct kvm_memory_slot *memslot;
pgd_t *pgdir = vcpu->arch.pgdir;
pte_t pte;
psize = hpte_page_size(v, r);
gfn = ((r & HPTE_R_RPN) & ~(psize - 1)) >> PAGE_SHIFT;
memslot = __gfn_to_memslot(kvm_memslots(kvm), gfn);
if (memslot) {
hva = __gfn_to_hva_memslot(memslot, gfn);
pte = lookup_linux_pte(pgdir, hva, 1, &psize);
if (pte_present(pte) && !pte_write(pte))
r = hpte_make_readonly(r);
}
}
}
hpte[1] = r;
eieio();
......@@ -599,8 +682,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
v &= ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
}
if (v & HPTE_V_VALID)
if (v & HPTE_V_VALID) {
r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
r &= ~HPTE_GR_RESERVED;
}
vcpu->arch.gpr[4 + i * 2] = v;
vcpu->arch.gpr[5 + i * 2] = r;
}
......
......@@ -27,6 +27,7 @@
#include <asm/asm-offsets.h>
#include <asm/exception-64s.h>
#include <asm/kvm_book3s_asm.h>
#include <asm/mmu-hash64.h>
/*****************************************************************************
* *
......@@ -134,8 +135,11 @@ kvm_start_guest:
27: /* XXX should handle hypervisor maintenance interrupts etc. here */
/* reload vcpu pointer after clearing the IPI */
ld r4,HSTATE_KVM_VCPU(r13)
cmpdi r4,0
/* if we have no vcpu to run, go back to sleep */
beq cr1,kvm_no_guest
beq kvm_no_guest
/* were we napping due to cede? */
lbz r0,HSTATE_NAPPING(r13)
......@@ -310,7 +314,33 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
mtspr SPRN_SDR1,r6 /* switch to partition page table */
mtspr SPRN_LPID,r7
isync
/* See if we need to flush the TLB */
lhz r6,PACAPACAINDEX(r13) /* test_bit(cpu, need_tlb_flush) */
clrldi r7,r6,64-6 /* extract bit number (6 bits) */
srdi r6,r6,6 /* doubleword number */
sldi r6,r6,3 /* address offset */
add r6,r6,r9
addi r6,r6,KVM_NEED_FLUSH /* dword in kvm->arch.need_tlb_flush */
li r0,1
sld r0,r0,r7
ld r7,0(r6)
and. r7,r7,r0
beq 22f
23: ldarx r7,0,r6 /* if set, clear the bit */
andc r7,r7,r0
stdcx. r7,0,r6
bne 23b
li r6,128 /* and flush the TLB */
mtctr r6
li r7,0x800 /* IS field = 0b10 */
ptesync
28: tlbiel r7
addi r7,r7,0x1000
bdnz 28b
ptesync
22: li r0,1
stb r0,VCORE_IN_GUEST(r5) /* signal secondaries to continue */
b 10f
......@@ -333,36 +363,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
mr r9,r4
blt hdec_soon
/*
* Invalidate the TLB if we could possibly have stale TLB
* entries for this partition on this core due to the use
* of tlbiel.
* XXX maybe only need this on primary thread?
*/
ld r9,VCPU_KVM(r4) /* pointer to struct kvm */
lwz r5,VCPU_VCPUID(r4)
lhz r6,PACAPACAINDEX(r13)
rldimi r6,r5,0,62 /* XXX map as if threads 1:1 p:v */
lhz r8,VCPU_LAST_CPU(r4)
sldi r7,r6,1 /* see if this is the same vcpu */
add r7,r7,r9 /* as last ran on this pcpu */
lhz r0,KVM_LAST_VCPU(r7)
cmpw r6,r8 /* on the same cpu core as last time? */
bne 3f
cmpw r0,r5 /* same vcpu as this core last ran? */
beq 1f
3: sth r6,VCPU_LAST_CPU(r4) /* if not, invalidate partition TLB */
sth r5,KVM_LAST_VCPU(r7)
li r6,128
mtctr r6
li r7,0x800 /* IS field = 0b10 */
ptesync
2: tlbiel r7
addi r7,r7,0x1000
bdnz 2b
ptesync
1:
/* Save purr/spurr */
mfspr r5,SPRN_PURR
mfspr r6,SPRN_SPURR
......@@ -679,8 +679,7 @@ BEGIN_FTR_SECTION
1:
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
nohpte_cont:
hcall_real_cont: /* r9 = vcpu, r12 = trap, r13 = paca */
guest_exit_cont: /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save DEC */
mfspr r5,SPRN_DEC
mftb r6
......@@ -701,6 +700,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
std r6, VCPU_FAULT_DAR(r9)
stw r7, VCPU_FAULT_DSISR(r9)
/* See if it is a machine check */
cmpwi r12, BOOK3S_INTERRUPT_MACHINE_CHECK
beq machine_check_realmode
mc_cont:
/* Save guest CTRL register, set runlatch to 1 */
6: mfspr r6,SPRN_CTRLF
stw r6,VCPU_CTRL(r9)
......@@ -1113,38 +1117,41 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
/*
* For external and machine check interrupts, we need
* to call the Linux handler to process the interrupt.
* We do that by jumping to the interrupt vector address
* which we have in r12. The [h]rfid at the end of the
* We do that by jumping to absolute address 0x500 for
* external interrupts, or the machine_check_fwnmi label
* for machine checks (since firmware might have patched
* the vector area at 0x200). The [h]rfid at the end of the
* handler will return to the book3s_hv_interrupts.S code.
* For other interrupts we do the rfid to get back
* to the book3s_interrupts.S code here.
* to the book3s_hv_interrupts.S code here.
*/
ld r8, HSTATE_VMHANDLER(r13)
ld r7, HSTATE_HOST_MSR(r13)
cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL
BEGIN_FTR_SECTION
beq 11f
cmpwi r12, BOOK3S_INTERRUPT_MACHINE_CHECK
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
/* RFI into the highmem handler, or branch to interrupt handler */
12: mfmsr r6
mtctr r12
mfmsr r6
li r0, MSR_RI
andc r6, r6, r0
mtmsrd r6, 1 /* Clear RI in MSR */
mtsrr0 r8
mtsrr1 r7
beqctr
beqa 0x500 /* external interrupt (PPC970) */
beq cr1, 13f /* machine check */
RFI
11:
BEGIN_FTR_SECTION
b 12b
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
mtspr SPRN_HSRR0, r8
/* On POWER7, we have external interrupts set to use HSRR0/1 */
11: mtspr SPRN_HSRR0, r8
mtspr SPRN_HSRR1, r7
ba 0x500
13: b machine_check_fwnmi
/*
* Check whether an HDSI is an HPTE not found fault or something else.
* If it is an HPTE not found fault that is due to the guest accessing
......@@ -1177,7 +1184,7 @@ kvmppc_hdsi:
cmpdi r3, 0 /* retry the instruction */
beq 6f
cmpdi r3, -1 /* handle in kernel mode */
beq nohpte_cont
beq guest_exit_cont
cmpdi r3, -2 /* MMIO emulation; need instr word */
beq 2f
......@@ -1191,6 +1198,7 @@ kvmppc_hdsi:
li r10, BOOK3S_INTERRUPT_DATA_STORAGE
li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */
rotldi r11, r11, 63
fast_interrupt_c_return:
6: ld r7, VCPU_CTR(r9)
lwz r8, VCPU_XER(r9)
mtctr r7
......@@ -1223,7 +1231,7 @@ kvmppc_hdsi:
/* Unset guest mode. */
li r0, KVM_GUEST_MODE_NONE
stb r0, HSTATE_IN_GUEST(r13)
b nohpte_cont
b guest_exit_cont
/*
* Similarly for an HISI, reflect it to the guest as an ISI unless
......@@ -1249,9 +1257,9 @@ kvmppc_hisi:
ld r11, VCPU_MSR(r9)
li r12, BOOK3S_INTERRUPT_H_INST_STORAGE
cmpdi r3, 0 /* retry the instruction */
beq 6f
beq fast_interrupt_c_return
cmpdi r3, -1 /* handle in kernel mode */
beq nohpte_cont
beq guest_exit_cont
/* Synthesize an ISI for the guest */
mr r11, r3
......@@ -1260,12 +1268,7 @@ kvmppc_hisi:
li r10, BOOK3S_INTERRUPT_INST_STORAGE
li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */
rotldi r11, r11, 63
6: ld r7, VCPU_CTR(r9)
lwz r8, VCPU_XER(r9)
mtctr r7
mtxer r8
mr r4, r9
b fast_guest_return
b fast_interrupt_c_return
3: ld r6, VCPU_KVM(r9) /* not relocated, use VRMA */
ld r5, KVM_VRMA_SLB_V(r6)
......@@ -1281,14 +1284,14 @@ kvmppc_hisi:
hcall_try_real_mode:
ld r3,VCPU_GPR(R3)(r9)
andi. r0,r11,MSR_PR
bne hcall_real_cont
bne guest_exit_cont
clrrdi r3,r3,2
cmpldi r3,hcall_real_table_end - hcall_real_table
bge hcall_real_cont
bge guest_exit_cont
LOAD_REG_ADDR(r4, hcall_real_table)
lwzx r3,r3,r4
cmpwi r3,0
beq hcall_real_cont
beq guest_exit_cont
add r3,r3,r4
mtctr r3
mr r3,r9 /* get vcpu pointer */
......@@ -1309,7 +1312,7 @@ hcall_real_fallback:
li r12,BOOK3S_INTERRUPT_SYSCALL
ld r9, HSTATE_KVM_VCPU(r13)
b hcall_real_cont
b guest_exit_cont
.globl hcall_real_table
hcall_real_table:
......@@ -1568,6 +1571,21 @@ kvm_cede_exit:
li r3,H_TOO_HARD
blr
/* Try to handle a machine check in real mode */
machine_check_realmode:
mr r3, r9 /* get vcpu pointer */
bl .kvmppc_realmode_machine_check
nop
cmpdi r3, 0 /* continue exiting from guest? */
ld r9, HSTATE_KVM_VCPU(r13)
li r12, BOOK3S_INTERRUPT_MACHINE_CHECK
beq mc_cont
/* If not, deliver a machine check. SRR0/1 are already set */
li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */
rotldi r11, r11, 63
b fast_interrupt_c_return
secondary_too_late:
ld r5,HSTATE_KVM_VCORE(r13)
HMT_LOW
......@@ -1587,6 +1605,10 @@ secondary_too_late:
.endr
secondary_nap:
/* Clear our vcpu pointer so we don't come back in early */
li r0, 0
std r0, HSTATE_KVM_VCPU(r13)
lwsync
/* Clear any pending IPI - assume we're a secondary thread */
ld r5, HSTATE_XICS_PHYS(r13)
li r7, XICS_XIRR
......@@ -1612,8 +1634,6 @@ secondary_nap:
kvm_no_guest:
li r0, KVM_HWTHREAD_IN_NAP
stb r0, HSTATE_HWTHREAD_STATE(r13)
li r0, 0
std r0, HSTATE_KVM_VCPU(r13)
li r3, LPCR_PECE0
mfspr r4, SPRN_LPCR
......
......@@ -114,11 +114,6 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
hlist_del_init_rcu(&pte->list_vpte);
hlist_del_init_rcu(&pte->list_vpte_long);
if (pte->pte.may_write)
kvm_release_pfn_dirty(pte->pfn);
else
kvm_release_pfn_clean(pte->pfn);
spin_unlock(&vcpu3s->mmu_lock);
vcpu3s->hpte_cache_count--;
......
......@@ -52,8 +52,6 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
#define MSR_USER32 MSR_USER
#define MSR_USER64 MSR_USER
#define HW_PAGE_SIZE PAGE_SIZE
#define __hard_irq_disable local_irq_disable
#define __hard_irq_enable local_irq_enable
#endif
void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
......@@ -66,7 +64,7 @@ void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
svcpu->slb_max = to_book3s(vcpu)->slb_shadow_max;
svcpu_put(svcpu);
#endif
vcpu->cpu = smp_processor_id();
#ifdef CONFIG_PPC_BOOK3S_32
current->thread.kvm_shadow_vcpu = to_book3s(vcpu)->shadow_vcpu;
#endif
......@@ -83,17 +81,71 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
svcpu_put(svcpu);
#endif
kvmppc_giveup_ext(vcpu, MSR_FP);
kvmppc_giveup_ext(vcpu, MSR_VEC);
kvmppc_giveup_ext(vcpu, MSR_VSX);
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
vcpu->cpu = -1;
}
int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
{
int r = 1; /* Indicate we want to get back into the guest */
/* We misuse TLB_FLUSH to indicate that we want to clear
all shadow cache entries */
if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
kvmppc_mmu_pte_flush(vcpu, 0, 0);
return r;
}
/************* MMU Notifiers *************/
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
{
trace_kvm_unmap_hva(hva);
/*
* Flush all shadow tlb entries everywhere. This is slow, but
* we are 100% sure that we catch the to be unmapped page
*/
kvm_flush_remote_tlbs(kvm);
return 0;
}
int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
{
/* kvm_unmap_hva flushes everything anyways */
kvm_unmap_hva(kvm, start);
return 0;
}
int kvm_age_hva(struct kvm *kvm, unsigned long hva)
{
/* XXX could be more clever ;) */
return 0;
}
int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
{
/* XXX could be more clever ;) */
return 0;
}
void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
{
/* The page will get remapped properly on its next fault */
kvm_unmap_hva(kvm, hva);
}
/*****************************************/
static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
{
ulong smsr = vcpu->arch.shared->msr;
/* Guest MSR values */
smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_DE;
smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE;
/* Process MSR values */
smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
/* External providers the guest reserved */
......@@ -379,10 +431,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
static inline int get_fpr_index(int i)
{
#ifdef CONFIG_VSX
i *= 2;
#endif
return i;
return i * TS_FPRWIDTH;
}
/* Give up external provider (FPU, Altivec, VSX) */
......@@ -396,41 +445,49 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
u64 *thread_fpr = (u64*)t->fpr;
int i;
if (!(vcpu->arch.guest_owned_ext & msr))
/*
* VSX instructions can access FP and vector registers, so if
* we are giving up VSX, make sure we give up FP and VMX as well.
*/
if (msr & MSR_VSX)
msr |= MSR_FP | MSR_VEC;
msr &= vcpu->arch.guest_owned_ext;
if (!msr)
return;
#ifdef DEBUG_EXT
printk(KERN_INFO "Giving up ext 0x%lx\n", msr);
#endif
switch (msr) {
case MSR_FP:
if (msr & MSR_FP) {
/*
* Note that on CPUs with VSX, giveup_fpu stores
* both the traditional FP registers and the added VSX
* registers into thread.fpr[].
*/
giveup_fpu(current);
for (i = 0; i < ARRAY_SIZE(vcpu->arch.fpr); i++)
vcpu_fpr[i] = thread_fpr[get_fpr_index(i)];
vcpu->arch.fpscr = t->fpscr.val;
break;
case MSR_VEC:
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX))
for (i = 0; i < ARRAY_SIZE(vcpu->arch.vsr) / 2; i++)
vcpu_vsx[i] = thread_fpr[get_fpr_index(i) + 1];
#endif
}
#ifdef CONFIG_ALTIVEC
if (msr & MSR_VEC) {
giveup_altivec(current);
memcpy(vcpu->arch.vr, t->vr, sizeof(vcpu->arch.vr));
vcpu->arch.vscr = t->vscr;
#endif
break;
case MSR_VSX:
#ifdef CONFIG_VSX
__giveup_vsx(current);
for (i = 0; i < ARRAY_SIZE(vcpu->arch.vsr); i++)
vcpu_vsx[i] = thread_fpr[get_fpr_index(i) + 1];
#endif
break;
default:
BUG();
}
#endif
vcpu->arch.guest_owned_ext &= ~msr;
current->thread.regs->msr &= ~msr;
vcpu->arch.guest_owned_ext &= ~(msr | MSR_VSX);
kvmppc_recalc_shadow_msr(vcpu);
}
......@@ -490,47 +547,56 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
return RESUME_GUEST;
}
/* We already own the ext */
if (vcpu->arch.guest_owned_ext & msr) {
return RESUME_GUEST;
if (msr == MSR_VSX) {
/* No VSX? Give an illegal instruction interrupt */
#ifdef CONFIG_VSX
if (!cpu_has_feature(CPU_FTR_VSX))
#endif
{
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
return RESUME_GUEST;
}
/*
* We have to load up all the FP and VMX registers before
* we can let the guest use VSX instructions.
*/
msr = MSR_FP | MSR_VEC | MSR_VSX;
}
/* See if we already own all the ext(s) needed */
msr &= ~vcpu->arch.guest_owned_ext;
if (!msr)
return RESUME_GUEST;
#ifdef DEBUG_EXT
printk(KERN_INFO "Loading up ext 0x%lx\n", msr);
#endif
current->thread.regs->msr |= msr;
switch (msr) {
case MSR_FP:
if (msr & MSR_FP) {
for (i = 0; i < ARRAY_SIZE(vcpu->arch.fpr); i++)
thread_fpr[get_fpr_index(i)] = vcpu_fpr[i];
#ifdef CONFIG_VSX
for (i = 0; i < ARRAY_SIZE(vcpu->arch.vsr) / 2; i++)
thread_fpr[get_fpr_index(i) + 1] = vcpu_vsx[i];
#endif
t->fpscr.val = vcpu->arch.fpscr;
t->fpexc_mode = 0;
kvmppc_load_up_fpu();
break;
case MSR_VEC:
}
if (msr & MSR_VEC) {
#ifdef CONFIG_ALTIVEC
memcpy(t->vr, vcpu->arch.vr, sizeof(vcpu->arch.vr));
t->vscr = vcpu->arch.vscr;
t->vrsave = -1;
kvmppc_load_up_altivec();
#endif
break;
case MSR_VSX:
#ifdef CONFIG_VSX
for (i = 0; i < ARRAY_SIZE(vcpu->arch.vsr); i++)
thread_fpr[get_fpr_index(i) + 1] = vcpu_vsx[i];
kvmppc_load_up_vsx();
#endif
break;
default:
BUG();
}
vcpu->arch.guest_owned_ext |= msr;
kvmppc_recalc_shadow_msr(vcpu);
return RESUME_GUEST;
......@@ -540,18 +606,18 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
unsigned int exit_nr)
{
int r = RESUME_HOST;
int s;
vcpu->stat.sum_exits++;
run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1;
/* We get here with MSR.EE=0, so enable it to be a nice citizen */
__hard_irq_enable();
/* We get here with MSR.EE=1 */
trace_kvm_exit(exit_nr, vcpu);
kvm_guest_exit();
trace_kvm_book3s_exit(exit_nr, vcpu);
preempt_enable();
kvm_resched(vcpu);
switch (exit_nr) {
case BOOK3S_INTERRUPT_INST_STORAGE:
{
......@@ -802,7 +868,6 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
}
}
preempt_disable();
if (!(r & RESUME_HOST)) {
/* To avoid clobbering exit_reason, only check for signals if
* we aren't already exiting to userspace for some other
......@@ -814,20 +879,13 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
* and if we really did time things so badly, then we just exit
* again due to a host external interrupt.
*/
__hard_irq_disable();
if (signal_pending(current)) {
__hard_irq_enable();
#ifdef EXIT_DEBUG
printk(KERN_EMERG "KVM: Going back to host\n");
#endif
vcpu->stat.signal_exits++;
run->exit_reason = KVM_EXIT_INTR;
r = -EINTR;
local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
if (s <= 0) {
local_irq_enable();
r = s;
} else {
/* In case an interrupt came in that was triggered
* from userspace (like DEC), we need to check what
* to inject now! */
kvmppc_core_prepare_to_enter(vcpu);
kvmppc_lazy_ee_enable();
}
}
......@@ -899,34 +957,59 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return 0;
}
int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *val)
{
int r = -EINVAL;
int r = 0;
switch (reg->id) {
switch (id) {
case KVM_REG_PPC_HIOR:
r = copy_to_user((u64 __user *)(long)reg->addr,
&to_book3s(vcpu)->hior, sizeof(u64));
*val = get_reg_val(id, to_book3s(vcpu)->hior);
break;
#ifdef CONFIG_VSX
case KVM_REG_PPC_VSR0 ... KVM_REG_PPC_VSR31: {
long int i = id - KVM_REG_PPC_VSR0;
if (!cpu_has_feature(CPU_FTR_VSX)) {
r = -ENXIO;
break;
}
val->vsxval[0] = vcpu->arch.fpr[i];
val->vsxval[1] = vcpu->arch.vsr[i];
break;
}
#endif /* CONFIG_VSX */
default:
r = -EINVAL;
break;
}
return r;
}
int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *val)
{
int r = -EINVAL;
int r = 0;
switch (reg->id) {
switch (id) {
case KVM_REG_PPC_HIOR:
r = copy_from_user(&to_book3s(vcpu)->hior,
(u64 __user *)(long)reg->addr, sizeof(u64));
if (!r)
to_book3s(vcpu)->hior_explicit = true;
to_book3s(vcpu)->hior = set_reg_val(id, *val);
to_book3s(vcpu)->hior_explicit = true;
break;
#ifdef CONFIG_VSX
case KVM_REG_PPC_VSR0 ... KVM_REG_PPC_VSR31: {
long int i = id - KVM_REG_PPC_VSR0;
if (!cpu_has_feature(CPU_FTR_VSX)) {
r = -ENXIO;
break;
}
vcpu->arch.fpr[i] = val->vsxval[0];
vcpu->arch.vsr[i] = val->vsxval[1];
break;
}
#endif /* CONFIG_VSX */
default:
r = -EINVAL;
break;
}
......@@ -1020,8 +1103,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
#endif
ulong ext_msr;
preempt_disable();
/* Check if we can run the vcpu at all */
if (!vcpu->arch.sane) {
kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
......@@ -1029,21 +1110,16 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
goto out;
}
kvmppc_core_prepare_to_enter(vcpu);
/*
* Interrupts could be timers for the guest which we have to inject
* again, so let's postpone them until we're in the guest and if we
* really did time things so badly, then we just exit again due to
* a host external interrupt.
*/
__hard_irq_disable();
/* No need to go into the guest when all we do is going out */
if (signal_pending(current)) {
__hard_irq_enable();
kvm_run->exit_reason = KVM_EXIT_INTR;
ret = -EINTR;
local_irq_disable();
ret = kvmppc_prepare_to_enter(vcpu);
if (ret <= 0) {
local_irq_enable();
goto out;
}
......@@ -1070,7 +1146,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
/* Save VSX state in stack */
used_vsr = current->thread.used_vsr;
if (used_vsr && (current->thread.regs->msr & MSR_VSX))
__giveup_vsx(current);
__giveup_vsx(current);
#endif
/* Remember the MSR with disabled extensions */
......@@ -1080,20 +1156,19 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
if (vcpu->arch.shared->msr & MSR_FP)
kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
kvm_guest_enter();
kvmppc_lazy_ee_enable();
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
kvm_guest_exit();
current->thread.regs->msr = ext_msr;
/* No need for kvm_guest_exit. It's done in handle_exit.
We also get here with interrupts enabled. */
/* Make sure we save the guest FPU/Altivec/VSX state */
kvmppc_giveup_ext(vcpu, MSR_FP);
kvmppc_giveup_ext(vcpu, MSR_VEC);
kvmppc_giveup_ext(vcpu, MSR_VSX);
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
current->thread.regs->msr = ext_msr;
/* Restore FPU state from stack */
/* Restore FPU/VSX state from stack */
memcpy(current->thread.fpr, fpr, sizeof(current->thread.fpr));
current->thread.fpscr.val = fpscr;
current->thread.fpexc_mode = fpexc_mode;
......@@ -1113,7 +1188,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
#endif
out:
preempt_enable();
vcpu->mode = OUTSIDE_GUEST_MODE;
return ret;
}
......@@ -1181,14 +1256,31 @@ int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm, struct kvm_ppc_smmu_info *info)
}
#endif /* CONFIG_PPC64 */
void kvmppc_core_free_memslot(struct kvm_memory_slot *free,
struct kvm_memory_slot *dont)
{
}
int kvmppc_core_create_memslot(struct kvm_memory_slot *slot,
unsigned long npages)
{
return 0;
}
int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem)
{
return 0;
}
void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem)
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old)
{
}
void kvmppc_core_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
{
}
......
......@@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
* Call kvmppc_handler_trampoline_enter in real mode
*
* On entry, r4 contains the guest shadow MSR
* MSR.EE has to be 0 when calling this function
*/
_GLOBAL(kvmppc_entry_trampoline)
mfmsr r5
LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
toreal(r7)
li r9, MSR_RI
ori r9, r9, MSR_EE
andc r9, r5, r9 /* Clear EE and RI in MSR value */
li r6, MSR_IR | MSR_DR
ori r6, r6, MSR_EE
andc r6, r5, r6 /* Clear EE, DR and IR in MSR value */
MTMSR_EERI(r9) /* Clear EE and RI in MSR */
mtsrr0 r7 /* before we set srr0/1 */
andc r6, r5, r6 /* Clear DR and IR in MSR value */
/*
* Set EE in HOST_MSR so that it's enabled when we get into our
* C exit handler function
*/
ori r5, r5, MSR_EE
mtsrr0 r7
mtsrr1 r6
RFI
......@@ -233,8 +234,5 @@ define_load_up(fpu)
#ifdef CONFIG_ALTIVEC
define_load_up(altivec)
#endif
#ifdef CONFIG_VSX
define_load_up(vsx)
#endif
#include "book3s_segment.S"
此差异已折叠。
......@@ -69,6 +69,7 @@ extern unsigned long kvmppc_booke_handlers;
void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr);
void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr);
void kvmppc_set_epcr(struct kvm_vcpu *vcpu, u32 new_epcr);
void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr);
void kvmppc_set_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits);
void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits);
......
......@@ -133,10 +133,10 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
vcpu->arch.csrr1 = spr_val;
break;
case SPRN_DBCR0:
vcpu->arch.dbcr0 = spr_val;
vcpu->arch.dbg_reg.dbcr0 = spr_val;
break;
case SPRN_DBCR1:
vcpu->arch.dbcr1 = spr_val;
vcpu->arch.dbg_reg.dbcr1 = spr_val;
break;
case SPRN_DBSR:
vcpu->arch.dbsr &= ~spr_val;
......@@ -145,6 +145,14 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
kvmppc_clr_tsr_bits(vcpu, spr_val);
break;
case SPRN_TCR:
/*
* WRC is a 2-bit field that is supposed to preserve its
* value once written to non-zero.
*/
if (vcpu->arch.tcr & TCR_WRC_MASK) {
spr_val &= ~TCR_WRC_MASK;
spr_val |= vcpu->arch.tcr & TCR_WRC_MASK;
}
kvmppc_set_tcr(vcpu, spr_val);
break;
......@@ -229,7 +237,17 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
case SPRN_IVOR15:
vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG] = spr_val;
break;
case SPRN_MCSR:
vcpu->arch.mcsr &= ~spr_val;
break;
#if defined(CONFIG_64BIT)
case SPRN_EPCR:
kvmppc_set_epcr(vcpu, spr_val);
#ifdef CONFIG_KVM_BOOKE_HV
mtspr(SPRN_EPCR, vcpu->arch.shadow_epcr);
#endif
break;
#endif
default:
emulated = EMULATE_FAIL;
}
......@@ -258,10 +276,10 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
*spr_val = vcpu->arch.csrr1;
break;
case SPRN_DBCR0:
*spr_val = vcpu->arch.dbcr0;
*spr_val = vcpu->arch.dbg_reg.dbcr0;
break;
case SPRN_DBCR1:
*spr_val = vcpu->arch.dbcr1;
*spr_val = vcpu->arch.dbg_reg.dbcr1;
break;
case SPRN_DBSR:
*spr_val = vcpu->arch.dbsr;
......@@ -321,6 +339,14 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
case SPRN_IVOR15:
*spr_val = vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG];
break;
case SPRN_MCSR:
*spr_val = vcpu->arch.mcsr;
break;
#if defined(CONFIG_64BIT)
case SPRN_EPCR:
*spr_val = vcpu->arch.epcr;
break;
#endif
default:
emulated = EMULATE_FAIL;
......
......@@ -27,8 +27,7 @@
#define E500_TLB_NUM 2
#define E500_TLB_VALID 1
#define E500_TLB_DIRTY 2
#define E500_TLB_BITMAP 4
#define E500_TLB_BITMAP 2
struct tlbe_ref {
pfn_t pfn;
......@@ -130,9 +129,9 @@ int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *vcpu_e500,
ulong value);
int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu);
int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu);
int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb);
int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int rt, int ra, int rb);
int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb);
int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, gva_t ea);
int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int type, gva_t ea);
int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, gva_t ea);
int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500);
void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500);
......@@ -155,7 +154,7 @@ get_tlb_size(const struct kvm_book3e_206_tlb_entry *tlbe)
static inline gva_t get_tlb_eaddr(const struct kvm_book3e_206_tlb_entry *tlbe)
{
return tlbe->mas2 & 0xfffff000;
return tlbe->mas2 & MAS2_EPN;
}
static inline u64 get_tlb_bytes(const struct kvm_book3e_206_tlb_entry *tlbe)
......
......@@ -89,6 +89,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
int ra = get_ra(inst);
int rb = get_rb(inst);
int rt = get_rt(inst);
gva_t ea;
switch (get_op(inst)) {
case 31:
......@@ -113,15 +114,20 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
break;
case XOP_TLBSX:
emulated = kvmppc_e500_emul_tlbsx(vcpu,rb);
ea = kvmppc_get_ea_indexed(vcpu, ra, rb);
emulated = kvmppc_e500_emul_tlbsx(vcpu, ea);
break;
case XOP_TLBILX:
emulated = kvmppc_e500_emul_tlbilx(vcpu, rt, ra, rb);
case XOP_TLBILX: {
int type = rt & 0x3;
ea = kvmppc_get_ea_indexed(vcpu, ra, rb);
emulated = kvmppc_e500_emul_tlbilx(vcpu, type, ea);
break;
}
case XOP_TLBIVAX:
emulated = kvmppc_e500_emul_tlbivax(vcpu, ra, rb);
ea = kvmppc_get_ea_indexed(vcpu, ra, rb);
emulated = kvmppc_e500_emul_tlbivax(vcpu, ea);
break;
default:
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -90,6 +90,7 @@ config MPIC
config PPC_EPAPR_HV_PIC
bool
default n
select EPAPR_PARAVIRT
config MPIC_WEIRD
bool
......
此差异已折叠。
......@@ -253,6 +253,7 @@ struct platform_diu_data_ops diu_ops;
EXPORT_SYMBOL(diu_ops);
#endif
#ifdef CONFIG_EPAPR_PARAVIRT
/*
* Restart the current partition
*
......@@ -278,3 +279,4 @@ void fsl_hv_halt(void)
pr_info("hv exit\n");
fh_partition_stop(-1);
}
#endif
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册