1. 30 5月, 2014 11 次提交
    • A
      KVM: PPC: Book3S PR: PAPR: Access RTAS in big endian · b59d9d26
      Alexander Graf 提交于
      When the guest does an RTAS hypercall it keeps all RTAS variables inside a
      big endian data structure.
      
      To make sure we don't have to bother about endianness inside the actual RTAS
      handlers, let's just convert the whole structure to host endian before we
      call our RTAS handlers and back to big endian when we return to the guest.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b59d9d26
    • A
      KVM: PPC: Book3S PR: PAPR: Access HTAB in big endian · 1692aa3f
      Alexander Graf 提交于
      The HTAB on PPC is always in big endian. When we access it via hypercalls
      on behalf of the guest and we're running on a little endian host, we need
      to make sure we swap the bits accordingly.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      1692aa3f
    • A
      KVM: PPC: Book3S PR: Default to big endian guest · 94810ba4
      Alexander Graf 提交于
      The default MSR when user space does not define anything should be identical
      on little and big endian hosts, so remove MSR_LE from it.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      94810ba4
    • A
      KVM: PPC: Book3S_64 PR: Access shadow slb in big endian · 14a7d41d
      Alexander Graf 提交于
      The "shadow SLB" in the PACA is shared with the hypervisor, so it has to
      be big endian. We access the shadow SLB during world switch, so let's make
      sure we access it in big endian even when we're on a little endian host.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      14a7d41d
    • A
      KVM: PPC: Book3S_64 PR: Access HTAB in big endian · 4e509af9
      Alexander Graf 提交于
      The HTAB is always big endian. We access the guest's HTAB using
      copy_from/to_user, but don't yet take care of the fact that we might
      be running on an LE host.
      
      Wrap all accesses to the guest HTAB with big endian accessors.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      4e509af9
    • A
      KVM: PPC: Book3S_32: PR: Access HTAB in big endian · 860540bc
      Alexander Graf 提交于
      The HTAB is always big endian. We access the guest's HTAB using
      copy_from/to_user, but don't yet take care of the fact that we might
      be running on an LE host.
      
      Wrap all accesses to the guest HTAB with big endian accessors.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      860540bc
    • A
      KVM: PPC: Book3S: PR: Fix C/R bit setting · 740f834e
      Alexander Graf 提交于
      Commit 9308ab8e made C/R HTAB updates go byte-wise into the target HTAB.
      However, it didn't update the guest's copy of the HTAB, but instead the
      host local copy of it.
      
      Write to the guest's HTAB instead.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      CC: Paul Mackerras <paulus@samba.org>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      740f834e
    • A
      KVM: PPC: BOOK3S: PR: Fix WARN_ON with debug options on · 7562c4fd
      Aneesh Kumar K.V 提交于
      With debug option "sleep inside atomic section checking" enabled we get
      the below WARN_ON during a PR KVM boot. This is because upstream now
      have PREEMPT_COUNT enabled even if we have preempt disabled. Fix the
      warning by adding preempt_disable/enable around floating point and altivec
      enable.
      
      WARNING: at arch/powerpc/kernel/process.c:156
      Modules linked in: kvm_pr kvm
      CPU: 1 PID: 3990 Comm: qemu-system-ppc Tainted: G        W     3.15.0-rc1+ #4
      task: c0000000eb85b3a0 ti: c0000000ec59c000 task.ti: c0000000ec59c000
      NIP: c000000000015c84 LR: d000000003334644 CTR: c000000000015c00
      REGS: c0000000ec59f140 TRAP: 0700   Tainted: G        W      (3.15.0-rc1+)
      MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 42000024  XER: 20000000
      CFAR: c000000000015c24 SOFTE: 1
      GPR00: d000000003334644 c0000000ec59f3c0 c000000000e2fa40 c0000000e2f80000
      GPR04: 0000000000000800 0000000000002000 0000000000000001 8000000000000000
      GPR08: 0000000000000001 0000000000000001 0000000000002000 c000000000015c00
      GPR12: d00000000333da18 c00000000fb80900 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 00003fffce4e0fa1
      GPR20: 0000000000000010 0000000000000001 0000000000000002 00000000100b9a38
      GPR24: 0000000000000002 0000000000000000 0000000000000000 0000000000000013
      GPR28: 0000000000000000 c0000000eb85b3a0 0000000000002000 c0000000e2f80000
      NIP [c000000000015c84] .enable_kernel_fp+0x84/0x90
      LR [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr]
      Call Trace:
      [c0000000ec59f3c0] [0000000000000010] 0x10 (unreliable)
      [c0000000ec59f430] [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr]
      [c0000000ec59f4c0] [d00000000324b380] .kvmppc_set_msr+0x30/0x50 [kvm]
      [c0000000ec59f530] [d000000003337cac] .kvmppc_core_emulate_op_pr+0x16c/0x5e0 [kvm_pr]
      [c0000000ec59f5f0] [d00000000324a944] .kvmppc_emulate_instruction+0x284/0xa80 [kvm]
      [c0000000ec59f6c0] [d000000003336888] .kvmppc_handle_exit_pr+0x488/0xb70 [kvm_pr]
      [c0000000ec59f790] [d000000003338d34] kvm_start_lightweight+0xcc/0xdc [kvm_pr]
      [c0000000ec59f960] [d000000003336288] .kvmppc_vcpu_run_pr+0xc8/0x190 [kvm_pr]
      [c0000000ec59f9f0] [d00000000324c880] .kvmppc_vcpu_run+0x30/0x50 [kvm]
      [c0000000ec59fa60] [d000000003249e74] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0 [kvm]
      [c0000000ec59faf0] [d000000003244948] .kvm_vcpu_ioctl+0x478/0x760 [kvm]
      [c0000000ec59fcb0] [c000000000224e34] .do_vfs_ioctl+0x4d4/0x790
      [c0000000ec59fd90] [c000000000225148] .SyS_ioctl+0x58/0xb0
      [c0000000ec59fe30] [c00000000000a1e4] syscall_exit+0x0/0x98
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7562c4fd
    • A
      KVM: PPC: BOOK3S: PR: Enable Little Endian PR guest · e5ee5422
      Aneesh Kumar K.V 提交于
      This patch make sure we inherit the LE bit correctly in different case
      so that we can run Little Endian distro in PR mode
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e5ee5422
    • A
      KVM: PPC: E500: Add dcbtls emulation · 8f20a3ab
      Alexander Graf 提交于
      The dcbtls instruction is able to lock data inside the L1 cache.
      
      We don't want to give the guest actual access to hardware cache locks,
      as that could influence other VMs on the same system. But we can tell
      the guest that its locking attempt failed.
      
      By implementing the instruction we at least don't give the guest a
      program exception which it definitely does not expect.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8f20a3ab
    • A
      KVM: PPC: E500: Ignore L1CSR1_ICFI,ICLFR · 07fec1c2
      Alexander Graf 提交于
      The L1 instruction cache control register contains bits that indicate
      that we're still handling a request. Mask those out when we set the SPR
      so that a read doesn't assume we're still doing something.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      07fec1c2
  2. 29 3月, 2014 7 次提交
    • P
      KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8 · 72cde5a8
      Paul Mackerras 提交于
      Currently we save the host PMU configuration, counter values, etc.,
      when entering a guest, and restore it on return from the guest.
      (We have to do this because the guest has control of the PMU while
      it is executing.)  However, we missed saving/restoring the SIAR and
      SDAR registers, as well as the registers which are new on POWER8,
      namely SIER and MMCR2.
      
      This adds code to save the values of these registers when entering
      the guest and restore them on exit.  This also works around the bug
      in POWER8 where setting PMAE with a counter already negative doesn't
      generate an interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      72cde5a8
    • P
      KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset · c5fb80d3
      Paul Mackerras 提交于
      Commit c7699822bc21 ("KVM: PPC: Book3S HV: Make physical thread 0 do
      the MMU switching") reordered the guest entry/exit code so that most
      of the guest register save/restore code happened in guest MMU context.
      A side effect of that is that the timebase still contains the guest
      timebase value at the point where we compute and use vcpu->arch.dec_expires,
      and therefore that is now a guest timebase value rather than a host
      timebase value.  That in turn means that the timeouts computed in
      kvmppc_set_timer() are wrong if the timebase offset for the guest is
      non-zero.  The consequence of that is things such as "sleep 1" in a
      guest after migration may sleep for much longer than they should.
      
      This fixes the problem by converting between guest and host timebase
      values as necessary, by adding or subtracting the timebase offset.
      This also fixes an incorrect comment.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      c5fb80d3
    • P
      KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode · 797f9c07
      Paul Mackerras 提交于
      With HV KVM, some high-frequency hypercalls such as H_ENTER are handled
      in real mode, and need to access the memslots array for the guest.
      Accessing the memslots array is safe, because we hold the SRCU read
      lock for the whole time that a guest vcpu is running.  However, the
      checks that kvm_memslots() does when lockdep is enabled are potentially
      unsafe in real mode, when only the linear mapping is available.
      Furthermore, kvm_memslots() can be called from a secondary CPU thread,
      which is an offline CPU from the point of view of the host kernel,
      and is not running the task which holds the SRCU read lock.
      
      To avoid false positives in the checks in kvm_memslots(), and to avoid
      possible side effects from doing the checks in real mode, this replaces
      kvm_memslots() with kvm_memslots_raw() in all the places that execute
      in real mode.  kvm_memslots_raw() is a new function that is like
      kvm_memslots() but uses rcu_dereference_raw_notrace() instead of
      kvm_dereference_check().
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      797f9c07
    • P
      KVM: PPC: Book3S HV: Return ENODEV error rather than EIO · 739e2425
      Paul Mackerras 提交于
      If an attempt is made to load the kvm-hv module on a machine which
      doesn't have hypervisor mode available, return an ENODEV error,
      which is the conventional thing to return to indicate that this
      module is not applicable to the hardware of the current machine,
      rather than EIO, which causes a warning to be printed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      739e2425
    • P
      KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code · b24f36f3
      Paul Mackerras 提交于
      The in-kernel emulation of RTAS functions needs to read the argument
      buffer from guest memory in order to find out what function is being
      requested.  The guest supplies the guest physical address of the buffer,
      and on a real system the code that reads that buffer would run in guest
      real mode.  In guest real mode, the processor ignores the top 4 bits
      of the address specified in load and store instructions.  In order to
      emulate that behaviour correctly, we need to mask off those bits
      before calling kvm_read_guest() or kvm_write_guest().  This adds that
      masking.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      b24f36f3
    • M
      KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state · a7d80d01
      Michael Neuling 提交于
      This adds code to get/set_one_reg to read and write the new transactional
      memory (TM) state.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      a7d80d01
    • M
      KVM: PPC: Book3S HV: Add transactional memory support · e4e38121
      Michael Neuling 提交于
      This adds saving of the transactional memory (TM) checkpointed state
      on guest entry and exit.  We only do this if we see that the guest has
      an active transaction.
      
      It also adds emulation of the TM state changes when delivering IRQs
      into the guest.  According to the architecture, if we are
      transactional when an IRQ occurs, the TM state is changed to
      suspended, otherwise it's left unchanged.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NScott Wood <scottwood@freescale.com>
      e4e38121
  3. 26 3月, 2014 3 次提交
  4. 20 3月, 2014 2 次提交
    • S
      powerpc/booke64: Use SPRG_TLB_EXFRAME on bolted handlers · a3dc6207
      Scott Wood 提交于
      While bolted handlers (including e6500) do not need to deal with a TLB
      miss recursively causing another TLB miss, nested TLB misses can still
      happen with crit/mc/debug exceptions -- so we still need to honor
      SPRG_TLB_EXFRAME.
      
      We don't need to spend time modifying it in the TLB miss fastpath,
      though -- the special level exception will handle that.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: kvm-ppc@vger.kernel.org
      a3dc6207
    • S
      powerpc/booke64: Use SPRG7 for VDSO · 9d378dfa
      Scott Wood 提交于
      Previously SPRG3 was marked for use by both VDSO and critical
      interrupts (though critical interrupts were not fully implemented).
      
      In commit 8b64a9df ("powerpc/booke64:
      Use SPRG0/3 scratch for bolted TLB miss & crit int"), Mihai Caraman
      made an attempt to resolve this conflict by restoring the VDSO value
      early in the critical interrupt, but this has some issues:
      
       - It's incompatible with EXCEPTION_COMMON which restores r13 from the
         by-then-overwritten scratch (this cost me some debugging time).
       - It forces critical exceptions to be a special case handled
         differently from even machine check and debug level exceptions.
       - It didn't occur to me that it was possible to make this work at all
         (by doing a final "ld r13, PACA_EXCRIT+EX_R13(r13)") until after
         I made (most of) this patch. :-)
      
      It might be worth investigating using a load rather than SPRG on return
      from all exceptions (except TLB misses where the scratch never leaves
      the SPRG) -- it could save a few cycles.  Until then, let's stick with
      SPRG for all exceptions.
      
      Since we cannot use SPRG4-7 for scratch without corrupting the state of
      a KVM guest, move VDSO to SPRG7 on book3e.  Since neither SPRG4-7 nor
      critical interrupts exist on book3s, SPRG3 is still used for VDSO
      there.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: kvm-ppc@vger.kernel.org
      9d378dfa
  5. 13 3月, 2014 2 次提交
  6. 27 1月, 2014 15 次提交
    • P
      KVM: PPC: Book3S PR: Cope with doorbell interrupts · 40688909
      Paul Mackerras 提交于
      When the PR host is running on a POWER8 machine in POWER8 mode, it
      will use doorbell interrupts for IPIs.  If one of them arrives while
      we are in the guest, we pop out of the guest with trap number 0xA00,
      which isn't handled by kvmppc_handle_exit_pr, leading to the following
      BUG_ON:
      
      [  331.436215] exit_nr=0xa00 | pc=0x1d2c | msr=0x800000000000d032
      [  331.437522] ------------[ cut here ]------------
      [  331.438296] kernel BUG at arch/powerpc/kvm/book3s_pr.c:982!
      [  331.439063] Oops: Exception in kernel mode, sig: 5 [#2]
      [  331.439819] SMP NR_CPUS=1024 NUMA pSeries
      [  331.440552] Modules linked in: tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw virtio_net kvm binfmt_misc ibmvscsi scsi_transport_srp scsi_tgt virtio_blk
      [  331.447614] CPU: 11 PID: 1296 Comm: qemu-system-ppc Tainted: G      D      3.11.7-200.2.fc19.ppc64p7 #1
      [  331.448920] task: c0000003bdc8c000 ti: c0000003bd32c000 task.ti: c0000003bd32c000
      [  331.450088] NIP: d0000000025d6b9c LR: d0000000025d6b98 CTR: c0000000004cfdd0
      [  331.451042] REGS: c0000003bd32f420 TRAP: 0700   Tainted: G      D       (3.11.7-200.2.fc19.ppc64p7)
      [  331.452331] MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI>  CR: 28004824  XER: 20000000
      [  331.454616] SOFTE: 1
      [  331.455106] CFAR: c000000000848bb8
      [  331.455726]
      GPR00: d0000000025d6b98 c0000003bd32f6a0 d0000000026017b8 0000000000000032
      GPR04: c0000000018627f8 c000000001873208 320d0a3030303030 3030303030643033
      GPR08: c000000000c490a8 0000000000000000 0000000000000000 0000000000000002
      GPR12: 0000000028004822 c00000000fdc6300 0000000000000000 00000100076ec310
      GPR16: 000000002ae343b8 00003ffffd397398 0000000000000000 0000000000000000
      GPR20: 00000100076f16f4 00000100076ebe60 0000000000000008 ffffffffffffffff
      GPR24: 0000000000000000 0000008001041e60 0000000000000000 0000008001040ce8
      GPR28: c0000003a2d80000 0000000000000a00 0000000000000001 c0000003a2681810
      [  331.466504] NIP [d0000000025d6b9c] .kvmppc_handle_exit_pr+0x75c/0xa80 [kvm]
      [  331.466999] LR [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm]
      [  331.467517] Call Trace:
      [  331.467909] [c0000003bd32f6a0] [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] (unreliable)
      [  331.468553] [c0000003bd32f750] [d0000000025d98f0] kvm_start_lightweight+0xb4/0xc4 [kvm]
      [  331.469189] [c0000003bd32f920] [d0000000025d7648] .kvmppc_vcpu_run_pr+0xd8/0x270 [kvm]
      [  331.469838] [c0000003bd32f9c0] [d0000000025cf748] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
      [  331.470790] [c0000003bd32fa50] [d0000000025cc19c] .kvm_arch_vcpu_ioctl_run+0x5c/0x1b0 [kvm]
      [  331.471401] [c0000003bd32fae0] [d0000000025c4888] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
      [  331.472026] [c0000003bd32fc90] [c00000000026192c] .do_vfs_ioctl+0x4dc/0x7a0
      [  331.472561] [c0000003bd32fd80] [c000000000261cc4] .SyS_ioctl+0xd4/0xf0
      [  331.473095] [c0000003bd32fe30] [c000000000009ed8] syscall_exit+0x0/0x98
      [  331.473633] Instruction dump:
      [  331.473766] 4bfff9b4 2b9d0800 419efc18 60000000 60420000 3d220000 e8bf11a0 e8df12a8
      [  331.474733] 7fa4eb78 e8698660 48015165 e8410028 <0fe00000> 813f00e4 3ba00000 39290001
      [  331.475386] ---[ end trace 49fc47d994c1f8f2 ]---
      [  331.479817]
      
      This fixes the problem by making kvmppc_handle_exit_pr() recognize the
      interrupt.  We also need to jump to the doorbell interrupt handler in
      book3s_segment.S to handle the interrupt on the way out of the guest.
      Having done that, there's nothing further to be done in
      kvmppc_handle_exit_pr().
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      40688909
    • M
      KVM: PPC: Book3S HV: Add new state for transactional memory · 7b490411
      Michael Neuling 提交于
      Add new state for transactional memory (TM) to kvm_vcpu_arch.  Also add
      asm-offset bits that are going to be required.
      
      This also moves the existing TFHAR, TFIAR and TEXASR SPRs into a
      CONFIG_PPC_TRANSACTIONAL_MEM section.  This requires some code changes to
      ensure we still compile with CONFIG_PPC_TRANSACTIONAL_MEM=N.  Much of the added
      the added #ifdefs are removed in a later patch when the bulk of the TM code is
      added.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix merge conflict]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7b490411
    • A
      KVM: PPC: Book3S HV: Basic little-endian guest support · d682916a
      Anton Blanchard 提交于
      We create a guest MSR from scratch when delivering exceptions in
      a few places.  Instead of extracting LPCR[ILE] and inserting it
      into MSR_LE each time, we simply create a new variable intr_msr which
      contains the entire MSR to use.  For a little-endian guest, userspace
      needs to set the ILE (interrupt little-endian) bit in the LPCR for
      each vcpu (or at least one vcpu in each virtual core).
      
      [paulus@samba.org - removed H_SET_MODE implementation from original
      version of the patch, and made kvmppc_set_lpcr update vcpu->arch.intr_msr.]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d682916a
    • P
      KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 · 8563bf52
      Paul Mackerras 提交于
      The DABRX (DABR extension) register on POWER7 processors provides finer
      control over which accesses cause a data breakpoint interrupt.  It
      contains 3 bits which indicate whether to enable accesses in user,
      kernel and hypervisor modes respectively to cause data breakpoint
      interrupts, plus one bit that enables both real mode and virtual mode
      accesses to cause interrupts.  Currently, KVM sets DABRX to allow
      both kernel and user accesses to cause interrupts while in the guest.
      
      This adds support for the guest to specify other values for DABRX.
      PAPR defines a H_SET_XDABR hcall to allow the guest to set both DABR
      and DABRX with one call.  This adds a real-mode implementation of
      H_SET_XDABR, which shares most of its code with the existing H_SET_DABR
      implementation.  To support this, we add a per-vcpu field to store the
      DABRX value plus code to get and set it via the ONE_REG interface.
      
      For Linux guests to use this new hcall, userspace needs to add
      "hcall-xdabr" to the set of strings in the /chosen/hypertas-functions
      property in the device tree.  If userspace does this and then migrates
      the guest to a host where the kernel doesn't include this patch, then
      userspace will need to implement H_SET_XDABR by writing the specified
      DABR value to the DABR using the ONE_REG interface.  In that case, the
      old kernel will set DABRX to DABRX_USER | DABRX_KERNEL.  That should
      still work correctly, at least for Linux guests, since Linux guests
      cope with getting data breakpoint interrupts in modes that weren't
      requested by just ignoring the interrupt, and Linux guests never set
      DABRX_BTI.
      
      The other thing this does is to make H_SET_DABR and H_SET_XDABR work
      on POWER8, which has the DAWR and DAWRX instead of DABR/X.  Guests that
      know about POWER8 should use H_SET_MODE rather than H_SET_[X]DABR, but
      guests running in POWER7 compatibility mode will still use H_SET_[X]DABR.
      For them, this adds the logic to convert DABR/X values into DAWR/X values
      on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8563bf52
    • P
      KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells · 5d00f66b
      Paul Mackerras 提交于
      POWER8 has support for hypervisor doorbell interrupts.  Though the
      kernel doesn't use them for IPIs on the powernv platform yet, it
      probably will in future, so this makes KVM cope gracefully if a
      hypervisor doorbell interrupt arrives while in a guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5d00f66b
    • P
      KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 · e0622bd9
      Paul Mackerras 提交于
      POWER8 has a bit in the LPCR to enable or disable the PURR and SPURR
      registers to count when in the guest.  Set this bit.
      
      POWER8 has a field in the LPCR called AIL (Alternate Interrupt Location)
      which is used to enable relocation-on interrupts.  Allow userspace to
      set this field.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0622bd9
    • P
      KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs · aa31e843
      Paul Mackerras 提交于
      * SRR1 wake reason field for system reset interrupt on wakeup from nap
        is now a 4-bit field on P8, compared to 3 bits on P7.
      
      * Set PECEDP in LPCR when napping because of H_CEDE so guest doorbells
        will wake us up.
      
      * Waking up from nap because of a guest doorbell interrupt is not a
        reason to exit the guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      aa31e843
    • P
      KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap · e3bbbbfa
      Paul Mackerras 提交于
      Currently in book3s_hv_rmhandlers.S we have three places where we
      have woken up from nap mode and we check the reason field in SRR1
      to see what event woke us up.  This consolidates them into a new
      function, kvmppc_check_wake_reason.  It looks at the wake reason
      field in SRR1, and if it indicates that an external interrupt caused
      the wakeup, calls kvmppc_read_intr to check what sort of interrupt
      it was.
      
      This also consolidates the two places where we synthesize an external
      interrupt (0x500 vector) for the guest.  Now, if the guest exit code
      finds that there was an external interrupt which has been handled
      (i.e. it was an IPI indicating that there is now an interrupt pending
      for the guest), it jumps to deliver_guest_interrupt, which is in the
      last part of the guest entry code, where we synthesize guest external
      and decrementer interrupts.  That code has been streamlined a little
      and now clears LPCR[MER] when appropriate as well as setting it.
      
      The extra clearing of any pending IPI on a secondary, offline CPU
      thread before going back to nap mode has been removed.  It is no longer
      necessary now that we have code to read and acknowledge IPIs in the
      guest exit path.
      
      This fixes a minor bug in the H_CEDE real-mode handling - previously,
      if we found that other threads were already exiting the guest when we
      were about to go to nap mode, we would branch to the cede wakeup path
      and end up looking in SRR1 for a wakeup reason.  Now we branch to a
      point after we have checked the wakeup reason.
      
      This also fixes a minor bug in kvmppc_read_intr - previously it could
      return 0xff rather than 1, in the case where we find that a host IPI
      is pending after we have cleared the IPI.  Now it returns 1.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e3bbbbfa
    • P
      KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 · 5557ae0e
      Paul Mackerras 提交于
      This allows us to select architecture 2.05 (POWER6) or 2.06 (POWER7)
      compatibility modes on a POWER8 processor.  (Note that transactional
      memory is disabled for usermode if either or both of the PCR_TM_DIS
      and PCR_ARCH_206 bits are set.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5557ae0e
    • M
      KVM: PPC: Book3S HV: Add handler for HV facility unavailable · bd3048b8
      Michael Ellerman 提交于
      At present this should never happen, since the host kernel sets
      HFSCR to allow access to all facilities.  It's better to be prepared
      to handle it cleanly if it does ever happen, though.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bd3048b8
    • P
      KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 · ca252055
      Paul Mackerras 提交于
      POWER8 has 512 sets in the TLB, compared to 128 for POWER7, so we need
      to do more tlbiel instructions when flushing the TLB on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ca252055
    • M
      KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs · b005255e
      Michael Neuling 提交于
      This adds fields to the struct kvm_vcpu_arch to store the new
      guest-accessible SPRs on POWER8, adds code to the get/set_one_reg
      functions to allow userspace to access this state, and adds code to
      the guest entry and exit to context-switch these SPRs between host
      and guest.
      
      Note that DPDES (Directed Privileged Doorbell Exception State) is
      shared between threads on a core; hence we store it in struct
      kvmppc_vcore and have the master thread save and restore it.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b005255e
    • P
      KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers · e0b7ec05
      Paul Mackerras 提交于
      On a threaded processor such as POWER7, we group VCPUs into virtual
      cores and arrange that the VCPUs in a virtual core run on the same
      physical core.  Currently we don't enforce any correspondence between
      virtual thread numbers within a virtual core and physical thread
      numbers.  Physical threads are allocated starting at 0 on a first-come
      first-served basis to runnable virtual threads (VCPUs).
      
      POWER8 implements a new "msgsndp" instruction which guest kernels can
      use to interrupt other threads in the same core or sub-core.  Since
      the instruction takes the destination physical thread ID as a parameter,
      it becomes necessary to align the physical thread IDs with the virtual
      thread IDs, that is, to make sure virtual thread N within a virtual
      core always runs on physical thread N.
      
      This means that it's possible that thread 0, which is where we call
      __kvmppc_vcore_entry, may end up running some other vcpu than the
      one whose task called kvmppc_run_core(), or it may end up running
      no vcpu at all, if for example thread 0 of the virtual core is
      currently executing in userspace.  However, we do need thread 0
      to be responsible for switching the MMU -- a previous version of
      this patch that had other threads switching the MMU was found to
      be responsible for occasional memory corruption and machine check
      interrupts in the guest on POWER7 machines.
      
      To accommodate this, we no longer pass the vcpu pointer to
      __kvmppc_vcore_entry, but instead let the assembly code load it from
      the PACA.  Since the assembly code will need to know the kvm pointer
      and the thread ID for threads which don't have a vcpu, we move the
      thread ID into the PACA and we add a kvm pointer to the virtual core
      structure.
      
      In the case where thread 0 has no vcpu to run, it still calls into
      kvmppc_hv_entry in order to do the MMU switch, and then naps until
      either its vcpu is ready to run in the guest, or some other thread
      needs to exit the guest.  In the latter case, thread 0 jumps to the
      code that switches the MMU back to the host.  This control flow means
      that now we switch the MMU before loading any guest vcpu state.
      Similarly, on guest exit we now save all the guest vcpu state before
      switching the MMU back to the host.  This has required substantial
      code movement, making the diff rather large.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0b7ec05
    • M
      KVM: PPC: Book3S HV: Don't set DABR on POWER8 · eee7ff9d
      Michael Neuling 提交于
      POWER8 doesn't have the DABR and DABRX registers; instead it has
      new DAWR/DAWRX registers, which will be handled in a later patch.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      eee7ff9d
    • S
      kvm/ppc: IRQ disabling cleanup · 6c85f52b
      Scott Wood 提交于
      Simplify the handling of lazy EE by going directly from fully-enabled
      to hard-disabled.  This replaces the lazy_irq_pending() check
      (including its misplaced kvm_guest_exit() call).
      
      As suggested by Tiejun Chen, move the interrupt disabling into
      kvmppc_prepare_to_enter() rather than have each caller do it.  Also
      move the IRQ enabling on heavyweight exit into
      kvmppc_prepare_to_enter().
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6c85f52b