1. 11 2月, 2014 9 次提交
    • M
      powerpc/xmon: Don't loop forever in get_output_lock() · 730efb61
      Michael Ellerman 提交于
      If we enter with xmon_speaker != 0 we skip the first cmpxchg(), we also
      skip the while loop because xmon_speaker != last_speaker (0) - meaning we
      skip the second cmpxchg() also.
      
      Following that code path the compiler sees no memory barriers and so is
      within its rights to never reload xmon_speaker. The end result is we loop
      forever.
      
      This manifests as all cpus being in xmon ('c' command), but they refuse
      to take control when you switch to them ('c x' for cpu # x).
      
      I have seen this deadlock in practice and also checked the generated code to
      confirm this is what's happening.
      
      The simplest fix is just to always try the cmpxchg().
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      730efb61
    • A
      powerpc/perf: Configure BHRB filter before enabling PMU interrupts · b4d6c06c
      Anshuman Khandual 提交于
      Right now the config_bhrb() PMU specific call happens after
      write_mmcr0(), which actually enables the PMU for event counting and
      interrupts. So there is a small window of time where the PMU and BHRB
      runs without the required HW branch filter (if any) enabled in BHRB.
      
      This can cause some of the branch samples to be collected through BHRB
      without any filter applied and hence affects the correctness of
      the results. This patch moves the BHRB config function call before
      enabling interrupts.
      
      Here are some data points captured via trace prints which depicts how we
      could get PMU interrupts with BHRB filter NOT enabled with a standard
      perf record command line (asking for branch record information as well).
      
          $ perf record -j any_call ls
      
      Before the patch:-
      
          ls-1962  [003] d...  2065.299590: .perf_event_interrupt: MMCRA: 40000000000
          ls-1962  [003] d...  2065.299603: .perf_event_interrupt: MMCRA: 40000000000
          ...
      
          All the PMU interrupts before this point did not have the requested
          HW branch filter enabled in the MMCRA.
      
          ls-1962  [003] d...  2065.299647: .perf_event_interrupt: MMCRA: 40040000000
          ls-1962  [003] d...  2065.299662: .perf_event_interrupt: MMCRA: 40040000000
      
      After the patch:-
      
          ls-1850  [008] d...   190.311828: .perf_event_interrupt: MMCRA: 40040000000
          ls-1850  [008] d...   190.311848: .perf_event_interrupt: MMCRA: 40040000000
      
          All the PMU interrupts have the requested HW BHRB branch filter
          enabled in MMCRA.
      Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      [mpe: Fixed up whitespace and cleaned up changelog]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b4d6c06c
    • M
      powerpc/pseries: Select ARCH_RANDOM on pseries · 8d4887ee
      Michael Ellerman 提交于
      We have a driver for the ARCH_RANDOM hook in rng.c, so we should select
      ARCH_RANDOM on pseries.
      
      Without this the build breaks if you turn ARCH_RANDOM off.
      
      This hasn't broken the build because pseries_defconfig doesn't specify a
      value for PPC_POWERNV, which is default y, and selects ARCH_RANDOM.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d4887ee
    • M
    • L
      powerpc/relocate fix relocate processing in LE mode · 3b830c82
      Laurent Dufour 提交于
      Relocation's code is not working in little endian mode because the r_info
      field, which is a 64 bits value, should be read from the right offset.
      
      The current code is optimized to read the r_info field as a 32 bits value
      starting at the middle of the double word (offset 12). When running in LE
      mode, the read value is not correct since only the MSB is read.
      
      This patch removes this optimization which consist to deal with a 32 bits
      value instead of a 64 bits one. This way it works in big and little endian
      mode.
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3b830c82
    • M
      powerpc: Fix kdump hang issue on p8 with relocation on exception enabled. · 429d2e83
      Mahesh Salgaonkar 提交于
      On p8 systems, with relocation on exception feature enabled we are seeing
      kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this
      feature enabled, exception are raised with MMU (IR=DR=1) ON with the
      default offset of 0xc*4000. Since exception is raised in virtual mode it
      requires the vector region to be executable without which it fails to
      fetch and execute instruction at 0xc*4xxx. For default kernel since kernel
      is loaded at real 0, the htab mappings sets the entire kernel text region
      executable. But for relocatable kernel (e.g. kdump case) we only copy
      interrupt vectors down to real 0 and never marked that region as
      executable because in p7 and below we always get exception in real mode.
      
      This patch fixes this issue by marking htab mapping range as executable
      that overlaps with the interrupt vector region for relocatable kernel.
      
      Thanks to Ben who helped me to debug this issue and find the root cause.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      429d2e83
    • M
      powerpc/pseries: Disable relocation on exception while going down during crash. · 3ec8b78f
      Mahesh Salgaonkar 提交于
      Disable relocation on exception while going down even in kdump case. This
      is because we are about clear htab mappings while kexec-ing into kdump
      kernel and we may run into issues if we still have AIL ON.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ec8b78f
    • T
      powerpc/eeh: Drop taken reference to driver on eeh_rmv_device · 8cc6b6cd
      Thadeu Lima de Souza Cascardo 提交于
      Commit f5c57710 ("powerpc/eeh: Use
      partial hotplug for EEH unaware drivers") introduces eeh_rmv_device,
      which may grab a reference to a driver, but not release it.
      
      That prevents a driver from being removed after it has gone through EEH
      recovery.
      
      This patch drops the reference if it was taken.
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8cc6b6cd
    • P
      powerpc: Fix build failure in sysdev/mpic.c for MPIC_WEIRD=y · 0215b4aa
      Paul Gortmaker 提交于
      Commit 446f6d06 ("powerpc/mpic: Properly
      set default triggers") breaks the mpc7447_hpc_defconfig as follows:
      
        CC      arch/powerpc/sysdev/mpic.o
      arch/powerpc/sysdev/mpic.c: In function 'mpic_set_irq_type':
      arch/powerpc/sysdev/mpic.c:886:9: error: case label does not reduce to an integer constant
      arch/powerpc/sysdev/mpic.c:890:9: error: case label does not reduce to an integer constant
      arch/powerpc/sysdev/mpic.c:894:9: error: case label does not reduce to an integer constant
      arch/powerpc/sysdev/mpic.c:898:9: error: case label does not reduce to an integer constant
      
      Looking at the cpp output (gcc 4.7.3), I see:
      
         case mpic->hw_set[MPIC_IDX_VECPRI_SENSE_EDGE] |
              mpic->hw_set[MPIC_IDX_VECPRI_POLARITY_POSITIVE]:
      
      The pointer into an array appears because CONFIG_MPIC_WEIRD=y is set
      for this platform, thus enabling the following:
      
        -------------------
        #ifdef CONFIG_MPIC_WEIRD
        static u32 mpic_infos[][MPIC_IDX_END] = {
              [0] = { /* Original OpenPIC compatible MPIC */
      
        [...]
      
        #define MPIC_INFO(name) mpic->hw_set[MPIC_IDX_##name]
      
        #else /* CONFIG_MPIC_WEIRD */
      
        #define MPIC_INFO(name) MPIC_##name
      
        #endif /* CONFIG_MPIC_WEIRD */
        -------------------
      
      Here we convert the case section to if/else if, and also add
      the equivalent of a default case to warn about unknown types.
      Boot tested on sbc8548, build tested on all defconfigs.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0215b4aa
  2. 29 1月, 2014 12 次提交
  3. 28 1月, 2014 2 次提交
  4. 27 1月, 2014 17 次提交
    • P
      KVM: PPC: Book3S PR: Cope with doorbell interrupts · 40688909
      Paul Mackerras 提交于
      When the PR host is running on a POWER8 machine in POWER8 mode, it
      will use doorbell interrupts for IPIs.  If one of them arrives while
      we are in the guest, we pop out of the guest with trap number 0xA00,
      which isn't handled by kvmppc_handle_exit_pr, leading to the following
      BUG_ON:
      
      [  331.436215] exit_nr=0xa00 | pc=0x1d2c | msr=0x800000000000d032
      [  331.437522] ------------[ cut here ]------------
      [  331.438296] kernel BUG at arch/powerpc/kvm/book3s_pr.c:982!
      [  331.439063] Oops: Exception in kernel mode, sig: 5 [#2]
      [  331.439819] SMP NR_CPUS=1024 NUMA pSeries
      [  331.440552] Modules linked in: tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw virtio_net kvm binfmt_misc ibmvscsi scsi_transport_srp scsi_tgt virtio_blk
      [  331.447614] CPU: 11 PID: 1296 Comm: qemu-system-ppc Tainted: G      D      3.11.7-200.2.fc19.ppc64p7 #1
      [  331.448920] task: c0000003bdc8c000 ti: c0000003bd32c000 task.ti: c0000003bd32c000
      [  331.450088] NIP: d0000000025d6b9c LR: d0000000025d6b98 CTR: c0000000004cfdd0
      [  331.451042] REGS: c0000003bd32f420 TRAP: 0700   Tainted: G      D       (3.11.7-200.2.fc19.ppc64p7)
      [  331.452331] MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI>  CR: 28004824  XER: 20000000
      [  331.454616] SOFTE: 1
      [  331.455106] CFAR: c000000000848bb8
      [  331.455726]
      GPR00: d0000000025d6b98 c0000003bd32f6a0 d0000000026017b8 0000000000000032
      GPR04: c0000000018627f8 c000000001873208 320d0a3030303030 3030303030643033
      GPR08: c000000000c490a8 0000000000000000 0000000000000000 0000000000000002
      GPR12: 0000000028004822 c00000000fdc6300 0000000000000000 00000100076ec310
      GPR16: 000000002ae343b8 00003ffffd397398 0000000000000000 0000000000000000
      GPR20: 00000100076f16f4 00000100076ebe60 0000000000000008 ffffffffffffffff
      GPR24: 0000000000000000 0000008001041e60 0000000000000000 0000008001040ce8
      GPR28: c0000003a2d80000 0000000000000a00 0000000000000001 c0000003a2681810
      [  331.466504] NIP [d0000000025d6b9c] .kvmppc_handle_exit_pr+0x75c/0xa80 [kvm]
      [  331.466999] LR [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm]
      [  331.467517] Call Trace:
      [  331.467909] [c0000003bd32f6a0] [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] (unreliable)
      [  331.468553] [c0000003bd32f750] [d0000000025d98f0] kvm_start_lightweight+0xb4/0xc4 [kvm]
      [  331.469189] [c0000003bd32f920] [d0000000025d7648] .kvmppc_vcpu_run_pr+0xd8/0x270 [kvm]
      [  331.469838] [c0000003bd32f9c0] [d0000000025cf748] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
      [  331.470790] [c0000003bd32fa50] [d0000000025cc19c] .kvm_arch_vcpu_ioctl_run+0x5c/0x1b0 [kvm]
      [  331.471401] [c0000003bd32fae0] [d0000000025c4888] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
      [  331.472026] [c0000003bd32fc90] [c00000000026192c] .do_vfs_ioctl+0x4dc/0x7a0
      [  331.472561] [c0000003bd32fd80] [c000000000261cc4] .SyS_ioctl+0xd4/0xf0
      [  331.473095] [c0000003bd32fe30] [c000000000009ed8] syscall_exit+0x0/0x98
      [  331.473633] Instruction dump:
      [  331.473766] 4bfff9b4 2b9d0800 419efc18 60000000 60420000 3d220000 e8bf11a0 e8df12a8
      [  331.474733] 7fa4eb78 e8698660 48015165 e8410028 <0fe00000> 813f00e4 3ba00000 39290001
      [  331.475386] ---[ end trace 49fc47d994c1f8f2 ]---
      [  331.479817]
      
      This fixes the problem by making kvmppc_handle_exit_pr() recognize the
      interrupt.  We also need to jump to the doorbell interrupt handler in
      book3s_segment.S to handle the interrupt on the way out of the guest.
      Having done that, there's nothing further to be done in
      kvmppc_handle_exit_pr().
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      40688909
    • M
      KVM: PPC: Book3S HV: Add software abort codes for transactional memory · b17dfec0
      Michael Neuling 提交于
      This adds the software abort code defines for transactional memory (TM).
      These values are from PAPR.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b17dfec0
    • M
      KVM: PPC: Book3S HV: Add new state for transactional memory · 7b490411
      Michael Neuling 提交于
      Add new state for transactional memory (TM) to kvm_vcpu_arch.  Also add
      asm-offset bits that are going to be required.
      
      This also moves the existing TFHAR, TFIAR and TEXASR SPRs into a
      CONFIG_PPC_TRANSACTIONAL_MEM section.  This requires some code changes to
      ensure we still compile with CONFIG_PPC_TRANSACTIONAL_MEM=N.  Much of the added
      the added #ifdefs are removed in a later patch when the bulk of the TM code is
      added.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix merge conflict]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7b490411
    • M
      powerpc/Kconfig: Make TM select VSX and VMX · 7b37a123
      Michael Neuling 提交于
      There are no processors in existence that have TM but no VMX or VSX.  So let's
      makes CONFIG_PPC_TRANSACTIONAL_MEM select both CONFIG_VSX and CONFIG_ALTIVEC.
      This makes the code a lot simpler by removing the need for a bunch of #ifdefs.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7b37a123
    • A
      KVM: PPC: Book3S HV: Basic little-endian guest support · d682916a
      Anton Blanchard 提交于
      We create a guest MSR from scratch when delivering exceptions in
      a few places.  Instead of extracting LPCR[ILE] and inserting it
      into MSR_LE each time, we simply create a new variable intr_msr which
      contains the entire MSR to use.  For a little-endian guest, userspace
      needs to set the ILE (interrupt little-endian) bit in the LPCR for
      each vcpu (or at least one vcpu in each virtual core).
      
      [paulus@samba.org - removed H_SET_MODE implementation from original
      version of the patch, and made kvmppc_set_lpcr update vcpu->arch.intr_msr.]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d682916a
    • P
      KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 · 8563bf52
      Paul Mackerras 提交于
      The DABRX (DABR extension) register on POWER7 processors provides finer
      control over which accesses cause a data breakpoint interrupt.  It
      contains 3 bits which indicate whether to enable accesses in user,
      kernel and hypervisor modes respectively to cause data breakpoint
      interrupts, plus one bit that enables both real mode and virtual mode
      accesses to cause interrupts.  Currently, KVM sets DABRX to allow
      both kernel and user accesses to cause interrupts while in the guest.
      
      This adds support for the guest to specify other values for DABRX.
      PAPR defines a H_SET_XDABR hcall to allow the guest to set both DABR
      and DABRX with one call.  This adds a real-mode implementation of
      H_SET_XDABR, which shares most of its code with the existing H_SET_DABR
      implementation.  To support this, we add a per-vcpu field to store the
      DABRX value plus code to get and set it via the ONE_REG interface.
      
      For Linux guests to use this new hcall, userspace needs to add
      "hcall-xdabr" to the set of strings in the /chosen/hypertas-functions
      property in the device tree.  If userspace does this and then migrates
      the guest to a host where the kernel doesn't include this patch, then
      userspace will need to implement H_SET_XDABR by writing the specified
      DABR value to the DABR using the ONE_REG interface.  In that case, the
      old kernel will set DABRX to DABRX_USER | DABRX_KERNEL.  That should
      still work correctly, at least for Linux guests, since Linux guests
      cope with getting data breakpoint interrupts in modes that weren't
      requested by just ignoring the interrupt, and Linux guests never set
      DABRX_BTI.
      
      The other thing this does is to make H_SET_DABR and H_SET_XDABR work
      on POWER8, which has the DAWR and DAWRX instead of DABR/X.  Guests that
      know about POWER8 should use H_SET_MODE rather than H_SET_[X]DABR, but
      guests running in POWER7 compatibility mode will still use H_SET_[X]DABR.
      For them, this adds the logic to convert DABR/X values into DAWR/X values
      on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8563bf52
    • P
      KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells · 5d00f66b
      Paul Mackerras 提交于
      POWER8 has support for hypervisor doorbell interrupts.  Though the
      kernel doesn't use them for IPIs on the powernv platform yet, it
      probably will in future, so this makes KVM cope gracefully if a
      hypervisor doorbell interrupt arrives while in a guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5d00f66b
    • P
      KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 · e0622bd9
      Paul Mackerras 提交于
      POWER8 has a bit in the LPCR to enable or disable the PURR and SPURR
      registers to count when in the guest.  Set this bit.
      
      POWER8 has a field in the LPCR called AIL (Alternate Interrupt Location)
      which is used to enable relocation-on interrupts.  Allow userspace to
      set this field.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0622bd9
    • P
      KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs · aa31e843
      Paul Mackerras 提交于
      * SRR1 wake reason field for system reset interrupt on wakeup from nap
        is now a 4-bit field on P8, compared to 3 bits on P7.
      
      * Set PECEDP in LPCR when napping because of H_CEDE so guest doorbells
        will wake us up.
      
      * Waking up from nap because of a guest doorbell interrupt is not a
        reason to exit the guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      aa31e843
    • P
      KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap · e3bbbbfa
      Paul Mackerras 提交于
      Currently in book3s_hv_rmhandlers.S we have three places where we
      have woken up from nap mode and we check the reason field in SRR1
      to see what event woke us up.  This consolidates them into a new
      function, kvmppc_check_wake_reason.  It looks at the wake reason
      field in SRR1, and if it indicates that an external interrupt caused
      the wakeup, calls kvmppc_read_intr to check what sort of interrupt
      it was.
      
      This also consolidates the two places where we synthesize an external
      interrupt (0x500 vector) for the guest.  Now, if the guest exit code
      finds that there was an external interrupt which has been handled
      (i.e. it was an IPI indicating that there is now an interrupt pending
      for the guest), it jumps to deliver_guest_interrupt, which is in the
      last part of the guest entry code, where we synthesize guest external
      and decrementer interrupts.  That code has been streamlined a little
      and now clears LPCR[MER] when appropriate as well as setting it.
      
      The extra clearing of any pending IPI on a secondary, offline CPU
      thread before going back to nap mode has been removed.  It is no longer
      necessary now that we have code to read and acknowledge IPIs in the
      guest exit path.
      
      This fixes a minor bug in the H_CEDE real-mode handling - previously,
      if we found that other threads were already exiting the guest when we
      were about to go to nap mode, we would branch to the cede wakeup path
      and end up looking in SRR1 for a wakeup reason.  Now we branch to a
      point after we have checked the wakeup reason.
      
      This also fixes a minor bug in kvmppc_read_intr - previously it could
      return 0xff rather than 1, in the case where we find that a host IPI
      is pending after we have cleared the IPI.  Now it returns 1.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e3bbbbfa
    • P
      KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 · 5557ae0e
      Paul Mackerras 提交于
      This allows us to select architecture 2.05 (POWER6) or 2.06 (POWER7)
      compatibility modes on a POWER8 processor.  (Note that transactional
      memory is disabled for usermode if either or both of the PCR_TM_DIS
      and PCR_ARCH_206 bits are set.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5557ae0e
    • M
      KVM: PPC: Book3S HV: Add handler for HV facility unavailable · bd3048b8
      Michael Ellerman 提交于
      At present this should never happen, since the host kernel sets
      HFSCR to allow access to all facilities.  It's better to be prepared
      to handle it cleanly if it does ever happen, though.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bd3048b8
    • P
      KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 · ca252055
      Paul Mackerras 提交于
      POWER8 has 512 sets in the TLB, compared to 128 for POWER7, so we need
      to do more tlbiel instructions when flushing the TLB on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ca252055
    • M
      KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs · b005255e
      Michael Neuling 提交于
      This adds fields to the struct kvm_vcpu_arch to store the new
      guest-accessible SPRs on POWER8, adds code to the get/set_one_reg
      functions to allow userspace to access this state, and adds code to
      the guest entry and exit to context-switch these SPRs between host
      and guest.
      
      Note that DPDES (Directed Privileged Doorbell Exception State) is
      shared between threads on a core; hence we store it in struct
      kvmppc_vcore and have the master thread save and restore it.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b005255e
    • P
      KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers · e0b7ec05
      Paul Mackerras 提交于
      On a threaded processor such as POWER7, we group VCPUs into virtual
      cores and arrange that the VCPUs in a virtual core run on the same
      physical core.  Currently we don't enforce any correspondence between
      virtual thread numbers within a virtual core and physical thread
      numbers.  Physical threads are allocated starting at 0 on a first-come
      first-served basis to runnable virtual threads (VCPUs).
      
      POWER8 implements a new "msgsndp" instruction which guest kernels can
      use to interrupt other threads in the same core or sub-core.  Since
      the instruction takes the destination physical thread ID as a parameter,
      it becomes necessary to align the physical thread IDs with the virtual
      thread IDs, that is, to make sure virtual thread N within a virtual
      core always runs on physical thread N.
      
      This means that it's possible that thread 0, which is where we call
      __kvmppc_vcore_entry, may end up running some other vcpu than the
      one whose task called kvmppc_run_core(), or it may end up running
      no vcpu at all, if for example thread 0 of the virtual core is
      currently executing in userspace.  However, we do need thread 0
      to be responsible for switching the MMU -- a previous version of
      this patch that had other threads switching the MMU was found to
      be responsible for occasional memory corruption and machine check
      interrupts in the guest on POWER7 machines.
      
      To accommodate this, we no longer pass the vcpu pointer to
      __kvmppc_vcore_entry, but instead let the assembly code load it from
      the PACA.  Since the assembly code will need to know the kvm pointer
      and the thread ID for threads which don't have a vcpu, we move the
      thread ID into the PACA and we add a kvm pointer to the virtual core
      structure.
      
      In the case where thread 0 has no vcpu to run, it still calls into
      kvmppc_hv_entry in order to do the MMU switch, and then naps until
      either its vcpu is ready to run in the guest, or some other thread
      needs to exit the guest.  In the latter case, thread 0 jumps to the
      code that switches the MMU back to the host.  This control flow means
      that now we switch the MMU before loading any guest vcpu state.
      Similarly, on guest exit we now save all the guest vcpu state before
      switching the MMU back to the host.  This has required substantial
      code movement, making the diff rather large.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0b7ec05
    • M
      KVM: PPC: Book3S HV: Don't set DABR on POWER8 · eee7ff9d
      Michael Neuling 提交于
      POWER8 doesn't have the DABR and DABRX registers; instead it has
      new DAWR/DAWRX registers, which will be handled in a later patch.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      eee7ff9d
    • S
      kvm/ppc: IRQ disabling cleanup · 6c85f52b
      Scott Wood 提交于
      Simplify the handling of lazy EE by going directly from fully-enabled
      to hard-disabled.  This replaces the lazy_irq_pending() check
      (including its misplaced kvm_guest_exit() call).
      
      As suggested by Tiejun Chen, move the interrupt disabling into
      kvmppc_prepare_to_enter() rather than have each caller do it.  Also
      move the IRQ enabling on heavyweight exit into
      kvmppc_prepare_to_enter().
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6c85f52b