1. 01 9月, 2017 1 次提交
    • P
      powerpc: Change analyse_instr so it doesn't modify *regs · 3cdfcbfd
      Paul Mackerras 提交于
      The analyse_instr function currently doesn't just work out what an
      instruction does, it also executes those instructions whose effect
      is only to update CPU registers that are stored in struct pt_regs.
      This is undesirable because optprobes uses analyse_instr to work out
      if an instruction could be successfully emulated in future.
      
      This changes analyse_instr so it doesn't modify *regs; instead it
      stores information in the instruction_op structure to indicate what
      registers (GPRs, CR, XER, LR) would be set and what value they would
      be set to.  A companion function called emulate_update_regs() can
      then use that information to update a pt_regs struct appropriately.
      
      As a minor cleanup, this replaces inline asm using the cntlzw and
      cntlzd instructions with calls to __builtin_clz() and __builtin_clzl().
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3cdfcbfd
  2. 31 8月, 2017 18 次提交
    • P
      powerpc: Correct instruction code for xxlor instruction · 93b2d3cf
      Paul Mackerras 提交于
      The instruction code for xxlor that commit 0016a4cf ("powerpc:
      Emulate most Book I instructions in emulate_step()", 2010-06-15)
      added is actually the code for xxlnor.  It is used in get_vsr()
      and put_vsr() and the effect of the error is that if emulate_step
      is used to emulate a VSX load or store from any register other
      than vsr0, the bitwise complement of the correct value will be
      loaded or stored.  This corrects the error.
      
      Fixes: 0016a4cf ("powerpc: Emulate most Book I instructions in emulate_step()")
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      93b2d3cf
    • O
      powerpc/smp: Add cpu_l2_cache_map · 2a636a56
      Oliver O'Halloran 提交于
      We want to add an extra level to the CPU scheduler topology to account
      for cores which share a cache. To do this we need to build a cpumask
      for each CPU that indicates which CPUs share this cache to use as an
      input to the scheduler.
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2a636a56
    • T
      powerpc/asm: Convert .llong directives to .8byte · eb039161
      Tobin C. Harding 提交于
      .llong is an undocumented PPC specific directive. The generic
      equivalent is .quad, but even better (because it's self describing) is
      .8byte.
      
      Convert all .llong directives to .8byte.
      Signed-off-by: NTobin C. Harding <me@tobin.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eb039161
    • B
      powerpc/xmon: Add ISA v3.0 SPRs to SPR dump · d1e1b351
      Balbir Singh 提交于
      Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR
      in ISA 3.0 hypervisor mode.
      
      SPRN_PSSCR_PR is the privileged mode access and is used when we are
      not in hypervisor mode.
      Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Split out of larger patch]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d1e1b351
    • S
      powerpc/powernv/vas: Define copy/paste interfaces · 2392c8c8
      Sukadev Bhattiprolu 提交于
      Define interfaces (wrappers) to the 'copy' and 'paste'
      instructions (which are new in PowerISA 3.0). These are intended to be
      used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to
      the NX hardware engines.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2392c8c8
    • S
      powerpc/powernv/vas: Define vas_tx_win_open() · 5239af67
      Sukadev Bhattiprolu 提交于
      Define an interface to open a VAS send window. This interface is
      intended to be used the Nest Accelerator (NX) driver(s) to open
      a send window and use it to submit compression/encryption requests
      to a VAS receive window.
      
      The receive window, identified by the [vasid, cop] parameters, must
      already be open in VAS (i.e connected to an NX engine).
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5239af67
    • S
      powerpc/powernv/vas: Define vas_win_close() interface · 98271d41
      Sukadev Bhattiprolu 提交于
      Define the vas_win_close() interface which should be used to close a
      send or receive windows.
      
      While the hardware configurations required to open send and receive
      windows differ, the configuration to close a window is the same for
      both. So we use a single interface to close the window.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      98271d41
    • S
      powerpc/powernv/vas: Define vas_rx_win_open() interface · 62c4eda4
      Sukadev Bhattiprolu 提交于
      Define the vas_rx_win_open() interface. This interface is intended to
      be used by the Nest Accelerator (NX) driver(s) to setup receive
      windows for one or more NX engines (which implement compression &
      encryption algorithms in the hardware).
      
      Follow-on patches will provide an interface to close the window and to
      open a send window that kernel subsystems can use to access the NX
      engines.
      
      The interface to open a receive window is expected to be invoked for
      each instance of VAS in the system.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      62c4eda4
    • S
      powerpc/powernv: Move GET_FIELD/SET_FIELD to vas.h · b6622a33
      Sukadev Bhattiprolu 提交于
      Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
      users of VAS, including NX-842 can use those macros.
      
      There is a lot of related code between the VAS/NX kernel drivers
      and skiboot. For consistency, switch the order of parameters in
      SET_FIELD to match the order in skiboot.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b6622a33
    • S
      powerpc/powernv/vas: Define macros, register fields and structures · 96768914
      Sukadev Bhattiprolu 提交于
      Define macros for the VAS hardware registers and bit-fields as well
      as couple of data structures needed by the VAS driver.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      [mpe: Fixup include guard to use _ASM_POWERPC_VAS_H]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      96768914
    • A
      powerpc/pci: Remove OF node back pointer from pci_dn · f1e08232
      Alexey Kardashevskiy 提交于
      The check_req() helper uses pci_get_pdn() to get an OF node pointer.
      pci_get_pdn() returns a pci_dn pointer which either:
      1) from the OF node returned by pci_device_to_OF_node();
      2) from the parent child_list where entries don't have OF node pointers.
      Since check_req() does not care about 2), it can call
      pci_device_to_OF_node() directly, hence the change.
      
      The find_pe_dn() helper uses embedded pci_dn to get an OF node which is
      also stored in edev->pdev so let's take a shortcut and call
      pci_device_to_OF_node() directly.
      
      With these 2 changes, we can finally get rid of the OF node back pointer.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f1e08232
    • A
      powerpc/eeh: Remove unnecessary config_addr from eeh_dev · 405b33a7
      Alexey Kardashevskiy 提交于
      The eeh_dev struct hold a config space address of an associated node
      and the very same address is also stored in the pci_dn struct which
      is always present during the eeh_dev lifetime.
      
      This uses bus:devfn directly from pci_dn instead of cached and packed
      config_addr.
      
      Since config_addr is made from device's bus:dev.fn, there is no point
      in keeping it in the debugfs either so remove that too.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      405b33a7
    • A
      powerpc/eeh: Remove unnecessary pointer to phb from eeh_dev · 69672bd7
      Alexey Kardashevskiy 提交于
      The eeh_dev struct already holds a pointer to pci_dn which it does not
      exist without and pci_dn itself holds the very same pointer so just
      use it.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69672bd7
    • A
      powerpc/eeh: Reduce to one the number of places where edev is allocated · 8bae6a23
      Alexey Kardashevskiy 提交于
      arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev
      is allocated; other 2 places allocate it on stack and in the heap for
      a very short period of time to use eeh_pe_get() as takes edev.
      
      This changes eeh_pe_get() to receive required parameters explicitly.
      
      This removes unnecessary temporary allocation of edev.
      
      This uses the "pe_no" name instead of the "pe_config_addr" name as
      it actually is a PE number and not a config space address as it seemed.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8bae6a23
    • N
      powerpc/powernv: Use kernel crash path for machine checks · 6fcd6baa
      Nicholas Piggin 提交于
      There are quite a few machine check exceptions that can be caused by
      kernel bugs. To make debugging easier, use the kernel crash path in
      cases of synchronous machine checks that occur in kernel mode, if that
      would not result in the machine going straight to panic or crash dump.
      
      There is a downside here that die()ing the process in kernel mode can
      still leave the system unstable. panic_on_oops will always force the
      system to fail-stop, so systems where that behaviour is important will
      still do the right thing.
      
      As a test, when triggering an i-side 0111b error (ifetch from foreign
      address) in kernel mode process context on POWER9, the kernel currently
      dies quickly like this:
      
        Severe Machine check interrupt [Not recovered]
          NIP [ffff000000000000]: 0xffff000000000000
          Initiator: CPU
          Error type: Real address [Instruction fetch (foreign)]
        [  127.426651616,0] OPAL: Reboot requested due to Platform error.
            Effective[  127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000
        opal: Reboot type 1 not supported
        Kernel panic - not syncing: PowerNV Unrecovered Machine Check
        CPU: 56 PID: 4425 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #35
        Call Trace:
        [  128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to
          buggy/mising code in OPAL for this BMC
          Rebooting in 10 seconds..
        Trying to free IRQ 496 from IRQ context!
      
      After this patch, the process is killed and the kernel continues with
      this message, which gives enough information to identify the offending
      branch (i.e., with CFAR):
      
        Severe Machine check interrupt [Not recovered]
          NIP [ffff000000000000]: 0xffff000000000000
          Initiator: CPU
          Error type: Real address [Instruction fetch (foreign)]
            Effective address: ffff000000000000
        Oops: Machine check, sig: 7 [#1]
        SMP NR_CPUS=2048
        NUMA
        PowerNV
        Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ...
        CPU: 22 PID: 4436 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #36
        task: c000000932300000 task.stack: c000000932380000
        NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000
        REGS: c00000000fc8fd80 TRAP: 0200   Tainted: G   M             (4.12.0-rc1-13857-ga4700a26-dirty)
        MSR: 90000000001c1003 <SF,HV,ME,RI,LE>
          CR: 24000484  XER: 20000000
        CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1
        GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000
        GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8
        GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030
        GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918
        NIP [ffff000000000000] 0xffff000000000000
        LR [00000000217706a4] 0x217706a4
        Call Trace:
        Instruction dump:
        XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
        XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6fcd6baa
    • N
      powerpc/powernv: Flush console before platform error reboot · b746e3e0
      Nicholas Piggin 提交于
      Unrecovered MCE and HMI errors are sent through a special restart OPAL
      call to log the platform error. The downside is that they don't go
      through normal Linux crash paths, so they don't give much information
      to the Linux console.
      
      Change this by providing a special crash function which does some of
      the console flushing from the panic() path before calling firmware to
      reboot.
      
      The downside of this is a little more code to execute before reaching
      the firmware reboot. However in practice, it's critical to get the
      Linux console messages output in order to debug a problem. So this is
      a desirable tradeoff.
      
      Note on the implementation: It is difficult to plumb a custom reboot
      handler into the panic path, because panic does a little bit too much
      work. For example, it will try to delay with the timebase, but that
      may be corrupted in some cases resulting in a hang without reaching
      the platform reboot. Another problem is that panic can invoke the
      crash dump code which is not what we want in the case of a hardware
      platform error. Long-term the best solution will be to rework the
      panic path so it can be suitable for this kind of panic, but for now
      we just duplicate a bit of the code.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b746e3e0
    • N
      powerpc: Do not call ppc_md.panic in fadump panic notifier · a3b2cb30
      Nicholas Piggin 提交于
      If fadump is not registered, and no other crash or debug handlers are
      registered, the powerpc panic handler stops the guest before the
      generic panic code can push out debug information to the console.
      
      Currently, system reset injection causes the guest to silently stop.
      
      Stop calling ppc_md.panic in the panic notifier. crash_fadump already
      does rtas_os_term() to terminate the guest if fadump is registered.
      
      Remove ppc_md.panic. Move fadump panic notifier into fadump code.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a3b2cb30
    • N
      powerpc/64: Fix watchdog configuration regressions · 70412c55
      Nicholas Piggin 提交于
      This fixes a couple more bits of fallout from the new hard lockup watchdog
      patch.
      
      It restores the required hw_nmi_get_sample_period() function for the
      perf watchdog, and removes some function declarations on 64e that are only
      defined for 64s. This fixes the 64e build when the hardlockup detector is
      enabled.
      
      It restores the default behaviour of disabling the perf watchdog, and also
      fixes disabling the 64s watchdog when running as a guest.
      
      Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      70412c55
  3. 29 8月, 2017 2 次提交
  4. 23 8月, 2017 4 次提交
  5. 18 8月, 2017 1 次提交
  6. 17 8月, 2017 3 次提交
    • A
      powerpc/mm: Don't send IPI to all cpus on THP updates · fa4531f7
      Aneesh Kumar K.V 提交于
      Now that we made sure that lockless walk of linux page table is mostly
      limitted to current task(current->mm->pgdir) we can update the THP
      update sequence to only send IPI to CPUs on which this task has run.
      This helps in reducing the IPI overload on systems with large number
      of CPUs.
      
      WRT kvm even though kvm is walking page table with vpc->arch.pgdir,
      it is done only on secondary CPUs and in that case we have primary CPU
      added to task's mm cpumask. Sending an IPI to primary will force the
      secondary to do a vm exit and hence this mm cpumask usage is safe
      here.
      
      WRT CAPI, we still end up walking linux page table with capi context
      MM. For now the pte lookup serialization sends an IPI to all CPUs in
      CPI is in use. We can further improve this by adding the CAPI
      interrupt handling CPU to task mm cpumask. That will be done in a
      later patch.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fa4531f7
    • A
      powerpc/mm: Rename find_linux_pte_or_hugepte() · 94171b19
      Aneesh Kumar K.V 提交于
      Add newer helpers to make the function usage simpler. It is always
      recommended to use find_current_mm_pte() for walking the page table.
      If we cannot use find_current_mm_pte(), it should be documented why
      the said usage of __find_linux_pte() is safe against a parallel THP
      split.
      
      For now we have KVM code using __find_linux_pte(). This is because kvm
      code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
      with PACA soft_enabled = 1. We may want to fix that later and make
      sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
      we can switch kvm to use find_linux_pte().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      94171b19
    • N
      powerpc/string: Implement optimized memset variants · 694fc88c
      Naveen N. Rao 提交于
      Based on Matthew Wilcox's patches for other architectures.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      694fc88c
  7. 16 8月, 2017 3 次提交
  8. 15 8月, 2017 5 次提交
  9. 10 8月, 2017 3 次提交