1. 23 2月, 2018 1 次提交
  2. 10 1月, 2018 1 次提交
  3. 07 11月, 2017 1 次提交
  4. 21 10月, 2017 1 次提交
    • M
      powerpc/powernv: Enable TM without suspend if possible · 54820530
      Michael Ellerman 提交于
      Some Power9 revisions can run in a mode where TM operates without
      suspended state. If we find ourself on a CPU that might be in this
      mode, we query OPAL to check, and if so we reenable TM in CPU
      features, and enable a new user feature to signal to userspace that we
      are in this mode.
      
      We do not enable the "normal" user feature, PPC_FEATURE2_HTM, but we
      do enable PPC_FEATURE2_HTM_NOSC because that indicates to userspace
      that the kernel will abort transactions on syscall entry, which is
      true regardless of the suspend mode.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      54820530
  5. 06 10月, 2017 1 次提交
  6. 04 10月, 2017 1 次提交
    • N
      powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET · e36d0a2e
      Nicholas Piggin 提交于
      This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal)
      system similarly to the hcall NMI IPI on pseries guests, when the
      platform/firmware supports it.
      
      This is an example of CPU10 spinning with interrupts hard disabled:
      
        Watchdog CPU:32 detected Hard LOCKUP other CPUS:10
        Watchdog CPU:10 Hard LOCKUP
        CPU: 10 PID: 4410 Comm: bash Not tainted 4.13.0-rc7-00074-ge89ce1f8-dirty #34
        task: c0000003a82b4400 task.stack: c0000003af55c000
        NIP: c0000000000a7b38 LR: c000000000659044 CTR: c0000000000a7b00
        REGS: c00000000fd23d80 TRAP: 0100   Not tainted  (4.13.0-rc7-00074-ge89ce1f8-dirty)
        MSR: 90000000000c1033 <SF,HV,ME,IR,DR,RI,LE>
        CR: 28422222  XER: 20000000
        CFAR: c0000000000a7b38 SOFTE: 0
        GPR00: c000000000659044 c0000003af55fbb0 c000000001072a00 0000000000000078
        GPR04: c0000003c81b5c80 c0000003c81cc7e8 9000000000009033 0000000000000000
        GPR08: 0000000000000000 c0000000000a7b00 0000000000000001 9000000000001003
        GPR12: c0000000000a7b00 c00000000fd83200 0000000010180df8 0000000010189e60
        GPR16: 0000000010189ed8 0000000010151270 000000001018bd88 000000001018de78
        GPR20: 00000000370a0668 0000000000000001 00000000101645e0 0000000010163c10
        GPR24: 00007fffd14d6294 00007fffd14d6290 c000000000fba6f0 0000000000000004
        GPR28: c000000000f351d8 0000000000000078 c000000000f4095c 0000000000000000
        NIP [c0000000000a7b38] sysrq_handle_xmon+0x38/0x40
        LR [c000000000659044] __handle_sysrq+0xe4/0x270
        Call Trace:
        [c0000003af55fbd0] [c000000000659044] __handle_sysrq+0xe4/0x270
        [c0000003af55fc70] [c000000000659810] write_sysrq_trigger+0x70/0xa0
        [c0000003af55fca0] [c0000000003da650] proc_reg_write+0xb0/0x110
        [c0000003af55fcf0] [c0000000003423bc] __vfs_write+0x6c/0x1b0
        [c0000003af55fd90] [c000000000344398] vfs_write+0xd8/0x240
        [c0000003af55fde0] [c00000000034632c] SyS_write+0x6c/0x110
        [c0000003af55fe30] [c00000000000b220] system_call+0x58/0x6c
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Use kernel types for opal_signal_system_reset()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e36d0a2e
  7. 10 7月, 2017 1 次提交
  8. 10 4月, 2017 1 次提交
    • B
      powerpc/xive: Native exploitation of the XIVE interrupt controller · 243e2511
      Benjamin Herrenschmidt 提交于
      The XIVE interrupt controller is the new interrupt controller
      found in POWER9. It supports advanced virtualization capabilities
      among other things.
      
      Currently we use a set of firmware calls that simulate the old
      "XICS" interrupt controller but this is fairly inefficient.
      
      This adds the framework for using XIVE along with a native
      backend which OPAL for configuration. Later, a backend allowing
      the use in a KVM or PowerVM guest will also be provided.
      
      This disables some fast path for interrupts in KVM when XIVE is
      enabled as these rely on the firmware emulation code which is no
      longer available when the XIVE is used natively by Linux.
      
      A latter patch will make KVM also directly exploit the XIVE, thus
      recovering the lost performance (and more).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings,
       tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV,
       fix build errors when SMP=n, fold in fixes from Ben:
         Don't call cpu_online() on an invalid CPU number
         Fix irq target selection returning out of bounds cpu#
         Extra sanity checks on cpu numbers
       ]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      243e2511
  9. 31 3月, 2017 1 次提交
  10. 30 11月, 2016 1 次提交
  11. 21 7月, 2016 3 次提交
  12. 23 6月, 2016 1 次提交
  13. 01 5月, 2016 1 次提交
  14. 17 12月, 2015 2 次提交
  15. 06 10月, 2015 1 次提交
  16. 20 8月, 2015 1 次提交
  17. 18 8月, 2015 1 次提交
    • A
      powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_ops · 53522982
      Andrew Donnellan 提交于
      Simplify the dma_get_required_mask call chain by moving it from pnv_phb to
      pci_controller_ops, similar to commit 763d2d8d ("powerpc/powernv:
      Move dma_set_mask from pnv_phb to pci_controller_ops").
      
      Previous call chain:
      
        0) call dma_get_required_mask() (kernel/dma.c)
        1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that
           points to pnv_dma_get_required_mask() (platforms/powernv/setup.c)
        2) device is PCI, therefore call pnv_pci_dma_get_required_mask()
           (platforms/powernv/pci.c)
        3) call phb->dma_get_required_mask if it exists
        4) it only exists in the ioda case, where it points to
             pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c)
      
      New call chain:
      
        0) call dma_get_required_mask() (kernel/dma.c)
        1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask
           if it exists
        2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask()
           (platforms/powernv/pci-ioda.c)
      
      In the p5ioc2 case, the call chain remains the same -
      dma_get_required_mask() does not find either a ppc_md call or
      pci_controller_ops call, so it calls __dma_get_required_mask().
      Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      53522982
  18. 02 6月, 2015 1 次提交
    • D
      powerpc/powernv: Move dma_set_mask() from pnv_phb to pci_controller_ops · 763d2d8d
      Daniel Axtens 提交于
      Previously, dma_set_mask() on powernv was convoluted:
       0) Call dma_set_mask() (a/p/kernel/dma.c)
       1) In dma_set_mask(), ppc_md.dma_set_mask() exists, so call it.
       2) On powernv, that function pointer is pnv_dma_set_mask().
          In pnv_dma_set_mask(), the device is pci, so call pnv_pci_dma_set_mask().
       3) In pnv_pci_dma_set_mask(), call pnv_phb->set_dma_mask() if it exists.
       4) It only exists in the ioda case, where it points to
          pnv_pci_ioda_dma_set_mask(), which is the final function.
      
      So the call chain is:
       dma_set_mask() ->
        pnv_dma_set_mask() ->
         pnv_pci_dma_set_mask() ->
          pnv_pci_ioda_dma_set_mask()
      
      Both ppc_md and pnv_phb function pointers are used.
      
      Rip out the ppc_md call, pnv_dma_set_mask() and pnv_pci_dma_set_mask().
      
      Instead:
       0) Call dma_set_mask() (a/p/kernel/dma.c)
       1) In dma_set_mask(), the device is pci, and pci_controller_ops.dma_set_mask()
          exists, so call pci_controller_ops.dma_set_mask()
       2) In the ioda case, that points to pnv_pci_ioda_dma_set_mask().
      
      The new call chain is
       dma_set_mask() ->
        pnv_pci_ioda_dma_set_mask()
      
      Now only the pci_controller_ops function pointer is used.
      
      The fallback paths for p5ioc2 are the same.
      
      Previously, pnv_pci_dma_set_mask() would find no pnv_phb->set_dma_mask()
      function, to it would call __set_dma_mask().
      
      Now, dma_set_mask() finds no ppc_md call or pci_controller_ops call,
      so it calls __set_dma_mask().
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      763d2d8d
  19. 22 5月, 2015 2 次提交
  20. 07 4月, 2015 1 次提交
  21. 26 3月, 2015 1 次提交
  22. 22 1月, 2015 1 次提交
    • S
      powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared · 0eb13208
      Shreyas B. Prabhu 提交于
      LPCR_PECE1 bit controls whether decrementer interrupts are allowed to
      cause exit from power-saving mode. While waking up from winkle, restoring
      LPCR with LPCR_PECE1 set (i.e Decrementer interrupts allowed) can cause
      issue in the following scenario:
      
      - All the threads in a core are offlined. The core enters deep winkle.
      - Spurious interrupt wakes up a thread in the core. Here LPCR is restored
        with LPCR_PECE1 bit set.
      - Since it was a spurious interrupt on a offline thread, the thread clears
        the interrupt and goes back to winkle.
      - Here before the thread executes winkle and puts the core into deep winkle,
        if a decrementer interrupt occurs on any of the sibling threads in the core
        that thread wakes up.
      - Since in offline loop we are flushing interrupt only in case of external
        interrupt, the decrementer interrupt does not get flushed. So at this stage
        the thread is stuck in this is loop of waking up at 0x100 due to decrementer
        interrupt, not flushing the interrupt as only external interrupts get flushed,
        entering winkle, waking up at 0x100 again.
      
      Fix this by programming PORE to restore LPCR with LPCR_PECE1 bit
      cleared when waking up from winkle.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0eb13208
  23. 15 12月, 2014 3 次提交
    • S
      powernv/powerpc: Add winkle support for offline cpus · 77b54e9f
      Shreyas B. Prabhu 提交于
      Winkle is a deep idle state supported in power8 chips. A core enters
      winkle when all the threads of the core enter winkle. In this state
      power supply to the entire chiplet i.e core, private L2 and private L3
      is turned off. As a result it gives higher powersavings compared to
      sleep.
      
      But entering winkle results in a total hypervisor state loss. Hence the
      hypervisor context has to be preserved before entering winkle and
      restored upon wake up.
      
      Power-on Reset Engine (PORE) is a dedicated engine which is responsible
      for powering on the chiplet during wake up. It can be programmed to
      restore the register contests of a few specific registers. This patch
      uses PORE to restore register state wherever possible and uses stack to
      save and restore rest of the necessary registers.
      
      With hypervisor state restore things fall under three categories-
      per-core state, per-subcore state and per-thread state. To manage this,
      extend the infrastructure introduced for sleep. Mainly we add a paca
      variable subcore_sibling_mask. Using this and the core_idle_state we can
      distingush first thread in core and subcore.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      77b54e9f
    • S
      powernv/cpuidle: Redesign idle states management · 7cba160a
      Shreyas B. Prabhu 提交于
      Deep idle states like sleep and winkle are per core idle states. A core
      enters these states only when all the threads enter either the
      particular idle state or a deeper one. There are tasks like fastsleep
      hardware bug workaround and hypervisor core state save which have to be
      done only by the last thread of the core entering deep idle state and
      similarly tasks like timebase resync, hypervisor core register restore
      that have to be done only by the first thread waking up from these
      state.
      
      The current idle state management does not have a way to distinguish the
      first/last thread of the core waking/entering idle states. Tasks like
      timebase resync are done for all the threads. This is not only is
      suboptimal, but can cause functionality issues when subcores and kvm is
      involved.
      
      This patch adds the necessary infrastructure to track idle states of
      threads in a per-core structure. It uses this info to perform tasks like
      fastsleep workaround and timebase resync only once per core.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Originally-by: NPreeti U. Murthy <preeti@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7cba160a
    • S
      powerpc/powernv: Enable Offline CPUs to enter deep idle states · 8eb8ac89
      Shreyas B. Prabhu 提交于
      The secondary threads should enter deep idle states so as to gain maximum
      powersavings when the entire core is offline. To do so the offline path
      must be made aware of the available deepest idle state. Hence probe the
      device tree for the possible idle states in powernv core code and
      expose the deepest idle state through flags.
      
      Since the  device tree is probed by the cpuidle driver as well, move
      the parameters required to discover the idle states into an appropriate
      common place to both the driver and the powernv core code.
      
      Another point is that fastsleep idle state may require workarounds in
      the kernel to function properly. This workaround is introduced in the
      subsequent patches. However neither the cpuidle driver or the hotplug
      path need be bothered about this workaround.
      
      They will be taken care of by the core powernv code.
      Originally-by: NSrivatsa S. Bhat <srivatsa@mit.edu>
      Signed-off-by: NPreeti U. Murthy <preeti@linux.vnet.ibm.com>
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8eb8ac89
  24. 17 11月, 2014 1 次提交
    • N
      rtc/tpo: Driver to support rtc and wakeup on PowerNV platform · 16b1d26e
      Neelesh Gupta 提交于
      The patch implements the OPAL rtc driver that binds with the rtc
      driver subsystem. The driver uses the platform device infrastructure
      to probe the rtc device and register it to rtc class framework. The
      'wakeup' is supported depending upon the property 'has-tpo' present
      in the OF node. It provides a way to load the generic rtc driver in
      in the absence of an OPAL driver.
      
      The patch also moves the existing OPAL rtc get/set time interfaces to the
      new driver and exposes the necessary OPAL calls using EXPORT_SYMBOL_GPL.
      
      Test results:
      -------------
      Host:
      [root@tul169p1 ~]# ls -l /sys/class/rtc/
      total 0
      lrwxrwxrwx 1 root root 0 Oct 14 03:07 rtc0 -> ../../devices/opal-rtc/rtc/rtc0
      [root@tul169p1 ~]# cat /sys/devices/opal-rtc/rtc/rtc0/time
      08:10:07
      [root@tul169p1 ~]# echo `date '+%s' -d '+ 2 minutes'` > /sys/class/rtc/rtc0/wakealarm
      [root@tul169p1 ~]# cat /sys/class/rtc/rtc0/wakealarm
      1413274345
      [root@tul169p1 ~]#
      
      FSP:
      $ smgr mfgState
      standby
      $ rtim timeofday
      
      System time is valid: 2014/10/14 08:12:04.225115
      
      $ smgr mfgState
      ipling
      $
      
      CC: devicetree@vger.kernel.org
      CC: tglx@linutronix.de
      CC: rtc-linux@googlegroups.com
      CC: a.zummo@towertech.it
      Signed-off-by: NNeelesh Gupta <neelegup@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      16b1d26e
  25. 03 11月, 2014 1 次提交
    • A
      powerpc: Convert power off logic to pm_power_off · 9178ba29
      Alexander Graf 提交于
      The generic Linux framework to power off the machine is a function pointer
      called pm_power_off. The trick about this pointer is that device drivers can
      potentially implement it rather than board files.
      
      Today on powerpc we set pm_power_off to invoke our generic full machine power
      off logic which then calls ppc_md.power_off to invoke machine specific power
      off.
      
      However, when we want to add a power off GPIO via the "gpio-poweroff" driver,
      this card house falls apart. That driver only registers itself if pm_power_off
      is NULL to ensure it doesn't override board specific logic. However, since we
      always set pm_power_off to the generic power off logic (which will just not
      power off the machine if no ppc_md.power_off call is implemented), we can't
      implement power off via the generic GPIO power off driver.
      
      To fix this up, let's get rid of the ppc_md.power_off logic and just always use
      pm_power_off as was intended. Then individual drivers such as the GPIO power off
      driver can implement power off logic via that function pointer.
      
      With this patch set applied and a few patches on top of QEMU that implement a
      power off GPIO on the virt e500 machine, I can successfully turn off my virtual
      machine after halt.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      [mpe: Squash into one patch and update changelog based on cover letter]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9178ba29
  26. 30 9月, 2014 1 次提交
    • G
      powerpc/powernv: Override dma_get_required_mask() · fe7e85c6
      Gavin Shan 提交于
      The dma_get_required_mask() function is used by some drivers to
      query the platform about what DMA mask is needed to cover all of
      memory. This is a bit of a strange semantic when we have to choose
      between IOMMU translation or bypass, but essentially what it means
      is "what DMA mask will give best performances".
      
      Currently, our IOMMU backend always returns a 32-bit mask here, we
      don't do anything special to it when we have bypass available. This
      causes some drivers to choose a 32-bit mask, thus losing the ability
      to use the bypass window, thinking this is more efficient. The problem
      was reported from the driver of following device:
      
      0004:03:00.0 0107: 1000:0087 (rev 05)
      0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \
                   Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
      
      This patch adds an override of that function in order to, instead,
      return a 64-bit mask whenever a bypass window is available in order
      for drivers to prefer this configuration.
      Reported-by: NMurali N. Iyer <mniyer@us.ibm.com>
      Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fe7e85c6
  27. 25 9月, 2014 1 次提交
  28. 05 8月, 2014 1 次提交
  29. 11 6月, 2014 2 次提交
  30. 05 6月, 2014 1 次提交
  31. 28 4月, 2014 3 次提交