1. 13 4月, 2017 2 次提交
  2. 11 4月, 2017 1 次提交
  3. 10 4月, 2017 1 次提交
    • B
      powerpc/xive: Native exploitation of the XIVE interrupt controller · 243e2511
      Benjamin Herrenschmidt 提交于
      The XIVE interrupt controller is the new interrupt controller
      found in POWER9. It supports advanced virtualization capabilities
      among other things.
      
      Currently we use a set of firmware calls that simulate the old
      "XICS" interrupt controller but this is fairly inefficient.
      
      This adds the framework for using XIVE along with a native
      backend which OPAL for configuration. Later, a backend allowing
      the use in a KVM or PowerVM guest will also be provided.
      
      This disables some fast path for interrupts in KVM when XIVE is
      enabled as these rely on the firmware emulation code which is no
      longer available when the XIVE is used natively by Linux.
      
      A latter patch will make KVM also directly exploit the XIVE, thus
      recovering the lost performance (and more).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings,
       tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV,
       fix build errors when SMP=n, fold in fixes from Ben:
         Don't call cpu_online() on an invalid CPU number
         Fix irq target selection returning out of bounds cpu#
         Extra sanity checks on cpu numbers
       ]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      243e2511
  4. 02 3月, 2017 1 次提交
  5. 09 2月, 2017 1 次提交
  6. 31 1月, 2017 1 次提交
    • G
      powernv: Pass PSSCR value and mask to power9_idle_stop · 09206b60
      Gautham R. Shenoy 提交于
      The power9_idle_stop method currently takes only the requested stop
      level as a parameter and picks up the rest of the PSSCR bits from a
      hand-coded macro. This is not a very flexible design, especially when
      the firmware has the capability to communicate the psscr value and the
      mask associated with a particular stop state via device tree.
      
      This patch modifies the power9_idle_stop API to take as parameters the
      PSSCR value and the PSSCR mask corresponding to the stop state that
      needs to be set. These PSSCR value and mask are respectively obtained
      by parsing the "ibm,cpu-idle-state-psscr" and
      "ibm,cpu-idle-state-psscr-mask" fields from the device tree.
      
      In addition to this, the patch adds support for handling stop states
      for which ESL and EC bits in the PSSCR are zero. As per the
      architecture, a wakeup from these stop states resumes execution from
      the subsequent instruction as opposed to waking up at the System
      Vector.
      
      The older firmware sets only the Requested Level (RL) field in the
      psscr and psscr-mask exposed in the device tree. For older firmware
      where psscr-mask=0xf, this patch will set the default sane values that
      the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
      TR). For the new firmware, the patch will validate that the invariants
      required by the ISA for the psscr values are maintained by the
      firmware.
      
      This skiboot patch that exports fully populated PSSCR values and the
      mask for all the stop states can be found here:
      https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
      
      [Optimize the number of instructions before entering STOP with
      ESL=EC=0, validate the PSSCR values provided by the firimware
      maintains the invariants required as per the ISA suggested by Balbir
      Singh]
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      09206b60
  7. 15 7月, 2016 1 次提交
  8. 17 12月, 2015 2 次提交
  9. 21 10月, 2015 1 次提交
    • P
      powerpc/powernv: Handle irq_happened flag correctly in off-line loop · 53c656c4
      Paul Mackerras 提交于
      This fixes a bug where it is possible for an off-line CPU to fail to go
      into a low-power state (nap/sleep/winkle), and to become unresponsive to
      requests from the KVM subsystem to wake up and run a VCPU. What can
      happen is that a maskable interrupt of some kind (external, decrementer,
      hypervisor doorbell, or HMI) after we have called local_irq_disable() at
      the beginning of pnv_smp_cpu_kill_self() and before interrupts are
      hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
      pending event is marked in the irq_happened flag in the PACA. This
      pending event prevents power7_nap/sleep/winkle from going to the
      requested low-power state; instead they return immediately. We don't
      deal with any of these pending event flags in the off-line loop in
      pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
      so we will have srr1 == 0, and none of the processing to clear
      interrupts or doorbells will be done.
      
      Usually, the most obvious symptom of this is that a KVM guest will fail
      with a console message saying "KVM: couldn't grab cpu N".
      
      This fixes the problem by making sure we handle the irq_happened flags
      properly. First, we hard-disable before the off-line loop. Once we have
      hard-disabled, the irq_happened flags can't change underneath us. We
      unconditionally clear the DEC and HMI flags: there is no processing of
      timer interrupts while off-line, and the necessary HMI processing is all
      done in lower-level code. We leave the EE and DBELL flags alone for the
      first iteration of the loop, so that we won't fail to respond to a
      split-core request that came in just before hard-disabling. Within the
      loop, we handle external interrupts if the EE bit is set in irq_happened
      as well as if the low-power state was interrupted by an external
      interrupt. (We don't need to do the msgclr for a pending doorbell in
      irq_happened, because doorbells are edge-triggered and don't remain
      pending in hardware.) Then we clear both the EE and DBELL flags, and
      once clear, they cannot be set again (until this CPU comes online again,
      that is).
      
      This also fixes the debug check to not be done when we just ran a KVM
      guest or when the sleep didn't happen because of a pending event in
      irq_happened.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      53c656c4
  10. 07 4月, 2015 1 次提交
  11. 20 3月, 2015 1 次提交
    • P
      powerpc/powernv: Fixes for hypervisor doorbell handling · 755563bc
      Paul Mackerras 提交于
      Since we can now use hypervisor doorbells for host IPIs, this makes
      sure we clear the host IPI flag when taking a doorbell interrupt, and
      clears any pending doorbell IPI in pnv_smp_cpu_kill_self() (as we
      already do for IPIs sent via the XICS interrupt controller).  Otherwise
      if there did happen to be a leftover pending doorbell interrupt for
      an offline CPU thread for any reason, it would prevent that thread from
      going into a power-saving mode; it would instead keep waking up because
      of the interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      755563bc
  12. 18 12月, 2014 1 次提交
  13. 15 12月, 2014 3 次提交
    • S
      powernv/powerpc: Add winkle support for offline cpus · 77b54e9f
      Shreyas B. Prabhu 提交于
      Winkle is a deep idle state supported in power8 chips. A core enters
      winkle when all the threads of the core enter winkle. In this state
      power supply to the entire chiplet i.e core, private L2 and private L3
      is turned off. As a result it gives higher powersavings compared to
      sleep.
      
      But entering winkle results in a total hypervisor state loss. Hence the
      hypervisor context has to be preserved before entering winkle and
      restored upon wake up.
      
      Power-on Reset Engine (PORE) is a dedicated engine which is responsible
      for powering on the chiplet during wake up. It can be programmed to
      restore the register contests of a few specific registers. This patch
      uses PORE to restore register state wherever possible and uses stack to
      save and restore rest of the necessary registers.
      
      With hypervisor state restore things fall under three categories-
      per-core state, per-subcore state and per-thread state. To manage this,
      extend the infrastructure introduced for sleep. Mainly we add a paca
      variable subcore_sibling_mask. Using this and the core_idle_state we can
      distingush first thread in core and subcore.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      77b54e9f
    • S
      powernv/cpuidle: Redesign idle states management · 7cba160a
      Shreyas B. Prabhu 提交于
      Deep idle states like sleep and winkle are per core idle states. A core
      enters these states only when all the threads enter either the
      particular idle state or a deeper one. There are tasks like fastsleep
      hardware bug workaround and hypervisor core state save which have to be
      done only by the last thread of the core entering deep idle state and
      similarly tasks like timebase resync, hypervisor core register restore
      that have to be done only by the first thread waking up from these
      state.
      
      The current idle state management does not have a way to distinguish the
      first/last thread of the core waking/entering idle states. Tasks like
      timebase resync are done for all the threads. This is not only is
      suboptimal, but can cause functionality issues when subcores and kvm is
      involved.
      
      This patch adds the necessary infrastructure to track idle states of
      threads in a per-core structure. It uses this info to perform tasks like
      fastsleep workaround and timebase resync only once per core.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Originally-by: NPreeti U. Murthy <preeti@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7cba160a
    • S
      powerpc/powernv: Enable Offline CPUs to enter deep idle states · 8eb8ac89
      Shreyas B. Prabhu 提交于
      The secondary threads should enter deep idle states so as to gain maximum
      powersavings when the entire core is offline. To do so the offline path
      must be made aware of the available deepest idle state. Hence probe the
      device tree for the possible idle states in powernv core code and
      expose the deepest idle state through flags.
      
      Since the  device tree is probed by the cpuidle driver as well, move
      the parameters required to discover the idle states into an appropriate
      common place to both the driver and the powernv core code.
      
      Another point is that fastsleep idle state may require workarounds in
      the kernel to function properly. This workaround is introduced in the
      subsequent patches. However neither the cpuidle driver or the hotplug
      path need be bothered about this workaround.
      
      They will be taken care of by the core powernv code.
      Originally-by: NSrivatsa S. Bhat <srivatsa@mit.edu>
      Signed-off-by: NPreeti U. Murthy <preeti@linux.vnet.ibm.com>
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8eb8ac89
  14. 08 12月, 2014 1 次提交
    • P
      powerpc/powernv: Return to cpu offline loop when finished in KVM guest · 56548fc0
      Paul Mackerras 提交于
      When a secondary hardware thread has finished running a KVM guest, we
      currently put that thread into nap mode using a nap instruction in
      the KVM code.  This changes the code so that instead of doing a nap
      instruction directly, we instead cause the call to power7_nap() that
      put the thread into nap mode to return.  The reason for doing this is
      to avoid having the KVM code having to know what low-power mode to
      put the thread into.
      
      In the case of a secondary thread used to run a KVM guest, the thread
      will be offline from the point of view of the host kernel, and the
      relevant power7_nap() call is the one in pnv_smp_cpu_disable().
      In this case we don't want to clear pending IPIs in the offline loop
      in that function, since that might cause us to miss the wakeup for
      the next time the thread needs to run a guest.  To tell whether or
      not to clear the interrupt, we use the SRR1 value returned from
      power7_nap(), and check if it indicates an external interrupt.  We
      arrange that the return from power7_nap() when we have finished running
      a guest returns 0, so pending interrupts don't get flushed in that
      case.
      
      Note that it is important a secondary thread that has finished
      executing in the guest, or that didn't have a guest to run, should
      not return to power7_nap's caller while the kvm_hstate.hwthread_req
      flag in the PACA is non-zero, because the return from power7_nap
      will reenable the MMU, and the MMU might still be in guest context.
      In this situation we spin at low priority in real mode waiting for
      hwthread_req to become zero.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      56548fc0
  15. 25 9月, 2014 2 次提交
    • P
      powerpc/powernv: Don't call generic code on offline cpus · d6a4f709
      Paul Mackerras 提交于
      On PowerNV platforms, when a CPU is offline, we put it into nap mode.
      It's possible that the CPU wakes up from nap mode while it is still
      offline due to a stray IPI.  A misdirected device interrupt could also
      potentially cause it to wake up.  In that circumstance, we need to clear
      the interrupt so that the CPU can go back to nap mode.
      
      In the past the clearing of the interrupt was accomplished by briefly
      enabling interrupts and allowing the normal interrupt handling code
      (do_IRQ() etc.) to handle the interrupt.  This has the problem that
      this code calls irq_enter() and irq_exit(), which call functions such
      as account_system_vtime() which use RCU internally.  Use of RCU is not
      permitted on offline CPUs and will trigger errors if RCU checking is
      enabled.
      
      To avoid calling into any generic code which might use RCU, we adopt
      a different method of clearing interrupts on offline CPUs.  Since we
      are on the PowerNV platform, we know that the system interrupt
      controller is a XICS being driven directly (i.e. not via hcalls) by
      the kernel.  Hence this adds a new icp_native_flush_interrupt()
      function to the native-mode XICS driver and arranges to call that
      when an offline CPU is woken from nap.  This new function reads the
      interrupt from the XICS.  If it is an IPI, it clears the IPI; if it
      is a device interrupt, it prints a warning and disables the source.
      Then it does the end-of-interrupt processing for the interrupt.
      
      The other thing that briefly enabling interrupts did was to check and
      clear the irq_happened flag in this CPU's PACA.  Therefore, after
      flushing the interrupt from the XICS, we also clear all bits except
      the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
      irq_happened flag.  The PACA_IRQ_HARD_DIS flag is set by power7_nap()
      and is left set to indicate that interrupts are hard disabled.  This
      means we then have to ignore that flag in power7_nap(), which is
      reasonable since it doesn't indicate that any interrupt event needs
      servicing.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d6a4f709
    • A
      powerpc: Make a bunch of things static · e51df2c1
      Anton Blanchard 提交于
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e51df2c1
  16. 11 6月, 2014 1 次提交
  17. 28 5月, 2014 2 次提交
  18. 28 4月, 2014 1 次提交
  19. 23 4月, 2014 1 次提交
  20. 14 8月, 2013 1 次提交
  21. 01 7月, 2013 1 次提交
    • P
      powerpc: Delete __cpuinit usage from all users · 061d19f2
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the powerpc uses of the __cpuinit macros.  There
      are no __CPUINIT users in assembly files in powerpc.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      061d19f2
  22. 20 6月, 2013 1 次提交
  23. 14 5月, 2013 1 次提交
  24. 06 5月, 2013 1 次提交
  25. 04 1月, 2013 1 次提交
    • G
      POWERPC: drivers: remove __dev* attributes. · cad5cef6
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cad5cef6
  26. 05 9月, 2012 1 次提交
    • P
      powerpc/powernv: Always go into nap mode when CPU is offline · 375f561a
      Paul Mackerras 提交于
      The CPU hotplug code for the powernv platform currently only puts
      offline CPUs into nap mode if the powersave_nap variable is set.
      However, HV-style KVM on this platform requires secondary CPU threads
      to be offline and in nap mode.  Since we know nap mode works just
      fine on all POWER7 machines, and the only machines that support the
      powernv platform are POWER7 machines, this changes the code to
      always put offline CPUs into nap mode, regardless of powersave_nap.
      Powersave_nap still controls whether or not CPUs go into nap mode
      when idle, as before.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      375f561a
  27. 29 3月, 2012 1 次提交
  28. 08 12月, 2011 1 次提交
    • P
      powerpc/powernv: Fix problems in onlining CPUs · cba313da
      Paul Mackerras 提交于
      At present, on the powernv platform, if you off-line a CPU that was
      online, and then try to on-line it again, the kernel generates a
      warning message "OPAL Error -1 starting CPU n".  Furthermore, if the
      CPU is a secondary thread that was used by KVM while it was off-line,
      the CPU fails to come online.
      
      The first problem is fixed by only calling OPAL to start the CPU the
      first time it is on-lined, as indicated by the cpu_start field of its
      PACA being zero.  The second problem is fixed by restoring the
      cpu_start field to 1 instead of 0 when using the CPU within KVM.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cba313da
  29. 20 9月, 2011 3 次提交