1. 23 7月, 2020 3 次提交
  2. 22 7月, 2020 1 次提交
    • A
      powerpc/perf: BHRB control to disable BHRB logic when not used · 1cade527
      Athira Rajeev 提交于
      PowerISA v3.1 has few updates for the Branch History Rolling
      Buffer(BHRB).
      
      BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
      bit, namely "BHRB Recording Disable (BHRBRD)". This field controls
      whether BHRB entries are written when BHRB recording is enabled by
      other bits. This patch implements support for this BHRB disable bit.
      By setting 0b1 to this bit will disable the BHRB and by setting 0b0 to
      this bit will have BHRB enabled. This addresses backward
      compatibility (for older OS), since this bit will be cleared and
      hardware will be writing to BHRB by default.
      
      This patch addresses changes to set MMCRA (BHRBRD) at boot for
      power10 (there by the core will run faster) and enable this feature
      only on runtime ie, on explicit need from user. Also save/restore
      MMCRA in the restore path of state-loss idle state to make sure we
      keep BHRB disabled if it was not enabled on request at runtime.
      Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1594996707-3727-12-git-send-email-atrajeev@linux.vnet.ibm.com
      1cade527
  3. 11 5月, 2020 1 次提交
  4. 30 8月, 2019 1 次提交
  5. 12 7月, 2019 1 次提交
  6. 03 7月, 2019 1 次提交
  7. 19 6月, 2019 1 次提交
  8. 31 5月, 2019 1 次提交
  9. 30 4月, 2019 2 次提交
    • M
      powerpc/powernv/idle: Restore AMR/UAMOR/AMOR/IAMR after idle · e9cef018
      Michael Ellerman 提交于
      This is an implementation of commits 53a712ba
      ("powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle") and
      a3f3072d ("powerpc/powernv/idle: Restore IAMR after idle") using
      the new C-based idle code.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Extract from Nick's patch]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e9cef018
    • N
      powerpc/64s: Reimplement book3s idle code in C · 10d91611
      Nicholas Piggin 提交于
      Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
      speific HV idle code to the powernv platform code.
      
      Book3S assembly stubs are kept in common code and used only to save
      the stack frame and non-volatile GPRs before executing architected
      idle instructions, and restoring the stack and reloading GPRs then
      returning to C after waking from idle.
      
      The complex logic dealing with threads and subcores, locking, SPRs,
      HMIs, timebase resync, etc., is all done in C which makes it more
      maintainable.
      
      This is not a strict translation to C code, there are some
      significant differences:
      
      - Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
        but saves and restores them itself.
      
      - The optimisation where EC=ESL=0 idle modes did not have to save GPRs
        or change MSR is restored, because it's now simple to do. ESL=1
        sleeps that do not lose GPRs can use this optimization too.
      
      - KVM secondary entry and cede is now more of a call/return style
        rather than branchy. nap_state_lost is not required because KVM
        always returns via NVGPR restoring path.
      
      - KVM secondary wakeup from offline sequence is moved entirely into
        the offline wakeup, which avoids a hwsync in the normal idle wakeup
        path.
      
      Performance measured with context switch ping-pong on different
      threads or cores, is possibly improved a small amount, 1-3% depending
      on stop state and core vs thread test for shallow states. Deep states
      it's in the noise compared with other latencies.
      
      KVM improvements:
      
      - Idle sleepers now always return to caller rather than branch out
        to KVM first.
      
      - This allows optimisations like very fast return to caller when no
        state has been lost.
      
      - KVM no longer requires nap_state_lost because it controls NVGPR
        save/restore itself on the way in and out.
      
      - The heavy idle wakeup KVM request check can be moved out of the
        normal host idle code and into the not-performance-critical offline
        code.
      
      - KVM nap code now returns from where it is called, which makes the
        flow a bit easier to follow.
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Squash the KVM changes in]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      10d91611
  10. 21 2月, 2019 1 次提交
    • P
      powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit · 19f8a5b5
      Paul Mackerras 提交于
      Commit 24be85a2 ("powerpc/powernv: Clear PECE1 in LPCR via stop-api
      only on Hotplug", 2017-07-21) added two calls to opal_slw_set_reg()
      inside pnv_cpu_offline(), with the aim of changing the LPCR value in
      the SLW image to disable wakeups from the decrementer while a CPU is
      offline.  However, pnv_cpu_offline() gets called each time a secondary
      CPU thread is woken up to participate in running a KVM guest, that is,
      not just when a CPU is offlined.
      
      Since opal_slw_set_reg() is a very slow operation (with observed
      execution times around 20 milliseconds), this means that an offline
      secondary CPU can often be busy doing the opal_slw_set_reg() call
      when the primary CPU wants to grab all the secondary threads so that
      it can run a KVM guest.  This leads to messages like "KVM: couldn't
      grab CPU n" being printed and guest execution failing.
      
      There is no need to reprogram the SLW image on every KVM guest entry
      and exit.  So that we do it only when a CPU is really transitioning
      between online and offline, this moves the calls to
      pnv_program_cpu_hotplug_lpcr() into pnv_smp_cpu_kill_self().
      
      Fixes: 24be85a2 ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      19f8a5b5
  11. 10 8月, 2018 1 次提交
    • A
      powerpc/powernv/idle: Fix build error · ae24ce5e
      Aneesh Kumar K.V 提交于
      Fix the below build error using strlcpy instead of strncpy
      
      In function 'pnv_parse_cpuidle_dt',
          inlined from 'pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:840:7,
          inlined from '__machine_initcall_powernv_pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:870:1:
      arch/powerpc/platforms/powernv/idle.c:820:3: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation]
         strncpy(pnv_idle_states[i].name, temp_string[i],
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          PNV_IDLE_NAME_LEN);
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ae24ce5e
  12. 03 8月, 2018 1 次提交
  13. 31 7月, 2018 1 次提交
  14. 16 7月, 2018 1 次提交
  15. 28 5月, 2018 1 次提交
  16. 03 4月, 2018 1 次提交
    • N
      powerpc/powernv: Fix SMT4 forcing idle code · a2b5e056
      Nicholas Piggin 提交于
      The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not
      have the XER[SO] bug.
      
      Fix this by storing up-front, outside the workaround code. The initial
      test is not required because it is a slow path.
      
      The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to
      match pnv_power9_force_smt4_catch() where it is used. Drop the comment
      on pnv_power9_force_smt4_catch() as it's no longer true.
      
      Fixes: 7672691a ("powerpc/powernv: Provide a way to force a core into SMT4 mode")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a2b5e056
  17. 31 3月, 2018 1 次提交
  18. 30 3月, 2018 1 次提交
  19. 23 3月, 2018 1 次提交
    • P
      powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a
      Paul Mackerras 提交于
      POWER9 processors up to and including "Nimbus" v2.2 have hardware
      bugs relating to transactional memory and thread reconfiguration.
      One of these bugs has a workaround which is to get the core into
      SMT4 state temporarily.  This workaround is only needed when
      running bare-metal.
      
      This patch provides a function which gets the core into SMT4 mode
      by preventing threads from going to a stop state, and waking up
      those which are already in a stop state.  Once at least 3 threads
      are not in a stop state, the core will be in SMT4 and we can
      continue.
      
      To do this, we add a "dont_stop" flag to the paca to tell the
      thread not to go into a stop state.  If this flag is set,
      power9_idle_stop() just returns immediately with a return value
      of 0.  The pnv_power9_force_smt4_catch() function does the following:
      
      1. Set the dont_stop flag for each thread in the core, except
         ourselves (in fact we use an atomic_inc() in case more than
         one thread is calling this function concurrently).
      2. See how many threads are awake, indicated by their
         requested_psscr field in the paca being 0.  If this is at
         least 3, skip to step 5.
      3. Send a doorbell interrupt to each thread that was seen as
         being in a stop state in step 2.
      4. Until at least 3 threads are awake, scan the threads to which
         we sent a doorbell interrupt and check if they are awake now.
      
      This relies on the following properties:
      
      - Once dont_stop is non-zero, requested_psccr can't go from zero to
        non-zero, except transiently (and without the thread doing stop).
      - requested_psscr being zero guarantees that the thread isn't in
        a state-losing stop state where thread reconfiguration could occur.
      - Doing stop with a PSSCR value of 0 won't be a state-losing stop
        and thus won't allow thread reconfiguration.
      - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
        must be in SMT4 mode, since SMT modes are powers of 2.
      
      This does add a sync to power9_idle_stop(), which is necessary to
      provide the correct ordering between setting requested_psscr and
      checking dont_stop.  The overhead of the sync should be unnoticeable
      compared to the latency of going into and out of a stop state.
      
      Because some objected to incurring this extra latency on systems where
      the XER[SO] bug is not relevant, I have put the test in
      power9_idle_stop inside a feature section.  This means that
      pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
      without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
      probably hang the system.
      
      In order to cater for uses where the caller has an operation that
      has to be done while the core is in SMT4, the core continues to be
      kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
      until the pnv_power9_force_smt4_release() function is called.
      It undoes the effect of step 1 above and allows the other threads
      to go into a stop state.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7672691a
  20. 20 9月, 2017 1 次提交
  21. 08 8月, 2017 1 次提交
    • G
      powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails · 785a12af
      Gautham R. Shenoy 提交于
      Currently, we use the opal call opal_slw_set_reg() to inform the
      Sleep-Winkle Engine (SLW) to restore the contents of some of the
      Hypervisor state on wakeup from deep idle states that lose full
      hypervisor context (characterized by the flag
      OPAL_PM_LOSE_FULL_CONTEXT).
      
      However, the current code has a bug in that if opal_slw_set_reg()
      fails, we don't disable the use of these deep states (winkle on
      POWER8, stop4 onwards on POWER9).
      
      This patch fixes this bug by ensuring that if programing the
      sleep-winkle engine to restore the hypervisor states in
      pnv_save_sprs_for_deep_states() fails, then we exclude such states by
      clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from
      supported_cpuidle_states. As a result POWER8 will be prevented from
      using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to
      the default stop state when available.
      
      Further, we ensure in the initialization of the cpuidle-powernv driver
      to only include those states whose flags are present in
      supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT
      states when they have been disabled due to stop-api failure.
      
      Fixes: 1e1601b3 ("powerpc/powernv/idle: Restore SPRs for deep idle
      states via stop API.")
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      785a12af
  22. 01 8月, 2017 1 次提交
    • G
      powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug · 24be85a2
      Gautham R. Shenoy 提交于
      Currently we use the stop-api provided by the firmware to program the
      SLW engine to restore the values of hypervisor resources that get lost
      on deeper idle states (such as winkle). Since the deep states were
      only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
      to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
      woken up by decrementer.
      
      On POWER9, some of the deep platform idle states such as stop4 can be
      used in cpuidle as well. In this case, we want the CPU in stop4 to be
      woken up by the decrementer when some timer on the CPU expires.
      
      In this patch, we program the stop-api for LPCR with PECE1
      bit cleared only when we are offlining the CPU and set it
      back once the CPU is online.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      24be85a2
  23. 19 6月, 2017 4 次提交
  24. 30 5月, 2017 3 次提交
  25. 23 4月, 2017 1 次提交
    • N
      powerpc/64s: Stop using bit in HSPRG0 to test winkle · 544686ca
      Nicholas Piggin 提交于
      The POWER8 idle code has a neat trick of programming the power on engine
      to restore a low bit into HSPRG0, so idle wakeup code can test and see
      if it has been programmed this way and therefore lost all state. Restore
      time can be reduced if winkle has not been reached.
      
      However this messes with our r13 PACA pointer, and requires HSPRG0 to be
      written to. It also optimizes the slowest and most uncommon case at the
      expense of another SPR write in the common nap state wakeup.
      
      Remove this complexity and assume winkle sleeps always require a state
      restore. This speedup could be made entirely contained within the winkle
      idle code by counting per-core winkles and setting a thread bitmap when
      all have gone to winkle.
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      544686ca
  26. 11 4月, 2017 4 次提交
    • G
      powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1 · 17ed4c8f
      Gautham R. Shenoy 提交于
      POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up
      from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus
      the HSPRG0 of a thread waking up from can contain the paca pointer of
      its sibling.
      
      This patch implements a context recovery framework within threads of a
      core, by provisioning space in paca_struct for saving every sibling
      threads's paca pointers. Basically, we should be able to arrive at the
      right paca pointer from any of the thread's existing paca pointer.
      
      At bootup, during powernv idle-init, we save the paca address of every
      CPU in each one its siblings paca_struct in the slot corresponding to
      this CPU's index in the core.
      
      On wakeup from a stop, the thread will determine its index in the core
      from the TIR register and recover its PACA pointer by indexing into
      the correct slot in the provisioned space in the current PACA.
      
      Furthermore, ensure that the NVGPRs are restored from the stack on the
      way out by setting the NAPSTATELOST in paca.
      
      [Changelog written with inputs from svaidy@linux.vnet.ibm.com]
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Call it a bug]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      17ed4c8f
    • G
      powerpc/powernv/idle: Don't override default/deepest directly in kernel · f3b3f284
      Gautham R. Shenoy 提交于
      Currently during idle-init on power9, if we don't find suitable stop
      states in the device tree that can be used as the
      default_stop/deepest_stop, we set stop0 (ESL=1,EC=1) as the default
      stop state psscr to be used by power9_idle and deepest stop state
      which is used by CPU-Hotplug.
      
      However, if the platform firmware has not configured or enabled a stop
      state, the kernel should not make any assumptions and fallback to a
      default choice.
      
      If the kernel uses a stop state that is not configured by the platform
      firmware, it may lead to further failures which should be avoided.
      
      In this patch, we modify the init code to ensure that the kernel uses
      only the stop states exposed by the firmware through the device
      tree. When a suitable default stop state isn't found, we disable
      ppc_md.power_save for power9. Similarly, when a suitable
      deepest_stop_state is not found in the device tree exported by the
      firmware, fall back to the default busy-wait loop in the CPU-Hotplug
      code.
      
      [Changelog written with inputs from svaidy@linux.vnet.ibm.com]
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f3b3f284
    • G
      powerpc/powernv/smp: Add busy-wait loop as fall back for CPU-Hotplug · 90061231
      Gautham R. Shenoy 提交于
      Currently, the powernv cpu-offline function assumes that platform idle
      states such as stop on POWER9, winkle/sleep/nap on POWER8 are always
      available. On POWER8, it picks nap as the default state if other deep
      idle states like sleep/winkle are not available and enabled in the
      platform.
      
      On POWER9, nap is not available and all idle states are managed by
      STOP instruction.  The parameters to the idle state are passed through
      processor stop status control register (PSSCR).  Hence as such
      executing STOP would take parameters from current PSSCR. We do not
      want to make any assumptions in kernel on what STOP states and PSSCR
      features are configured by the platform.
      
      Ideally platform will configure a good set of stop states that can be
      used in the kernel.  We would like to start with a clean slate, if the
      platform choose to not configure any state or there is an error in
      platform firmware that lead to no stop states being configured or
      allowed to be requested.
      
      This patch adds a fallback method for CPU-Hotplug that is similar to
      snooze loop at idle where the threads are left to spin at low priority
      and hence reduce the cycles consumed.
      
      This is a safe fallback mechanism in the case when no stop state would
      be requested if the platform firmware did not configure them most
      likely due to an error condition.
      
      Requesting a stop state when the platform has not configured them or
      enabled them would lead to further error conditions which could be
      difficult to debug.
      
      [Changelog written with inputs from svaidy@linux.vnet.ibm.com]
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      90061231
    • G
      powerpc/powernv: Move CPU-Offline idle state invocation from smp.c to idle.c · a7cd88da
      Gautham R. Shenoy 提交于
      Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
      transitions the CPU to the deepest available platform idle state to a
      new function named pnv_cpu_offline() in powernv/idle.c. The rationale
      behind this code movement is that the data required to determine the
      deepest available platform state resides in powernv/idle.c.
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7cd88da
  27. 31 1月, 2017 2 次提交
    • G
      powernv: Pass PSSCR value and mask to power9_idle_stop · 09206b60
      Gautham R. Shenoy 提交于
      The power9_idle_stop method currently takes only the requested stop
      level as a parameter and picks up the rest of the PSSCR bits from a
      hand-coded macro. This is not a very flexible design, especially when
      the firmware has the capability to communicate the psscr value and the
      mask associated with a particular stop state via device tree.
      
      This patch modifies the power9_idle_stop API to take as parameters the
      PSSCR value and the PSSCR mask corresponding to the stop state that
      needs to be set. These PSSCR value and mask are respectively obtained
      by parsing the "ibm,cpu-idle-state-psscr" and
      "ibm,cpu-idle-state-psscr-mask" fields from the device tree.
      
      In addition to this, the patch adds support for handling stop states
      for which ESL and EC bits in the PSSCR are zero. As per the
      architecture, a wakeup from these stop states resumes execution from
      the subsequent instruction as opposed to waking up at the System
      Vector.
      
      The older firmware sets only the Requested Level (RL) field in the
      psscr and psscr-mask exposed in the device tree. For older firmware
      where psscr-mask=0xf, this patch will set the default sane values that
      the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
      TR). For the new firmware, the patch will validate that the invariants
      required by the ISA for the psscr values are maintained by the
      firmware.
      
      This skiboot patch that exports fully populated PSSCR values and the
      mask for all the stop states can be found here:
      https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
      
      [Optimize the number of instructions before entering STOP with
      ESL=EC=0, validate the PSSCR values provided by the firimware
      maintains the invariants required as per the ISA suggested by Balbir
      Singh]
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      09206b60
    • G
      powernv:stop: Rename pnv_arch300_idle_init to pnv_power9_idle_init · dd34c74c
      Gautham R. Shenoy 提交于
      Balbir pointed out that the name of the function pnv_arch300_idle_init
      was inconsistent with the names of the variables and functions
      pertaining to POWER9 features in book3s_idle.S.
      
      This patch renames pnv_arch300_idle_init to pnv_power9_idle_init.
      
      This patch does not change any behaviour.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dd34c74c
  28. 15 7月, 2016 1 次提交