1. 17 5月, 2019 1 次提交
  2. 30 7月, 2018 2 次提交
  3. 18 7月, 2018 1 次提交
  4. 16 7月, 2018 1 次提交
  5. 19 4月, 2018 1 次提交
    • M
      powerpc/kvm: Fix lockups when running KVM guests on Power8 · 56376c58
      Michael Ellerman 提交于
      When running KVM guests on Power8 we can see a lockup where one CPU
      stops responding. This often leads to a message such as:
      
        watchdog: CPU 136 detected hard LOCKUP on other CPUs 72
        Task dump for CPU 72:
        qemu-system-ppc R  running task    10560 20917  20908 0x00040004
      
      And then backtraces on other CPUs, such as:
      
        Task dump for CPU 48:
        ksmd            R  running task    10032  1519      2 0x00000804
        Call Trace:
          ...
          --- interrupt: 901 at smp_call_function_many+0x3c8/0x460
              LR = smp_call_function_many+0x37c/0x460
          pmdp_invalidate+0x100/0x1b0
          __split_huge_pmd+0x52c/0xdb0
          try_to_unmap_one+0x764/0x8b0
          rmap_walk_anon+0x15c/0x370
          try_to_unmap+0xb4/0x170
          split_huge_page_to_list+0x148/0xa30
          try_to_merge_one_page+0xc8/0x990
          try_to_merge_with_ksm_page+0x74/0xf0
          ksm_scan_thread+0x10ec/0x1ac0
          kthread+0x160/0x1a0
          ret_from_kernel_thread+0x5c/0x78
      
      This is caused by commit 8c1c7fb0 ("powerpc/64s/idle: avoid sync
      for KVM state when waking from idle"), which added a check in
      pnv_powersave_wakeup() to see if the kvm_hstate.hwthread_state is
      already set to KVM_HWTHREAD_IN_KERNEL, and if so to skip the store and
      test of kvm_hstate.hwthread_req.
      
      The problem is that the primary does not set KVM_HWTHREAD_IN_KVM when
      entering the guest, so it can then come out to cede with
      KVM_HWTHREAD_IN_KERNEL set. It can then go idle in kvm_do_nap after
      setting hwthread_req to 1, but because hwthread_state is still
      KVM_HWTHREAD_IN_KERNEL we will skip the test of hwthread_req when we
      wake up from idle and won't go to kvm_start_guest. From there the
      thread will return somewhere garbage and crash.
      
      Fix it by skipping the store of hwthread_state, but not the test of
      hwthread_req, when coming out of idle. It's OK to skip the sync in
      that case because hwthread_req will have been set on the same thread,
      so there is no synchronisation required.
      
      Fixes: 8c1c7fb0 ("powerpc/64s/idle: avoid sync for KVM state when waking from idle")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      56376c58
  6. 05 4月, 2018 2 次提交
    • N
      powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep · c1b25a17
      Nicholas Piggin 提交于
      POWER8 restores AMOR when waking from deep sleep, but POWER9 does not,
      because it does not go through the subcore restore.
      
      Have POWER9 restore it in core restore.
      
      Fixes: ee97b6b9 ("powerpc/mm/radix: Setup AMOR in HV mode to allow key 0")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c1b25a17
    • M
      Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead" · a67cc594
      Michael Ellerman 提交于
      As described in that commit:
      
        When stop is executed with EC=ESL=0, it appears to execute like a
        normal instruction (resuming from NIP when woken by interrupt). So
        all the save/restore handling can be avoided completely.
      
      This is true, except in the case of an NMI interrupt (sreset or
      machine check) interrupting the instruction. In that case, the NMI
      gets an "interrupt occurred while the processor was in power-saving
      mode" indication. The power-save wakeup code uses that bit to decide
      whether to restore some registers (e.g., LR). Because these are no
      longer saved, this causes random register corruption.
      
      It may be possible to restore this optimisation by detecting the case
      of no register loss on the wakeup side, and avoid restoring in that
      case, but that's not a minor fix because the wakeup code itself uses
      some registers that would be live (e.g., LR).
      
      Fixes: b9ee31e1 ("powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a67cc594
  7. 04 4月, 2018 2 次提交
    • N
      powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead · b9ee31e1
      Nicholas Piggin 提交于
      When stop is executed with EC=ESL=0, it appears to execute like a
      normal instruction (resuming from NIP when woken by interrupt). So all
      the save/restore handling can be avoided completely. In particular NV
      GPRs do not have to be saved, and MSR does not have to be switched
      back to kernel MSR.
      
      So move the test for EC=ESL=0 sleep states out to power9_idle_stop,
      and return directly to the caller after stop in that case.
      
      This improves performance for ping-pong benchmark with the stop0_lite
      idle state by 2.54% for 2 threads in the same core, and 2.57% for
      different cores. Performance increase with HV_POSSIBLE defined will be
      improved further by avoiding the hwsync.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b9ee31e1
    • M
      powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop() · d0b791c0
      Michael Ellerman 提交于
      Commit 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate
      idle stop function for hotplug") that added power9_offline_stop() was
      written before commit 7672691a ("powerpc/powernv: Provide a way to
      force a core into SMT4 mode").
      
      When merging the former I failed to notice that it caused us to skip
      the force-SMT4 logic for offline CPUs. The result is that offlined
      CPUs will not correctly participate in the force-SMT4 logic, which
      presumably will result in badness (not tested).
      
      Reconcile the two commits by making power9_offline_stop() a pre-cursor
      to power9_idle_stop(), so that they share the force-SMT4 logic.
      
      This is based on an original commit from Nick, all breakage is my own.
      
      Fixes: 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      d0b791c0
  8. 03 4月, 2018 1 次提交
    • N
      powerpc/powernv: Fix SMT4 forcing idle code · a2b5e056
      Nicholas Piggin 提交于
      The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not
      have the XER[SO] bug.
      
      Fix this by storing up-front, outside the workaround code. The initial
      test is not required because it is a slow path.
      
      The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to
      match pnv_power9_force_smt4_catch() where it is used. Drop the comment
      on pnv_power9_force_smt4_catch() as it's no longer true.
      
      Fixes: 7672691a ("powerpc/powernv: Provide a way to force a core into SMT4 mode")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a2b5e056
  9. 31 3月, 2018 2 次提交
  10. 23 3月, 2018 1 次提交
    • P
      powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a
      Paul Mackerras 提交于
      POWER9 processors up to and including "Nimbus" v2.2 have hardware
      bugs relating to transactional memory and thread reconfiguration.
      One of these bugs has a workaround which is to get the core into
      SMT4 state temporarily.  This workaround is only needed when
      running bare-metal.
      
      This patch provides a function which gets the core into SMT4 mode
      by preventing threads from going to a stop state, and waking up
      those which are already in a stop state.  Once at least 3 threads
      are not in a stop state, the core will be in SMT4 and we can
      continue.
      
      To do this, we add a "dont_stop" flag to the paca to tell the
      thread not to go into a stop state.  If this flag is set,
      power9_idle_stop() just returns immediately with a return value
      of 0.  The pnv_power9_force_smt4_catch() function does the following:
      
      1. Set the dont_stop flag for each thread in the core, except
         ourselves (in fact we use an atomic_inc() in case more than
         one thread is calling this function concurrently).
      2. See how many threads are awake, indicated by their
         requested_psscr field in the paca being 0.  If this is at
         least 3, skip to step 5.
      3. Send a doorbell interrupt to each thread that was seen as
         being in a stop state in step 2.
      4. Until at least 3 threads are awake, scan the threads to which
         we sent a doorbell interrupt and check if they are awake now.
      
      This relies on the following properties:
      
      - Once dont_stop is non-zero, requested_psccr can't go from zero to
        non-zero, except transiently (and without the thread doing stop).
      - requested_psscr being zero guarantees that the thread isn't in
        a state-losing stop state where thread reconfiguration could occur.
      - Doing stop with a PSSCR value of 0 won't be a state-losing stop
        and thus won't allow thread reconfiguration.
      - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
        must be in SMT4 mode, since SMT modes are powers of 2.
      
      This does add a sync to power9_idle_stop(), which is necessary to
      provide the correct ordering between setting requested_psscr and
      checking dont_stop.  The overhead of the sync should be unnoticeable
      compared to the latency of going into and out of a stop state.
      
      Because some objected to incurring this extra latency on systems where
      the XER[SO] bug is not relevant, I have put the test in
      power9_idle_stop inside a feature section.  This means that
      pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
      without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
      probably hang the system.
      
      In order to cater for uses where the caller has an operation that
      has to be done while the core is in SMT4, the core continues to be
      kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
      until the pnv_power9_force_smt4_release() function is called.
      It undoes the effect of step 1 above and allows the other threads
      to go into a stop state.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7672691a
  11. 15 11月, 2017 1 次提交
    • M
      powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature · 3ffa9d9e
      Michael Ellerman 提交于
      Recently we added a CPU feature for Power9 DD2.0, to capture the fact
      that some workarounds are required only on Power9 DD1 and DD2.0 but
      not DD2.1 or later.
      
      Then in commit 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and
      DD2.0 ERAT workaround on DD2.1") and commit e3646330
      "powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on
      DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg:
      
        BEGIN_FTR_SECTION
                PPC_INVALIDATE_ERAT
        END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20)
      
      Unfortunately although this reads as "if set DD1 or DD2.0", the or is
      a bitwise or and actually generates a mask of both bits. The code that
      does the feature patching then checks that the value of the CPU
      features masked with that mask are equal to the mask.
      
      So the end result is we're checking for DD1 and DD20 being set, which
      never happens. Yes the API is terrible.
      
      Removing the ERAT workaround on DD2.0 results in random SEGVs, the
      system tends to boot, but things randomly die including sometimes
      dhclient, udev etc.
      
      To fix the problem and hopefully avoid it in future, we remove the
      DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This
      allows us to easily express that the workarounds are required if DD2.1
      is not set.
      
      At some point we will drop the DD1 workarounds entirely and some of
      this can be cleaned up.
      
      Fixes: 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1")
      Fixes: e3646330 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3ffa9d9e
  12. 06 11月, 2017 2 次提交
  13. 19 10月, 2017 1 次提交
  14. 29 8月, 2017 5 次提交
  15. 04 8月, 2017 1 次提交
  16. 01 8月, 2017 1 次提交
    • G
      powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle · e1c1cfed
      Gautham R. Shenoy 提交于
      The stop4 idle state on POWER9 is a deep idle state which loses
      hypervisor resources, but whose latency is low enough that it can be
      exposed via cpuidle.
      
      Until now, the deep idle states which lose hypervisor resources (eg:
      winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
      from such states, barring a few SPRs which need to be restored to
      their older value, rest of the SPRS are reinitialized to their values
      corresponding to that at boot time.
      
      When stop4 is used in the context of cpuidle, we want these additional
      SPRs to be restored to their older value, to ensure that the context
      on the CPU coming back from idle is same as it was before going idle.
      
      In this patch, we define a SPR save area in PACA (since we have used
      up the volatile register space in the stack) and on POWER9, we restore
      SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
      SPRN_MMCR2 to the values they had before entering stop.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e1c1cfed
  17. 18 7月, 2017 1 次提交
  18. 28 6月, 2017 1 次提交
    • A
      powerpc/powernv/idle: Clear r12 on wakeup from stop lite · 4d0d7c02
      Akshay Adiga 提交于
      pnv_wakeup_noloss() expects r12 to contain SRR1 value to determine if the wakeup
      reason is an HMI in CHECK_HMI_INTERRUPT.
      
      When we wakeup with ESL=0, SRR1 will not contain the wakeup reason, so there is
      no point setting r12 to SRR1.
      
      However, we don't set r12 at all so r12 contains garbage (likely a kernel
      pointer), and is still used to check HMI assuming that it contained SRR1. This
      causes the OPAL msglog to be filled with the following print:
      
        HMI: Received HMI interrupt: HMER = 0x0040000000000000
      
      This patch clears r12 after waking up from stop with ESL=EC=0, so that we don't
      accidentally enter the HMI handler in pnv_wakeup_noloss() if the value of
      r12[42:45] corresponds to HMI as wakeup reason.
      
      Prior to commit 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle
      sleep/wake paths") this bug existed, in that we would incorrectly look at SRR1
      to check for a HMI when SRR1 didn't contain a wakeup reason. However the SRR1
      value would just happen to never have bits 42:45 set.
      
      Fixes: 9d292501 ("powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths")
      Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Change log and comment massaging]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4d0d7c02
  19. 27 6月, 2017 1 次提交
  20. 19 6月, 2017 3 次提交
  21. 30 5月, 2017 3 次提交
    • G
      powerpc/powernv/idle: Use Requested Level for restoring state on P9 DD1 · 22c6663d
      Gautham R. Shenoy 提交于
      On Power9 DD1 due to a hardware bug the Power-Saving Level Status
      field (PLS) of the PSSCR for a thread waking up from a deep state can
      under-report if some other thread in the core is in a shallow stop
      state. The scenario in which this can manifest is as follows:
      
         1) All the threads of the core are in deep stop.
         2) One of the threads is woken up. The PLS for this thread will
            correctly reflect that it is waking up from deep stop.
         3) The thread that has woken up now executes a shallow stop.
         4) When some other thread in the core is woken, its PLS will reflect
            the shallow stop state.
      
      Thus, the subsequent thread for which the PLS is under-reporting the
      wakeup state will not restore the hypervisor resources.
      
      Hence, on DD1 systems, use the Requested Level (RL) field as a
      workaround to restore the contents of the hypervisor resources on the
      wakeup from the stop state.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      22c6663d
    • G
      powerpc/powernv/idle: Restore LPCR on wakeup from deep-stop · cb0be7ec
      Gautham R. Shenoy 提交于
      On wakeup from a deep stop state which is supposed to lose the
      hypervisor state, we don't restore the LPCR to the old value but set
      it to a "sane" value via cur_cpu_spec->cpu_restore().
      
      The problem is that the "sane" value doesn't include UPRT and the HR
      bits which are required to run correctly in Radix mode.
      
      Fix this on POWER9 onwards by restoring the LPCR value whatever it was
      before executing the stop instruction.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cb0be7ec
    • G
      powerpc/powernv/idle: Decouple Timebase restore & Per-core SPRs restore · ec486735
      Gautham R. Shenoy 提交于
      On POWER8, in case of
         -  nap: both timebase and hypervisor state is retained.
         -  fast-sleep: timebase is lost. But the hypervisor state is retained.
         -  winkle: timebase and hypervisor state is lost.
      
      Hence, the current code for handling exit from a idle state assumes
      that if the timebase value is retained, then so is the hypervisor
      state. Thus, the current code doesn't restore per-core hypervisor
      state in such cases.
      
      But that is no longer the case on POWER9 where we do have stop states
      in which timebase value is retained, but the hypervisor state is
      lost. So we have to ensure that the per-core hypervisor state gets
      restored in such cases.
      
      Fix this by ensuring that even in the case when timebase is retained,
      we explicitly check if we are waking up from a deep stop that loses
      per-core hypervisor state (indicated by cr4 being eq or gt), and if
      this is the case, we restore the per-core hypervisor state.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ec486735
  22. 16 5月, 2017 1 次提交
  23. 23 4月, 2017 5 次提交