1. 17 9月, 2020 1 次提交
  2. 26 8月, 2020 3 次提交
  3. 30 7月, 2020 9 次提交
    • G
      cpuidle: pseries: Fixup exit latency for CEDE(0) · d947fb4c
      Gautham R. Shenoy 提交于
      We are currently assuming that CEDE(0) has exit latency 10us, since
      there is no way for us to query from the platform. However, if the
      wakeup latency of an Extended CEDE state is smaller than 10us, then we
      can be sure that the exit latency of CEDE(0) cannot be more than that.
      
      In this patch, we fix the exit latency of CEDE(0) if we discover an
      Extended CEDE state with wakeup latency smaller than 10us.
      
      Benchmark results:
      
      On POWER8, this patch does not have any impact since the advertized
      latency of Extended CEDE (1) is 30us which is higher than the default
      latency of CEDE (0) which is 10us.
      
      On POWER9 we see improvement the single-threaded performance of
      ebizzy, and no regression in the wakeup latency or the number of
      context-switches.
      
      ebizzy:
      2 ebizzy threads bound to the same big-core. 25% improvement in the
      avg records/s with patch.
      
        x without_patch
        * with_patch
            N           Min           Max        Median           Avg        Stddev
        x  10       2491089       5834307       5398375       4244335     1596244.9
        *  10       2893813       5834474       5832448     5327281.3     1055941.4
      
      context_switch2:
      There is no major regression observed with this patch as seen from the
      context_switch2 benchmark.
      
      context_switch2 across CPU0 CPU1 (Both belong to same big-core, but
      different small cores). We observe a minor 0.14% regression in the
      number of context-switches (higher is better).
      
        x without_patch
        * with_patch
            N           Min           Max        Median           Avg        Stddev
        x 500        348872        362236        354712     354745.69      2711.827
        * 500        349422        361452        353942      354215.4     2576.9258
      
        Difference at 99.0% confidence
          -530.288 +/- 430.963
          -0.149484% +/- 0.121485%
          (Student's t, pooled s = 2645.24)
      
      context_switch2 across CPU0 CPU8 (Different big-cores). We observe a
      0.37% improvement in the number of context-switches (higher is
      better).
      
        x without_patch
        * with_patch
            N           Min           Max        Median           Avg        Stddev
        x 500        287956        294940        288896     288977.23     646.59295
        * 500        288300        294646        289582     290064.76     1161.9992
      
        Difference at 99.0% confidence
          1087.53 +/- 153.194
          0.376337% +/- 0.0530125%
          (Student's t, pooled s = 940.299)
      
      schbench:
      No major difference could be seen until the 99.9th percentile.
      
      Without-patch:
        Latency percentiles (usec)
              50.0th: 29
              75.0th: 39
              90.0th: 49
              95.0th: 59
              *99.0th: 13104
              99.5th: 14672
              99.9th: 15824
              min=0, max=17993
      
      With-patch:
        Latency percentiles (usec)
              50.0th: 29
              75.0th: 40
              90.0th: 50
              95.0th: 61
              *99.0th: 13648
              99.5th: 14768
              99.9th: 15664
              min=0, max=29812
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      [mpe: Minor formatting]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1596087177-30329-4-git-send-email-ego@linux.vnet.ibm.com
      d947fb4c
    • G
      cpuidle: pseries: Add function to parse extended CEDE records · 054e44ba
      Gautham R. Shenoy 提交于
      Currently we use CEDE with latency-hint 0 as the only other idle state
      on a dedicated LPAR apart from the polling "snooze" state.
      
      The platform might support additional extended CEDE idle states, which
      can be discovered through the "ibm,get-system-parameter" rtas-call
      made with CEDE_LATENCY_TOKEN.
      
      This patch adds a function to obtain information about the extended
      CEDE idle states from the platform and parse the contents to populate
      an array of extended CEDE states. These idle states thus discovered
      will be added to the cpuidle framework in the next patch.
      
      dmesg on a POWER8 and POWER9 LPAR, demonstrating the output of parsing
      the extended CEDE latency parameters are as follows
      
      POWER8
      [   10.093279] xcede : xcede_record_size = 10
      [   10.093285] xcede : Record 0 : hint = 1, latency = 0x3c00 tb ticks, Wake-on-irq = 1
      [   10.093291] xcede : Record 1 : hint = 2, latency = 0x4e2000 tb ticks, Wake-on-irq = 0
      [   10.093297] cpuidle : Skipping the 2 Extended CEDE idle states
      
      POWER9
      [    5.913180] xcede : xcede_record_size = 10
      [    5.913183] xcede : Record 0 : hint = 1, latency = 0x400 tb ticks, Wake-on-irq = 1
      [    5.913188] xcede : Record 1 : hint = 2, latency = 0x3e8000 tb ticks, Wake-on-irq = 0
      [    5.913193] cpuidle : Skipping the 2 Extended CEDE idle states
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      [mpe: Make space for 16 records, drop memset, minor cleanup & formatting]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1596087177-30329-3-git-send-email-ego@linux.vnet.ibm.com
      054e44ba
    • G
      cpuidle: pseries: Set the latency-hint before entering CEDE · 3af0ada7
      Gautham R. Shenoy 提交于
      As per the PAPR, each H_CEDE call is associated with a latency-hint to
      be passed in the VPA field "cede_latency_hint". The CEDE states that
      we were implicitly entering so far is CEDE with latency-hint = 0.
      
      This patch explicitly sets the latency hint corresponding to the CEDE
      state that we are currently entering. While at it, we save the
      previous hint, to be restored once we wakeup from CEDE. This will be
      required in the future when we expose extended-cede states through the
      cpuidle framework, where each of them will have a different
      cede-latency hint.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      [mpe: Make cede_latency_hint static]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1596087177-30329-2-git-send-email-ego@linux.vnet.ibm.com
      3af0ada7
    • N
      cpuidle: change enter_s2idle() prototype · efe97112
      Neal Liu 提交于
      Control Flow Integrity(CFI) is a security mechanism that disallows
      changes to the original control flow graph of a compiled binary,
      making it significantly harder to perform such attacks.
      
      init_state_node() assign same function callback to different
      function pointer declarations.
      
      static int init_state_node(struct cpuidle_state *idle_state,
                                 const struct of_device_id *matches,
                                 struct device_node *state_node) { ...
              idle_state->enter = match_id->data; ...
              idle_state->enter_s2idle = match_id->data; }
      
      Function declarations:
      
      struct cpuidle_state { ...
              int (*enter) (struct cpuidle_device *dev,
                            struct cpuidle_driver *drv,
                            int index);
      
              void (*enter_s2idle) (struct cpuidle_device *dev,
                                    struct cpuidle_driver *drv,
                                    int index); };
      
      In this case, either enter() or enter_s2idle() would cause CFI check
      failed since they use same callee.
      
      Align function prototype of enter() since it needs return value for
      some use cases. The return value of enter_s2idle() is no
      need currently.
      Signed-off-by: NNeal Liu <neal.liu@mediatek.com>
      Reviewed-by: NSami Tolvanen <samitolvanen@google.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      efe97112
    • U
      cpuidle: psci: Prevent domain idlestates until consumers are ready · 81f94ddf
      Ulf Hansson 提交于
      Depending on the SoC/platform, additional devices may be part of the PSCI
      PM domain topology. This is the case with 'qcom,rpmh-rsc' device, for
      example, even if this is not yet visible in the corresponding DTS-files.
      
      Without going into too much details, a device like the 'qcom,rpmh-rsc' may
      have HW constraints that needs to be obeyed to, before a domain idlestate
      can be picked.
      
      Therefore, let's implement the ->sync_state() callback to receive a
      notification when all consumers of the PSCI PM domain providers have been
      attached/probed to it. In this way, we can make sure all constraints from
      all relevant devices, are taken into account before allowing a domain
      idlestate to be picked.
      Acked-by: NSaravana Kannan <saravanak@google.com>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      81f94ddf
    • U
      cpuidle: psci: Convert PM domain to platform driver · ee7c34ca
      Ulf Hansson 提交于
      To enable support for deferred probing and to allow implementation of the
      ->sync_state() callback from subsequent changes, let's convert into a
      platform driver.
      Reviewed-by: NLina Iyer <ilina@codeaurora.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ee7c34ca
    • U
      cpuidle: psci: Fix error path via converting to a platform driver · 166bf835
      Ulf Hansson 提交于
      The current error paths for the cpuidle-psci driver, may leak memory or
      possibly leave CPU devices attached to their PM domains. These are quite
      harmless issues, but still deserves to be taken care of.
      
      Although, rather than fixing them by keeping track of allocations that
      needs to be freed, which tends to become a bit messy, let's convert into a
      platform driver. In this way, it gets easier to fix the memory leaks as we
      can rely on the devm_* functions.
      
      Moreover, converting to a platform driver also enables support for deferred
      probe, which subsequent changes takes benefit from.
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      166bf835
    • U
      cpuidle: psci: Fail cpuidle registration if set OSI mode failed · 4b072cd6
      Ulf Hansson 提交于
      Currently we allow the cpuidle driver registration to succeed, even if we
      failed to enable the OSI mode when the hierarchical DT layout is used. This
      means running in a degraded mode, by using the available idle states per
      CPU, while also preventing the domain idle states.
      
      Moving forward, this behaviour looks quite questionable to maintain, as
      complexity seems to grow around it, especially when trying to add support
      for deferred probe, for example.
      
      Therefore, let's make the cpuidle driver registration to fail in this
      situation, thus relying on the default architectural cpuidle backend for
      WFI to be used.
      Reviewed-by: NLina Iyer <ilina@codeaurora.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4b072cd6
    • U
      cpuidle: psci: Split into two separate build objects · 03175619
      Ulf Hansson 提交于
      The combined build object for the PSCI cpuidle driver and the PSCI PM
      domain, is a bit messy. Therefore let's split it up by adding a new Kconfig
      ARM_PSCI_CPUIDLE_DOMAIN and convert into two separate objects.
      Reviewed-by: NLina Iyer <ilina@codeaurora.org>
      Reviewed-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      03175619
  4. 16 7月, 2020 1 次提交
  5. 15 7月, 2020 1 次提交
  6. 25 6月, 2020 1 次提交
  7. 23 6月, 2020 1 次提交
    • C
      PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle · 81e67375
      Chen Yu 提交于
      Suspend to idle was found to not work on Goldmont CPU recently.
      
      The issue happens due to:
      
       1. On Goldmont the CPU in idle can only be woken up via IPIs,
          not POLLING mode, due to commit 08e237fa ("x86/cpu: Add
          workaround for MONITOR instruction erratum on Goldmont based
          CPUs")
      
       2. When the CPU is entering suspend to idle process, the
          _TIF_POLLING_NRFLAG remains on, because cpuidle_enter_s2idle()
          doesn't match call_cpuidle() exactly.
      
       3. Commit b2a02fc4 ("smp: Optimize send_call_function_single_ipi()")
          makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to idle
          CPUs.
      
       4. As a result, some IPIs related functions might not work
          well during suspend to idle on Goldmont. For example, one
          suspected victim:
      
          tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume()
          -> clock_was_set() -> on_each_cpu() might wait forever,
          because the IPIs will not be sent to the CPUs which are
          sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU
          could not be woken up by only setting _TIF_NEED_RESCHED
          on the monitor address.
      
      To avoid that, clear the _TIF_POLLING_NRFLAG flag before invoking
      enter_s2idle_proper() in cpuidle_enter_s2idle() in analogy with the
      call_cpuidle() code flow.
      
      Fixes: b2a02fc4 ("smp: Optimize send_call_function_single_ipi()")
      Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Suggested-by: NRafael J. Wysocki <rafael@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      [ rjw: Subject / changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      81e67375
  8. 30 5月, 2020 1 次提交
  9. 26 5月, 2020 1 次提交
  10. 19 5月, 2020 5 次提交
  11. 16 5月, 2020 1 次提交
  12. 07 5月, 2020 1 次提交
  13. 30 4月, 2020 2 次提交
  14. 29 4月, 2020 2 次提交
  15. 08 4月, 2020 1 次提交
  16. 14 3月, 2020 2 次提交
  17. 13 3月, 2020 4 次提交
  18. 13 2月, 2020 2 次提交
  19. 23 1月, 2020 1 次提交