1. 19 11月, 2019 1 次提交
  2. 12 11月, 2019 1 次提交
    • R
      cpuidle: Use nanoseconds as the unit of time · c1d51f68
      Rafael J. Wysocki 提交于
      Currently, the cpuidle subsystem uses microseconds as the unit of
      time which (among other things) causes the idle loop to incur some
      integer division overhead for no clear benefit.
      
      In order to allow cpuidle to measure time in nanoseconds, add two
      new fields, exit_latency_ns and target_residency_ns, to represent the
      exit latency and target residency of an idle state in nanoseconds,
      respectively, to struct cpuidle_state and initialize them with the
      help of the corresponding values in microseconds provided by drivers.
      Additionally, change cpuidle_governor_latency_req() to return the
      idle state exit latency constraint in nanoseconds.
      
      Also meeasure idle state residency (last_residency_ns in struct
      cpuidle_device and time_ns in struct cpuidle_driver) in nanoseconds
      and update the cpuidle core and governors accordingly.
      
      However, the menu governor still computes typical intervals in
      microseconds to avoid integer overflows.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NDoug Smythies <dsmythies@telus.net>
      Tested-by: NDoug Smythies <dsmythies@telus.net>
      c1d51f68
  3. 06 11月, 2019 1 次提交
    • R
      cpuidle: Consolidate disabled state checks · 99e98d3f
      Rafael J. Wysocki 提交于
      There are two reasons why CPU idle states may be disabled: either
      because the driver has disabled them or because they have been
      disabled by user space via sysfs.
      
      In the former case, the state's "disabled" flag is set once during
      the initialization of the driver and it is never cleared later (it
      is read-only effectively).  In the latter case, the "disable" field
      of the given state's cpuidle_state_usage struct is set and it may be
      changed via sysfs.  Thus checking whether or not an idle state has
      been disabled involves reading these two flags every time.
      
      In order to avoid the additional check of the state's "disabled" flag
      (which is effectively read-only anyway), use the value of it at the
      init time to set a (new) flag in the "disable" field of that state's
      cpuidle_state_usage structure and use the sysfs interface to
      manipulate another (new) flag in it.  This way the state is disabled
      whenever the "disable" field of its cpuidle_state_usage structure is
      nonzero, whatever the reason, and it is the only place to look into
      to check whether or not the state has been disabled.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      99e98d3f
  4. 11 9月, 2019 1 次提交
  5. 10 8月, 2019 1 次提交
    • L
      PSCI: cpuidle: Refactor CPU suspend power_state parameter handling · 9ffeb6d0
      Lorenzo Pieralisi 提交于
      Current PSCI code handles idle state entry through the
      psci_cpu_suspend_enter() API, that takes an idle state index as a
      parameter and convert the index into a previously initialized
      power_state parameter before calling the PSCI.CPU_SUSPEND() with it.
      
      This is unwieldly, since it forces the PSCI firmware layer to keep track
      of power_state parameter for every idle state so that the
      index->power_state conversion can be made in the PSCI firmware layer
      instead of the CPUidle driver implementations.
      
      Move the power_state handling out of drivers/firmware/psci
      into the respective ACPI/DT PSCI CPUidle backends and convert
      the psci_cpu_suspend_enter() API to get the power_state
      parameter as input, which makes it closer to its firmware
      interface PSCI.CPU_SUSPEND() API.
      
      A notable side effect is that the PSCI ACPI/DT CPUidle backends
      now can directly handle (and if needed update) power_state
      parameters before handing them over to the PSCI firmware
      interface to trigger PSCI.CPU_SUSPEND() calls.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NSudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Signed-off-by: NWill Deacon <will@kernel.org>
      9ffeb6d0
  6. 31 7月, 2019 1 次提交
  7. 30 7月, 2019 2 次提交
  8. 10 4月, 2019 1 次提交
    • U
      cpuidle: Export the next timer expiration for CPUs · 6f9b83ac
      Ulf Hansson 提交于
      To be able to predict the sleep duration for a CPU entering idle, it
      is essential to know the expiration time of the next timer.  Both the
      teo and the menu cpuidle governors already use this information for
      CPU idle state selection.
      
      Moving forward, a similar prediction needs to be made for a group of
      idle CPUs rather than for a single one and the following changes
      implement a new genpd governor for that purpose.
      
      In order to support that feature, add a new function called
      tick_nohz_get_next_hrtimer() that will return the next hrtimer
      expiration time of a given CPU to be invoked after deciding
      whether or not to stop the scheduler tick on that CPU.
      
      Make the cpuidle core call tick_nohz_get_next_hrtimer() right
      before invoking the ->enter() callback provided by the cpuidle
      driver for the given state and store its return value in the
      per-CPU struct cpuidle_device, so as to make it available to code
      outside of cpuidle.
      
      Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(),
      the governor's ->select() callback has already returned and indicated
      whether or not the tick should be stopped, so in fact the value
      returned by tick_nohz_get_next_hrtimer() always is the next hrtimer
      expiration time for the given CPU, possibly including the tick (if
      it hasn't been stopped).
      Co-developed-by: NLina Iyer <lina.iyer@linaro.org>
      Co-developed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      [ rjw: Subject & changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6f9b83ac
  9. 18 1月, 2019 1 次提交
  10. 13 12月, 2018 1 次提交
  11. 04 10月, 2018 1 次提交
    • R
      cpuidle: menu: Fix wakeup statistics updates for polling state · 5f26bdce
      Rafael J. Wysocki 提交于
      If the CPU exits the "polling" state due to the time limit in the
      loop in poll_idle(), this is not a real wakeup and it just means
      that the "polling" state selection was not adequate.  The governor
      mispredicted short idle duration, but had a more suitable state been
      selected, the CPU might have spent more time in it.  In fact, there
      is no reason to expect that there would have been a wakeup event
      earlier than the next timer in that case.
      
      Handling such cases as regular wakeups in menu_update() may cause the
      menu governor to make suboptimal decisions going forward, but ignoring
      them altogether would not be correct either, because every time
      menu_select() is invoked, it makes a separate new attempt to predict
      the idle duration taking distinct time to the closest timer event as
      input and the outcomes of all those attempts should be recorded.
      
      For this reason, make menu_update() always assume that if the
      "polling" state was exited due to the time limit, the next proper
      wakeup event for the CPU would be the next timer event (not
      including the tick).
      
      Fixes: a37b969a "cpuidle: poll_state: Add time limit to poll_idle()"
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      5f26bdce
  12. 18 9月, 2018 1 次提交
    • F
      cpuidle: Remove unnecessary wrapper cpuidle_get_last_residency() · 6a5f95b5
      Fieah Lim 提交于
      cpuidle_get_last_residency() is just a wrapper for retrieving
      the last_residency member of struct cpuidle_device.  It's also
      weirdly the only wrapper function for accessing cpuidle_* struct
      member (by my best guess is it could be a leftover from v2.x).
      
      Anyhow, since the only two users (the ladder and menu governors)
      can access dev->last_residency directly, and it's more intuitive to
      do it that way, let's just get rid of the wrapper.
      
      This patch tidies up CPU idle code a bit without functional changes.
      Signed-off-by: NFieah Lim <kw@fieahl.im>
      [ rjw: Changelog cleanup ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6a5f95b5
  13. 31 5月, 2018 1 次提交
  14. 06 4月, 2018 1 次提交
    • R
      cpuidle: Return nohz hint from cpuidle_select() · 45f1ff59
      Rafael J. Wysocki 提交于
      Add a new pointer argument to cpuidle_select() and to the ->select
      cpuidle governor callback to allow a boolean value indicating
      whether or not the tick should be stopped before entering the
      selected state to be returned from there.
      
      Make the ladder governor ignore that pointer (to preserve its
      current behavior) and make the menu governor return 'false" through
      it if:
       (1) the idle exit latency is constrained at 0, or
       (2) the selected state is a polling one, or
       (3) the expected idle period duration is within the tick period
           range.
      
      In addition to that, the correction factor computations in the menu
      governor need to take the possibility that the tick may not be
      stopped into account to avoid artificially small correction factor
      values.  To that end, add a mechanism to record tick wakeups, as
      suggested by Peter Zijlstra, and use it to modify the menu_update()
      behavior when tick wakeup occurs.  Namely, if the CPU is woken up by
      the tick and the return value of tick_nohz_get_sleep_length() is not
      within the tick boundary, the predicted idle duration is likely too
      short, so make menu_update() try to compensate for that by updating
      the governor statistics as though the CPU was idle for a long time.
      
      Since the value returned through the new argument pointer of
      cpuidle_select() is not used by its caller yet, this change by
      itself is not expected to alter the functionality of the code.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      45f1ff59
  15. 29 3月, 2018 1 次提交
    • R
      PM: cpuidle/suspend: Add s2idle usage and time state attributes · 64bdff69
      Rafael J. Wysocki 提交于
      Add a new attribute group called "s2idle" under the sysfs directory
      of each cpuidle state that supports the ->enter_s2idle callback
      and put two new attributes, "usage" and "time", into that group to
      represent the number of times the given state was requested for
      suspend-to-idle and the total time spent in suspend-to-idle after
      requesting that state, respectively.
      
      That will allow diagnostic information related to suspend-to-idle
      to be collected without enabling advanced debug features and
      analyzing dmesg output.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      64bdff69
  16. 12 2月, 2018 1 次提交
  17. 02 1月, 2018 1 次提交
    • P
      cpuidle: Add new macro to enter a retention idle state · db50a74d
      Prashanth Prakash 提交于
      If a CPU is entering a low power idle state where it doesn't lose any
      context, then there is no need to call cpu_pm_enter()/cpu_pm_exit().
      Add a new macro(CPU_PM_CPU_IDLE_ENTER_RETENTION) to be used by cpuidle
      drivers when they are entering retention state. By not calling
      cpu_pm_enter and cpu_pm_exit we reduce the latency involved in
      entering and exiting the retention idle states.
      
      CPU_PM_CPU_IDLE_ENTER_RETENTION assumes that no state is lost and
      hence CPU PM notifiers will not be called. We may need a broader
      change if we need to support partial retention states effeciently.
      
      On ARM64 based Qualcomm Server Platform we measured below overhead for
      for calling cpu_pm_enter and cpu_pm_exit for retention states.
      
      workload: stress --hdd #CPUs --hdd-bytes 32M  -t 30
              Average overhead of cpu_pm_enter - 1.2us
              Average overhead of cpu_pm_exit  - 3.1us
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NPrashanth Prakash <pprakash@codeaurora.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      db50a74d
  18. 30 8月, 2017 3 次提交
  19. 11 8月, 2017 1 次提交
  20. 31 1月, 2017 1 次提交
  21. 29 11月, 2016 1 次提交
  22. 21 10月, 2016 1 次提交
  23. 22 7月, 2016 1 次提交
  24. 03 6月, 2016 1 次提交
    • C
      cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE · 9bd616e3
      Catalin Marinas 提交于
      The cpuidle_devices per-CPU variable is only defined when CPU_IDLE is
      enabled. Commit c8cc7d4d ("sched/idle: Reorganize the idle loop")
      removed the #ifdef CONFIG_CPU_IDLE around cpuidle_idle_call() with the
      compiler optimising away __this_cpu_read(cpuidle_devices). However, with
      CONFIG_UBSAN && !CONFIG_CPU_IDLE, this optimisation no longer happens
      and the kernel fails to link since cpuidle_devices is not defined.
      
      This patch introduces an accessor function for the current CPU cpuidle
      device (returning NULL when !CONFIG_CPU_IDLE) and uses it in
      cpuidle_idle_call().
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9bd616e3
  25. 28 8月, 2015 1 次提交
  26. 19 5月, 2015 1 次提交
  27. 15 5月, 2015 2 次提交
  28. 03 4月, 2015 1 次提交
  29. 06 3月, 2015 1 次提交
  30. 16 2月, 2015 1 次提交
    • R
      PM / sleep: Make it possible to quiesce timers during suspend-to-idle · 124cf911
      Rafael J. Wysocki 提交于
      The efficiency of suspend-to-idle depends on being able to keep CPUs
      in the deepest available idle states for as much time as possible.
      Ideally, they should only be brought out of idle by system wakeup
      interrupts.
      
      However, timer interrupts occurring periodically prevent that from
      happening and it is not practical to chase all of the "misbehaving"
      timers in a whack-a-mole fashion.  A much more effective approach is
      to suspend the local ticks for all CPUs and the entire timekeeping
      along the lines of what is done during full suspend, which also
      helps to keep suspend-to-idle and full suspend reasonably similar.
      
      The idea is to suspend the local tick on each CPU executing
      cpuidle_enter_freeze() and to make the last of them suspend the
      entire timekeeping.  That should prevent timer interrupts from
      triggering until an IO interrupt wakes up one of the CPUs.  It
      needs to be done with interrupts disabled on all of the CPUs,
      though, because otherwise the suspended clocksource might be
      accessed by an interrupt handler which might lead to fatal
      consequences.
      
      Unfortunately, the existing ->enter callbacks provided by cpuidle
      drivers generally cannot be used for implementing that, because some
      of them re-enable interrupts temporarily and some idle entry methods
      cause interrupts to be re-enabled automatically on exit.  Also some
      of these callbacks manipulate local clock event devices of the CPUs
      which really shouldn't be done after suspending their ticks.
      
      To overcome that difficulty, introduce a new cpuidle state callback,
      ->enter_freeze, that will be guaranteed (1) to keep interrupts
      disabled all the time (and return with interrupts disabled) and (2)
      not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
      look for the deepest available idle state with ->enter_freeze present
      and to make the CPU execute that callback with suspended tick (and the
      last of the online CPUs to execute it with suspended timekeeping).
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      124cf911
  31. 14 2月, 2015 1 次提交
    • R
      PM / sleep: Re-implement suspend-to-idle handling · 38106313
      Rafael J. Wysocki 提交于
      In preparation for adding support for quiescing timers in the final
      stage of suspend-to-idle transitions, rework the freeze_enter()
      function making the system wait on a wakeup event, the freeze_wake()
      function terminating the suspend-to-idle loop and the mechanism by
      which deep idle states are entered during suspend-to-idle.
      
      First of all, introduce a simple state machine for suspend-to-idle
      and make the code in question use it.
      
      Second, prevent freeze_enter() from losing wakeup events due to race
      conditions and ensure that the number of online CPUs won't change
      while it is being executed.  In addition to that, make it force
      all of the CPUs re-enter the idle loop in case they are in idle
      states already (so they can enter deeper idle states if possible).
      
      Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
      checks in cpuidle_select() and cpuidle_reflect() with a single
      suspend-to-idle state check in cpuidle_idle_call().
      
      Finally, introduce cpuidle_enter_freeze() that will simply find the
      deepest idle state available to the given CPU and enter it using
      cpuidle_enter().
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      38106313
  32. 17 12月, 2014 1 次提交
    • L
      cpuidle / ACPI: remove unused CPUIDLE_FLAG_TIME_INVALID · 62c4cf97
      Len Brown 提交于
      CPUIDLE_FLAG_TIME_INVALID is no longer checked
      by menu or ladder cpuidle governors, so don't
      bother setting or defining it.
      
      It was originally invented to account for the fact that
      acpi_safe_halt() enables interrupts to invoke HLT.
      That would allow interrupt service routines to be included
      in the last_idle duration measurements made in cpuidle_enter_state(),
      potentially returning a duration much larger than reality.
      
      But menu and ladder can gracefully handle erroneously large duration
      intervals without checking for CPUIDLE_FLAG_TIME_INVALID.
      Further, if they don't check CPUIDLE_FLAG_TIME_INVALID, they
      can also benefit from the instances when the duration interval
      is not erroneously large.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      62c4cf97
  33. 13 11月, 2014 1 次提交
    • D
      cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logic · b82b6cca
      Daniel Lezcano 提交于
      The only place where the time is invalid is when the ACPI_CSTATE_FFH entry
      method is not set. Otherwise for all the drivers, the time can be correctly
      measured.
      
      Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers
      for all the states, just invert the logic by replacing it by the flag
      CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle
      driver, remove the former flag from all the drivers and invert the logic with
      this flag in the different governor.
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b82b6cca
  34. 28 5月, 2014 1 次提交
  35. 07 5月, 2014 1 次提交
  36. 01 5月, 2014 1 次提交