1. 21 3月, 2016 1 次提交
    • R
      cpuidle: menu: Fall back to polling if next timer event is near · 0c313cb2
      Rafael J. Wysocki 提交于
      Commit a9ceb78b (cpuidle,menu: use interactivity_req to disable
      polling) changed the behavior of the fallback state selection part
      of menu_select() so it looks at interactivity_req instead of
      data->next_timer_us when it makes its decision.  That effectively
      caused polling to be used more often as fallback idle which led to
      significant increases of energy consumption in some cases.
      
      Commit e132b9b3 (cpuidle: menu: use high confidence factors
      only when considering polling) changed that logic again to be more
      predictable, but that didn't help with the increased energy
      consumption problem.
      
      For this reason, go back to making decisions on which state to fall
      back to based on data->next_timer_us which is the time we know for
      sure something will happen rather than a prediction (which may be
      inaccurate and turns out to be so often enough to be problematic).
      However, take the target residency of the first proper idle state
      (C1) into account, so that state is not used as the fallback one
      if its target residency is greater than data->next_timer_us.
      
      Fixes: a9ceb78b (cpuidle,menu: use interactivity_req to disable polling)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reported-and-tested-by: NDoug Smythies <dsmythies@telus.net>
      0c313cb2
  2. 17 3月, 2016 1 次提交
    • R
      cpuidle: menu: use high confidence factors only when considering polling · e132b9b3
      Rik van Riel 提交于
      The menu governor uses five different factors to pick the
      idle state:
       - the user configured latency_req
       - the time until the next timer (next_timer_us)
       - the typical sleep interval, as measured recently
       - an estimate of sleep time by dividing next_timer_us by an observed factor
       - a load corrected version of the above, divided again by load
      
      Only the first three items are known with enough confidence that
      we can use them to consider polling, instead of an actual CPU
      idle state, because the cost of being wrong about polling can be
      excessive power use.
      
      The latter two are used in the menu governor's main selection
      loop, and can result in choosing a shallower idle state when
      the system is expected to be busy again soon.
      
      This pushes a busy system in the "performance" direction of
      the performance<>power tradeoff, when choosing between idle
      states, but stays more strictly on the "power" state when
      deciding between polling and C1.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e132b9b3
  3. 17 2月, 2016 2 次提交
  4. 28 1月, 2016 1 次提交
  5. 22 1月, 2016 1 次提交
  6. 19 1月, 2016 2 次提交
  7. 16 1月, 2016 2 次提交
  8. 15 1月, 2016 1 次提交
    • R
      cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0 · 9c4b2867
      Rafael J. Wysocki 提交于
      Commit a9ceb78b (cpuidle,menu: use interactivity_req to disable
      polling) exposed a bug in menu_select() causing it to return -1
      on systems with CPUIDLE_DRIVER_STATE_START equal to zero, although
      it should have returned 0.  As a result, idle states are not entered
      by CPUs on those systems.
      
      Namely, on the systems in question data->last_state_idx is initially
      equal to -1 and the above commit modified the condition that would
      have caused it to be changed to 0 to be less likely to trigger which
      exposed the problem.  However, setting data->last_state_idx initially
      to -1 doesn't make sense at all and on the affected systems it should
      always be set to CPUIDLE_DRIVER_STATE_START (ie. 0) unconditionally,
      so make that happen.
      
      Fixes: a9ceb78b (cpuidle,menu: use interactivity_req to disable polling)
      Reported-and-tested-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9c4b2867
  9. 17 12月, 2015 1 次提交
  10. 15 12月, 2015 3 次提交
  11. 17 11月, 2015 3 次提交
    • R
      cpuidle,menu: smooth out measured_us calculation · efddfd90
      Rik van Riel 提交于
      The cpuidle state tables contain the maximum exit latency for each
      cpuidle state. On x86, that is the exit latency for when the entire
      package goes into that same idle state.
      
      However, a lot of the time we only go into the core idle state,
      not the package idle state. This means we see a much smaller exit
      latency.
      
      We have no way to detect whether we went into the core or package
      idle state while idle, and that is ok.
      
      However, the current menu_update logic does have the potential to
      trip up the repeating pattern detection in get_typical_interval.
      If the system is experiencing an exit latency near the idle state's
      exit latency, some of the samples will have exit_us subtracted,
      while others will not. This turns a repeating pattern into mush,
      potentially breaking get_typical_interval.
      
      Furthermore, for smaller sleep intervals, we know the chance that
      all the cores in the package went to the same idle state are fairly
      small. Dividing the measured_us by two, instead of subtracting the
      full exit latency when hitting a small measured_us, will reduce the
      error.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      efddfd90
    • R
      cpuidle,menu: use interactivity_req to disable polling · a9ceb78b
      Rik van Riel 提交于
      The menu governor carefully figures out how much time we typically
      sleep for an estimated sleep interval, or whether there is a repeating
      pattern going on, and corrects that estimate for the CPU load.
      
      Then it proceeds to ignore that information when determining whether
      or not to consider polling. This is not a big deal on most x86 CPUs,
      which have very low C1 latencies, and the patch should not have any
      effect on those CPUs.
      
      However, certain CPUs (eg. Atom) have much higher C1 latencies, and
      it would be good to not waste performance and power on those CPUs if
      we are expecting a very low wakeup latency.
      
      Disable polling based on the estimated interactivity requirement, not
      on the time to the next timer interrupt.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a9ceb78b
    • R
      cpuidle,x86: increase forced cut-off for polling to 20us · 7884084f
      Rik van Riel 提交于
      The cpuidle menu governor has a forced cut-off for polling at 5us,
      in order to deal with firmware that gives the OS bad information
      on cpuidle states, leading to the system spending way too much time
      in polling.
      
      However, at least one x86 CPU family (Atom) has chips that have
      a 20us break-even point for C1. Forcing the polling cut-off to
      less than that wastes performance and power.
      
      Increase the polling cut-off to 20us.
      
      Systems with a lower C1 latency will be found in the states table by
      the menu governor, which will pick those states as appropriate.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      7884084f
  12. 23 10月, 2015 2 次提交
  13. 03 9月, 2015 1 次提交
  14. 28 8月, 2015 2 次提交
  15. 03 8月, 2015 1 次提交
    • M
      ARM: migrate to common PSCI client code · be120397
      Mark Rutland 提交于
      Now that the common PSCI client code has been factored out to
      drivers/firmware, and made safe for 32-bit use, move the 32-bit ARM code
      over to it. This results in a moderate reduction of duplicated lines,
      and will prevent further duplication as the PSCI client code is updated
      for PSCI 1.0 and beyond.
      
      The two legacy platform users of the PSCI invocation code are updated to
      account for interface changes. In both cases the power state parameter
      (which is constant) is now generated using macros, so that the
      pack/unpack logic can be killed in preparation for PSCI 1.0 power state
      changes.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NRob Herring <robh@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      be120397
  16. 21 7月, 2015 1 次提交
  17. 10 7月, 2015 1 次提交
  18. 26 6月, 2015 1 次提交
    • P
      tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode · cc5a2f7b
      preeti 提交于
      On some archs, the local clockevent device stops in deep cpuidle states.
      The broadcast framework is used to wakeup cpus in these idle states, in
      which either an external clockevent device is used to send wakeup ipis
      or the hrtimer broadcast framework kicks in in the absence of such a
      device. One cpu is nominated as the broadcast cpu and this cpu sends
      wakeup ipis to sleeping cpus at the appropriate time. This is the
      implementation in the oneshot mode of broadcast.
      
      In periodic mode of broadcast however, the presence of such cpuidle
      states results in the cpuidle driver calling tick_broadcast_enable()
      which shuts down the local clockevent devices of all the cpus and
      appoints the tick broadcast device as the clockevent device for each of
      them. This works on those archs where the tick broadcast device is a
      real clockevent device.  But on archs which depend on the hrtimer mode
      of broadcast, the tick broadcast device hapens to be a pseudo device.
      The consequence is that the local clockevent devices of all cpus are
      shutdown and the kernel hangs at boot time in periodic mode.
      
      Let us thus not register the cpuidle states which have
      CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
      mode of broadcast in periodic mode. This patch takes care of doing this
      on powerpc. The cpus would not have entered into such deep cpuidle
      states in periodic mode on powerpc anyway. So there is no loss here.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: 3.19+ <stable@vger.kernel.org> # 3.19+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      cc5a2f7b
  19. 22 6月, 2015 1 次提交
  20. 17 6月, 2015 1 次提交
    • P
      drivers/cpuidle: Convert non-modular drivers to use builtin_platform_driver · 090d1cf1
      Paul Gortmaker 提交于
      All these drivers are configured with Kconfig options that are
      declared as bool.  Hence it is not possible for the code
      to be built as modular.  However the code is currently using the
      module_platform_driver() macro for driver registration.
      
      While this currently works, we really don't want to be including
      the module.h header in non-modular code, which we'll be forced
      to do, pending some upcoming code relocation from init.h into
      module.h.  So we fix it now by using the non-modular equivalent.
      
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: linux-pm@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      090d1cf1
  21. 30 5月, 2015 1 次提交
    • R
      cpuidle: Do not use CPUIDLE_DRIVER_STATE_START in cpuidle.c · 7d51d979
      Rafael J. Wysocki 提交于
      The CPUIDLE_DRIVER_STATE_START symbol is defined as 1 only if
      CONFIG_ARCH_HAS_CPU_RELAX is set, otherwise it is defined as 0.
      However, if CONFIG_ARCH_HAS_CPU_RELAX is set, the first (index 0)
      entry in the cpuidle driver's table of states is overwritten with
      the default "poll" entry by the core.  The "state" defined by the
      "poll" entry doesn't provide ->enter_dead and ->enter_freeze
      callbacks and its exit_latency is 0.
      
      For this reason, it is not necessary to use CPUIDLE_DRIVER_STATE_START
      in cpuidle_play_dead() (->enter_dead is NULL, so the "poll state"
      will be skipped by the loop).
      
      It also is arguably unuseful to return states with exit_latency
      equal to 0 from find_deepest_state(), so the function can be modified
      to start the loop from index 0 and the "poll state" will be skipped by
      it as a result of the check against latency_req.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      7d51d979
  22. 19 5月, 2015 1 次提交
  23. 15 5月, 2015 3 次提交
  24. 10 5月, 2015 1 次提交
  25. 06 5月, 2015 1 次提交
  26. 05 5月, 2015 1 次提交
  27. 29 4月, 2015 1 次提交
  28. 17 4月, 2015 1 次提交
  29. 03 4月, 2015 1 次提交