1. 29 4月, 2015 1 次提交
  2. 03 4月, 2015 1 次提交
  3. 06 3月, 2015 1 次提交
  4. 01 3月, 2015 2 次提交
  5. 16 2月, 2015 1 次提交
    • R
      PM / sleep: Make it possible to quiesce timers during suspend-to-idle · 124cf911
      Rafael J. Wysocki 提交于
      The efficiency of suspend-to-idle depends on being able to keep CPUs
      in the deepest available idle states for as much time as possible.
      Ideally, they should only be brought out of idle by system wakeup
      interrupts.
      
      However, timer interrupts occurring periodically prevent that from
      happening and it is not practical to chase all of the "misbehaving"
      timers in a whack-a-mole fashion.  A much more effective approach is
      to suspend the local ticks for all CPUs and the entire timekeeping
      along the lines of what is done during full suspend, which also
      helps to keep suspend-to-idle and full suspend reasonably similar.
      
      The idea is to suspend the local tick on each CPU executing
      cpuidle_enter_freeze() and to make the last of them suspend the
      entire timekeeping.  That should prevent timer interrupts from
      triggering until an IO interrupt wakes up one of the CPUs.  It
      needs to be done with interrupts disabled on all of the CPUs,
      though, because otherwise the suspended clocksource might be
      accessed by an interrupt handler which might lead to fatal
      consequences.
      
      Unfortunately, the existing ->enter callbacks provided by cpuidle
      drivers generally cannot be used for implementing that, because some
      of them re-enable interrupts temporarily and some idle entry methods
      cause interrupts to be re-enabled automatically on exit.  Also some
      of these callbacks manipulate local clock event devices of the CPUs
      which really shouldn't be done after suspending their ticks.
      
      To overcome that difficulty, introduce a new cpuidle state callback,
      ->enter_freeze, that will be guaranteed (1) to keep interrupts
      disabled all the time (and return with interrupts disabled) and (2)
      not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
      look for the deepest available idle state with ->enter_freeze present
      and to make the CPU execute that callback with suspended tick (and the
      last of the online CPUs to execute it with suspended timekeeping).
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      124cf911
  6. 14 2月, 2015 1 次提交
    • R
      PM / sleep: Re-implement suspend-to-idle handling · 38106313
      Rafael J. Wysocki 提交于
      In preparation for adding support for quiescing timers in the final
      stage of suspend-to-idle transitions, rework the freeze_enter()
      function making the system wait on a wakeup event, the freeze_wake()
      function terminating the suspend-to-idle loop and the mechanism by
      which deep idle states are entered during suspend-to-idle.
      
      First of all, introduce a simple state machine for suspend-to-idle
      and make the code in question use it.
      
      Second, prevent freeze_enter() from losing wakeup events due to race
      conditions and ensure that the number of online CPUs won't change
      while it is being executed.  In addition to that, make it force
      all of the CPUs re-enter the idle loop in case they are in idle
      states already (so they can enter deeper idle states if possible).
      
      Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
      checks in cpuidle_select() and cpuidle_reflect() with a single
      suspend-to-idle state check in cpuidle_idle_call().
      
      Finally, introduce cpuidle_enter_freeze() that will simply find the
      deepest idle state available to the given CPU and enter it using
      cpuidle_enter().
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      38106313
  7. 24 9月, 2014 1 次提交
    • D
      sched: Let the scheduler see CPU idle states · 442bf3aa
      Daniel Lezcano 提交于
      When the cpu enters idle, it stores the cpuidle state pointer in its
      struct rq instance which in turn could be used to make a better decision
      when balancing tasks.
      
      As soon as the cpu exits its idle state, the struct rq reference is
      cleared.
      
      There are a couple of situations where the idle state pointer could be changed
      while it is being consulted:
      
      1. For x86/acpi with dynamic c-states, when a laptop switches from battery
         to AC that could result on removing the deeper idle state. The acpi driver
         triggers:
      	'acpi_processor_cst_has_changed'
      		'cpuidle_pause_and_lock'
      			'cpuidle_uninstall_idle_handler'
      				'kick_all_cpus_sync'.
      
      All cpus will exit their idle state and the pointed object will be set to
      NULL.
      
      2. The cpuidle driver is unloaded. Logically that could happen but not
      in practice because the drivers are always compiled in and 95% of them are
      not coded to unregister themselves.  In any case, the unloading code must
      call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock'
      leading to 'kick_all_cpus_sync' as mentioned above.
      
      A race can happen if we use the pointer and then one of these two scenarios
      occurs at the same moment.
      
      In order to be safe, the idle state pointer stored in the rq must be
      used inside a rcu_read_lock section where we are protected with the
      'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The
      idle_get_state() and idle_put_state() accessors should be used to that
      effect.
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linaro-kernel@lists.linaro.org
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      442bf3aa
  8. 19 9月, 2014 1 次提交
  9. 09 7月, 2014 1 次提交
  10. 07 5月, 2014 1 次提交
  11. 01 5月, 2014 1 次提交
  12. 12 3月, 2014 1 次提交
    • P
      cpuidle: delay enabling interrupts until all coupled CPUs leave idle · 0b89e9aa
      Paul Burton 提交于
      As described by a comment at the end of cpuidle_enter_state_coupled it
      can be inefficient for coupled idle states to return with IRQs enabled
      since they may proceed to service an interrupt instead of clearing the
      coupled idle state. Until they have finished & cleared the idle state
      all CPUs coupled with them will spin rather than being able to enter a
      safe idle state.
      
      Commits e1689795 "cpuidle: Add common time keeping and irq
      enabling" and 554c06ba "cpuidle: remove en_core_tk_irqen flag" led
      to the cpuidle_enter_state enabling interrupts for all idle states,
      including coupled ones, making this inefficiency unavoidable by drivers
      & the local_irq_enable near the end of cpuidle_enter_state_coupled
      redundant. This patch avoids enabling interrupts in cpuidle_enter_state
      after a coupled state has been entered, allowing them to remain disabled
      until all coupled CPUs have exited the idle state and
      cpuidle_enter_state_coupled re-enables them.
      
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NPaul Burton <paul.burton@imgtec.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0b89e9aa
  13. 11 3月, 2014 2 次提交
  14. 07 2月, 2014 1 次提交
    • P
      cpuidle: Handle clockevents_notify(BROADCAST_ENTER) failure · ba8f20c2
      Preeti U Murthy 提交于
      Some archs set the CPUIDLE_FLAG_TIMER_STOP flag for idle states in which the
      local timers stop. The cpuidle_idle_call() currently handles such idle states
      by calling into the broadcast framework so as to wakeup CPUs at their next
      wakeup event. With the hrtimer mode of broadcast, the BROADCAST_ENTER call
      into the broadcast frameowork can fail for archs that do not have an external
      clock device to handle wakeups and the CPU in question has thus to be made
      the stand by CPU. This patch handles such cases by failing the call into
      cpuidle so that the arch can take some default action. The arch will certainly
      not enter a similar idle state because a failed cpuidle call will also implicitly
      indicate that the broadcast framework has not registered this CPU to be woken up.
      Hence we are safe if we fail the cpuidle call.
      
      In the process move the functions that trace idle statistics just before and
      after the entry and exit into idle states respectively. In other
      scenarios where the call to cpuidle fails, we end up not tracing idle
      entry and exit since a decision on an idle state could not be taken. Similarly
      when the call to broadcast framework fails, we skip tracing idle statistics
      because we are in no further position to take a decision on an alternative
      idle state to enter into.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: deepthi@linux.vnet.ibm.com
      Cc: paulmck@linux.vnet.ibm.com
      Cc: fweisbec@gmail.com
      Cc: paulus@samba.org
      Cc: srivatsa.bhat@linux.vnet.ibm.com
      Cc: svaidy@linux.vnet.ibm.com
      Cc: peterz@infradead.org
      Cc: benh@kernel.crashing.org
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20140207080652.17187.66344.stgit@preeti.in.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ba8f20c2
  15. 04 12月, 2013 1 次提交
    • K
      cpuidle: Check for dev before deregistering it. · 813e8e3d
      Konrad Rzeszutek Wilk 提交于
      If not, we could end up in the unfortunate situation where
      we dereference a NULL pointer b/c we have cpuidle disabled.
      
      This is the case when booting under Xen (which uses the
      ACPI P/C states but disables the CPU idle driver) - and can
      be easily reproduced when booting with cpuidle.off=1.
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff8156db4a>] cpuidle_unregister_device+0x2a/0x90
      .. snip..
      Call Trace:
       [<ffffffff813b15b4>] acpi_processor_power_exit+0x3c/0x5c
       [<ffffffff813af0a9>] acpi_processor_stop+0x61/0xb6
       [<ffffffff814215bf>] __device_release_driver+0fffff81421653>] device_release_driver+0x23/0x30
       [<ffffffff81420ed8>] bus_remove_device+0x108/0x180
       [<ffffffff8141d9d9>] device_del+0x129/0x1c0
       [<ffffffff813cb4b0>] ? unregister_xenbus_watch+0x1f0/0x1f0
       [<ffffffff8141da8e>] device_unregister+0x1e/0x60
       [<ffffffff814243e9>] unregister_cpu+0x39/0x60
       [<ffffffff81019e03>] arch_unregister_cpu+0x23/0x30
       [<ffffffff813c3c51>] handle_vcpu_hotplug_event+0xc1/0xe0
       [<ffffffff813cb4f5>] xenwatch_thread+0x45/0x120
       [<ffffffff810af010>] ? abort_exclusive_wait+0xb0/0xb0
       [<ffffffff8108ec42>] kthread+0xd2/0xf0
       [<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180
       [<ffffffff816ce17c>] ret_from_fork+0x7c/0xb0
       [<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180
      
      This problem also appears in 3.12 and could be a candidate for backport.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      813e8e3d
  16. 30 10月, 2013 7 次提交
  17. 15 7月, 2013 4 次提交
  18. 11 6月, 2013 1 次提交
    • D
      cpuidle: simplify multiple driver support · 82467a5a
      Daniel Lezcano 提交于
      Commit bf4d1b5d (cpuidle: support multiple drivers) introduced support
      for using multiple cpuidle drivers at the same time.  It added a
      couple of new APIs to register the driver per CPU, but that led to
      some unnecessary code complexity related to the kernel config options
      deciding whether or not the multiple driver support is enabled.  The
      code has to work as it did before when the multiple driver support is
      not enabled and the multiple driver support has to be compatible with
      the previously existing API.
      
      Remove the new API, not used by any driver in the tree yet (but
      needed for the HMP cpuidle drivers that will be submitted soon), and
      add a new cpumask pointer to the cpuidle driver structure that will
      point to the mask of CPUs handled by the given driver.  That will
      allow the cpuidle_[un]register_driver() API to be used for the
      multiple driver support along with the cpuidle_[un]register()
      functions added recently.
      
      [rjw: Changelog]
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      82467a5a
  19. 24 4月, 2013 1 次提交
  20. 23 4月, 2013 2 次提交
    • D
      cpuidle: make a single register function for all · 4c637b21
      Daniel Lezcano 提交于
      The usual scheme to initialize a cpuidle driver on a SMP is:
      
      	cpuidle_register_driver(drv);
      	for_each_possible_cpu(cpu) {
      		device = &per_cpu(cpuidle_dev, cpu);
      		cpuidle_register_device(device);
      	}
      
      This code is duplicated in each cpuidle driver.
      
      On UP systems, it is done this way:
      
      	cpuidle_register_driver(drv);
      	device = &per_cpu(cpuidle_dev, cpu);
      	cpuidle_register_device(device);
      
      On UP, the macro 'for_each_cpu' does one iteration:
      
      #define for_each_cpu(cpu, mask)                 \
              for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
      
      Hence, the initialization loop is the same for UP than SMP.
      
      Beside, we saw different bugs / mis-initialization / return code unchecked in
      the different drivers, the code is duplicated including bugs. After fixing all
      these ones, it appears the initialization pattern is the same for everyone.
      
      Please note, some drivers are doing dev->state_count = drv->state_count. This is
      not necessary because it is done by the cpuidle_enable_device function in the
      cpuidle framework. This is true, until you have the same states for all your
      devices. Otherwise, the 'low level' API should be used instead with the specific
      initialization for the driver.
      
      Let's add a wrapper function doing this initialization with a cpumask parameter
      for the coupled idle states and use it for all the drivers.
      
      That will save a lot of LOC, consolidate the code, and the modifications in the
      future could be done in a single place. Another benefit is the consolidation of
      the cpuidle_device variable which is now in the cpuidle framework and no longer
      spread accross the different arch specific drivers.
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4c637b21
    • D
      cpuidle: remove en_core_tk_irqen flag · 554c06ba
      Daniel Lezcano 提交于
      The en_core_tk_irqen flag is set in all the cpuidle driver which
      means it is not necessary to specify this flag.
      
      Remove the flag and the code related to it.
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: Kevin Hilman <khilman@linaro.org>  # for mach-omap2/*
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      554c06ba
  21. 01 4月, 2013 1 次提交
  22. 26 1月, 2013 1 次提交
    • P
      PM / tracing: remove deprecated power trace API · 43720bd6
      Paul Gortmaker 提交于
      The text in Documentation said it would be removed in 2.6.41;
      the text in the Kconfig said removal in the 3.1 release.  Either
      way you look at it, we are well past both, so push it off a cliff.
      
      Note that the POWER_CSTATE and the POWER_PSTATE are part of the
      legacy tracing API.  Remove all tracepoints which use these flags.
      As can be seen from context, most already have a trace entry via
      trace_cpu_idle anyways.
      
      Also, the cpufreq/cpufreq.c PSTATE one is actually unpaired, as
      compared to the CSTATE ones which all have a clear start/stop.
      As part of this, the trace_power_frequency also becomes orphaned,
      so it too is deleted.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      43720bd6
  23. 15 1月, 2013 1 次提交
  24. 03 1月, 2013 1 次提交
  25. 27 11月, 2012 1 次提交
    • J
      cpuidle: Measure idle state durations with monotonic clock · a474a515
      Julius Werner 提交于
      Many cpuidle drivers measure their time spent in an idle state by
      reading the wallclock time before and after idling and calculating the
      difference. This leads to erroneous results when the wallclock time gets
      updated by another processor in the meantime, adding that clock
      adjustment to the idle state's time counter.
      
      If the clock adjustment was negative, the result is even worse due to an
      erroneous cast from int to unsigned long long of the last_residency
      variable. The negative 32 bit integer will zero-extend and result in a
      forward time jump of roughly four billion milliseconds or 1.3 hours on
      the idle state residency counter.
      
      This patch changes all affected cpuidle drivers to either use the
      monotonic clock for their measurements or make use of the generic time
      measurement wrapper in cpuidle.c, which was already working correctly.
      Some superfluous CLIs/STIs in the ACPI code are removed (interrupts
      should always already be disabled before entering the idle function, and
      not get reenabled until the generic wrapper has performed its second
      measurement). It also removes the erroneous cast, making sure that
      negative residency values are applied correctly even though they should
      not appear anymore.
      Signed-off-by: NJulius Werner <jwerner@chromium.org>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Tested-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NLen Brown <len.brown@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a474a515
  26. 15 11月, 2012 3 次提交