1. 14 2月, 2015 1 次提交
    • R
      PM / sleep: Re-implement suspend-to-idle handling · 38106313
      Rafael J. Wysocki 提交于
      In preparation for adding support for quiescing timers in the final
      stage of suspend-to-idle transitions, rework the freeze_enter()
      function making the system wait on a wakeup event, the freeze_wake()
      function terminating the suspend-to-idle loop and the mechanism by
      which deep idle states are entered during suspend-to-idle.
      
      First of all, introduce a simple state machine for suspend-to-idle
      and make the code in question use it.
      
      Second, prevent freeze_enter() from losing wakeup events due to race
      conditions and ensure that the number of online CPUs won't change
      while it is being executed.  In addition to that, make it force
      all of the CPUs re-enter the idle loop in case they are in idle
      states already (so they can enter deeper idle states if possible).
      
      Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
      checks in cpuidle_select() and cpuidle_reflect() with a single
      suspend-to-idle state check in cpuidle_idle_call().
      
      Finally, introduce cpuidle_enter_freeze() that will simply find the
      deepest idle state available to the given CPU and enter it using
      cpuidle_enter().
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      38106313
  2. 31 1月, 2015 1 次提交
  3. 24 9月, 2014 1 次提交
    • D
      sched: Let the scheduler see CPU idle states · 442bf3aa
      Daniel Lezcano 提交于
      When the cpu enters idle, it stores the cpuidle state pointer in its
      struct rq instance which in turn could be used to make a better decision
      when balancing tasks.
      
      As soon as the cpu exits its idle state, the struct rq reference is
      cleared.
      
      There are a couple of situations where the idle state pointer could be changed
      while it is being consulted:
      
      1. For x86/acpi with dynamic c-states, when a laptop switches from battery
         to AC that could result on removing the deeper idle state. The acpi driver
         triggers:
      	'acpi_processor_cst_has_changed'
      		'cpuidle_pause_and_lock'
      			'cpuidle_uninstall_idle_handler'
      				'kick_all_cpus_sync'.
      
      All cpus will exit their idle state and the pointed object will be set to
      NULL.
      
      2. The cpuidle driver is unloaded. Logically that could happen but not
      in practice because the drivers are always compiled in and 95% of them are
      not coded to unregister themselves.  In any case, the unloading code must
      call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock'
      leading to 'kick_all_cpus_sync' as mentioned above.
      
      A race can happen if we use the pointer and then one of these two scenarios
      occurs at the same moment.
      
      In order to be safe, the idle state pointer stored in the rq must be
      used inside a rcu_read_lock section where we are protected with the
      'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The
      idle_get_state() and idle_put_state() accessors should be used to that
      effect.
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linaro-kernel@lists.linaro.org
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      442bf3aa
  4. 09 7月, 2014 1 次提交
  5. 05 7月, 2014 1 次提交
  6. 05 6月, 2014 2 次提交
  7. 08 5月, 2014 3 次提交
  8. 01 5月, 2014 1 次提交
  9. 11 3月, 2014 4 次提交
  10. 27 2月, 2014 1 次提交
  11. 11 2月, 2014 2 次提交
  12. 14 1月, 2014 1 次提交
  13. 25 9月, 2013 2 次提交
    • P
      sched: Add NEED_RESCHED to the preempt_count · f27dde8d
      Peter Zijlstra 提交于
      In order to combine the preemption and need_resched test we need to
      fold the need_resched information into the preempt_count value.
      
      Since the NEED_RESCHED flag is set across CPUs this needs to be an
      atomic operation, however we very much want to avoid making
      preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED
      infrastructure in place but at 3 sites test it and fold its value into
      preempt_count; namely:
      
       - resched_task() when setting TIF_NEED_RESCHED on the current task
       - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a
                         remote task it follows it up with a reschedule IPI
                         and we can modify the cpu local preempt_count from
                         there.
       - cpu_idle_loop() for when resched_task() found tsk_is_polling().
      
      We use an inverted bitmask to indicate need_resched so that a 0 means
      both need_resched and !atomic.
      
      Also remove the barrier() in preempt_enable() between
      preempt_enable_no_resched() and preempt_check_resched() to avoid
      having to reload the preemption value and allow the compiler to use
      the flags of the previuos decrement. I couldn't come up with any sane
      reason for this barrier() to be there as preempt_enable_no_resched()
      already has a barrier() before doing the decrement.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f27dde8d
    • P
      sched, idle: Fix the idle polling state logic · ea811747
      Peter Zijlstra 提交于
      Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
      regressed several workloads and caused excessive reschedule
      interrupts.
      
      The patch in question failed to notice that the x86 code had an
      inverted sense of the polling state versus the new generic code (x86:
      default polling, generic: default !polling).
      
      Fix the two prominent x86 mwait based idle drivers and introduce a few
      new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
      usage).
      
      Also switch the idle routines to using tif_need_resched() which is an
      immediate TIF_NEED_RESCHED test as opposed to need_resched which will
      end up being slightly different.
      Reported-by: NMike Galbraith <bitbucket@online.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: lenb@kernel.org
      Cc: tglx@linutronix.de
      Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ea811747
  14. 15 6月, 2013 1 次提交
  15. 12 6月, 2013 1 次提交
    • T
      idle: Add the stack canary init to cpu_startup_entry() · d7880812
      Thomas Gleixner 提交于
      Moving x86 to the generic idle implementation (commit 7d1a9417 "x86:
      Use generic idle loop") wreckaged the stack protector.
      
      I stupidly missed that boot_init_stack_canary() must be inlined from a
      function which never returns, but I put that call into
      arch_cpu_idle_prepare() which of course returns.
      
      I pondered to play tricks with arch_cpu_idle_prepare() first, but then
      I noticed, that the other archs which have implemented the
      stackprotector (ARM and SH) do not initialize the canary for the
      non-boot cpus.
      
      So I decided to move the boot_init_stack_canary() call into
      cpu_startup_entry() ifdeffed with an CONFIG_X86 for now. This #ifdef
      is just a temporary measure as I don't want to inflict the
      boot_init_stack_canary() call on ARM and SH that late in the cycle.
      
      I'll queue a patch for 3.11 which removes the #ifdef if the ARM/SH
      maintainers have no objection.
      Reported-by: NWouter van Kesteren <woutershep@gmail.com>
      Cc: x86@kernel.org
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d7880812
  16. 14 5月, 2013 1 次提交
    • S
      rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit · b47430d3
      Srivatsa S. Bhat 提交于
      Bjørn Mork reported the following warning when running powertop.
      
      [   49.289034] ------------[ cut here ]------------
      [   49.289055] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common.isra.48+0x3d/0x125()
      [   49.289244] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-bisect-rcu-warn+ #107
      [   49.289251]  ffffffff8157d8c8 ffffffff81801e28 ffffffff8137e4e3 ffffffff81801e68
      [   49.289260]  ffffffff8103094f ffffffff81801e68 0000000000000000 ffff88023afcd9b0
      [   49.289268]  0000000000000000 0140000000000000 ffff88023bee7700 ffffffff81801e78
      [   49.289276] Call Trace:
      [   49.289285]  [<ffffffff8137e4e3>] dump_stack+0x19/0x1b
      [   49.289293]  [<ffffffff8103094f>] warn_slowpath_common+0x62/0x7b
      [   49.289300]  [<ffffffff8103097d>] warn_slowpath_null+0x15/0x17
      [   49.289306]  [<ffffffff810a9006>] rcu_eqs_exit_common.isra.48+0x3d/0x125
      [   49.289314]  [<ffffffff81079b49>] ? trace_hardirqs_off_caller+0x37/0xa6
      [   49.289320]  [<ffffffff810a9692>] rcu_idle_exit+0x85/0xa8
      [   49.289327]  [<ffffffff8107076e>] trace_cpu_idle_rcuidle+0xae/0xff
      [   49.289334]  [<ffffffff810708b1>] cpu_startup_entry+0x72/0x115
      [   49.289341]  [<ffffffff813689e5>] rest_init+0x149/0x150
      [   49.289347]  [<ffffffff8136889c>] ? csum_partial_copy_generic+0x16c/0x16c
      [   49.289355]  [<ffffffff81a82d34>] start_kernel+0x3f0/0x3fd
      [   49.289362]  [<ffffffff81a8274c>] ? repair_env_string+0x5a/0x5a
      [   49.289368]  [<ffffffff81a82481>] x86_64_start_reservations+0x2a/0x2c
      [   49.289375]  [<ffffffff81a82550>] x86_64_start_kernel+0xcd/0xd1
      [   49.289379] ---[ end trace 07a1cc95e29e9036 ]---
      
      The warning is that 'rdtp->dynticks' has an unexpected value, which roughly
      translates to - the calls to rcu_idle_enter() and rcu_idle_exit() were not
      made in the correct order, or otherwise messed up.
      
      And Bjørn's painstaking debugging indicated that this happens when the idle
      loop enters the poll mode. Looking at the poll function cpu_idle_poll(), and
      the implementation of trace_cpu_idle_rcuidle(), the problem becomes very clear:
      cpu_idle_poll() lacks calls to rcu_idle_enter/exit(), and trace_cpu_idle_rcuidle()
      calls them in the reverse order - first rcu_idle_exit(), and then rcu_idle_enter().
      Hence the even/odd alternative sequencing of rdtp->dynticks goes for a toss.
      
      And powertop readily triggers this because powertop uses the idle-tracing
      infrastructure extensively.
      
      So, to fix this, wrap the code in cpu_idle_poll() within rcu_idle_enter/exit(),
      so that it blends properly with the calls inside trace_cpu_idle_rcuidle() and
      thus get the function ordering right.
      Reported-and-tested-by: NBjørn Mork <bjorn@mork.no>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/519169BF.4080208@linux.vnet.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b47430d3
  17. 17 4月, 2013 1 次提交
  18. 08 4月, 2013 2 次提交