1. 06 10月, 2011 2 次提交
    • S
      sched: Use resched IPI to kick off the nohz idle balance · ca38062e
      Suresh Siddha 提交于
      Current use of smp call function to kick the nohz idle balance can deadlock
      in this scenario.
      
      1. cpu-A did a generic_exec_single() to cpu-B and after queuing its call single
      data (csd) to the call single queue, cpu-A took a timer interrupt.  Actual IPI
      to cpu-B to process the call single queue is not yet sent.
      
      2. As part of the timer interrupt handler, cpu-A decided to kick cpu-B
      for the idle load balancing (sets cpu-B's rq->nohz_balance_kick to 1)
      and __smp_call_function_single() with nowait will queue the csd to the
      cpu-B's queue. But the generic_exec_single() won't send an IPI to cpu-B
      as the call single queue was not empty.
      
      3. cpu-A is busy with lot of interrupts
      
      4. Meanwhile cpu-B is entering and exiting idle and noticed that it has
      it's rq->nohz_balance_kick set to '1'. So it will go ahead and do the
      idle load balancer and clear its rq->nohz_balance_kick.
      
      5. At this point, csd queued as part of the step-2 above is still locked
      and waiting to be serviced on cpu-B.
      
      6. cpu-A is still busy with interrupt load and now it got another timer
      interrupt and as part of it decided to kick cpu-B for another idle load
      balancing (as it finds cpu-B's rq->nohz_balance_kick cleared in step-4
      above) and does __smp_call_function_single() with the same csd that is
      still locked.
      
      7. And we get a deadlock waiting for the csd_lock() in the
      __smp_call_function_single().
      
      Main issue here is that cpu-B can service the idle load balancer kick
      request from cpu-A even with out receiving the IPI and this lead to
      doing multiple __smp_call_function_single() on the same csd leading to
      deadlock.
      
      To kick a cpu, scheduler already has the reschedule vector reserved. Use
      that mechanism (kick_process()) instead of using the generic smp call function
      mechanism to kick off the nohz idle load balancing and avoid the deadlock.
      
         [ This issue is present from 2.6.35+ kernels, but marking it -stable
           only from v3.0+ as the proposed fix depends on the scheduler_ipi()
           that is introduced recently. ]
      Reported-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: stable@kernel.org # v3.0+
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111003220934.834943260@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ca38062e
    • I
      Merge commit 'v3.1-rc9' into sched/core · 9243a169
      Ingo Molnar 提交于
      Merge reason: pick up latest fixes.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9243a169
  2. 05 10月, 2011 12 次提交
  3. 04 10月, 2011 26 次提交