1. 07 12月, 2011 2 次提交
    • S
      sched, nohz: Fix the idle cpu check in nohz_idle_balance · 8a6d42d1
      Suresh Siddha 提交于
      cpu bit in the nohz.idle_cpu_mask are reset in the first busy tick after
      exiting idle. So during nohz_idle_balance(), intention is to double
      check if the cpu that is part of the idle_cpu_mask is indeed idle before
      going ahead in performing idle balance for that cpu.
      
      Fix the cpu typo in the idle_cpu() check during nohz_idle_balance().
      Reported-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1323199177.1984.12.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      8a6d42d1
    • M
      sched: Save some hrtick_start_fair cycles · b39e66ea
      Mike Galbraith 提交于
      hrtick_start_fair() shows up in profiles even when disabled.
      
      v3.0.6
      
      taskset -c 3 pipe-test
      
         PerfTop:     997 irqs/sec  kernel:89.5%  exact:  0.0% [1000Hz cycles],  (all, CPU: 3)
      ------------------------------------------------------------------------------------------------
      
                   Virgin                                    Patched
                   samples  pcnt function                    samples  pcnt function
                   _______ _____ ___________________________ _______ _____ ___________________________
      
                   2880.00 10.2% __schedule                  3136.00 11.3% __schedule
                   1634.00  5.8% pipe_read                   1615.00  5.8% pipe_read
                   1458.00  5.2% system_call                 1534.00  5.5% system_call
                   1382.00  4.9% _raw_spin_lock_irqsave      1412.00  5.1% _raw_spin_lock_irqsave
                   1202.00  4.3% pipe_write                  1255.00  4.5% copy_user_generic_string
                   1164.00  4.1% copy_user_generic_string    1241.00  4.5% __switch_to
                   1097.00  3.9% __switch_to                  929.00  3.3% mutex_lock
                    872.00  3.1% mutex_lock                   846.00  3.0% mutex_unlock
                    687.00  2.4% mutex_unlock                 804.00  2.9% pipe_write
                    682.00  2.4% native_sched_clock           713.00  2.6% native_sched_clock
                    643.00  2.3% system_call_after_swapgs     653.00  2.3% _raw_spin_unlock_irqrestore
                    617.00  2.2% sched_clock_local            633.00  2.3% fsnotify
                    612.00  2.2% fsnotify                     605.00  2.2% sched_clock_local
                    596.00  2.1% _raw_spin_unlock_irqrestore  593.00  2.1% system_call_after_swapgs
                    542.00  1.9% sysret_check                 559.00  2.0% sysret_check
                    467.00  1.7% fget_light                   472.00  1.7% fget_light
                    462.00  1.6% finish_task_switch           461.00  1.7% finish_task_switch
                    437.00  1.5% vfs_write                    442.00  1.6% vfs_write
                    431.00  1.5% do_sync_write                428.00  1.5% do_sync_write
                    413.00  1.5% select_task_rq_fair          404.00  1.5% _raw_spin_lock_irq
                    386.00  1.4% update_curr                  402.00  1.4% update_curr
                    385.00  1.4% rw_verify_area               389.00  1.4% do_sync_read
                    377.00  1.3% _raw_spin_lock_irq           378.00  1.4% vfs_read
                    369.00  1.3% do_sync_read                 340.00  1.2% pipe_iov_copy_from_user
                    360.00  1.3% vfs_read                     316.00  1.1% __wake_up_sync_key
      *             342.00  1.2% hrtick_start_fair            313.00  1.1% __wake_up_common
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      [ fixed !CONFIG_SCHED_HRTICK borkage ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1321971607.6855.17.camel@marge.simson.netSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b39e66ea
  2. 06 12月, 2011 8 次提交
  3. 17 11月, 2011 2 次提交
  4. 16 11月, 2011 3 次提交
  5. 14 11月, 2011 2 次提交
  6. 06 10月, 2011 3 次提交
    • P
      sched: Wrap scheduler p->cpus_allowed access · fa17b507
      Peter Zijlstra 提交于
      This task is preparatory for the migrate_disable() implementation, but
      stands on its own and provides a cleanup.
      
      It currently only converts those sites required for task-placement.
      Kosaki-san once mentioned replacing cpus_allowed with a proper
      cpumask_t instead of the NR_CPUS sized array it currently is, that
      would also require something like this.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Link: http://lkml.kernel.org/n/tip-e42skvaddos99psip0vce41o@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      fa17b507
    • S
      sched: Request for idle balance during nohz idle load balance · 6eb57e0d
      Suresh Siddha 提交于
      rq's idle_at_tick is set to idle/busy during the timer tick
      depending on the cpu was idle or not. This will be used later in the load
      balance that will be done in the softirq context (which is a process
      context in -RT kernels).
      
      For nohz kernels, for the cpu doing nohz idle load balance on behalf of
      all the idle cpu's, its rq->idle_at_tick might have a stale value (which is
      recorded when it got the timer tick presumably when it is busy).
      
      As the nohz idle load balancing is also being done at the same place
      as the regular load balancing, nohz idle load balancing was bailing out
      when it sees rq's idle_at_tick not set.
      
      Thus leading to poor system utilization.
      
      Rename rq's idle_at_tick to idle_balance and set it when someone requests
      for nohz idle balance on an idle cpu.
      Reported-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111003220934.892350549@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      6eb57e0d
    • S
      sched: Use resched IPI to kick off the nohz idle balance · ca38062e
      Suresh Siddha 提交于
      Current use of smp call function to kick the nohz idle balance can deadlock
      in this scenario.
      
      1. cpu-A did a generic_exec_single() to cpu-B and after queuing its call single
      data (csd) to the call single queue, cpu-A took a timer interrupt.  Actual IPI
      to cpu-B to process the call single queue is not yet sent.
      
      2. As part of the timer interrupt handler, cpu-A decided to kick cpu-B
      for the idle load balancing (sets cpu-B's rq->nohz_balance_kick to 1)
      and __smp_call_function_single() with nowait will queue the csd to the
      cpu-B's queue. But the generic_exec_single() won't send an IPI to cpu-B
      as the call single queue was not empty.
      
      3. cpu-A is busy with lot of interrupts
      
      4. Meanwhile cpu-B is entering and exiting idle and noticed that it has
      it's rq->nohz_balance_kick set to '1'. So it will go ahead and do the
      idle load balancer and clear its rq->nohz_balance_kick.
      
      5. At this point, csd queued as part of the step-2 above is still locked
      and waiting to be serviced on cpu-B.
      
      6. cpu-A is still busy with interrupt load and now it got another timer
      interrupt and as part of it decided to kick cpu-B for another idle load
      balancing (as it finds cpu-B's rq->nohz_balance_kick cleared in step-4
      above) and does __smp_call_function_single() with the same csd that is
      still locked.
      
      7. And we get a deadlock waiting for the csd_lock() in the
      __smp_call_function_single().
      
      Main issue here is that cpu-B can service the idle load balancer kick
      request from cpu-A even with out receiving the IPI and this lead to
      doing multiple __smp_call_function_single() on the same csd leading to
      deadlock.
      
      To kick a cpu, scheduler already has the reschedule vector reserved. Use
      that mechanism (kick_process()) instead of using the generic smp call function
      mechanism to kick off the nohz idle load balancing and avoid the deadlock.
      
         [ This issue is present from 2.6.35+ kernels, but marking it -stable
           only from v3.0+ as the proposed fix depends on the scheduler_ipi()
           that is introduced recently. ]
      Reported-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: stable@kernel.org # v3.0+
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111003220934.834943260@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ca38062e
  7. 26 9月, 2011 1 次提交
  8. 14 8月, 2011 14 次提交
  9. 22 7月, 2011 5 次提交