• S
    sched: Use group weight, idle cpu metrics to fix imbalances during idle · aae6d3dd
    Suresh Siddha 提交于
    Currently we consider a sched domain to be well balanced when the imbalance
    is less than the domain's imablance_pct. As the number of cores and threads
    are increasing, current values of imbalance_pct (for example 25% for a
    NUMA domain) are not enough to detect imbalances like:
    
    a) On a WSM-EP system (two sockets, each having 6 cores and 12 logical threads),
    24 cpu-hogging tasks get scheduled as 13 on one socket and 11 on another
    socket. Leading to an idle HT cpu.
    
    b) On a hypothetial 2 socket NHM-EX system (each socket having 8 cores and
    16 logical threads), 16 cpu-hogging tasks can get scheduled as 9 on one
    socket and 7 on another socket. Leaving one core in a socket idle
    whereas in another socket we have a core having both its HT siblings busy.
    
    While this issue can be fixed by decreasing the domain's imbalance_pct
    (by making it a function of number of logical cpus in the domain), it
    can potentially cause more task migrations across sched groups in an
    overloaded case.
    
    Fix this by using imbalance_pct only during newly_idle and busy
    load balancing. And during idle load balancing, check if there
    is an imbalance in number of idle cpu's across the busiest and this
    sched_group or if the busiest group has more tasks than its weight that
    the idle cpu in this_group can pull.
    Reported-by: NNikhil Rao <ncrao@google.com>
    Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
    Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <1284760952.2676.11.camel@sbsiddha-MOBL3.sc.intel.com>
    Signed-off-by: NIngo Molnar <mingo@elte.hu>
    aae6d3dd
sched.c 226.1 KB