• I
    sched: re-tune NUMA topologies · ea3f01f8
    Ingo Molnar 提交于
    improve the sysbench ramp-up phase and its peak throughput on
    a 16way NUMA box, by turning on WAKE_AFFINE:
    
                 tip/sched   tip/sched+wake-affine
    -------------------------------------------------
        1:             700              830    +15.65%
        2:            1465             1391    -5.28%
        4:            3017             3105    +2.81%
        8:            5100             6021    +15.30%
       16:           10725            10745    +0.19%
       32:           10135            10150    +0.16%
       64:            9338             9240    -1.06%
      128:            8599             8252    -4.21%
      256:            8475             8144    -4.07%
    -------------------------------------------------
      SUM:           57558            57882    +0.56%
    
    this change also improves lat_ctx from 6.69 usecs to 1.11 usec:
    
      $ ./lat_ctx -s 0 2
      "size=0k ovr=1.19
      2 1.11
    
      $ ./lat_ctx -s 0 2
      "size=0k ovr=1.22
      2 6.69
    
    in sysbench it's an overall win with some weakness at the lots-of-clients
    side. That happens because we now under-balance this workload
    a bit. To counter that effect, turn on NEWIDLE:
    
                  wake-idle          wake-idle+newidle
     -------------------------------------------------
         1:             830              834    +0.43%
         2:            1391             1401    +0.65%
         4:            3105             3091    -0.43%
         8:            6021             6046    +0.42%
        16:           10745            10736    -0.08%
        32:           10150            10206    +0.55%
        64:            9240             9533    +3.08%
       128:            8252             8355    +1.24%
       256:            8144             8384    +2.87%
     -------------------------------------------------
       SUM:           57882            58591    +1.21%
    
    as a bonus this not only improves the many-clients case but
    also improves the (more important) rampup phase.
    
    sysbench is a workload that quickly breaks down if the
    scheduler over-balances, so since it showed an improvement
    under NEWIDLE this change is definitely good.
    ea3f01f8
topology.h 5.0 KB