• E
    net: sched: use pinned timers · 4a8e320c
    Eric Dumazet 提交于
    While using a MQ + NETEM setup, I had confirmation that the default
    timer migration ( /proc/sys/kernel/timer_migration ) is killing us.
    
    Installing this on a receiver side of a TCP_STREAM test, (NIC has 8 TX
    queues) :
    
    EST="est 1sec 4sec"
    for ETH in eth1
    do
     tc qd del dev $ETH root 2>/dev/null
     tc qd add dev $ETH root handle 1: mq
     tc qd add dev $ETH parent 1:1 $EST netem limit 70000 delay 6ms
     tc qd add dev $ETH parent 1:2 $EST netem limit 70000 delay 8ms
     tc qd add dev $ETH parent 1:3 $EST netem limit 70000 delay 10ms
     tc qd add dev $ETH parent 1:4 $EST netem limit 70000 delay 12ms
     tc qd add dev $ETH parent 1:5 $EST netem limit 70000 delay 14ms
     tc qd add dev $ETH parent 1:6 $EST netem limit 70000 delay 16ms
     tc qd add dev $ETH parent 1:7 $EST netem limit 80000 delay 18ms
     tc qd add dev $ETH parent 1:8 $EST netem limit 90000 delay 20ms
    done
    
    We can see that timers get migrated into a single cpu, presumably idle
    at the time timers are set up.
    Then all qdisc dequeues run from this cpu and huge lock contention
    happens. This single cpu is stuck in softirq mode and cannot dequeue
    fast enough.
    
        39.24%  [kernel]          [k] _raw_spin_lock
         2.65%  [kernel]          [k] netem_enqueue
         1.80%  [kernel]          [k] netem_dequeue
         1.63%  [kernel]          [k] copy_user_enhanced_fast_string
         1.45%  [kernel]          [k] _raw_spin_lock_bh
    
    By pinning qdisc timers on the cpu running the qdisc, we respect proper
    XPS setting and remove this lock contention.
    
         5.84%  [kernel]          [k] netem_enqueue
         4.83%  [kernel]          [k] _raw_spin_lock
         2.92%  [kernel]          [k] copy_user_enhanced_fast_string
    
    Current Qdiscs that benefit from this change are :
    
    	netem, cbq, fq, hfsc, tbf, htb.
    Signed-off-by: NEric Dumazet <edumazet@google.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    4a8e320c
sch_api.c 44.5 KB