• J
    net: bonding: Use per-cpu rr_tx_counter · 848ca918
    Jussi Maki 提交于
    The round-robin rr_tx_counter was shared across CPUs leading to
    significant cache thrashing at high packet rates. This patch switches
    the round-robin packet counter to use a per-cpu variable to decide
    the destination slave.
    
    On a test with 2x100Gbit ICE nic with pktgen_sample_04_many_flows.sh
    (-s 64 -t 32) the tx rate was 19.6Mpps before and 22.3Mpps after
    this patch.
    
    "perf top -e cache_misses" before:
        12.31%  [bonding]       [k] bond_xmit_roundrobin_slave_get
        10.59%  [sch_fq_codel]  [k] fq_codel_dequeue
         9.34%  [kernel]        [k] skb_release_data
    after:
        15.42%  [sch_fq_codel]  [k] fq_codel_dequeue
        10.06%  [kernel]        [k] __memset
         9.12%  [kernel]        [k] skb_release_data
    Signed-off-by: NJussi Maki <joamaki@gmail.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    848ca918
bonding.h 20.4 KB