• W
    net: compute skb->rxhash if nic hash may be 3-tuple · ecd5cf5d
    Willem de Bruijn 提交于
    Network device drivers can communicate a Toeplitz hash in skb->rxhash,
    but devices differ in their hashing capabilities. All compute a 5-tuple
    hash for TCP over IPv4, but for other connection-oriented protocols,
    they may compute only a 3-tuple. This breaks RPS load balancing, e.g.,
    for TCP over IPv6 flows. Additionally, for GRE and other tunnels,
    the kernel computes a 5-tuple hash over the inner packet if possible,
    but devices do not.
    
    This patch recomputes the rxhash in software in all cases where it
    cannot be certain that a 5-tuple was computed. Device drivers can avoid
    recomputation by setting the skb->l4_rxhash flag.
    
    Recomputing adds cycles to each packet when RPS is enabled or the
    packet arrives over a tunnel. A comparison of 200x TCP_STREAM between
    two servers running unmodified netnext with rxhash computation
    in hardware vs software (using ethtool -K eth0 rxhash [on|off]) shows
    how much time is spent in __skb_get_rxhash in this worst case:
    
         0.03%          swapper  [kernel.kallsyms]     [k] __skb_get_rxhash
         0.03%          swapper  [kernel.kallsyms]     [k] __skb_get_rxhash
         0.05%          swapper  [kernel.kallsyms]     [k] __skb_get_rxhash
    
    With 200x TCP_RR it increases to
    
         0.10%          netperf  [kernel.kallsyms]     [k] __skb_get_rxhash
         0.10%          netperf  [kernel.kallsyms]     [k] __skb_get_rxhash
         0.10%          netperf  [kernel.kallsyms]     [k] __skb_get_rxhash
    
    I considered having the patch explicitly skips recomputation when it knows
    that it will not improve the hash (TCP over IPv4), but that conditional
    complicates code without saving many cycles in practice, because it has
    to take place after flow dissector.
    Signed-off-by: NWillem de Bruijn <willemb@google.com>
    Acked-by: NEric Dumazet <edumazet@google.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    ecd5cf5d
skbuff.h 73.3 KB