• Y
    virtio-net: support RSC v4/v6 tcp traffic for Windows HCK · 2974e916
    Yuri Benditovich 提交于
    This commit adds implementation of RX packets
    coalescing, compatible with requirements of Windows
    Hardware compatibility kit.
    
    The device enables feature VIRTIO_NET_F_RSC_EXT in
    host features if it supports extended RSC functionality
    as defined in the specification.
    This feature requires at least one of VIRTIO_NET_F_GUEST_TSO4,
    VIRTIO_NET_F_GUEST_TSO6. Windows guest driver acks
    this feature only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
    is also present.
    
    If the guest driver acks VIRTIO_NET_F_RSC_EXT feature,
    the device coalesces TCPv4 and TCPv6 packets (if
    respective VIRTIO_NET_F_GUEST_TSO feature is on,
    populates extended RSC information in virtio header
    and sets VIRTIO_NET_HDR_F_RSC_INFO bit in header flags.
    The device does not recalculate checksums in the coalesced
    packet, so they are not valid.
    
    In this case:
    All the data packets in a tcp connection are cached
    to a single buffer in every receive interval, and will
    be sent out via a timer, the 'virtio_net_rsc_timeout'
    controls the interval, this value may impact the
    performance and response time of tcp connection,
    50000(50us) is an experience value to gain a performance
    improvement, since the whql test sends packets every 100us,
    so '300000(300us)' passes the test case, it is the default
    value as well, tune it via the command line parameter
    'rsc_interval' within 'virtio-net-pci' device, for example,
    to launch a guest with interval set as '500000':
    
    'virtio-net-pci,netdev=hostnet1,bus=pci.0,id=net1,mac=00,
    guest_rsc_ext=on,rsc_interval=500000'
    
    The timer will only be triggered if the packets pool is not empty,
    and it'll drain off all the cached packets.
    
    'NetRscChain' is used to save the segments of IPv4/6 in a
    VirtIONet device.
    
    A new segment becomes a 'Candidate' as well as it passed sanity check,
    the main handler of TCP includes TCP window update, duplicated
    ACK check and the real data coalescing.
    
    An 'Candidate' segment means:
    1. Segment is within current window and the sequence is the expected one.
    2. 'ACK' of the segment is in the valid window.
    
    Sanity check includes:
    1. Incorrect version in IP header
    2. An IP options or IP fragment
    3. Not a TCP packet
    4. Sanity size check to prevent buffer overflow attack.
    5. An ECN packet
    
    Even though, there might more cases should be considered such as
    ip identification other flags, while it breaks the test because
    windows set it to the same even it's not a fragment.
    
    Normally it includes 2 typical ways to handle a TCP control flag,
    'bypass' and 'finalize', 'bypass' means should be sent out directly,
    while 'finalize' means the packets should also be bypassed, but this
    should be done after search for the same connection packets in the
    pool and drain all of them out, this is to avoid out of order fragment.
    
    All the 'SYN' packets will be bypassed since this always begin a new'
    connection, other flags such 'URG/FIN/RST/CWR/ECE' will trigger a
    finalization, because this normally happens upon a connection is going
    to be closed, an 'URG' packet also finalize current coalescing unit.
    
    Statistics can be used to monitor the basic coalescing status, the
    'out of order' and 'out of window' means how many retransmitting packets,
    thus describe the performance intuitively.
    
    Difference between ip v4 and v6 processing:
     Fragment length in ipv4 header includes itself, while it's not
     included for ipv6, thus means ipv6 can carry a real 65535 payload.
    
    Note that main goal of implementing this feature in software
    is to create reference setup for certification tests. In such
    setups guest migration is not required, so the coalesced packets
    not yet delivered to the guest will be lost in case of migration.
    Signed-off-by: NWei Xu <wexu@redhat.com>
    Signed-off-by: NYuri Benditovich <yuri.benditovich@daynix.com>
    Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
    2974e916
virtio-net.c 90.0 KB