1. 08 7月, 2014 4 次提交
    • T
      net: Save TX flow hash in sock and set in skbuf on xmit · b73c3d0e
      Tom Herbert 提交于
      For a connected socket we can precompute the flow hash for setting
      in skb->hash on output. This is a performance advantage over
      calculating the skb->hash for every packet on the connection. The
      computation is done using the common hash algorithm to be consistent
      with computations done for packets of the connection in other states
      where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).
      
      This patch adds sk_txhash to the sock structure. inet_set_txhash and
      ip6_set_txhash functions are added which are called from points in
      TCP and UDP where socket moves to established state.
      
      skb_set_hash_from_sk is a function which sets skb->hash from the
      sock txhash value. This is called in UDP and TCP transmit path when
      transmitting within the context of a socket.
      
      Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
      interface (in this case skb_get_hash called on every TX packet to
      create a UDP source port).
      
      Before fix:
      
        95.02% CPU utilization
        154/256/505 90/95/99% latencies
        1.13042e+06 tps
      
        Time in functions:
          0.28% skb_flow_dissect
          0.21% __skb_get_hash
      
      After fix:
      
        94.95% CPU utilization
        156/254/485 90/95/99% latencies
        1.15447e+06
      
        Neither __skb_get_hash nor skb_flow_dissect appear in perf
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b73c3d0e
    • T
      flow_dissector: Abstract out hash computation · 5ed20a68
      Tom Herbert 提交于
      Move the hash computation located in __skb_get_hash to be a separate
      function which takes flow_keys as input. This will allow flow hash
      computation in other contexts where we only have addresses and ports.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ed20a68
    • N
      tcp: switch snt_synack back to measuring transmit time of first SYNACK · 86c6a2c7
      Neal Cardwell 提交于
      Always store in snt_synack the time at which the server received the
      first client SYN and attempted to send the first SYNACK.
      
      Recent commit aa27fc50 ("tcp: tcp_v[46]_conn_request: fix snt_synack
      initialization") resolved an inconsistency between IPv4 and IPv6 in
      the initialization of snt_synack. This commit brings back the idea
      from 843f4a55 (tcp: use tcp_v4_send_synack on first SYN-ACK), which
      was going for the original behavior of snt_synack from the commit
      where it was added in 9ad7c049 ("tcp: RFC2988bis + taking RTT
      sample from 3WHS for the passive open side") in v3.1.
      
      In addition to being simpler (and probably a tiny bit faster),
      unconditionally storing the time of the first SYNACK attempt has been
      useful because it allows calculating a performance metric quantifying
      how long it took to establish a passive TCP connection.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Cc: Octavian Purdila <octavian.purdila@intel.com>
      Cc: Jerry Chu <hkchu@google.com>
      Acked-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86c6a2c7
    • S
      ptp: Classify ptp over ip over vlan packets · ae5c6c6d
      Stefan Sørensen 提交于
      This extends the ptp bpf to also match ptp over ip over vlan packets. The ptp
      classes are changed to orthogonal bitfields representing version, transport
      and vlan values to simplify matching.
      Signed-off-by: NStefan Sørensen <stefan.sorensen@spectralink.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae5c6c6d
  2. 03 7月, 2014 1 次提交
    • D
      net: sctp: improve timer slack calculation for transport HBs · 8f61059a
      Daniel Borkmann 提交于
      RFC4960, section 8.3 says:
      
        On an idle destination address that is allowed to heartbeat,
        it is recommended that a HEARTBEAT chunk is sent once per RTO
        of that destination address plus the protocol parameter
        'HB.interval', with jittering of +/- 50% of the RTO value,
        and exponential backoff of the RTO if the previous HEARTBEAT
        is unanswered.
      
      Currently, we calculate jitter via sctp_jitter() function first,
      and then add its result to the current RTO for the new timeout:
      
        TMO = RTO + (RAND() % RTO) - (RTO / 2)
                    `------------------------^-=> sctp_jitter()
      
      Instead, we can just simplify all this by directly calculating:
      
        TMO = (RTO / 2) + (RAND() % RTO)
      
      With the help of prandom_u32_max(), we don't need to open code
      our own global PRNG, but can instead just make use of the per
      CPU implementation of prandom with better quality numbers. Also,
      we can now spare us the conditional for divide by zero check
      since no div or mod operation needs to be used. Note that
      prandom_u32_max() won't emit the same result as a mod operation,
      but we really don't care here as we only want to have a random
      number scaled into RTO interval.
      
      Note, exponential RTO backoff is handeled elsewhere, namely in
      sctp_do_8_2_transport_strike().
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f61059a
  3. 02 7月, 2014 2 次提交
    • E
      inet: move ipv6only in sock_common · 9fe516ba
      Eric Dumazet 提交于
      When an UDP application switches from AF_INET to AF_INET6 sockets, we
      have a small performance degradation for IPv4 communications because of
      extra cache line misses to access ipv6only information.
      
      This can also be noticed for TCP listeners, as ipv6_only_sock() is also
      used from __inet_lookup_listener()->compute_score()
      
      This is magnified when SO_REUSEPORT is used.
      
      Move ipv6only into struct sock_common so that it is available at
      no extra cost in lookups.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe516ba
    • B
      ipv6: Allow accepting RA from local IP addresses. · d9333196
      Ben Greear 提交于
      This can be used in virtual networking applications, and
      may have other uses as well.  The option is disabled by
      default.
      
      A specific use case is setting up virtual routers, bridges, and
      hosts on a single OS without the use of network namespaces or
      virtual machines.  With proper use of ip rules, routing tables,
      veth interface pairs and/or other virtual interfaces,
      and applications that can bind to interfaces and/or IP addresses,
      it is possibly to create one or more virtual routers with multiple
      hosts attached.  The host interfaces can act as IPv6 systems,
      with radvd running on the ports in the virtual routers.  With the
      option provided in this patch enabled, those hosts can now properly
      obtain IPv6 addresses from the radvd.
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9333196
  4. 28 6月, 2014 11 次提交
  5. 26 6月, 2014 5 次提交
  6. 24 6月, 2014 4 次提交
  7. 23 6月, 2014 1 次提交
  8. 22 6月, 2014 1 次提交
  9. 21 6月, 2014 2 次提交
  10. 20 6月, 2014 1 次提交
  11. 18 6月, 2014 4 次提交
  12. 17 6月, 2014 4 次提交