1. 21 1月, 2016 3 次提交
  2. 16 1月, 2016 1 次提交
  3. 15 1月, 2016 8 次提交
  4. 12 1月, 2016 2 次提交
  5. 11 1月, 2016 4 次提交
  6. 09 1月, 2016 2 次提交
  7. 07 1月, 2016 1 次提交
    • Y
      tcp: fix zero cwnd in tcp_cwnd_reduction · 8b8a321f
      Yuchung Cheng 提交于
      Patch 3759824d ("tcp: PRR uses CRB mode by default and SS mode
      conditionally") introduced a bug that cwnd may become 0 when both
      inflight and sndcnt are 0 (cwnd = inflight + sndcnt). This may lead
      to a div-by-zero if the connection starts another cwnd reduction
      phase by setting tp->prior_cwnd to the current cwnd (0) in
      tcp_init_cwnd_reduction().
      
      To prevent this we skip PRR operation when nothing is acked or
      sacked. Then cwnd must be positive in all cases as long as ssthresh
      is positive:
      
      1) The proportional reduction mode
         inflight > ssthresh > 0
      
      2) The reduction bound mode
        a) inflight == ssthresh > 0
      
        b) inflight < ssthresh
           sndcnt > 0 since newly_acked_sacked > 0 and inflight < ssthresh
      
      Therefore in all cases inflight and sndcnt can not both be 0.
      We check invalid tp->prior_cwnd to avoid potential div0 bugs.
      
      In reality this bug is triggered only with a sequence of less common
      events.  For example, the connection is terminating an ECN-triggered
      cwnd reduction with an inflight 0, then it receives reordered/old
      ACKs or DSACKs from prior transmission (which acks nothing). Or the
      connection is in fast recovery stage that marks everything lost,
      but fails to retransmit due to local issues, then receives data
      packets from other end which acks nothing.
      
      Fixes: 3759824d ("tcp: PRR uses CRB mode by default and SS mode conditionally")
      Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b8a321f
  8. 06 1月, 2016 2 次提交
  9. 05 1月, 2016 4 次提交
    • D
      net: Propagate lookup failure in l3mdev_get_saddr to caller · b5bdacf3
      David Ahern 提交于
      Commands run in a vrf context are not failing as expected on a route lookup:
          root@kenny:~# ip ro ls table vrf-red
          unreachable default
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          ping: Warning: source address might be selected on device other than vrf-red.
          PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.
      
          --- 10.100.1.254 ping statistics ---
          2 packets transmitted, 0 received, 100% packet loss, time 999ms
      
      Since the vrf table does not have a route for 10.100.1.254 the ping
      should have failed. The saddr lookup causes a full VRF table lookup.
      Propogating a lookup failure to the user allows the command to fail as
      expected:
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          connect: No route to host
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5bdacf3
    • C
      soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF · 538950a1
      Craig Gallek 提交于
      Expose socket options for setting a classic or extended BPF program
      for use when selecting sockets in an SO_REUSEPORT group.  These options
      can be used on the first socket to belong to a group before bind or
      on any socket in the group after bind.
      
      This change includes refactoring of the existing sk_filter code to
      allow reuse of the existing BPF filter validation checks.
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      538950a1
    • C
      soreuseport: fast reuseport UDP socket selection · e32ea7e7
      Craig Gallek 提交于
      Include a struct sock_reuseport instance when a UDP socket binds to
      a specific address for the first time with the reuseport flag set.
      When selecting a socket for an incoming UDP packet, use the information
      available in sock_reuseport if present.
      
      This required adding an additional field to the UDP source address
      equality function to differentiate between exact and wildcard matches.
      The original use case allowed wildcard matches when checking for
      existing port uses during bind.  The new use case of adding a socket
      to a reuseport group requires exact address matching.
      
      Performance test (using a machine with 2 CPU sockets and a total of
      48 cores):  Create reuseport groups of varying size.  Use one socket
      from this group per user thread (pinning each thread to a different
      core) calling recvmmsg in a tight loop.  Record number of messages
      received per second while saturating a 10G link.
        10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
        20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
        40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)
      
      This work is based off a similar implementation written by
      Ying Cai <ycai@google.com> for implementing policy-based reuseport
      selection.
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e32ea7e7
    • E
      udp: properly support MSG_PEEK with truncated buffers · 197c949e
      Eric Dumazet 提交于
      Backport of this upstream commit into stable kernels :
      89c22d8c ("net: Fix skb csum races when peeking")
      exposed a bug in udp stack vs MSG_PEEK support, when user provides
      a buffer smaller than skb payload.
      
      In this case,
      skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                       msg->msg_iov);
      returns -EFAULT.
      
      This bug does not happen in upstream kernels since Al Viro did a great
      job to replace this into :
      skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
      This variant is safe vs short buffers.
      
      For the time being, instead reverting Herbert Xu patch and add back
      skb->ip_summed invalid changes, simply store the result of
      udp_lib_checksum_complete() so that we avoid computing the checksum a
      second time, and avoid the problematic
      skb_copy_and_csum_datagram_iovec() call.
      
      This patch can be applied on recent kernels as it avoids a double
      checksumming, then backported to stable kernels as a bug fix.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      197c949e
  10. 29 12月, 2015 1 次提交
  11. 26 12月, 2015 1 次提交
  12. 23 12月, 2015 3 次提交
  13. 19 12月, 2015 3 次提交
  14. 18 12月, 2015 1 次提交
  15. 17 12月, 2015 1 次提交
  16. 16 12月, 2015 3 次提交