1. 01 11月, 2012 2 次提交
  2. 29 10月, 2012 2 次提交
  3. 24 10月, 2012 2 次提交
  4. 23 10月, 2012 3 次提交
  5. 19 10月, 2012 2 次提交
  6. 13 10月, 2012 2 次提交
    • S
      vti: fix sparse bit endian warnings · 8437e761
      stephen hemminger 提交于
      Use be32_to_cpu instead of htonl to keep sparse happy.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8437e761
    • A
      tcp: resets are misrouted · 4c675258
      Alexey Kuznetsov 提交于
      After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no
      sock case").. tcp resets are always lost, when routing is asymmetric.
      Yes, backing out that patch will result in misrouting of resets for
      dead connections which used interface binding when were alive, but we
      actually cannot do anything here.  What's died that's died and correct
      handling normal unbound connections is obviously a priority.
      
      Comment to comment:
      > This has few benefits:
      >   1. tcp_v6_send_reset already did that.
      
      It was done to route resets for IPv6 link local addresses. It was a
      mistake to do so for global addresses. The patch fixes this as well.
      
      Actually, the problem appears to be even more serious than guaranteed
      loss of resets.  As reported by Sergey Soloviev <sol@eqv.ru>, those
      misrouted resets create a lot of arp traffic and huge amount of
      unresolved arp entires putting down to knees NAT firewalls which use
      asymmetric routing.
      Signed-off-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      4c675258
  7. 12 10月, 2012 1 次提交
  8. 11 10月, 2012 1 次提交
  9. 09 10月, 2012 8 次提交
  10. 06 10月, 2012 1 次提交
  11. 05 10月, 2012 1 次提交
    • E
      ipv4: add a fib_type to fib_info · f4ef85bb
      Eric Dumazet 提交于
      commit d2d68ba9 (ipv4: Cache input routes in fib_info nexthops.)
      introduced a regression for forwarding.
      
      This was hard to reproduce but the symptom was that packets were
      delivered to local host instead of being forwarded.
      
      David suggested to add fib_type to fib_info so that we dont
      inadvertently share same fib_info for different purposes.
      
      With help from Julian Anastasov who provided very helpful
      hints, reproduced here :
      
      <quote>
              Can it be a problem related to fib_info reuse
      from different routes. For example, when local IP address
      is created for subnet we have:
      
      broadcast 192.168.0.255 dev DEV  proto kernel  scope link  src
      192.168.0.1
      192.168.0.0/24 dev DEV  proto kernel  scope link  src 192.168.0.1
      local 192.168.0.1 dev DEV  proto kernel  scope host  src 192.168.0.1
      
              The "dev DEV  proto kernel  scope link  src 192.168.0.1" is
      a reused fib_info structure where we put cached routes.
      The result can be same fib_info for 192.168.0.255 and
      192.168.0.0/24. RTN_BROADCAST is cached only for input
      routes. Incoming broadcast to 192.168.0.255 can be cached
      and can cause problems for traffic forwarded to 192.168.0.0/24.
      So, this patch should solve the problem because it
      separates the broadcast from unicast traffic.
      
              And the ip_route_input_slow caching will work for
      local and broadcast input routes (above routes 1 and 3) just
      because they differ in scope and use different fib_info.
      
      </quote>
      
      Many thanks to Chris Clayton for his patience and help.
      Reported-by: NChris Clayton <chris2553@googlemail.com>
      Bisected-by: NChris Clayton <chris2553@googlemail.com>
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Tested-by: NChris Clayton <chris2553@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4ef85bb
  12. 02 10月, 2012 4 次提交
  13. 28 9月, 2012 6 次提交
  14. 26 9月, 2012 1 次提交
  15. 25 9月, 2012 2 次提交
    • E
      net: raw: revert unrelated change · 8489c1d9
      Eric Dumazet 提交于
      Commit 5640f768 ("net: use a per task frag allocator")
      accidentally contained an unrelated change to net/ipv4/raw.c,
      later committed (without the pr_err() debugging bits) in
      net tree as commit ab43ed8b (ipv4: raw: fix icmp_filter())
      
      This patch reverts this glitch, noticed by Stephen Rothwell.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8489c1d9
    • E
      net: use a per task frag allocator · 5640f768
      Eric Dumazet 提交于
      We currently use a per socket order-0 page cache for tcp_sendmsg()
      operations.
      
      This page is used to build fragments for skbs.
      
      Its done to increase probability of coalescing small write() into
      single segments in skbs still in write queue (not yet sent)
      
      But it wastes a lot of memory for applications handling many mostly
      idle sockets, since each socket holds one page in sk->sk_sndmsg_page
      
      Its also quite inefficient to build TSO 64KB packets, because we need
      about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
      page allocator more than wanted.
      
      This patch adds a per task frag allocator and uses bigger pages,
      if available. An automatic fallback is done in case of memory pressure.
      
      (up to 32768 bytes per frag, thats order-3 pages on x86)
      
      This increases TCP stream performance by 20% on loopback device,
      but also benefits on other network devices, since 8x less frags are
      mapped on transmit and unmapped on tx completion. Alexander Duyck
      mentioned a probable performance win on systems with IOMMU enabled.
      
      Its possible some SG enabled hardware cant cope with bigger fragments,
      but their ndo_start_xmit() should already handle this, splitting a
      fragment in sub fragments, since some arches have PAGE_SIZE=65536
      
      Successfully tested on various ethernet devices.
      (ixgbe, igb, bnx2x, tg3, mellanox mlx4)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5640f768
  16. 23 9月, 2012 2 次提交
    • N
      tcp: TCP Fast Open Server - record retransmits after 3WHS · 30099b2e
      Neal Cardwell 提交于
      When recording the number of SYNACK retransmits for servers using TCP
      Fast Open, fix the code to ensure that we copy over the retransmit
      count from the request_sock after we receive the ACK that completes
      the 3-way handshake.
      
      The story here is similar to that of SYNACK RTT
      measurements. Previously we were always doing this in
      tcp_v4_syn_recv_sock(). However, for TCP Fast Open connections
      tcp_v4_conn_req_fastopen() calls tcp_v4_syn_recv_sock() at the time we
      receive the SYN. So for TFO we must copy the final SYNACK retransmit
      count in tcp_rcv_state_process().
      
      Note that copying over the SYNACK retransmit count will give us the
      correct count since, as is mentioned in a comment in
      tcp_retransmit_timer(), before we receive an ACK for our SYN-ACK a TFO
      passive connection does not retransmit anything else (e.g., data or
      FIN segments).
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30099b2e
    • N
      tcp: TCP Fast Open Server - call tcp_validate_incoming() for all packets · e69bebde
      Neal Cardwell 提交于
      A TCP Fast Open (TFO) passive connection must call both
      tcp_check_req() and tcp_validate_incoming() for all incoming ACKs that
      are attempting to complete the 3WHS.
      
      This is needed to parallel all the action that happens for a non-TFO
      connection, where for an ACK that is attempting to complete the 3WHS
      we call both tcp_check_req() and tcp_validate_incoming().
      
      For example, upon receiving the ACK that completes the 3WHS, we need
      to call tcp_fast_parse_options() and update ts_recent based on the
      incoming timestamp value in the ACK.
      
      One symptom of the problem with the previous code was that for passive
      TFO connections using TCP timestamps, the outgoing TS ecr values
      ignored the incoming TS val value on the ACK that completed the 3WHS.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e69bebde