1. 23 9月, 2009 1 次提交
  2. 18 9月, 2009 1 次提交
  3. 17 9月, 2009 1 次提交
  4. 15 9月, 2009 3 次提交
  5. 12 9月, 2009 1 次提交
  6. 09 9月, 2009 1 次提交
  7. 04 9月, 2009 1 次提交
    • C
      ipv6: Fix tcp_v6_send_response(): it didn't set skb transport header · a8fdf2b3
      Cosmin Ratiu 提交于
      Here is a patch which fixes an issue observed when using TCP over IPv6
      and AH from IPsec.
      
      When a connection gets closed the 4-way method and the last ACK from
      the server gets dropped, the subsequent FINs from the client do not
      get ACKed because tcp_v6_send_response does not set the transport
      header pointer. This causes ah6_output to try to allocate a lot of
      memory, which typically fails, so the ACKs never make it out of the
      stack.
      
      I have reproduced the problem on kernel 2.6.7, but after looking at
      the latest kernel it seems the problem is still there.
      Signed-off-by: NCosmin Ratiu <cratiu@ixiacom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8fdf2b3
  8. 03 9月, 2009 2 次提交
    • W
      tcp: replace hard coded GFP_KERNEL with sk_allocation · aa133076
      Wu Fengguang 提交于
      This fixed a lockdep warning which appeared when doing stress
      memory tests over NFS:
      
      	inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      
      	page reclaim => nfs_writepage => tcp_sendmsg => lock sk_lock
      
      	mount_root => nfs_root_data => tcp_close => lock sk_lock =>
      			tcp_send_fin => alloc_skb_fclone => page reclaim
      
      David raised a concern that if the allocation fails in tcp_send_fin(), and it's
      GFP_ATOMIC, we are going to yield() (which sleeps) and loop endlessly waiting
      for the allocation to succeed.
      
      But fact is, the original GFP_KERNEL also sleeps. GFP_ATOMIC+yield() looks
      weird, but it is no worse the implicit sleep inside GFP_KERNEL. Both could
      loop endlessly under memory pressure.
      
      CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      CC: David S. Miller <davem@davemloft.net>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa133076
    • E
      ip: Report qdisc packet drops · 6ce9e7b5
      Eric Dumazet 提交于
      Christoph Lameter pointed out that packet drops at qdisc level where not
      accounted in SNMP counters. Only if application sets IP_RECVERR, drops
      are reported to user (-ENOBUFS errors) and SNMP counters updated.
      
      IP_RECVERR is used to enable extended reliable error message passing,
      but these are not needed to update system wide SNMP stats.
      
      This patch changes things a bit to allow SNMP counters to be updated,
      regardless of IP_RECVERR being set or not on the socket.
      
      Example after an UDP tx flood
      # netstat -s 
      ...
      IP:
          1487048 outgoing packets dropped
      ...
      Udp:
      ...
          SndbufErrors: 1487048
      
      
      send() syscalls, do however still return an OK status, to not
      break applications.
      
      Note : send() manual page explicitly says for -ENOBUFS error :
      
       "The output queue for a network interface was full.
        This generally indicates that the interface has stopped sending,
        but may be caused by transient congestion.
        (Normally, this does not occur in Linux. Packets are just silently
        dropped when a device queue overflows.) "
      
      This is not true for IP_RECVERR enabled sockets : a send() syscall
      that hit a qdisc drop returns an ENOBUFS error.
      
      Many thanks to Christoph, David, and last but not least, Alexey !
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ce9e7b5
  9. 02 9月, 2009 7 次提交
  10. 01 9月, 2009 1 次提交
  11. 31 8月, 2009 1 次提交
  12. 29 8月, 2009 2 次提交
    • D
      ipv6: Update Neighbor Cache when IPv6 RA is received on a router · 31ce8c71
      David Ward 提交于
      When processing a received IPv6 Router Advertisement, the kernel
      creates or updates an IPv6 Neighbor Cache entry for the sender --
      but presently this does not occur if IPv6 forwarding is enabled
      (net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements
      are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these
      cases processing of the Router Advertisement has already halted.
      
      This patch allows the Neighbor Cache to be updated in these cases,
      while still avoiding any modification to routes or link parameters.
      
      This continues to satisfy RFC 4861, since any entry created in the
      Neighbor Cache as the result of a received Router Advertisement is
      still placed in the STALE state.
      Signed-off-by: NDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31ce8c71
    • S
      sit: allow ip fragmentation when using nopmtudisc to fix package loss · 8945a808
      Sascha Hlusiak 提交于
      if tunnel parameters have frag_off set to IP_DF, pmtudisc on the ipv4 link
      will be performed by deriving the mtu from the ipv4 link and setting the
      DF-Flag of the encapsulating IPv4 Header. If fragmentation is needed on the
      way, the IPv4 pmtu gets adjusted, the ipv6 package will be resent eventually,
      using the new and lower mtu and everyone is happy.
      
      If the frag_off parameter is unset, the mtu for the tunnel will be derived
      from the tunnel device or the ipv6 pmtu, which might be higher than the ipv4
      pmtu. In that case we must allow the fragmentation of the IPv4 packet because
      the IPv6 mtu wouldn't 'learn' from the adjusted IPv4 pmtu, resulting in
      frequent icmp_frag_needed and package loss on the IPv6 layer.
      
      This patch allows fragmentation when tunnel was created with parameter
      nopmtudisc, like in ipip/gre tunnels.
      Signed-off-by: NSascha Hlusiak <contact@saschahlusiak.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8945a808
  13. 25 8月, 2009 1 次提交
  14. 24 8月, 2009 2 次提交
    • J
      netfilter: xtables: mark initial tables constant · 35aad0ff
      Jan Engelhardt 提交于
      The inputted table is never modified, so should be considered const.
      Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      35aad0ff
    • B
      ipv6: Fix commit 63d9950b (ipv6: Make... · ca6982b8
      Bruno Prémont 提交于
      ipv6: Fix commit 63d9950b (ipv6: Make v4-mapped bindings consistent with IPv4)
      
      Commit 63d9950b
        (ipv6: Make v4-mapped bindings consistent with IPv4)
      changes behavior of inet6_bind() for v4-mapped addresses so it should
      behave the same way as inet_bind().
      
      During this change setting of err to -EADDRNOTAVAIL got lost:
      
      af_inet.c:469 inet_bind()
      	err = -EADDRNOTAVAIL;
      	if (!sysctl_ip_nonlocal_bind &&
      	    !(inet->freebind || inet->transparent) &&
      	    addr->sin_addr.s_addr != htonl(INADDR_ANY) &&
      	    chk_addr_ret != RTN_LOCAL &&
      	    chk_addr_ret != RTN_MULTICAST &&
      	    chk_addr_ret != RTN_BROADCAST)
      		goto out;
      
      
      af_inet6.c:463 inet6_bind()
      	if (addr_type == IPV6_ADDR_MAPPED) {
      		int chk_addr_ret;
      
      		/* Binding to v4-mapped address on a v6-only socket                         
      		 * makes no sense                                                           
      		 */
      		if (np->ipv6only) {
      			err = -EINVAL;
      			goto out; 
      		}
      
      		/* Reproduce AF_INET checks to make the bindings consitant */               
      		v4addr = addr->sin6_addr.s6_addr32[3];                                      
      		chk_addr_ret = inet_addr_type(net, v4addr);                                 
      		if (!sysctl_ip_nonlocal_bind &&                                             
      		    !(inet->freebind || inet->transparent) &&                               
      		    v4addr != htonl(INADDR_ANY) &&
      		    chk_addr_ret != RTN_LOCAL &&                                            
      		    chk_addr_ret != RTN_MULTICAST &&                                        
      		    chk_addr_ret != RTN_BROADCAST)
      			goto out;
      	} else {
      
      
      Signed-off-by Bruno Prémont <bonbons@linux-vserver.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca6982b8
  15. 14 8月, 2009 3 次提交
  16. 10 8月, 2009 7 次提交
  17. 06 8月, 2009 1 次提交
  18. 05 8月, 2009 1 次提交
  19. 03 8月, 2009 1 次提交
  20. 31 7月, 2009 1 次提交
    • N
      xfrm: select sane defaults for xfrm[4|6] gc_thresh · a33bc5c1
      Neil Horman 提交于
      Choose saner defaults for xfrm[4|6] gc_thresh values on init
      
      Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
      (set to 1024).  Given that the ipv4 and ipv6 routing caches are sized
      dynamically at boot time, the static selections can be non-sensical.
      This patch dynamically selects an appropriate gc threshold based on
      the corresponding main routing table size, using the assumption that
      we should in the worst case be able to handle as many connections as
      the routing table can.
      
      For ipv4, the maximum route cache size is 16 * the number of hash
      buckets in the route cache.  Given that xfrm4 starts garbage
      collection at the gc_thresh and prevents new allocations at 2 *
      gc_thresh, we set gc_thresh to half the maximum route cache size.
      
      For ipv6, its a bit trickier.  there is no maximum route cache size,
      but the ipv6 dst_ops gc_thresh is statically set to 1024.  It seems
      sane to select a simmilar gc_thresh for the xfrm6 code that is half
      the number of hash buckets in the v6 route cache times 16 (like the v4
      code does).
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a33bc5c1
  21. 28 7月, 2009 1 次提交
    • N
      xfrm: export xfrm garbage collector thresholds via sysctl · a44a4a00
      Neil Horman 提交于
      Export garbage collector thresholds for xfrm[4|6]_dst_ops
      
      Had a problem reported to me recently in which a high volume of ipsec
      connections on a system began reporting ENOBUFS for new connections
      eventually.
      
      It seemed that after about 2000 connections we started being unable to
      create more.  A quick look revealed that the xfrm code used a dst_ops
      structure that limited the gc_thresh value to 1024, and always
      dropped route cache entries after 2x the gc_thresh.
      
      It seems the most direct solution is to export the gc_thresh values in
      the xfrm[4|6] dst_ops as sysctls, like the main routing table does, so
      that higher volumes of connections can be supported.  This patch has
      been tested and allows the reporter to increase their ipsec connection
      volume successfully.
      Reported-by: NJoe Nall <joe@nall.com>
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      
      ipv4/xfrm4_policy.c |   18 ++++++++++++++++++
      ipv6/xfrm6_policy.c |   18 ++++++++++++++++++
      2 files changed, 36 insertions(+)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a44a4a00