1. 28 7月, 2009 1 次提交
    • N
      xfrm: export xfrm garbage collector thresholds via sysctl · a44a4a00
      Neil Horman 提交于
      Export garbage collector thresholds for xfrm[4|6]_dst_ops
      
      Had a problem reported to me recently in which a high volume of ipsec
      connections on a system began reporting ENOBUFS for new connections
      eventually.
      
      It seemed that after about 2000 connections we started being unable to
      create more.  A quick look revealed that the xfrm code used a dst_ops
      structure that limited the gc_thresh value to 1024, and always
      dropped route cache entries after 2x the gc_thresh.
      
      It seems the most direct solution is to export the gc_thresh values in
      the xfrm[4|6] dst_ops as sysctls, like the main routing table does, so
      that higher volumes of connections can be supported.  This patch has
      been tested and allows the reporter to increase their ipsec connection
      volume successfully.
      Reported-by: NJoe Nall <joe@nall.com>
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      
      ipv4/xfrm4_policy.c |   18 ++++++++++++++++++
      ipv6/xfrm6_policy.c |   18 ++++++++++++++++++
      2 files changed, 36 insertions(+)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a44a4a00
  2. 22 7月, 2009 1 次提交
  3. 20 7月, 2009 2 次提交
  4. 13 7月, 2009 4 次提交
  5. 12 7月, 2009 2 次提交
  6. 07 7月, 2009 1 次提交
  7. 06 7月, 2009 1 次提交
  8. 04 7月, 2009 2 次提交
    • B
      IPv6: preferred lifetime of address not getting updated · a1ed0526
      Brian Haley 提交于
      There's a bug in addrconf_prefix_rcv() where it won't update the
      preferred lifetime of an IPv6 address if the current valid lifetime
      of the address is less than 2 hours (the minimum value in the RA).
      
      For example, If I send a router advertisement with a prefix that
      has valid lifetime = preferred lifetime = 2 hours we'll build
      this address:
      
      3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
          inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
             valid_lft 7175sec preferred_lft 7175sec
      
      If I then send the same prefix with valid lifetime = preferred
      lifetime = 0 it will be ignored since the minimum valid lifetime
      is 2 hours:
      
      3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
          inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
             valid_lft 7161sec preferred_lft 7161sec
      
      But according to RFC 4862 we should always reset the preferred lifetime
      even if the valid lifetime is invalid, which would cause the address
      to immediately get deprecated.  So with this patch we'd see this:
      
      5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
          inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
             valid_lft 7163sec preferred_lft 0sec
      
      The comment winds-up being 5x the size of the code to fix the problem.
      
      Update the preferred lifetime of IPv6 addresses derived from a prefix
      info option in a router advertisement even if the valid lifetime in
      the option is invalid, as specified in RFC 4862 Section 5.5.3e.  Fixes
      an issue where an address will not immediately become deprecated.
      Reported by Jens Rosenboom.
      Signed-off-by: NBrian Haley <brian.haley@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1ed0526
    • W
      xfrm6: fix the proto and ports decode of sctp protocol · 59cae009
      Wei Yongjun 提交于
      The SCTP pushed the skb above the sctp chunk header, so the
      check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in
      _decode_session6() will never return 0 and the ports decode
      of sctp will always fail. (nh + offset + 1 - skb->data < 0)
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59cae009
  9. 27 6月, 2009 2 次提交
  10. 26 6月, 2009 1 次提交
  11. 23 6月, 2009 1 次提交
  12. 18 6月, 2009 1 次提交
  13. 14 6月, 2009 1 次提交
    • T
      PIM-SM: namespace changes · 403dbb97
      Tom Goff 提交于
      IPv4:
        - make PIM register vifs netns local
        - set the netns when a PIM register vif is created
        - make PIM available in all network namespaces (if CONFIG_IP_PIMSM_V2)
          by adding the protocol handler when multicast routing is initialized
      
      IPv6:
        - make PIM register vifs netns local
        - make PIM available in all network namespaces (if CONFIG_IPV6_PIMSM_V2)
          by adding the protocol handler when multicast routing is initialized
      Signed-off-by: NTom Goff <thomas.goff@boeing.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      403dbb97
  14. 13 6月, 2009 2 次提交
  15. 11 6月, 2009 1 次提交
    • E
      net: No more expensive sock_hold()/sock_put() on each tx · 2b85a34e
      Eric Dumazet 提交于
      One of the problem with sock memory accounting is it uses
      a pair of sock_hold()/sock_put() for each transmitted packet.
      
      This slows down bidirectional flows because the receive path
      also needs to take a refcount on socket and might use a different
      cpu than transmit path or transmit completion path. So these
      two atomic operations also trigger cache line bounces.
      
      We can see this in tx or tx/rx workloads (media gateways for example),
      where sock_wfree() can be in top five functions in profiles.
      
      We use this sock_hold()/sock_put() so that sock freeing
      is delayed until all tx packets are completed.
      
      As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
      by one unit at init time, until sk_free() is called.
      Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
      to decrement initial offset and atomicaly check if any packets
      are in flight.
      
      skb_set_owner_w() doesnt call sock_hold() anymore
      
      sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
      reached 0 to perform the final freeing.
      
      Drawback is that a skb->truesize error could lead to unfreeable sockets, or
      even worse, prematurely calling __sk_free() on a live socket.
      
      Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
      on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
      contention point. 5 % speedup on a UDP transmit workload (depends
      on number of flows), lowering TX completion cpu usage.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b85a34e
  16. 09 6月, 2009 2 次提交
  17. 08 6月, 2009 1 次提交
    • J
      netfilter: nf_ct_icmp: keep the ICMP ct entries longer · f87fb666
      Jan Kasprzak 提交于
      Current conntrack code kills the ICMP conntrack entry as soon as
      the first reply is received. This is incorrect, as we then see only
      the first ICMP echo reply out of several possible duplicates as
      ESTABLISHED, while the rest will be INVALID. Also this unnecessarily
      increases the conntrackd traffic on H-A firewalls.
      
      Make all the ICMP conntrack entries (including the replied ones)
      last for the default of nf_conntrack_icmp{,v6}_timeout seconds.
      Signed-off-by: NJan "Yenya" Kasprzak <kas@fi.muni.cz>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      f87fb666
  18. 04 6月, 2009 1 次提交
  19. 03 6月, 2009 2 次提交
    • E
      net: skb->dst accessors · adf30907
      Eric Dumazet 提交于
      Define three accessors to get/set dst attached to a skb
      
      struct dst_entry *skb_dst(const struct sk_buff *skb)
      
      void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
      
      void skb_dst_drop(struct sk_buff *skb)
      This one should replace occurrences of :
      dst_release(skb->dst)
      skb->dst = NULL;
      
      Delete skb->dst field
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      adf30907
    • P
      netfilter: conntrack: simplify event caching system · 17e6e4ea
      Pablo Neira Ayuso 提交于
      This patch simplifies the conntrack event caching system by removing
      several events:
      
       * IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
         since the have no clients.
       * IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
         days.
       * IPCT_REFRESH which is not of any use since we always include the
         timeout in the messages.
      
      After this patch, the existing events are:
      
       * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
       addition and deletion of entries.
       * IPCT_STATUS, that notes that the status bits have changes,
       eg. IPS_SEEN_REPLY and IPS_ASSURED.
       * IPCT_PROTOINFO, that reports that internal protocol information has
       changed, eg. the TCP, DCCP and SCTP protocol state.
       * IPCT_HELPER, that a helper has been assigned or unassigned to this
       entry.
       * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
       covers the case when a mark is set to zero.
       * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
       adjustment.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      17e6e4ea
  20. 02 6月, 2009 1 次提交
  21. 01 6月, 2009 1 次提交
  22. 27 5月, 2009 1 次提交
  23. 22 5月, 2009 1 次提交
  24. 21 5月, 2009 2 次提交
    • J
      IPv6: set RTPROT_KERNEL to initial route · 4f724279
      Jean-Mickael Guerin 提交于
      The use of unspecified protocol in IPv6 initial route prevents quagga to
      install IPv6 default route:
      # show ipv6 route
      S   ::/0 [1/0] via fe80::1, eth1_0
      K>* ::/0 is directly connected, lo, rej
      C>* ::1/128 is directly connected, lo
      C>* fe80::/64 is directly connected, eth1_0
      
      # ip -6 route
      fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
      hoplimit -1
      ff00::/8 dev eth1_0  metric 256  mtu 1500 advmss 1440 hoplimit -1
      unreachable default dev lo  proto none  metric -1  error -101 hoplimit 255
      
      The attached patch ensures RTPROT_KERNEL to the default initial route
      and fixes the problem for quagga.
      This is similar to "ipv6: protocol for address routes"
      f410a1fb.
      
      # show ipv6 route
      S>* ::/0 [1/0] via fe80::1, eth1_0
      C>* ::1/128 is directly connected, lo
      C>* fe80::/64 is directly connected, eth1_0
      
      # ip -6 route
      fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
      hoplimit -1
      fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
      hoplimit -1
      ff00::/8 dev eth1_0  metric 256  mtu 1500 advmss 1440 hoplimit -1
      default via fe80::1 dev eth1_0  proto zebra  metric 1024  mtu 1500
      advmss 1440 hoplimit -1
      unreachable default dev lo  proto kernel  metric -1  error -101 hoplimit 255
      Signed-off-by: NJean-Mickael Guerin <jean-mickael.guerin@6wind.com>
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f724279
    • R
      net: Remove unused parameter from fill method in fib_rules_ops. · 04af8cf6
      Rami Rosen 提交于
      The netlink message header (struct nlmsghdr) is an unused parameter in
      fill method of fib_rules_ops struct.  This patch removes this
      parameter from this method and fixes the places where this method is
      called.
      
      (include/net/fib_rules.h)
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04af8cf6
  25. 20 5月, 2009 5 次提交