1. 29 9月, 2010 1 次提交
    • P
      netfilter: ctnetlink: add support for user-space expectation helpers · bc01befd
      Pablo Neira Ayuso 提交于
      This patch adds the basic infrastructure to support user-space
      expectation helpers via ctnetlink and the netfilter queuing
      infrastructure NFQUEUE. Basically, this patch:
      
      * adds NF_CT_EXPECT_USERSPACE flag to identify user-space
        created expectations. I have also added a sanity check in
        __nf_ct_expect_check() to avoid that kernel-space helpers
        may create an expectation if the master conntrack has no
        helper assigned.
      * adds some branches to check if the master conntrack helper
        exists, otherwise we skip the code that refers to kernel-space
        helper such as the local expectation list and the expectation
        policy.
      * allows to set the timeout for user-space expectations with
        no helper assigned.
      * a list of expectations created from user-space that depends
        on ctnetlink (if this module is removed, they are deleted).
      * includes USERSPACE in the /proc output for expectations
        that have been created by a user-space helper.
      
      This patch also modifies ctnetlink to skip including the helper
      name in the Netlink messages if no kernel-space helper is set
      (since no user-space expectation has not kernel-space kernel
      assigned).
      
      You can access an example user-space FTP conntrack helper at:
      http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bzSigned-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      bc01befd
  2. 22 9月, 2010 1 次提交
  3. 21 9月, 2010 2 次提交
    • J
      ipvs: make rerouting optional with snat_reroute · 8a803040
      Julian Anastasov 提交于
      	Add new sysctl flag "snat_reroute". Recent kernels use
      ip_route_me_harder() to route LVS-NAT responses properly by
      VIP when there are multiple paths to client. But setups
      that do not have alternative default routes can skip this
      routing lookup by using snat_reroute=0.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      8a803040
    • J
      ipvs: netfilter connection tracking changes · f4bc17cd
      Julian Anastasov 提交于
      	Add more code to IPVS to work with Netfilter connection
      tracking and fix some problems.
      
      - Allow IPVS to be compiled without connection tracking as in
      2.6.35 and before. This can avoid keeping conntracks for all
      IPVS connections because this costs memory. ip_vs_ftp still
      depends on connection tracking and NAT as implemented for 2.6.36.
      
      - Add sysctl var "conntrack" to enable connection tracking for
      all IPVS connections. For loaded IPVS directors it needs
      tuning of nf_conntrack_max limit.
      
      - Add IP_VS_CONN_F_NFCT connection flag to request the connection
      to use connection tracking. This allows user space to provide this
      flag, for example, in dest->conn_flags. This can be useful to
      request connection tracking per real server instead of forcing it
      for all connections with the "conntrack" sysctl. This flag is
      set currently only by ip_vs_ftp and of course by "conntrack" sysctl.
      
      - Add ip_vs_nfct.c file to hold all connection tracking code,
      by this way main code should not depend of netfilter conntrack
      support.
      
      - Return back the ip_vs_post_routing handler as in 2.6.35 and use
      skb->ipvs_property=1 to allow IPVS to work without connection
      tracking
      
      Connection tracking:
      
      - most of the code is already in 2.6.36-rc
      
      - alter conntrack reply tuple for LVS-NAT connections when first packet
      from client is forwarded and conntrack state is NEW or RELATED.
      Additionally, alter reply for RELATED connections from real server,
      again for packet in original direction.
      
      - add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
      reply) for LVS-TUN early because we want to call nf_reset. It is
      needed because we add IPIP header and the original conntrack
      should be preserved, not destroyed. The transmitted IPIP packets
      can reuse same conntrack, so we do not set skb->ipvs_property.
      
      - try to destroy conntrack when the IPVS connection is destroyed.
      It is not fatal if conntrack disappears before that, it depends
      on the used timers.
      
      Fix problems from long time:
      
      - add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      f4bc17cd
  4. 17 9月, 2010 1 次提交
  5. 09 9月, 2010 3 次提交
    • E
      udp: add rehash on connect() · 719f8358
      Eric Dumazet 提交于
      commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
      added a secondary hash on UDP, hashed on (local addr, local port).
      
      Problem is that following sequence :
      
      fd = socket(...)
      connect(fd, &remote, ...)
      
      not only selects remote end point (address and port), but also sets
      local address, while UDP stack stored in secondary hash table the socket
      while its local address was INADDR_ANY (or ipv6 equivalent)
      
      Sequence is :
       - autobind() : choose a random local port, insert socket in hash tables
                    [while local address is INADDR_ANY]
       - connect() : set remote address and port, change local address to IP
                    given by a route lookup.
      
      When an incoming UDP frame comes, if more than 10 sockets are found in
      primary hash table, we switch to secondary table, and fail to find
      socket because its local address changed.
      
      One solution to this problem is to rehash datagram socket if needed.
      
      We add a new rehash(struct socket *) method in "struct proto", and
      implement this method for UDP v4 & v6, using a common helper.
      
      This rehashing only takes care of secondary hash table, since primary
      hash (based on local port only) is not changed.
      Reported-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NKrzysztof Piotr Oledzki <ole@ans.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      719f8358
    • J
      e3634169
    • J
      ipvs: fix active FTP · 6523ce15
      Julian Anastasov 提交于
      - Do not create expectation when forwarding the PORT
        command to avoid blocking the connection. The problem is that
        nf_conntrack_ftp.c:help() tries to create the same expectation later in
        POST_ROUTING and drops the packet with "dropping packet" message after
        failure in nf_ct_expect_related.
      
      - Change ip_vs_update_conntrack to alter the conntrack
        for related connections from real server. If we do not alter the reply in
        this direction the next packet from client sent to vport 20 comes as NEW
        connection. We alter it but may be some collision happens for both
        conntracks and the second conntrack gets destroyed immediately. The
        connection stucks too.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6523ce15
  6. 04 9月, 2010 1 次提交
  7. 31 8月, 2010 3 次提交
    • G
      TCP: update initial windows according to RFC 5681 · 3d5b99ae
      Gerrit Renker 提交于
      This updates the use of larger initial windows, as originally specified in
      RFC 3390, to use the newer IW values specified in RFC 5681, section 3.1.
      
      The changes made in RFC 5681 are:
       a) the setting now is more clearly specified in units of segments (as the
          comments  by John Heffner emphasized, this was not very clear in RFC 3390);
       b) for connections with 1095 < SMSS <= 2190 there is now a change:
          - RFC 3390 says that IW <= 4380,
          - RFC 5681 says that IW = 3 * SMSS <= 6570.
      
      Since RFC 3390 is older and "only" proposed standard, whereas the newer RFC 5681
      is already draft standard, it seems preferable to use the newer IW variant.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d5b99ae
    • G
      tcp/dccp: Consolidate common code for RFC 3390 conversion · 22b71c8f
      Gerrit Renker 提交于
      This patch consolidates initial-window code common to TCP and CCID-2:
       * TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and
       * CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22b71c8f
    • J
      tcp: Add TCP_USER_TIMEOUT socket option. · dca43c75
      Jerry Chu 提交于
      This patch provides a "user timeout" support as described in RFC793. The
      socket option is also needed for the the local half of RFC5482 "TCP User
      Timeout Option".
      
      TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int,
      when > 0, to specify the maximum amount of time in ms that transmitted
      data may remain unacknowledged before TCP will forcefully close the
      corresponding connection and return ETIMEDOUT to the application. If
      0 is given, TCP will continue to use the system default.
      
      Increasing the user timeouts allows a TCP connection to survive extended
      periods without end-to-end connectivity. Decreasing the user timeouts
      allows applications to "fail fast" if so desired. Otherwise it may take
      upto 20 minutes with the current system defaults in a normal WAN
      environment.
      
      The socket option can be made during any state of a TCP connection, but
      is only effective during the synchronized states of a connection
      (ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK).
      Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option,
      TCP_USER_TIMEOUT will overtake keepalive to determine when to close a
      connection due to keepalive failure.
      
      The option does not change in anyway when TCP retransmits a packet, nor
      when a keepalive probe will be sent.
      
      This option, like many others, will be inherited by an acceptor from its
      listener.
      Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dca43c75
  8. 28 8月, 2010 3 次提交
  9. 27 8月, 2010 1 次提交
  10. 26 8月, 2010 1 次提交
  11. 25 8月, 2010 5 次提交
  12. 22 8月, 2010 2 次提交
  13. 20 8月, 2010 1 次提交
    • G
      net/sched: add ACT_CSUM action to update packets checksums · eb4d4065
      Grégoire Baron 提交于
      net/sched: add ACT_CSUM action to update packets checksums
      
      ACT_CSUM can be called just after ACT_PEDIT in order to re-compute some
      altered checksums in IPv4 and IPv6 packets. The following checksums are
      supported by this patch:
       - IPv4: IPv4 header, ICMP, IGMP, TCP, UDP & UDPLite
       - IPv6: ICMPv6, TCP, UDP & UDPLite
      It's possible to request in the same action to update different kind of
      checksums, if the packets flow mix TCP, UDP and UDPLite, ...
      
      An example of usage is done in the associated iproute2 patch.
      
      Version 3 changes:
       - remove useless goto instructions
       - improve IPv6 hop options decoding
      
      Version 2 changes:
       - coding style correction
       - remove useless arguments of some functions
       - use stack in tcf_csum_dump()
       - add tcf_csum_skb_nextlayer() to factor code
      Signed-off-by: NGregoire Baron <baronchon@n7mm.org>
      Acked-by: Njamal <hadi@cyberus.ca>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb4d4065
  14. 19 8月, 2010 1 次提交
  15. 17 8月, 2010 4 次提交
  16. 10 8月, 2010 3 次提交
    • M
      Bluetooth: Use 3-DH5 payload size for default ERTM max PDU size · db12d647
      Mat Martineau 提交于
      The previous value of 672 for L2CAP_DEFAULT_MAX_PDU_SIZE is based on
      the default L2CAP MTU.  That default MTU is calculated from the size
      of two DH5 packets, minus ACL and L2CAP b-frame header overhead.
      
      ERTM is used with newer basebands that typically support larger 3-DH5
      packets, and i-frames and s-frames have more header overhead.  With
      clean RF conditions, basebands will typically attempt to use 1021-byte
      3-DH5 packets for maximum throughput.  Adjusting for 2 bytes of ACL
      headers plus 10 bytes of worst-case L2CAP headers yields 1009 bytes
      of payload.
      
      This PDU size imposes less overhead for header bytes and gives the
      baseband the option to choose 3-DH5 packets, but is small enough for
      ERTM traffic to interleave well with other L2CAP or SCO data.
      672-byte payloads do not allow the most efficient over-the-air
      packet choice, and cannot achieve maximum throughput over BR/EDR.
      Signed-off-by: NMat Martineau <mathewm@codeaurora.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      db12d647
    • M
      Bluetooth: Change default L2CAP ERTM retransmit timeout · fa235562
      Mat Martineau 提交于
      The L2CAP specification requires that the ERTM retransmit timeout be at
      least 2 seconds for BR/EDR connections.
      Signed-off-by: NMat Martineau <mathewm@codeaurora.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      fa235562
    • R
      net/sock.h: add missing kernel-doc notation · 53c3fa20
      Randy Dunlap 提交于
      Add missing kernel-doc notation to struct sock:
      
      Warning(include/net/sock.h:324): No description found for parameter 'sk_peer_pid'
      Warning(include/net/sock.h:324): No description found for parameter 'sk_peer_cred'
      Warning(include/net/sock.h:324): No description found for parameter 'sk_classid'
      Warning(include/net/sock.h:324): Excess struct/union/enum/typedef member 'sk_peercred' description in 'sock'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53c3fa20
  17. 03 8月, 2010 7 次提交