1. 11 11月, 2005 6 次提交
  2. 10 11月, 2005 4 次提交
    • T
      [NETLINK]: Generic netlink family · 482a8524
      Thomas Graf 提交于
      The generic netlink family builds on top of netlink and provides
      simplifies access for the less demanding netlink users. It solves
      the problem of protocol numbers running out by introducing a so
      called controller taking care of id management and name resolving.
      
      Generic netlink modules register themself after filling out their
      id card (struct genl_family), after successful registration the
      modules are able to register callbacks to command numbers by
      filling out a struct genl_ops and calling genl_register_op(). The
      registered callbacks are invoked with attributes parsed making
      life of simple modules a lot easier.
      
      Although generic netlink modules can request static identifiers,
      it is recommended to use GENL_ID_GENERATE and to let the controller
      assign a unique identifier to the module. Userspace applications
      will then ask the controller and lookup the idenfier by the module
      name.
      
      Due to the current multicast implementation of netlink, the number
      of generic netlink modules is restricted to 1024 to avoid wasting
      memory for the per socket multiacst subscription bitmask.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      482a8524
    • T
      [NETLINK]: Generic netlink receive queue processor · 82ace47a
      Thomas Graf 提交于
      Introduces netlink_run_queue() to handle the receive queue of
      a netlink socket in a generic way. Processes as much as there
      was in the queue upon entry and invokes a callback function
      for each netlink message found. The callback function may
      refuse a message by returning a negative error code but setting
      the error pointer to 0 in which case netlink_run_queue() will
      return with a qlen != 0.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82ace47a
    • T
      [NETLINK]: Type-safe netlink messages/attributes interface · bfa83a9e
      Thomas Graf 提交于
      Introduces a new type-safe interface for netlink message and
      attributes handling. The interface is fully binary compatible
      with the old interface towards userspace. Besides type safety,
      this interface features attribute validation capabilities,
      simplified message contstruction, and documentation.
      
      The resulting netlink code should be smaller, less error prone
      and easier to understand.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfa83a9e
    • Y
      [NETFILTER]: Add nf_conntrack subsystem. · 9fb9cbb1
      Yasuyuki Kozakai 提交于
      The existing connection tracking subsystem in netfilter can only
      handle ipv4.  There were basically two choices present to add
      connection tracking support for ipv6.  We could either duplicate all
      of the ipv4 connection tracking code into an ipv6 counterpart, or (the
      choice taken by these patches) we could design a generic layer that
      could handle both ipv4 and ipv6 and thus requiring only one sub-protocol
      (TCP, UDP, etc.) connection tracking helper module to be written.
      
      In fact nf_conntrack is capable of working with any layer 3
      protocol.
      
      The existing ipv4 specific conntrack code could also not deal
      with the pecularities of doing connection tracking on ipv6,
      which is also cured here.  For example, these issues include:
      
      1) ICMPv6 handling, which is used for neighbour discovery in
         ipv6 thus some messages such as these should not participate
         in connection tracking since effectively they are like ARP
         messages
      
      2) fragmentation must be handled differently in ipv6, because
         the simplistic "defrag, connection track and NAT, refrag"
         (which the existing ipv4 connection tracking does) approach simply
         isn't feasible in ipv6
      
      3) ipv6 extension header parsing must occur at the correct spots
         before and after connection tracking decisions, and there were
         no provisions for this in the existing connection tracking
         design
      
      4) ipv6 has no need for stateful NAT
      
      The ipv4 specific conntrack layer is kept around, until all of
      the ipv4 specific conntrack helpers are ported over to nf_conntrack
      and it is feature complete.  Once that occurs, the old conntrack
      stuff will get placed into the feature-removal-schedule and we will
      fully kill it off 6 months later.
      Signed-off-by: NYasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
      Signed-off-by: NHarald Welte <laforge@netfilter.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      9fb9cbb1
  3. 09 11月, 2005 6 次提交
  4. 08 11月, 2005 1 次提交
  5. 06 11月, 2005 3 次提交
  6. 29 10月, 2005 4 次提交
  7. 28 10月, 2005 1 次提交
  8. 26 10月, 2005 2 次提交
  9. 23 10月, 2005 1 次提交
  10. 22 10月, 2005 1 次提交
  11. 11 10月, 2005 1 次提交
  12. 09 10月, 2005 1 次提交
  13. 07 10月, 2005 2 次提交
  14. 06 10月, 2005 1 次提交
  15. 05 10月, 2005 4 次提交
  16. 04 10月, 2005 1 次提交
    • E
      [INET]: speedup inet (tcp/dccp) lookups · 81c3d547
      Eric Dumazet 提交于
      Arnaldo and I agreed it could be applied now, because I have other
      pending patches depending on this one (Thank you Arnaldo)
      
      (The other important patch moves skc_refcnt in a separate cache line,
      so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)
      
      1) First some performance data :
      --------------------------------
      
      tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()
      
      The most time critical code is :
      
      sk_for_each(sk, node, &head->chain) {
           if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
               goto hit; /* You sunk my battleship! */
      }
      
      The sk_for_each() does use prefetch() hints but only the begining of
      "struct sock" is prefetched.
      
      As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
      away from the begining of "struct sock", it has to bring into CPU
      cache cold cache line. Each iteration has to use at least 2 cache
      lines.
      
      This can be problematic if some chains are very long.
      
      2) The goal
      -----------
      
      The idea I had is to change things so that INET_MATCH() may return
      FALSE in 99% of cases only using the data already in the CPU cache,
      using one cache line per iteration.
      
      3) Description of the patch
      ---------------------------
      
      Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
      filling a 32 bits hole on 64 bits platform.
      
      struct sock_common {
      	unsigned short		skc_family;
      	volatile unsigned char	skc_state;
      	unsigned char		skc_reuse;
      	int			skc_bound_dev_if;
      	struct hlist_node	skc_node;
      	struct hlist_node	skc_bind_node;
      	atomic_t		skc_refcnt;
      +	unsigned int		skc_hash;
      	struct proto		*skc_prot;
      };
      
      Store in this 32 bits field the full hash, not masked by (ehash_size -
      1) Using this full hash as the first comparison done in INET_MATCH
      permits us immediatly skip the element without touching a second cache
      line in case of a miss.
      
      Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
      sk_hash and tw_hash) already contains the slot number if we mask with
      (ehash_size - 1)
      
      File include/net/inet_hashtables.h
      
      64 bits platforms :
      #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
           (((__sk)->sk_hash == (__hash))
           ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
           ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports))   &&  \
           (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
      
      32bits platforms:
      #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
           (((__sk)->sk_hash == (__hash))                 &&  \
           (inet_sk(__sk)->daddr          == (__saddr))   &&  \
           (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
           (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
      
      
      - Adds a prefetch(head->chain.first) in 
      __inet_lookup_established()/__tcp_v4_check_established() and 
      __inet6_lookup_established()/__tcp_v6_check_established() and 
      __dccp_v4_check_established() to bring into cache the first element of the 
      list, before the {read|write}_lock(&head->lock);
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Acked-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81c3d547
  17. 03 10月, 2005 1 次提交