1. 11 2月, 2011 2 次提交
  2. 09 2月, 2011 4 次提交
  3. 06 2月, 2011 1 次提交
  4. 05 2月, 2011 1 次提交
  5. 04 2月, 2011 4 次提交
  6. 03 2月, 2011 4 次提交
  7. 02 2月, 2011 3 次提交
  8. 01 2月, 2011 9 次提交
    • P
      netfilter: ecache: always set events bits, filter them later · 3db7e93d
      Pablo Neira Ayuso 提交于
      For the following rule:
      
      iptables -I PREROUTING -t raw -j CT --ctevents assured
      
      The event delivered looks like the following:
      
       [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      
      Note that the TCP protocol state is not included. For that reason
      the CT event filtering is not very useful for conntrackd.
      
      To resolve this issue, instead of conditionally setting the CT events
      bits based on the ctmask, we always set them and perform the filtering
      in the late stage, just before the delivery.
      
      Thus, the event delivered looks like the following:
      
       [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      3db7e93d
    • J
      netfilter: xtables: "set" match and "SET" target support · d956798d
      Jozsef Kadlecsik 提交于
      The patch adds the combined module of the "SET" target and "set" match
      to netfilter. Both the previous and the current revisions are supported.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d956798d
    • J
      netfilter: ipset: list:set set type support · f830837f
      Jozsef Kadlecsik 提交于
      The module implements the list:set type support in two flavours:
      without and with timeout. The sets has two sides: for the userspace,
      they store the names of other (non list:set type of) sets: one can add,
      delete and test set names. For the kernel, it forms an ordered union of
      the member sets: the members sets are tried in order when elements are
      added, deleted and tested and the process stops at the first success.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      f830837f
    • J
      netfilter: ipset: hash:ip set type support · 6c027889
      Jozsef Kadlecsik 提交于
      The module implements the hash:ip type support in four flavours:
      for IPv4 or IPv6, both without and with timeout support.
      
      All the hash types are based on the "array hash" or ahash structure
      and functions as a good compromise between minimal memory footprint
      and speed. The hashing uses arrays to resolve clashes. The hash table
      is resized (doubled) when searching becomes too long. Resizing can be
      triggered by userspace add commands only and those are serialized by
      the nfnl mutex. During resizing the set is read-locked, so the only
      possible concurrent operations are the kernel side readers. Those are
      protected by RCU locking.
      
      Because of the four flavours and the other hash types, the functions
      are implemented in general forms in the ip_set_ahash.h header file
      and the real functions are generated before compiling by macro expansion.
      Thus the dereferencing of low-level functions and void pointer arguments
      could be avoided: the low-level functions are inlined, the function
      arguments are pointers of type-specific structures.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      6c027889
    • J
      netfilter: ipset: bitmap:ip set type support · 72205fc6
      Jozsef Kadlecsik 提交于
      The module implements the bitmap:ip set type in two flavours, without
      and with timeout support. In this kind of set one can store IPv4
      addresses (or network addresses) from a given range.
      
      In order not to waste memory, the timeout version does not rely on
      the kernel timer for every element to be timed out but on garbage
      collection. All set types use this mechanism.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      72205fc6
    • J
      netfilter: ipset: IP set core support · a7b4f989
      Jozsef Kadlecsik 提交于
      The patch adds the IP set core support to the kernel.
      
      The IP set core implements a netlink (nfnetlink) based protocol by which
      one can create, destroy, flush, rename, swap, list, save, restore sets,
      and add, delete, test elements from userspace. For simplicity (and backward
      compatibilty and for not to force ip(6)tables to be linked with a netlink
      library) reasons a small getsockopt-based protocol is also kept in order
      to communicate with the ip(6)tables match and target.
      
      The netlink protocol passes all u16, etc values in network order with
      NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
      NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.
      
      For other kernel subsystems (netfilter match and target) the API contains
      the functions to add, delete and test elements in sets and the required calls
      to get/put refereces to the sets before those operations can be performed.
      
      The set types (which are implemented in independent modules) are stored
      in a simple RCU protected list. A set type may have variants: for example
      without timeout or with timeout support, for IPv4 or for IPv6. The sets
      (i.e. the pointers to the sets) are stored in an array. The sets are
      identified by their index in the array, which makes possible easy and
      fast swapping of sets. The array is protected indirectly by the nfnl
      mutex from nfnetlink. The content of the sets are protected by the rwlock
      of the set.
      
      There are functional differences between the add/del/test functions
      for the kernel and userspace:
      
      - kernel add/del/test: works on the current packet (i.e. one element)
      - kernel test: may trigger an "add" operation  in order to fill
        out unspecified parts of the element from the packet (like MAC address)
      - userspace add/del: works on the netlink message and thus possibly
        on multiple elements from the IPSET_ATTR_ADT container attribute.
      - userspace add: may trigger resizing of a set
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      a7b4f989
    • J
      netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros · f703651e
      Jozsef Kadlecsik 提交于
      The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
      vanilla kernel.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      f703651e
    • D
      ipv4: Consolidate all default route selection implementations. · 0c838ff1
      David S. Miller 提交于
      Both fib_trie and fib_hash have a local implementation of
      fib_table_select_default().  This is completely unnecessary
      code duplication.
      
      Since we now remember the fib_table and the head of the fib
      alias list of the default route, we can implement one single
      generic version of this routine.
      
      Looking at the fib_hash implementation you may get the impression
      that it's possible for there to be multiple top-level routes in
      the table for the default route.  The truth is, it isn't, the
      insert code will only allow one entry to exist in the zero
      prefix hash table, because all keys evaluate to zero and all
      keys in a hash table must be unique.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c838ff1
    • D
      ipv4: Remember FIB alias list head and table in lookup results. · 5b470441
      David S. Miller 提交于
      This will be used later to implement fib_select_default() in a
      completely generic manner, instead of the current situation where the
      default route is re-looked up in the TRIE/HASH table and then the
      available aliases are analyzed.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b470441
  9. 30 1月, 2011 2 次提交
  10. 29 1月, 2011 3 次提交
  11. 28 1月, 2011 4 次提交
  12. 27 1月, 2011 1 次提交
    • D
      net: Implement read-only protection and COW'ing of metrics. · 62fa8a84
      David S. Miller 提交于
      Routing metrics are now copy-on-write.
      
      Initially a route entry points it's metrics at a read-only location.
      If a routing table entry exists, it will point there.  Else it will
      point at the all zero metric place-holder called 'dst_default_metrics'.
      
      The writeability state of the metrics is stored in the low bits of the
      metrics pointer, we have two bits left to spare if we want to store
      more states.
      
      For the initial implementation, COW is implemented simply via kmalloc.
      However future enhancements will change this to place the writable
      metrics somewhere else, in order to increase sharing.  Very likely
      this "somewhere else" will be the inetpeer cache.
      
      Note also that this means that metrics updates may transiently fail
      if we cannot COW the metrics successfully.
      
      But even by itself, this patch should decrease memory usage and
      increase cache locality especially for routing workloads.  In those
      cases the read-only metric copies stay in place and never get written
      to.
      
      TCP workloads where metrics get updated, and those rare cases where
      PMTU triggers occur, will take a very slight performance hit.  But
      that hit will be alleviated when the long-term writable metrics
      move to a more sharable location.
      
      Since the metrics storage went from a u32 array of RTAX_MAX entries to
      what is essentially a pointer, some retooling of the dst_entry layout
      was necessary.
      
      Most importantly, we need to preserve the alignment of the reference
      count so that it doesn't share cache lines with the read-mostly state,
      as per Eric Dumazet's alignment assertion checks.
      
      The only non-trivial bit here is the move of the 'flags' member into
      the writeable cacheline.  This is OK since we are always accessing the
      flags around the same moment when we made a modification to the
      reference count.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62fa8a84
  13. 25 1月, 2011 2 次提交