1. 20 6月, 2012 1 次提交
  2. 19 6月, 2012 2 次提交
  3. 17 6月, 2012 1 次提交
  4. 16 6月, 2012 9 次提交
    • P
      netfilter: add user-space connection tracking helper infrastructure · 12f7a505
      Pablo Neira Ayuso 提交于
      There are good reasons to supports helpers in user-space instead:
      
      * Rapid connection tracking helper development, as developing code
        in user-space is usually faster.
      
      * Reliability: A buggy helper does not crash the kernel. Moreover,
        we can monitor the helper process and restart it in case of problems.
      
      * Security: Avoid complex string matching and mangling in kernel-space
        running in privileged mode. Going further, we can even think about
        running user-space helpers as a non-root process.
      
      * Extensibility: It allows the development of very specific helpers (most
        likely non-standard proprietary protocols) that are very likely not to be
        accepted for mainline inclusion in the form of kernel-space connection
        tracking helpers.
      
      This patch adds the infrastructure to allow the implementation of
      user-space conntrack helpers by means of the new nfnetlink subsystem
      `nfnetlink_cthelper' and the existing queueing infrastructure
      (nfnetlink_queue).
      
      I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
      ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
      two pieces. This change is required not to break NAT sequence
      adjustment and conntrack confirmation for traffic that is enqueued
      to our user-space conntrack helpers.
      
      Basic operation, in a few steps:
      
      1) Register user-space helper by means of `nfct':
      
       nfct helper add ftp inet tcp
      
       [ It must be a valid existing helper supported by conntrack-tools ]
      
      2) Add rules to enable the FTP user-space helper which is
         used to track traffic going to TCP port 21.
      
      For locally generated packets:
      
       iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
      
      For non-locally generated packets:
      
       iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
      
      3) Run the test conntrackd in helper mode (see example files under
         doc/helper/conntrackd.conf
      
       conntrackd
      
      4) Generate FTP traffic going, if everything is OK, then conntrackd
         should create expectations (you can check that with `conntrack':
      
       conntrack -E expect
      
          [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      
      This confirms that our test helper is receiving packets including the
      conntrack information, and adding expectations in kernel-space.
      
      The user-space helper can also store its private tracking information
      in the conntrack structure in the kernel via the CTA_HELP_INFO. The
      kernel will consider this a binary blob whose layout is unknown. This
      information will be included in the information that is transfered
      to user-space via glue code that integrates nfnetlink_queue and
      ctnetlink.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12f7a505
    • P
      netfilter: ctnetlink: add CTA_HELP_INFO attribute · ae243bee
      Pablo Neira Ayuso 提交于
      This attribute can be used to modify and to dump the internal
      protocol information.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ae243bee
    • P
      netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled · 8c88f87c
      Pablo Neira Ayuso 提交于
      User-space programs that receive traffic via NFQUEUE may mangle packets.
      If NAT is enabled, this usually puzzles sequence tracking, leading to
      traffic disruptions.
      
      With this patch, nfnl_queue will make the corresponding NAT TCP sequence
      adjustment if:
      
      1) The packet has been mangled,
      2) the NFQA_CFG_F_CONNTRACK flag has been set, and
      3) NAT is detected.
      
      There are some records on the Internet complaning about this issue:
      http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables
      
      By now, we only support TCP since we have no helpers for DCCP or SCTP.
      Better to add this if we ever have some helper over those layer 4 protocols.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8c88f87c
    • P
      netfilter: nf_ct_helper: implement variable length helper private data · 1afc5679
      Pablo Neira Ayuso 提交于
      This patch uses the new variable length conntrack extensions.
      
      Instead of using union nf_conntrack_help that contain all the
      helper private data information, we allocate variable length
      area to store the private helper data.
      
      This patch includes the modification of all existing helpers.
      It also includes a couple of include header to avoid compilation
      warnings.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1afc5679
    • P
      netfilter: nf_ct_ext: support variable length extensions · 3cf4c7e3
      Pablo Neira Ayuso 提交于
      We can now define conntrack extensions of variable size. This
      patch is useful to get rid of these unions:
      
      union nf_conntrack_help
      union nf_conntrack_proto
      union nf_conntrack_nat_help
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3cf4c7e3
    • P
      netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names · 3a8fc53a
      Pablo Neira Ayuso 提交于
      This patch modifies the struct nf_conntrack_helper to allocate
      the room for the helper name. The maximum length is 16 bytes
      (this was already introduced in 2.6.24).
      
      For the maximum length for expectation policy names, I have
      also selected 16 bytes.
      
      This patch is required by the follow-up patch to support
      user-space connection tracking helpers.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3a8fc53a
    • D
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      David S. Miller 提交于
      This reverts commit 2a0c451a.
      
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8803b6c
    • T
      ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a
      Thomas Graf 提交于
      /proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
      handler is installed in ip6_route_net_init() whereas fib_table_hash is
      allocated in fib6_net_init() _after_ the proc handler has been installed.
      
      This opens up a short time frame to access fib_table_hash with its pants
      down.
      
      fib6_init() as a whole can't be moved to an earlier position as it also
      registers the rtnetlink message handlers which should be registered at
      the end. Therefore split it into fib6_init() which is run early and
      fib6_init_late() to register the rtnetlink message handlers.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a0c451a
    • D
      ipv6: Handle PMTU in ICMP error handlers. · 81aded24
      David S. Miller 提交于
      One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
      to handle the error pass the 32-bit info cookie in network byte order
      whereas ipv4 passes it around in host byte order.
      
      Like the ipv4 side, we have two helper functions.  One for when we
      have a socket context and one for when we do not.
      
      ip6ip6 tunnels are not handled here, because they handle PMTU events
      by essentially relaying another ICMP packet-too-big message back to
      the original sender.
      
      This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
      kinds of situations that simply cannot happen when we do the PMTU
      update directly using a fully resolved route.
      
      In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
      likely be removed or changed into a BUG_ON() check.  We should never
      have a prefixed ipv6 route when we get there.
      
      Another piece of strange history here is that TCP and DCCP, unlike in
      ipv4, never invoke the update_pmtu() method from their ICMP error
      handlers.  This is incredibly astonishing since this is the context
      where we have the most accurate context in which to make a PMTU
      update, namely we have a fully connected socket and associated cached
      socket route.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81aded24
  5. 15 6月, 2012 1 次提交
    • D
      ipv4: Handle PMTU in all ICMP error handlers. · 36393395
      David S. Miller 提交于
      With ip_rt_frag_needed() removed, we have to explicitly update PMTU
      information in every ICMP error handler.
      
      Create two helper functions to facilitate this.
      
      1) ipv4_sk_update_pmtu()
      
         This updates the PMTU when we have a socket context to
         work with.
      
      2) ipv4_update_pmtu()
      
         Raw version, used when no socket context is available.  For this
         interface, we essentially just pass in explicit arguments for
         the flow identity information we would have extracted from the
         socket.
      
         And you'll notice that ipv4_sk_update_pmtu() is simply implemented
         in terms of ipv4_update_pmtu()
      
      Note that __ip_route_output_key() is used, rather than something like
      ip_route_output_flow() or ip_route_output_key().  This is because we
      absolutely do not want to end up with a route that does IPSEC
      encapsulation and the like.  Instead, we only want the route that
      would get us to the node described by the outermost IP header.
      Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36393395
  6. 13 6月, 2012 1 次提交
    • E
      bonding: Fix corrupted queue_mapping · 5ee31c68
      Eric Dumazet 提交于
      In the transmit path of the bonding driver, skb->cb is used to
      stash the skb->queue_mapping so that the bonding device can set its
      own queue mapping.  This value becomes corrupted since the skb->cb is
      also used in __dev_xmit_skb.
      
      When transmitting through bonding driver, bond_select_queue is
      called from dev_queue_xmit.  In bond_select_queue the original
      skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
      and skb->queue_mapping is overwritten with the bond driver queue.
      
      Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
      the packet length into skb->cb, thereby overwriting the stashed
      queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
      the queue mapping for the skb is set to the stashed value which is now
      the skb length and hence is an invalid queue for the slave device.
      
      If we want to save skb->queue_mapping into skb->cb[], best place is to
      add a field in struct qdisc_skb_cb, to make sure it wont conflict with
      other layers (eg : Qdiscc, Infiniband...)
      
      This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
      bytes :
      
      netem qdisc for example assumes it can store an u64 in it, without
      misalignment penalty.
      
      Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
      The largest user is CHOKe and it fills it.
      
      Based on a previous patch from Tom Herbert.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NTom Herbert <therbert@google.com>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Roland Dreier <roland@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ee31c68
  7. 12 6月, 2012 1 次提交
  8. 11 6月, 2012 6 次提交
  9. 10 6月, 2012 4 次提交
  10. 09 6月, 2012 7 次提交
  11. 07 6月, 2012 7 次提交