1. 19 6月, 2012 12 次提交
  2. 18 6月, 2012 3 次提交
  3. 17 6月, 2012 6 次提交
  4. 16 6月, 2012 18 次提交
    • P
      netfilter: add user-space connection tracking helper infrastructure · 12f7a505
      Pablo Neira Ayuso 提交于
      There are good reasons to supports helpers in user-space instead:
      
      * Rapid connection tracking helper development, as developing code
        in user-space is usually faster.
      
      * Reliability: A buggy helper does not crash the kernel. Moreover,
        we can monitor the helper process and restart it in case of problems.
      
      * Security: Avoid complex string matching and mangling in kernel-space
        running in privileged mode. Going further, we can even think about
        running user-space helpers as a non-root process.
      
      * Extensibility: It allows the development of very specific helpers (most
        likely non-standard proprietary protocols) that are very likely not to be
        accepted for mainline inclusion in the form of kernel-space connection
        tracking helpers.
      
      This patch adds the infrastructure to allow the implementation of
      user-space conntrack helpers by means of the new nfnetlink subsystem
      `nfnetlink_cthelper' and the existing queueing infrastructure
      (nfnetlink_queue).
      
      I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
      ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
      two pieces. This change is required not to break NAT sequence
      adjustment and conntrack confirmation for traffic that is enqueued
      to our user-space conntrack helpers.
      
      Basic operation, in a few steps:
      
      1) Register user-space helper by means of `nfct':
      
       nfct helper add ftp inet tcp
      
       [ It must be a valid existing helper supported by conntrack-tools ]
      
      2) Add rules to enable the FTP user-space helper which is
         used to track traffic going to TCP port 21.
      
      For locally generated packets:
      
       iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
      
      For non-locally generated packets:
      
       iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
      
      3) Run the test conntrackd in helper mode (see example files under
         doc/helper/conntrackd.conf
      
       conntrackd
      
      4) Generate FTP traffic going, if everything is OK, then conntrackd
         should create expectations (you can check that with `conntrack':
      
       conntrack -E expect
      
          [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      
      This confirms that our test helper is receiving packets including the
      conntrack information, and adding expectations in kernel-space.
      
      The user-space helper can also store its private tracking information
      in the conntrack structure in the kernel via the CTA_HELP_INFO. The
      kernel will consider this a binary blob whose layout is unknown. This
      information will be included in the information that is transfered
      to user-space via glue code that integrates nfnetlink_queue and
      ctnetlink.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12f7a505
    • P
      netfilter: ctnetlink: add CTA_HELP_INFO attribute · ae243bee
      Pablo Neira Ayuso 提交于
      This attribute can be used to modify and to dump the internal
      protocol information.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ae243bee
    • P
      netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled · 8c88f87c
      Pablo Neira Ayuso 提交于
      User-space programs that receive traffic via NFQUEUE may mangle packets.
      If NAT is enabled, this usually puzzles sequence tracking, leading to
      traffic disruptions.
      
      With this patch, nfnl_queue will make the corresponding NAT TCP sequence
      adjustment if:
      
      1) The packet has been mangled,
      2) the NFQA_CFG_F_CONNTRACK flag has been set, and
      3) NAT is detected.
      
      There are some records on the Internet complaning about this issue:
      http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables
      
      By now, we only support TCP since we have no helpers for DCCP or SCTP.
      Better to add this if we ever have some helper over those layer 4 protocols.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8c88f87c
    • P
      netfilter: add glue code to integrate nfnetlink_queue and ctnetlink · 9cb01766
      Pablo Neira Ayuso 提交于
      This patch allows you to include the conntrack information together
      with the packet that is sent to user-space via NFQUEUE.
      
      Previously, there was no integration between ctnetlink and
      nfnetlink_queue. If you wanted to access conntrack information
      from your libnetfilter_queue program, you required to query
      ctnetlink from user-space to obtain it. Thus, delaying the packet
      processing even more.
      
      Including the conntrack information is optional, you can set it
      via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      9cb01766
    • P
      netfilter: nf_ct_helper: implement variable length helper private data · 1afc5679
      Pablo Neira Ayuso 提交于
      This patch uses the new variable length conntrack extensions.
      
      Instead of using union nf_conntrack_help that contain all the
      helper private data information, we allocate variable length
      area to store the private helper data.
      
      This patch includes the modification of all existing helpers.
      It also includes a couple of include header to avoid compilation
      warnings.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1afc5679
    • P
      netfilter: nf_ct_ext: support variable length extensions · 3cf4c7e3
      Pablo Neira Ayuso 提交于
      We can now define conntrack extensions of variable size. This
      patch is useful to get rid of these unions:
      
      union nf_conntrack_help
      union nf_conntrack_proto
      union nf_conntrack_nat_help
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3cf4c7e3
    • P
      netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names · 3a8fc53a
      Pablo Neira Ayuso 提交于
      This patch modifies the struct nf_conntrack_helper to allocate
      the room for the helper name. The maximum length is 16 bytes
      (this was already introduced in 2.6.24).
      
      For the maximum length for expectation policy names, I have
      also selected 16 bytes.
      
      This patch is required by the follow-up patch to support
      user-space connection tracking helpers.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3a8fc53a
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · aee289ba
      David S. Miller 提交于
      Conflicts:
      	net/ipv6/route.c
      
      Pull in 'net' again to get the revert of Thomas's change
      which introduced regressions.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aee289ba
    • D
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      David S. Miller 提交于
      This reverts commit 2a0c451a.
      
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8803b6c
    • D
      ipv6: Fix types of ip6_update_pmtu(). · 42ae66c8
      David S. Miller 提交于
      The mtu should be a __be32, not the mark.
      Reported-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42ae66c8
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7e52b33b
      David S. Miller 提交于
      Conflicts:
      	net/ipv6/route.c
      
      This deals with a merge conflict between the net-next addition of the
      inetpeer network namespace ops, and Thomas Graf's bug fix in
      2a0c451a which makes sure we don't
      register /proc/net/ipv6_route before it is actually safe to do so.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e52b33b
    • D
    • T
      ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a
      Thomas Graf 提交于
      /proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
      handler is installed in ip6_route_net_init() whereas fib_table_hash is
      allocated in fib6_net_init() _after_ the proc handler has been installed.
      
      This opens up a short time frame to access fib_table_hash with its pants
      down.
      
      fib6_init() as a whole can't be moved to an earlier position as it also
      registers the rtnetlink message handlers which should be registered at
      the end. Therefore split it into fib6_init() which is run early and
      fib6_init_late() to register the rtnetlink message handlers.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a0c451a
    • D
      qlcnic: off by one in qlcnic_init_pci_info() · 0f6efff9
      Dan Carpenter 提交于
      The adapter->npars[] array has QLCNIC_MAX_PCI_FUNC elements.  We
      allocate it that way a few lines earlier in the function.  So this test
      is off by one.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f6efff9
    • E
      net: remove skb_orphan_try() · 62b1a8ab
      Eric Dumazet 提交于
      Orphaning skb in dev_hard_start_xmit() makes bonding behavior
      unfriendly for applications sending big UDP bursts : Once packets
      pass the bonding device and come to real device, they might hit a full
      qdisc and be dropped. Without orphaning, the sender is automatically
      throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
      sk_sndbuf is not too big)
      
      We could try to defer the orphaning adding another test in
      dev_hard_start_xmit(), but all this seems of little gain,
      now that BQL tends to make packets more likely to be parked
      in Qdisc queues instead of NIC TX ring, in cases where performance
      matters.
      
      Reverts commits :
      fc6055a5 net: Introduce skb_orphan_try()
      87fd308c net: skb_tx_hash() fix relative to skb_orphan_try()
      and removes SKBTX_DRV_NEEDS_SK_REF flag
      Reported-and-bisected-by: NJean-Michel Hautbois <jhautbois@gmail.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62b1a8ab
    • E
      bnx2x: fix panic when TX ring is full · bc14786a
      Eric Dumazet 提交于
      There is a off by one error in the minimal number of BD in
      bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue.
      
      A full size GSO packet, with data included in skb->head really needs
      (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
      
      This error triggers if BQL is disabled and heavy TCP transmit traffic
      occurs.
      
      bnx2x_tx_split() definitely can be called, remove a wrong comment.
      Reported-by: NTomas Hruby <thruby@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Yaniv Rosner <yanivr@broadcom.com>
      Cc: Merav Sicron <meravs@broadcom.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Robert Evans <evansr@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc14786a
    • D
      can: c_can: precedence error in c_can_chip_config() · d9cb9bd6
      Dan Carpenter 提交于
      (CAN_CTRLMODE_LISTENONLY & CAN_CTRLMODE_LOOPBACK) is (0x02 & 0x01) which
      is zero so the condition is never true.  The intent here was to test
      that both flags were set.
      
      Cc: <stable@kernel.org> # 2.6.39+
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9cb9bd6
    • D
      ipv6: Handle PMTU in ICMP error handlers. · 81aded24
      David S. Miller 提交于
      One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
      to handle the error pass the 32-bit info cookie in network byte order
      whereas ipv4 passes it around in host byte order.
      
      Like the ipv4 side, we have two helper functions.  One for when we
      have a socket context and one for when we do not.
      
      ip6ip6 tunnels are not handled here, because they handle PMTU events
      by essentially relaying another ICMP packet-too-big message back to
      the original sender.
      
      This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
      kinds of situations that simply cannot happen when we do the PMTU
      update directly using a fully resolved route.
      
      In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
      likely be removed or changed into a BUG_ON() check.  We should never
      have a prefixed ipv6 route when we get there.
      
      Another piece of strange history here is that TCP and DCCP, unlike in
      ipv4, never invoke the update_pmtu() method from their ICMP error
      handlers.  This is incredibly astonishing since this is the context
      where we have the most accurate context in which to make a PMTU
      update, namely we have a fully connected socket and associated cached
      socket route.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81aded24
  5. 15 6月, 2012 1 次提交
    • D
      ipv4: Handle PMTU in all ICMP error handlers. · 36393395
      David S. Miller 提交于
      With ip_rt_frag_needed() removed, we have to explicitly update PMTU
      information in every ICMP error handler.
      
      Create two helper functions to facilitate this.
      
      1) ipv4_sk_update_pmtu()
      
         This updates the PMTU when we have a socket context to
         work with.
      
      2) ipv4_update_pmtu()
      
         Raw version, used when no socket context is available.  For this
         interface, we essentially just pass in explicit arguments for
         the flow identity information we would have extracted from the
         socket.
      
         And you'll notice that ipv4_sk_update_pmtu() is simply implemented
         in terms of ipv4_update_pmtu()
      
      Note that __ip_route_output_key() is used, rather than something like
      ip_route_output_flow() or ip_route_output_key().  This is because we
      absolutely do not want to end up with a route that does IPSEC
      encapsulation and the like.  Instead, we only want the route that
      would get us to the node described by the outermost IP header.
      Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36393395