1. 28 6月, 2012 1 次提交
  2. 26 6月, 2012 2 次提交
  3. 20 6月, 2012 1 次提交
  4. 19 6月, 2012 1 次提交
  5. 16 6月, 2012 5 次提交
    • P
      netfilter: add user-space connection tracking helper infrastructure · 12f7a505
      Pablo Neira Ayuso 提交于
      There are good reasons to supports helpers in user-space instead:
      
      * Rapid connection tracking helper development, as developing code
        in user-space is usually faster.
      
      * Reliability: A buggy helper does not crash the kernel. Moreover,
        we can monitor the helper process and restart it in case of problems.
      
      * Security: Avoid complex string matching and mangling in kernel-space
        running in privileged mode. Going further, we can even think about
        running user-space helpers as a non-root process.
      
      * Extensibility: It allows the development of very specific helpers (most
        likely non-standard proprietary protocols) that are very likely not to be
        accepted for mainline inclusion in the form of kernel-space connection
        tracking helpers.
      
      This patch adds the infrastructure to allow the implementation of
      user-space conntrack helpers by means of the new nfnetlink subsystem
      `nfnetlink_cthelper' and the existing queueing infrastructure
      (nfnetlink_queue).
      
      I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
      ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
      two pieces. This change is required not to break NAT sequence
      adjustment and conntrack confirmation for traffic that is enqueued
      to our user-space conntrack helpers.
      
      Basic operation, in a few steps:
      
      1) Register user-space helper by means of `nfct':
      
       nfct helper add ftp inet tcp
      
       [ It must be a valid existing helper supported by conntrack-tools ]
      
      2) Add rules to enable the FTP user-space helper which is
         used to track traffic going to TCP port 21.
      
      For locally generated packets:
      
       iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
      
      For non-locally generated packets:
      
       iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
      
      3) Run the test conntrackd in helper mode (see example files under
         doc/helper/conntrackd.conf
      
       conntrackd
      
      4) Generate FTP traffic going, if everything is OK, then conntrackd
         should create expectations (you can check that with `conntrack':
      
       conntrack -E expect
      
          [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
      
      This confirms that our test helper is receiving packets including the
      conntrack information, and adding expectations in kernel-space.
      
      The user-space helper can also store its private tracking information
      in the conntrack structure in the kernel via the CTA_HELP_INFO. The
      kernel will consider this a binary blob whose layout is unknown. This
      information will be included in the information that is transfered
      to user-space via glue code that integrates nfnetlink_queue and
      ctnetlink.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      12f7a505
    • D
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      David S. Miller 提交于
      This reverts commit 2a0c451a.
      
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8803b6c
    • D
      ipv6: Fix types of ip6_update_pmtu(). · 42ae66c8
      David S. Miller 提交于
      The mtu should be a __be32, not the mark.
      Reported-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42ae66c8
    • T
      ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a
      Thomas Graf 提交于
      /proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
      handler is installed in ip6_route_net_init() whereas fib_table_hash is
      allocated in fib6_net_init() _after_ the proc handler has been installed.
      
      This opens up a short time frame to access fib_table_hash with its pants
      down.
      
      fib6_init() as a whole can't be moved to an earlier position as it also
      registers the rtnetlink message handlers which should be registered at
      the end. Therefore split it into fib6_init() which is run early and
      fib6_init_late() to register the rtnetlink message handlers.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a0c451a
    • D
      ipv6: Handle PMTU in ICMP error handlers. · 81aded24
      David S. Miller 提交于
      One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
      to handle the error pass the 32-bit info cookie in network byte order
      whereas ipv4 passes it around in host byte order.
      
      Like the ipv4 side, we have two helper functions.  One for when we
      have a socket context and one for when we do not.
      
      ip6ip6 tunnels are not handled here, because they handle PMTU events
      by essentially relaying another ICMP packet-too-big message back to
      the original sender.
      
      This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
      kinds of situations that simply cannot happen when we do the PMTU
      update directly using a fully resolved route.
      
      In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
      likely be removed or changed into a BUG_ON() check.  We should never
      have a prefixed ipv6 route when we get there.
      
      Another piece of strange history here is that TCP and DCCP, unlike in
      ipv4, never invoke the update_pmtu() method from their ICMP error
      handlers.  This is incredibly astonishing since this is the context
      where we have the most accurate context in which to make a PMTU
      update, namely we have a fully connected socket and associated cached
      socket route.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81aded24
  6. 15 6月, 2012 1 次提交
    • D
      ipv4: Handle PMTU in all ICMP error handlers. · 36393395
      David S. Miller 提交于
      With ip_rt_frag_needed() removed, we have to explicitly update PMTU
      information in every ICMP error handler.
      
      Create two helper functions to facilitate this.
      
      1) ipv4_sk_update_pmtu()
      
         This updates the PMTU when we have a socket context to
         work with.
      
      2) ipv4_update_pmtu()
      
         Raw version, used when no socket context is available.  For this
         interface, we essentially just pass in explicit arguments for
         the flow identity information we would have extracted from the
         socket.
      
         And you'll notice that ipv4_sk_update_pmtu() is simply implemented
         in terms of ipv4_update_pmtu()
      
      Note that __ip_route_output_key() is used, rather than something like
      ip_route_output_flow() or ip_route_output_key().  This is because we
      absolutely do not want to end up with a route that does IPSEC
      encapsulation and the like.  Instead, we only want the route that
      would get us to the node described by the outermost IP header.
      Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36393395
  7. 13 6月, 2012 1 次提交
    • M
      net-next: add dev_loopback_xmit() to avoid duplicate code · 95603e22
      Michel Machado 提交于
      Add dev_loopback_xmit() in order to deduplicate functions
      ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
      ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
      
      I was about to reinvent the wheel when I noticed that
      ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
      need and are not IP-only functions, but they were not available to reuse
      elsewhere.
      
      ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
      understand that this is harmless, and should be in dev_loopback_xmit().
      Signed-off-by: NMichel Machado <michel@digirati.com.br>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Jiri Pirko <jpirko@redhat.com>
      CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95603e22
  8. 11 6月, 2012 4 次提交
  9. 10 6月, 2012 5 次提交
  10. 09 6月, 2012 3 次提交
    • D
      tcp: Get rid of inetpeer special cases. · 4670fd81
      David S. Miller 提交于
      The get_peer method TCP uses is full of special cases that make no
      sense accommodating, and it also gets in the way of doing more
      reasonable things here.
      
      First of all, if the socket doesn't have a usable cached route, there
      is no sense in trying to optimize timewait recycling.
      
      Likewise for the case where we have IP options, such as SRR enabled,
      that make the IP header destination address (and thus the destination
      address of the route key) differ from that of the connection's
      destination address.
      
      Just return a NULL peer in these cases, and thus we're also able to
      get rid of the clumsy inetpeer release logic.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4670fd81
    • D
      inet: Create and use rt{,6}_get_peer_create(). · fbfe95a4
      David S. Miller 提交于
      There's a lot of places that open-code rt{,6}_get_peer() only because
      they want to set 'create' to one.  So add an rt{,6}_get_peer_create()
      for their sake.
      
      There were also a few spots open-coding plain rt{,6}_get_peer() and
      those are transformed here as well.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbfe95a4
    • G
      inetpeer: add parameter net for inet_getpeer_v4,v6 · 54db0cc2
      Gao feng 提交于
      add struct net as a parameter of inet_getpeer_v[4,6],
      use net to replace &init_net.
      
      and modify some places to provide net for inet_getpeer_v[4,6]
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54db0cc2
  11. 08 6月, 2012 2 次提交
    • V
      snmp: fix OutOctets counter to include forwarded datagrams · 2d8dbb04
      Vincent Bernat 提交于
      RFC 4293 defines ipIfStatsOutOctets (similar definition for
      ipSystemStatsOutOctets):
      
         The total number of octets in IP datagrams delivered to the lower
         layers for transmission.  Octets from datagrams counted in
         ipIfStatsOutTransmits MUST be counted here.
      
      And ipIfStatsOutTransmits:
      
         The total number of IP datagrams that this entity supplied to the
         lower layers for transmission.  This includes datagrams generated
         locally and those forwarded by this entity.
      
      Therefore, IPSTATS_MIB_OUTOCTETS must be incremented when incrementing
      IPSTATS_MIB_OUTFORWDATAGRAMS.
      
      IP_UPD_PO_STATS is not used since ipIfStatsOutRequests must not
      include forwarded datagrams:
      
         The total number of IP datagrams that local IP user-protocols
         (including ICMP) supplied to IP in requests for transmission.  Note
         that this counter does not include any datagrams counted in
         ipIfStatsOutForwDatagrams.
      Signed-off-by: NVincent Bernat <bernat@luffy.cx>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8dbb04
    • T
      ipv6: fib: Restore NTF_ROUTER exception in fib6_age() · 8bd74516
      Thomas Graf 提交于
      Commit 5339ab8b (ipv6: fib: Convert fib6_age() to
      dst_neigh_lookup().) seems to have mistakenly inverted the
      exception for cached NTF_ROUTER routes.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bd74516
  12. 07 6月, 2012 6 次提交
  13. 04 6月, 2012 2 次提交
  14. 02 6月, 2012 1 次提交
    • E
      tcp: reflect SYN queue_mapping into SYNACK packets · fff32699
      Eric Dumazet 提交于
      While testing how linux behaves on SYNFLOOD attack on multiqueue device
      (ixgbe), I found that SYNACK messages were dropped at Qdisc level
      because we send them all on a single queue.
      
      Obvious choice is to reflect incoming SYN packet @queue_mapping to
      SYNACK packet.
      
      Under stress, my machine could only send 25.000 SYNACK per second (for
      200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues.
      
      After patch, not a single SYNACK is dropped.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fff32699
  15. 27 5月, 2012 2 次提交
    • G
      ipv6: fix incorrect ipsec fragment · 0c183379
      Gao feng 提交于
      Since commit ad0081e4
      "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
      the fragment of packets is incorrect.
      because tunnel mode needs IPsec headers and trailer for all fragments,
      while on transport mode it is sufficient to add the headers to the
      first fragment and the trailer to the last.
      
      so modify mtu and maxfraglen base on ipsec mode and if fragment is first
      or last.
      
      with my test,it work well(every fragment's size is the mtu)
      and does not trigger slow fragment path.
      
      Changes from v1:
      	though optimization, mtu_prev and maxfraglen_prev can be delete.
      	replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
      	add fuction ip6_append_data_mtu to make codes clearer.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c183379
    • B
      xfrm: take net hdr len into account for esp payload size calculation · 91657eaf
      Benjamin Poirier 提交于
      Corrects the function that determines the esp payload size. The calculations
      done in esp{4,6}_get_mtu() lead to overlength frames in transport mode for
      certain mtu values and suboptimal frames for others.
      
      According to what is done, mainly in esp{,6}_output() and tcp_mtu_to_mss(),
      net_header_len must be taken into account before doing the alignment
      calculation.
      Signed-off-by: NBenjamin Poirier <bpoirier@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91657eaf
  16. 21 5月, 2012 1 次提交
    • E
      ipv6/exthdrs: strict Pad1 and PadN check · 9b905fe6
      Eldad Zack 提交于
      The following tightens the padding check from commit
      c1412fce :
      
      * Take into account combinations of consecutive Pad1 and PadN.
      
      * Catch the corner case of when only padding is present in the
        header, when the extention header length is 0 (i.e., 8 bytes).
        In this case, the header would have exactly 6 bytes of padding:
      
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      :  Next Header  : Hdr Ext Len=0 :                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
      :                        Padding (Pad1 or PadN)                 :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      Signed-off-by: NEldad Zack <eldad@fogrefinery.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b905fe6
  17. 20 5月, 2012 2 次提交