1. 05 4月, 2016 1 次提交
    • E
      tcp/dccp: do not touch listener sk_refcnt under synflood · 3b24d854
      Eric Dumazet 提交于
      When a SYNFLOOD targets a non SO_REUSEPORT listener, multiple
      cpus contend on sk->sk_refcnt and sk->sk_wmem_alloc changes.
      
      By letting listeners use SOCK_RCU_FREE infrastructure,
      we can relax TCP_LISTEN lookup rules and avoid touching sk_refcnt
      
      Note that we still use SLAB_DESTROY_BY_RCU rules for other sockets,
      only listeners are impacted by this change.
      
      Peak performance under SYNFLOOD is increased by ~33% :
      
      On my test machine, I could process 3.2 Mpps instead of 2.4 Mpps
      
      Most consuming functions are now skb_set_owner_w() and sock_wfree()
      contending on sk->sk_wmem_alloc when cooking SYNACK and freeing them.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b24d854
  2. 11 2月, 2016 1 次提交
  3. 19 9月, 2015 1 次提交
  4. 18 6月, 2015 1 次提交
    • H
      netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag · 01555e74
      Harout Hedeshian 提交于
      xt_socket is useful for matching sockets with IP_TRANSPARENT and
      taking some action on the matching packets. However, it lacks the
      ability to match only a small subset of transparent sockets.
      
      Suppose there are 2 applications, each with its own set of transparent
      sockets. The first application wants all matching packets dropped,
      while the second application wants them forwarded somewhere else.
      
      Add the ability to retore the skb->mark from the sk_mark. The mark
      is only restored if a matching socket is found and the transparent /
      nowildcard conditions are satisfied.
      
      Now the 2 hypothetical applications can differentiate their sockets
      based on a mark value set with SO_MARK.
      
      iptables -t mangle -I PREROUTING -m socket --transparent \
                                                 --restore-skmark -j action
      iptables -t mangle -A action -m mark --mark 10 -j action2
      iptables -t mangle -A action -m mark --mark 11 -j action3
      Signed-off-by: NHarout Hedeshian <harouth@codeaurora.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      01555e74
  5. 08 4月, 2015 1 次提交
    • D
      netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match · d64d80a2
      Daniel Borkmann 提交于
      Currently in xt_socket, we take advantage of early demuxed sockets
      since commit 00028aa3 ("netfilter: xt_socket: use IP early demux")
      in order to avoid a second socket lookup in the fast path, but we
      only make partial use of this:
      
      We still unnecessarily parse headers, extract proto, {s,d}addr and
      {s,d}ports from the skb data, accessing possible conntrack information,
      etc even though we were not even calling into the socket lookup via
      xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles
      can be spared.
      
      After this patch, we only proceed the slower, manual lookup path
      when we have a skb->sk miss, thus time to match verdict for early
      demuxed sockets will improve further, which might be i.e. interesting
      for use cases such as mentioned in 681f130f ("netfilter: xt_socket:
      add XT_SOCKET_NOWILDCARD flag").
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d64d80a2
  6. 18 3月, 2015 1 次提交
  7. 17 2月, 2015 1 次提交
  8. 17 10月, 2013 1 次提交
  9. 09 10月, 2013 1 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
  10. 01 8月, 2013 1 次提交
  11. 15 7月, 2013 1 次提交
  12. 21 6月, 2013 1 次提交
    • E
      netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag · 681f130f
      Eric Dumazet 提交于
      xt_socket module can be a nice replacement to conntrack module
      in some cases (SYN filtering for example)
      
      But it lacks the ability to match the 3rd packet of TCP
      handshake (ACK coming from the client).
      
      Add a XT_SOCKET_NOWILDCARD flag to disable the wildcard mechanism.
      
      The wildcard is the legacy socket match behavior, that ignores
      LISTEN sockets bound to INADDR_ANY (or ipv6 equivalent)
      
      iptables -I INPUT -p tcp --syn -j SYN_CHAIN
      iptables -I INPUT -m socket --nowildcard -j ACCEPT
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      681f130f
  13. 23 5月, 2013 1 次提交
  14. 03 9月, 2012 1 次提交
    • P
      netfilter: xt_socket: fix compilation warnings with gcc 4.7 · 6703aa74
      Pablo Neira Ayuso 提交于
      This patch fixes compilation warnings in xt_socket with gcc-4.7.
      
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:19: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_match.isra.4’:
      include/net/netfilter/nf_tproxy_core.h:75:2: warning: ‘protocol’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:113:5: note: ‘protocol’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:45: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:106:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:9: note: ‘dport’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:16: note: ‘saddr’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:9: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:19: note: ‘daddr’ was declared here
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6703aa74
  15. 09 5月, 2012 1 次提交
  16. 17 12月, 2011 1 次提交
  17. 04 12月, 2011 1 次提交
  18. 06 6月, 2011 1 次提交
  19. 17 2月, 2011 1 次提交
    • F
      netfilter: tproxy: do not assign timewait sockets to skb->sk · d503b30b
      Florian Westphal 提交于
      Assigning a socket in timewait state to skb->sk can trigger
      kernel oops, e.g. in nfnetlink_log, which does:
      
      if (skb->sk) {
              read_lock_bh(&skb->sk->sk_callback_lock);
              if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...
      
      in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
      is invalid.
      
      Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
      or xt_TPROXY must not assign a timewait socket to skb->sk.
      
      This does the latter.
      
      If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
      thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
      
      The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
      listener socket.
      
      Cc: Balazs Scheidler <bazsi@balabit.hu>
      Cc: KOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NFlorian Westphal <fwestphal@astaro.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d503b30b
  20. 29 10月, 2010 1 次提交
  21. 26 10月, 2010 1 次提交
  22. 21 10月, 2010 2 次提交
  23. 08 6月, 2010 1 次提交
    • E
      netfilter: nf_conntrack: IPS_UNTRACKED bit · 5bfddbd4
      Eric Dumazet 提交于
      NOTRACK makes all cpus share a cache line on nf_conntrack_untracked
      twice per packet. This is bad for performance.
      __read_mostly annotation is also a bad choice.
      
      This patch introduces IPS_UNTRACKED bit so that we can use later a
      per_cpu untrack structure more easily.
      
      A new helper, nf_ct_untracked_get() returns a pointer to
      nf_conntrack_untracked.
      
      Another one, nf_ct_untracked_status_or() is used by nf_nat_init() to add
      IPS_NAT_DONE_MASK bits to untracked status.
      
      nf_ct_is_untracked() prototype is changed to work on a nf_conn pointer.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      5bfddbd4
  24. 12 5月, 2010 2 次提交
  25. 25 3月, 2010 1 次提交
  26. 29 10月, 2009 1 次提交
  27. 19 10月, 2009 1 次提交
    • E
      inet: rename some inet_sock fields · c720c7e8
      Eric Dumazet 提交于
      In order to have better cache layouts of struct sock (separate zones
      for rx/tx paths), we need this preliminary patch.
      
      Goal is to transfert fields used at lookup time in the first
      read-mostly cache line (inside struct sock_common) and move sk_refcnt
      to a separate cache line (only written by rx path)
      
      This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
      sport and id fields. This allows a future patch to define these
      fields as macros, like sk_refcnt, without name clashes.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c720c7e8
  28. 09 6月, 2009 1 次提交
  29. 08 12月, 2008 1 次提交
  30. 08 10月, 2008 2 次提交