1. 06 4月, 2013 1 次提交
    • G
      netfilter: nf_log: prepare net namespace support for loggers · 30e0c6a6
      Gao feng 提交于
      This patch adds netns support to nf_log and it prepares netns
      support for existing loggers. It is composed of four major
      changes.
      
      1) nf_log_register has been split to two functions: nf_log_register
         and nf_log_set. The new nf_log_register is used to globally
         register the nf_logger and nf_log_set is used for enabling
         pernet support from nf_loggers.
      
         Per netns is not yet complete after this patch, it comes in
         separate follow up patches.
      
      2) Add net as a parameter of nf_log_bind_pf. Per netns is not
         yet complete after this patch, it only allows to bind the
         nf_logger to the protocol family from init_net and it skips
         other cases.
      
      3) Adapt all nf_log_packet callers to pass netns as parameter.
         After this patch, this function only works for init_net.
      
      4) Make the sysctl net/netfilter/nf_log pernet.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      30e0c6a6
  2. 03 12月, 2012 1 次提交
  3. 10 9月, 2012 2 次提交
  4. 30 8月, 2012 1 次提交
  5. 05 7月, 2012 2 次提交
  6. 28 6月, 2012 2 次提交
  7. 12 6月, 2012 1 次提交
  8. 07 6月, 2012 3 次提交
  9. 17 5月, 2012 1 次提交
  10. 09 4月, 2012 1 次提交
  11. 02 4月, 2012 1 次提交
  12. 08 3月, 2012 3 次提交
  13. 17 12月, 2011 1 次提交
  14. 30 8月, 2011 2 次提交
  15. 01 3月, 2011 1 次提交
    • P
      netfilter: nf_ct_tcp: fix out of sync scenario while in SYN_RECV · 8a80c79a
      Pablo Neira Ayuso 提交于
      This patch fixes the out of sync scenarios while in SYN_RECV state.
      
      Quoting Jozsef, what it happens if we are out of sync if the
      following:
      
      > > b. conntrack entry is outdated, new SYN received
      > >    - (b1) we ignore it but save the initialization data from it
      > >    - (b2) when the reply SYN/ACK receives and it matches the saved data,
      > >      we pick up the new connection
      This is what it should happen if we are in SYN_RECV state. Initially,
      the SYN packet hits b1, thus we save data from it. But the SYN/ACK
      packet is considered a retransmission given that we're in SYN_RECV
      state. Therefore, we never hit b2 and we don't get in sync. To fix
      this, we ignore SYN/ACK if we are in SYN_RECV. If the previous packet
      was a SYN, then we enter the ignore case that get us in sync.
      
      This patch helps a lot to conntrackd in stress scenarios (assumming a
      client that generates lots of small TCP connections). During the failover,
      consider that the new primary has injected one outdated flow in SYN_RECV
      state (this is likely to happen if the conntrack event rate is high
      because the backup will be a bit delayed from the primary). With the
      current code, if the client starts a new fresh connection that matches
      the tuple, the SYN packet will be ignored without updating the state
      tracking, and the SYN+ACK in reply will blocked as it will not pass
      checkings III or IV (since all state tracking in the original direction
      is not initialized because of the SYN packet was ignored and the ignore
      case that get us in sync is not applied).
      
      I posted a couple of patches before this one. Changli Gao spotted
      a simpler way to fix this problem. This patch implements his idea.
      
      Cc: Changli Gao <xiaosuo@gmail.com>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      8a80c79a
  16. 13 11月, 2010 1 次提交
  17. 18 10月, 2010 1 次提交
  18. 15 7月, 2010 1 次提交
  19. 16 6月, 2010 1 次提交
    • C
      tcp: unify tcp flag macros · a3433f35
      Changli Gao 提交于
      unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH,
      TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced
      with the corresponding TCPHDR_*.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      ----
       include/net/tcp.h                      |   24 ++++++-------
       net/ipv4/tcp.c                         |    8 ++--
       net/ipv4/tcp_input.c                   |    2 -
       net/ipv4/tcp_output.c                  |   59 ++++++++++++++++-----------------
       net/netfilter/nf_conntrack_proto_tcp.c |   32 ++++++-----------
       net/netfilter/xt_TCPMSS.c              |    4 --
       6 files changed, 58 insertions(+), 71 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3433f35
  20. 16 2月, 2010 1 次提交
  21. 03 2月, 2010 1 次提交
    • P
      netfilter: nf_conntrack: split up IPCT_STATUS event · 858b3133
      Patrick McHardy 提交于
      Split up the IPCT_STATUS event into an IPCT_REPLY event, which is generated
      when the IPS_SEEN_REPLY bit is set, and an IPCT_ASSURED event, which is
      generated when the IPS_ASSURED bit is set.
      
      In combination with a following patch to support selective event delivery,
      this can be used for "sparse" conntrack replication: start replicating the
      conntrack entry after it reached the ASSURED state and that way it's SYN-flood
      resistant.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      858b3133
  22. 23 11月, 2009 1 次提交
    • P
      netfilter: nf_ct_tcp: improve out-of-sync situation in TCP tracking · c4832c7b
      Pablo Neira Ayuso 提交于
      Without this patch, if we receive a SYN packet from the client while
      the firewall is out-of-sync, we let it go through. Then, if we see
      the SYN/ACK reply coming from the server, we destroy the conntrack
      entry and drop the packet to trigger a new retransmission. Then,
      the retransmision from the client is used to start a new clean
      session.
      
      This patch improves the current handling. Basically, if we see an
      unexpected SYN packet, we annotate the TCP options. Then, if we
      see the reply SYN/ACK, this means that the firewall was indeed
      out-of-sync. Therefore, we set a clean new session from the existing
      entry based on the annotated values.
      
      This patch adds two new 8-bits fields that fit in a 16-bits gap of
      the ip_ct_tcp structure.
      
      This patch is particularly useful for conntrackd since the
      asynchronous nature of the state-synchronization allows to have
      backup nodes that are not perfect copies of the master. This helps
      to improve the recovery under some worst-case scenarios.
      
      I have tested this by creating lots of conntrack entries in wrong
      state:
      
      for ((i=1024;i<65535;i++)); do conntrack -I -p tcp -s 192.168.2.101 -d 192.168.2.2 --sport $i --dport 80 -t 800 --state ESTABLISHED -u ASSURED,SEEN_REPLY; done
      
      Then, I make some TCP connections:
      
      $ echo GET / | nc 192.168.2.2 80
      
      The events show the result:
      
       [UPDATE] tcp      6 60 SYN_RECV src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
       [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
       [UPDATE] tcp      6 120 FIN_WAIT src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
       [UPDATE] tcp      6 30 LAST_ACK src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
       [UPDATE] tcp      6 120 TIME_WAIT src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
      
      and tcpdump shows no retransmissions:
      
      20:47:57.271951 IP 192.168.2.101.33221 > 192.168.2.2.www: S 435402517:435402517(0) win 5840 <mss 1460,sackOK,timestamp 4294961827 0,nop,wscale 6>
      20:47:57.273538 IP 192.168.2.2.www > 192.168.2.101.33221: S 3509927945:3509927945(0) ack 435402518 win 5792 <mss 1460,sackOK,timestamp 235681024 4294961827,nop,wscale 4>
      20:47:57.273608 IP 192.168.2.101.33221 > 192.168.2.2.www: . ack 3509927946 win 92 <nop,nop,timestamp 4294961827 235681024>
      20:47:57.273693 IP 192.168.2.101.33221 > 192.168.2.2.www: P 435402518:435402524(6) ack 3509927946 win 92 <nop,nop,timestamp 4294961827 235681024>
      20:47:57.275492 IP 192.168.2.2.www > 192.168.2.101.33221: . ack 435402524 win 362 <nop,nop,timestamp 235681024 4294961827>
      20:47:57.276492 IP 192.168.2.2.www > 192.168.2.101.33221: P 3509927946:3509928082(136) ack 435402524 win 362 <nop,nop,timestamp 235681025 4294961827>
      20:47:57.276515 IP 192.168.2.101.33221 > 192.168.2.2.www: . ack 3509928082 win 108 <nop,nop,timestamp 4294961828 235681025>
      20:47:57.276521 IP 192.168.2.2.www > 192.168.2.101.33221: F 3509928082:3509928082(0) ack 435402524 win 362 <nop,nop,timestamp 235681025 4294961827>
      20:47:57.277369 IP 192.168.2.101.33221 > 192.168.2.2.www: F 435402524:435402524(0) ack 3509928083 win 108 <nop,nop,timestamp 4294961828 235681025>
      20:47:57.279491 IP 192.168.2.2.www > 192.168.2.101.33221: . ack 435402525 win 362 <nop,nop,timestamp 235681025 4294961828>
      
      I also added a rule to log invalid packets, with no occurrences  :-) .
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      c4832c7b
  23. 12 11月, 2009 1 次提交
    • E
      sysctl net: Remove unused binary sysctl code · f8572d8f
      Eric W. Biederman 提交于
      Now that sys_sysctl is a compatiblity wrapper around /proc/sys
      all sysctl strategy routines, and all ctl_name and strategy
      entries in the sysctl tables are unused, and can be
      revmoed.
      
      In addition neigh_sysctl_register has been modified to no longer
      take a strategy argument and it's callers have been modified not
      to pass one.
      
      Cc: "David Miller" <davem@davemloft.net>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f8572d8f
  24. 06 11月, 2009 1 次提交
    • J
      netfilter: nf_nat: fix NAT issue in 2.6.30.4+ · f9dd09c7
      Jozsef Kadlecsik 提交于
      Vitezslav Samel discovered that since 2.6.30.4+ active FTP can not work
      over NAT. The "cause" of the problem was a fix of unacknowledged data
      detection with NAT (commit a3a9f79e).
      However, actually, that fix uncovered a long standing bug in TCP conntrack:
      when NAT was enabled, we simply updated the max of the right edge of
      the segments we have seen (td_end), by the offset NAT produced with
      changing IP/port in the data. However, we did not update the other parameter
      (td_maxend) which is affected by the NAT offset. Thus that could drift
      away from the correct value and thus resulted breaking active FTP.
      
      The patch below fixes the issue by *not* updating the conntrack parameters
      from NAT, but instead taking into account the NAT offsets in conntrack in a
      consistent way. (Updating from NAT would be more harder and expensive because
      it'd need to re-calculate parameters we already calculated in conntrack.)
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9dd09c7
  25. 29 6月, 2009 1 次提交
  26. 11 6月, 2009 1 次提交
  27. 10 6月, 2009 1 次提交
  28. 03 6月, 2009 1 次提交
    • P
      netfilter: conntrack: simplify event caching system · 17e6e4ea
      Pablo Neira Ayuso 提交于
      This patch simplifies the conntrack event caching system by removing
      several events:
      
       * IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
         since the have no clients.
       * IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
         days.
       * IPCT_REFRESH which is not of any use since we always include the
         timeout in the messages.
      
      After this patch, the existing events are:
      
       * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
       addition and deletion of entries.
       * IPCT_STATUS, that notes that the status bits have changes,
       eg. IPS_SEEN_REPLY and IPS_ASSURED.
       * IPCT_PROTOINFO, that reports that internal protocol information has
       changed, eg. the TCP, DCCP and SCTP protocol state.
       * IPCT_HELPER, that a helper has been assigned or unassigned to this
       entry.
       * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
       covers the case when a mark is set to zero.
       * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
       adjustment.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      17e6e4ea
  29. 02 6月, 2009 1 次提交
    • J
      netfilter: nf_ct_tcp: TCP simultaneous open support · 874ab923
      Jozsef Kadlecsik 提交于
      The patch below adds supporting TCP simultaneous open to conntrack. The
      unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the
      second SYN sent from the reply direction in the new case. The state table
      is updated and the function tcp_in_window is modified to handle
      simultaneous open.
      
      The functionality can fairly easily be tested by socat. A sample tcpdump
      recording
      
      23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7>
      23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0
      23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1>
      23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7>
      23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423>
      
      and the corresponding netlink events:
      
          [NEW] tcp      6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
       [UPDATE] tcp      6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
       [UPDATE] tcp      6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
       [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED]
      
      The RST packet was dropped in the raw table, thus it did not reach
      conntrack.  nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2
      state as the old unused LISTEN.
      
      With TCP simultaneous open support we satisfy REQ-2 in RFC 5382  ;-) .
      
      Additional minor correction in this patch is that in order to catch
      uninitialized reply directions, "td_maxwin == 0" is used instead of
      "td_end == 0" because the former can't be true except in uninitialized
      state while td_end may accidentally be equal to zero in the mid of a
      connection.
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      874ab923
  30. 25 5月, 2009 1 次提交
    • J
      netfilter: nf_ct_tcp: fix accepting invalid RST segments · bfcaa502
      Jozsef Kadlecsik 提交于
      Robert L Mathews discovered that some clients send evil TCP RST segments,
      which are accepted by netfilter conntrack but discarded by the
      destination. Thus the conntrack entry is destroyed but the destination
      retransmits data until timeout.
      
      The same technique, i.e. sending properly crafted RST segments, can easily
      be used to bypass connlimit/connbytes based restrictions (the sample
      script written by Robert can be found in the netfilter mailing list
      archives).
      
      The patch below adds a new flag and new field to struct ip_ct_tcp_state so
      that checking RST segments can be made more strict and thus TCP conntrack
      can catch the invalid ones: the RST segment is accepted only if its
      sequence number higher than or equal to the highest ack we seen from the
      other direction. (The last_ack field cannot be reused because it is used
      to catch resent packets.)
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      bfcaa502
  31. 26 3月, 2009 1 次提交
  32. 23 3月, 2009 1 次提交