1. 02 3月, 2011 2 次提交
  2. 17 2月, 2011 1 次提交
    • F
      netfilter: tproxy: do not assign timewait sockets to skb->sk · d503b30b
      Florian Westphal 提交于
      Assigning a socket in timewait state to skb->sk can trigger
      kernel oops, e.g. in nfnetlink_log, which does:
      
      if (skb->sk) {
              read_lock_bh(&skb->sk->sk_callback_lock);
              if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...
      
      in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
      is invalid.
      
      Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
      or xt_TPROXY must not assign a timewait socket to skb->sk.
      
      This does the latter.
      
      If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
      thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
      
      The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
      listener socket.
      
      Cc: Balazs Scheidler <bazsi@balabit.hu>
      Cc: KOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NFlorian Westphal <fwestphal@astaro.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d503b30b
  3. 15 2月, 2011 1 次提交
  4. 09 2月, 2011 1 次提交
    • P
      netfilter: nf_conntrack: set conntrack templates again if we return NF_REPEAT · c3174286
      Pablo Neira Ayuso 提交于
      The TCP tracking code has a special case that allows to return
      NF_REPEAT if we receive a new SYN packet while in TIME_WAIT state.
      
      In this situation, the TCP tracking code destroys the existing
      conntrack to start a new clean session.
      
      [DESTROY] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925 [ASSURED]
          [NEW] tcp      6 120 SYN_SENT src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925
      
      However, this is a problem for the iptables' CT target event filtering
      which will not work in this case since the conntrack template will not
      be there for the new session. To fix this, we reassign the conntrack
      template to the packet if we return NF_REPEAT.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      c3174286
  5. 01 2月, 2011 1 次提交
    • P
      netfilter: ecache: always set events bits, filter them later · 3db7e93d
      Pablo Neira Ayuso 提交于
      For the following rule:
      
      iptables -I PREROUTING -t raw -j CT --ctevents assured
      
      The event delivered looks like the following:
      
       [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      
      Note that the TCP protocol state is not included. For that reason
      the CT event filtering is not very useful for conntrackd.
      
      To resolve this issue, instead of conditionally setting the CT events
      bits based on the ctmask, we always set them and perform the filtering
      in the late stage, just before the delivery.
      
      Thus, the event delivered looks like the following:
      
       [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      3db7e93d
  6. 25 1月, 2011 2 次提交
  7. 20 1月, 2011 1 次提交
  8. 14 1月, 2011 1 次提交
  9. 12 1月, 2011 1 次提交
  10. 11 1月, 2011 1 次提交
  11. 10 1月, 2011 1 次提交
  12. 07 1月, 2011 2 次提交
  13. 15 12月, 2010 1 次提交
    • T
      workqueue: convert cancel_rearming_delayed_work[queue]() users to cancel_delayed_work_sync() · afe2c511
      Tejun Heo 提交于
      cancel_rearming_delayed_work[queue]() has been superceded by
      cancel_delayed_work_sync() quite some time ago.  Convert all the
      in-kernel users.  The conversions are completely equivalent and
      trivial.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: netdev@vger.kernel.org
      Cc: Anton Vorontsov <cbou@mail.ru>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: xfs-masters@oss.sgi.com
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: netfilter-devel@vger.kernel.org
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: linux-nfs@vger.kernel.org
      afe2c511
  14. 19 11月, 2010 1 次提交
  15. 18 11月, 2010 2 次提交
  16. 12 11月, 2010 1 次提交
  17. 30 10月, 2010 1 次提交
  18. 29 10月, 2010 1 次提交
  19. 28 10月, 2010 1 次提交
  20. 26 10月, 2010 1 次提交
  21. 21 10月, 2010 16 次提交
    • B
      tproxy: use the interface primary IP address as a default value for --on-ip · cc6eb433
      Balazs Scheidler 提交于
      The REDIRECT target and the older TProxy versions used the primary address
      of the incoming interface as the default value of the --on-ip parameter.
      This was unintentionally changed during the initial TProxy submission and
      caused confusion among users.
      
      Since IPv6 has no notion of primary address, we just select the first address
      on the list: this way the socket lookup finds wildcard bound sockets
      properly and we cannot really do better without the user telling us the
      IPv6 address of the proxy.
      
      This is implemented for both IPv4 and IPv6.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      cc6eb433
    • B
      tproxy: added IPv6 support to the socket match · b64c9256
      Balazs Scheidler 提交于
      The ICMP extraction bits were contributed by Harry Mason.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      b64c9256
    • B
      tproxy: added IPv6 support to the TPROXY target · 6ad78893
      Balazs Scheidler 提交于
      This requires a new revision as the old target structure was
      IPv4 specific.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      6ad78893
    • B
      tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() · 6006db84
      Balazs Scheidler 提交于
      Also, inline this function as the lookup_type is always a literal
      and inlining removes branches performed at runtime.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      6006db84
    • B
      tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple · 106e4c26
      Balazs Scheidler 提交于
      Without tproxy redirections an incoming SYN kicks out conflicting
      TIME_WAIT sockets, in order to handle clients that reuse ports
      within the TIME_WAIT period.
      
      The same mechanism didn't work in case TProxy is involved in finding
      the proper socket, as the time_wait processing code looked up the
      listening socket assuming that the listener addr/port matches those
      of the established connection.
      
      This is not the case with TProxy as the listener addr/port is possibly
      changed with the tproxy rule.
      Signed-off-by: NBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: NKOVACS Krisztian <hidden@balabit.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      106e4c26
    • J
      ipvs: provide address family for debugging · 0d79641a
      Julian Anastasov 提交于
       	As skb->protocol is not valid in LOCAL_OUT add
      parameter for address family in packet debugging functions.
      Even if ports are not present in AH and ESP change them to
      use ip_vs_tcpudp_debug_packet to show at least valid addresses
      as before. This patch removes the last user of skb->protocol
      in IPVS.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      0d79641a
    • J
      ipvs: inherit forwarding method in backup · 3233759b
      Julian Anastasov 提交于
       	Connections in backup server should inherit the
      forwarding method from real server. It is a way to fix a
      problem where the forwarding method in backup connection
      is damaged by logical OR operation with the real server's
      connection flags. And the change is needed for setups
      where the backup server uses different forwarding method
      for the same real servers.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      3233759b
    • J
      ipvs: changes for local client · cb59155f
      Julian Anastasov 提交于
      	This patch deals with local client processing.
      
      	Prefer LOCAL_OUT hook for scheduling connections from
      local clients. LOCAL_IN is still supported if the packets are
      not marked as processed in LOCAL_OUT. The idea to process
      requests in LOCAL_OUT is to alter conntrack reply before
      it is confirmed at POST_ROUTING. If the local requests are
      processed in LOCAL_IN the conntrack can not be updated
      and matching by state is impossible.
      
      	Add the following handlers:
      
      - ip_vs_reply[46] at LOCAL_IN:99 to process replies from
      remote real servers to local clients. Now when both
      replies from remote real servers (ip_vs_reply*) and
      local real servers (ip_vs_local_reply*) are handled
      it is safe to remove the conn_out_get call from ip_vs_in
      because it does not support related ICMP packets.
      
      - ip_vs_local_request[46] at LOCAL_OUT:-98 to process
      requests from local client
      
      	Handling in LOCAL_OUT causes some changes:
      
      - as skb->dev, skb->protocol and skb->pkt_type are not defined
      in LOCAL_OUT make sure we set skb->dev before calling icmpv6_send,
      prefer skb_dst(skb) for struct net and remove the skb->protocol
      checks from TUN transmitters.
      
      [ horms@verge.net.au: removed trailing whitespace ]
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      cb59155f
    • J
      ipvs: changes for local real server · fc604767
      Julian Anastasov 提交于
       	This patch deals with local real servers:
      
      - Add support for DNAT to local address (different real server port).
      It needs ip_vs_out hook in LOCAL_OUT for both families because
      skb->protocol is not set for locally generated packets and can not
      be used to set 'af'.
      
      - Skip packets in ip_vs_in marked with skb->ipvs_property because
      ip_vs_out processing can be executed in LOCAL_OUT but we still
      have the conn_out_get check in ip_vs_in.
      
      - Ignore packets with inet->nodefrag from local stack
      
      - Require skb_dst(skb) != NULL because we use it to get struct net
      
      - Add support for changing the route to local IPv4 stack after DNAT
      depending on the source address type. Local client sets output
      route and the remote client sets input route. It looks like
      IPv6 does not need such rerouting because the replies use
      addresses from initial incoming header, not from skb route.
      
      - All transmitters now have strict checks for the destination
      address type: redirect from non-local address to local real
      server requires NAT method, local address can not be used as
      source address when talking to remote real server.
      
      - Now LOCALNODE is not set explicitly as forwarding
      method in real server to allow the connections to provide
      correct forwarding method to the backup server. Not sure if
      this breaks tools that expect to see 'Local' real server type.
      If needed, this can be supported with new flag IP_VS_DEST_F_LOCAL.
      Now it should be possible connections in backup that lost
      their fwmark information during sync to be forwarded properly
      to their daddr, even if it is local address in the backup server.
      By this way backup could be used as real server for DR or TUN,
      for NAT there are some restrictions because tuple collisions
      in conntracks can create problems for the traffic.
      
      - Call ip_vs_dst_reset when destination is updated in case
      some real server IP type is changed between local and remote.
      
      [ horms@verge.net.au: removed trailing whitespace ]
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      fc604767
    • J
      ipvs: move ip_route_me_harder for ICMP · f5a41847
      Julian Anastasov 提交于
       	Currently, ip_route_me_harder after ip_vs_out_icmp
      is called even if packet is not related to IPVS connection.
      Move it into handle_response_icmp. Also, force rerouting
      if sending to local client because IPv4 stack uses addresses
      from the route.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      f5a41847
    • J
      ipvs: create ip_vs_defrag_user · 1ca5bb54
      Julian Anastasov 提交于
       	Create new function ip_vs_defrag_user to return correct
      IP_DEFRAG_xxx user depending on the hooknum. It will be needed
      when we add handlers in LOCAL_OUT.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      1ca5bb54
    • J
      ipvs: fix CHECKSUM_PARTIAL for TUN method · 4256f1aa
      Julian Anastasov 提交于
       	The recent change in IP_VS_XMIT_TUNNEL to set
      CHECKSUM_NONE is not correct. After adding IPIP header
      skb->csum becomes invalid but the CHECKSUM_PARTIAL
      case must be supported. So, use skb_forward_csum() which is
      most suitable for us to allow local clients to send IPIP
      to remote real server.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      4256f1aa
    • J
      ipvs: stop ICMP from FORWARD to local · 489fdeda
      Julian Anastasov 提交于
       	Delivering locally ICMP from FORWARD hook is not supported.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      489fdeda
    • J
      ipvs: do not schedule conns from real servers · 190ecd27
      Julian Anastasov 提交于
       	This patch is needed to avoid scheduling of
      packets from local real server when we add ip_vs_in
      in LOCAL_OUT hook to support local client.
      
       	Currently, when ip_vs_in can not find existing
      connection it tries to create new one by calling ip_vs_schedule.
      
       	The default indication from ip_vs_schedule was if
      connection was scheduled to real server. If real server is
      not available we try to use the bypass forwarding method
      or to send ICMP error. But in some cases we do not want to use
      the bypass feature. So, add flag 'ignored' to indicate if
      the scheduler ignores this packet.
      
       	Make sure we do not create new connections from replies.
      We can hit this problem for persistent services and local real
      server when ip_vs_in is added to LOCAL_OUT hook to handle
      local clients.
      
       	Also, make sure ip_vs_schedule ignores SYN packets
      for Active FTP DATA from local real server. The FTP DATA
      connection should be created on SYN+ACK from client to assign
      correct connection daddr.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      190ecd27
    • J
      ipvs: switch to notrack mode · cf356d69
      Julian Anastasov 提交于
       	Change skb->ipvs_property semantic. This is preparation
      to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1
      will be used to avoid expensive lookups for traffic sent by
      transmitters. Now when conntrack support is not used we call
      ip_vs_notrack method to avoid problems in OUTPUT and
      POST_ROUTING hooks instead of exiting POST_ROUTING as before.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      cf356d69
    • J
      ipvs: optimize checksums for apps · 8b27b10f
      Julian Anastasov 提交于
       	Avoid full checksum calculation for apps that can provide
      info whether csum was broken after payload mangling. For now only
      ip_vs_ftp mangles payload and it updates the csum, so the full
      recalculation is avoided for all packets.
      
       	Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP).
      It is needed to support SNAT from local address for the case
      when csum is fully recalculated.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      8b27b10f