1. 26 3月, 2009 8 次提交
  2. 24 3月, 2009 1 次提交
  3. 23 3月, 2009 2 次提交
  4. 19 3月, 2009 3 次提交
  5. 17 3月, 2009 1 次提交
    • P
      netfilter: xtables: add cluster match · 0269ea49
      Pablo Neira Ayuso 提交于
      This patch adds the iptables cluster match. This match can be used
      to deploy gateway and back-end load-sharing clusters. The cluster
      can be composed of 32 nodes maximum (although I have only tested
      this with two nodes, so I cannot tell what is the real scalability
      limit of this solution in terms of cluster nodes).
      
      Assuming that all the nodes see all packets (see below for an
      example on how to do that if your switch does not allow this), the
      cluster match decides if this node has to handle a packet given:
      
      	(jhash(source IP) % total_nodes) & node_mask
      
      For related connections, the master conntrack is used. The following
      is an example of its use to deploy a gateway cluster composed of two
      nodes (where this is the node 1):
      
      iptables -I PREROUTING -t mangle -i eth1 -m cluster \
      	--cluster-total-nodes 2 --cluster-local-node 1 \
      	--cluster-proc-name eth1 -j MARK --set-mark 0xffff
      iptables -A PREROUTING -t mangle -i eth1 \
      	-m mark ! --mark 0xffff -j DROP
      iptables -A PREROUTING -t mangle -i eth2 -m cluster \
      	--cluster-total-nodes 2 --cluster-local-node 1 \
      	--cluster-proc-name eth2 -j MARK --set-mark 0xffff
      iptables -A PREROUTING -t mangle -i eth2 \
      	-m mark ! --mark 0xffff -j DROP
      
      And the following commands to make all nodes see the same packets:
      
      ip maddr add 01:00:5e:00:01:01 dev eth1
      ip maddr add 01:00:5e:00:01:02 dev eth2
      arptables -I OUTPUT -o eth1 --h-length 6 \
      	-j mangle --mangle-mac-s 01:00:5e:00:01:01
      arptables -I INPUT -i eth1 --h-length 6 \
      	--destination-mac 01:00:5e:00:01:01 \
      	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
      arptables -I OUTPUT -o eth2 --h-length 6 \
      	-j mangle --mangle-mac-s 01:00:5e:00:01:02
      arptables -I INPUT -i eth2 --h-length 6 \
      	--destination-mac 01:00:5e:00:01:02 \
      	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
      
      In the case of TCP connections, pickup facility has to be disabled
      to avoid marking TCP ACK packets coming in the reply direction as
      valid.
      
      echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
      
      BTW, some final notes:
      
       * This match mangles the skbuff pkt_type in case that it detects
      PACKET_MULTICAST for a non-multicast address. This may be done in
      a PKTTYPE target for this sole purpose.
       * This match supersedes the CLUSTERIP target.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      0269ea49
  6. 16 3月, 2009 8 次提交
  7. 24 2月, 2009 2 次提交
  8. 20 2月, 2009 4 次提交
  9. 19 2月, 2009 4 次提交
  10. 18 2月, 2009 2 次提交
  11. 10 2月, 2009 3 次提交
  12. 05 2月, 2009 1 次提交
    • H
      net: Partially allow skb destructors to be used on receive path · 9a279bcb
      Herbert Xu 提交于
      As it currently stands, skb destructors are forbidden on the
      receive path because the protocol end-points will overwrite
      any existing destructor with their own.
      
      This is the reason why we have to call skb_orphan in the loopback
      driver before we reinject the packet back into the stack, thus
      creating a period during which loopback traffic isn't charged
      to any socket.
      
      With virtualisation, we have a similar problem in that traffic
      is reinjected into the stack without being associated with any
      socket entity, thus providing no natural congestion push-back
      for those poor folks still stuck with UDP.
      
      Now had we been consistent in telling them that UDP simply has
      no congestion feedback, I could just fob them off.  Unfortunately,
      we appear to have gone to some length in catering for this on
      the standard UDP path, with skb/socket accounting so that has
      created a very unhealthy dependency.
      
      Alas habits are difficult to break out of, so we may just have
      to allow skb destructors on the receive path.
      
      It turns out that making skb destructors useable on the receive path
      isn't as easy as it seems.  For instance, simply adding skb_orphan
      to skb_set_owner_r isn't enough.  This is because we assume all
      over the IP stack that skb->sk is an IP socket if present.
      
      The new transparent proxy code goes one step further and assumes
      that skb->sk is the receiving socket if present.
      
      Now all of this can be dealt with by adding simple checks such
      as only treating skb->sk as an IP socket if skb->sk->sk_family
      matches.  However, it turns out that for bridging at least we
      don't need to do all of this work.
      
      This is of interest because most virtualisation setups use bridging
      so we don't actually go through the IP stack on the host (with
      the exception of our old nemesis the bridge netfilter, but that's
      easily taken care of).
      
      So this patch simply adds skb_orphan to the point just before we
      enter the IP stack, but after we've gone through the bridge on the
      receive path.  It also adds an skb_orphan to the one place in
      netfilter that touches skb->sk/skb->destructor, that is, tproxy.
      
      One word of caution, because of the internal code structure, anyone
      wishing to deploy this must use skb_set_owner_w as opposed to
      skb_set_owner_r since many functions that create a new skb from
      an existing one will invoke skb_set_owner_w on the new skb.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a279bcb
  13. 01 2月, 2009 1 次提交