1. 27 10月, 2008 1 次提交
  2. 24 10月, 2008 1 次提交
    • I
      tcp: Restore ordering of TCP options for the sake of inter-operability · fd6149d3
      Ilpo Järvinen 提交于
      This is not our bug! Sadly some devices cannot cope with the change
      of TCP option ordering which was a result of the recent rewrite of
      the option code (not that there was some particular reason steming
      from the rewrite for the reordering) though any ordering of TCP
      options is perfectly legal. Thus we restore the original ordering
      to allow interoperability with/through such broken devices and add
      some warning about this trap. Since the reordering just happened
      without any particular reason, this change shouldn't cost us
      anything.
      
      There are already couple of known failure reports (within close
      proximity of the last release), so the problem might be more
      wide-spread than a single device. And other reports which may
      be due to the same problem though the symptoms were less obvious.
      Analysis of one of the case revealed (with very high probability)
      that sack capability cannot be negotiated as the first option
      (SYN never got a response).
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Reported-by: NAldo Maggi <sentiniate@tiscali.it>
      Tested-by: NAldo Maggi <sentiniate@tiscali.it>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd6149d3
  3. 22 10月, 2008 1 次提交
    • I
      tcp: should use number of sack blocks instead of -1 · 75e3d8db
      Ilpo Järvinen 提交于
      While looking for the recent "sack issue" I also read all eff_sacks
      usage that was played around by some relevant commit. I found
      out that there's another thing that is asking for a fix (unrelated
      to the "sack issue" though).
      
      This feature has probably very little significance in practice.
      Opposite direction timeout with bidirectional tcp comes to me as
      the most likely scenario though there might be other cases as
      well related to non-data segments we send (e.g., response to the
      opposite direction segment). Also some ACK losses or option space
      wasted for other purposes is necessary to prevent the earlier
      SACK feedback getting to the sender.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75e3d8db
  4. 20 10月, 2008 2 次提交
  5. 17 10月, 2008 3 次提交
  6. 15 10月, 2008 2 次提交
    • P
      netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat · e6a7d3c0
      Pablo Neira Ayuso 提交于
      This patch removes the module dependency between ctnetlink and
      nf_nat by means of an indirect call that is initialized when
      nf_nat is loaded. Now, nf_conntrack_netlink only requires
      nf_conntrack and nfnetlink.
      
      This patch puts nfnetlink_parse_nat_setup_hook into the
      nf_conntrack_core to avoid dependencies between ctnetlink,
      nf_conntrack_ipv4 and nf_conntrack_ipv6.
      
      This patch also introduces the function ctnetlink_change_nat
      that is only invoked from the creation path. Actually, the
      nat handling cannot be invoked from the update path since
      this is not allowed. By introducing this function, we remove
      the useless nat handling in the update path and we avoid
      deadlock-prone code.
      
      This patch also adds the required EAGAIN logic for nfnetlink.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6a7d3c0
    • P
      netfilter: restore lost #ifdef guarding defrag exception · 38f7ac3e
      Patrick McHardy 提交于
      Nir Tzachar <nir.tzachar@gmail.com> reported a warning when sending
      fragments over loopback with NAT:
      
      [ 6658.338121] WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x33/0x155()
      
      The reason is that defragmentation is skipped for already tracked connections.
      This is wrong in combination with NAT and ip_conntrack actually had some ifdefs
      to avoid this behaviour when NAT is compiled in.
      
      The entire "optimization" may seem a bit silly, for now simply restoring the
      lost #ifdef is the easiest solution until we can come up with something better.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      38f7ac3e
  7. 14 10月, 2008 1 次提交
  8. 12 10月, 2008 1 次提交
  9. 11 10月, 2008 2 次提交
  10. 10 10月, 2008 11 次提交
    • P
      cipso: Add support for native local labeling and fixup mapping names · 15c45f7b
      Paul Moore 提交于
      This patch accomplishes three minor tasks: add a new tag type for local
      labeling, rename the CIPSO_V4_MAP_STD define to CIPSO_V4_MAP_TRANS and
      replace some of the CIPSO "magic numbers" with constants from the header
      file.  The first change allows CIPSO to support full LSM labels/contexts,
      not just MLS attributes.  The second change brings the mapping names inline
      with what userspace is using, compatibility is preserved since we don't
      actually change the value.  The last change is to aid readability and help
      prevent mistakes.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      15c45f7b
    • P
      selinux: Set socket NetLabel based on connection endpoint · 014ab19a
      Paul Moore 提交于
      Previous work enabled the use of address based NetLabel selectors, which while
      highly useful, brought the potential for additional per-packet overhead when
      used.  This patch attempts to solve that by applying NetLabel socket labels
      when sockets are connect()'d.  This should alleviate the per-packet NetLabel
      labeling for all connected sockets (yes, it even works for connected DGRAM
      sockets).
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      014ab19a
    • P
      netlabel: Add functionality to set the security attributes of a packet · 948bf85c
      Paul Moore 提交于
      This patch builds upon the new NetLabel address selector functionality by
      providing the NetLabel KAPI and CIPSO engine support needed to enable the
      new packet-based labeling.  The only new addition to the NetLabel KAPI at
      this point is shown below:
      
       * int netlbl_skbuff_setattr(skb, family, secattr)
      
      ... and is designed to be called from a Netfilter hook after the packet's
      IP header has been populated such as in the FORWARD or LOCAL_OUT hooks.
      
      This patch also provides the necessary SELinux hooks to support this new
      functionality.  Smack support is not currently included due to uncertainty
      regarding the permissions needed to expand the Smack network access controls.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      948bf85c
    • P
      netlabel: Replace protocol/NetLabel linking with refrerence counts · b1edeb10
      Paul Moore 提交于
      NetLabel has always had a list of backpointers in the CIPSO DOI definition
      structure which pointed to the NetLabel LSM domain mapping structures which
      referenced the CIPSO DOI struct.  The rationale for this was that when an
      administrator removed a CIPSO DOI from the system all of the associated
      NetLabel LSM domain mappings should be removed as well; a list of
      backpointers made this a simple operation.
      
      Unfortunately, while the backpointers did make the removal easier they were
      a bit of a mess from an implementation point of view which was making
      further development difficult.  Since the removal of a CIPSO DOI is a
      realtively rare event it seems to make sense to remove this backpointer
      list as the optimization was hurting us more then it was helping.  However,
      we still need to be able to track when a CIPSO DOI definition is being used
      so replace the backpointer list with a reference count.  In order to
      preserve the current functionality of removing the associated LSM domain
      mappings when a CIPSO DOI is removed we walk the LSM domain mapping table,
      removing the relevant entries.
      Signed-off-by: NPaul Moore <paul.moore@hp.com>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      b1edeb10
    • E
      udp: complete port availability checking · f24d43c0
      Eric Dumazet 提交于
      While looking at UDP port randomization, I noticed it
      was litle bit pessimistic, not looking at type of sockets
      (IPV6/IPV4) and not looking at bound addresses if any.
      
      We should perform same tests than when binding to a
      specific port.
      
      This permits a cleanup of udp_lib_get_port()
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f24d43c0
    • I
      tcpv[46]: fix md5 pseudoheader address field ordering · 78e645cb
      Ilpo Järvinen 提交于
      Maybe it's just me but I guess those md5 people made a mess
      out of it by having *_md5_hash_* to use daddr, saddr order
      instead of the one that is natural (and equal to what csum
      functions use). For the segment were sending, the original
      addresses are reversed so buff's saddr == skb's daddr and
      vice-versa.
      
      Maybe I can finally proceed with unification of some code
      after fixing it first... :-)
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78e645cb
    • H
      inet: Make tunnel RX/TX byte counters more consistent · 64194c31
      Herbert Xu 提交于
      This patch makes the RX/TX byte counters for IPIP, GRE and SIT more
      consistent.  Previously we included the external IP headers on the
      way out but not when the packet is inbound.
      
      The new scheme is to count payload only in both directions.  For
      IPIP and SIT this simply means the exclusion of the external IP
      header.  For GRE this means that we exclude the GRE header as
      well.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64194c31
    • H
      gre: Add Transparent Ethernet Bridging · e1a80002
      Herbert Xu 提交于
      This patch adds support for Ethernet over GRE encapsulation.
      This is exposed to user-space with a new link type of "gretap"
      instead of "gre".  It will create an ARPHRD_ETHER device in
      lieu of the usual ARPHRD_IPGRE.
      
      Note that to preserver backwards compatibility all Transparent
      Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel
      if its key matches and there is no ARPHRD_ETHER device whose
      key matches more closely.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1a80002
    • H
      gre: Add netlink interface · c19e654d
      Herbert Xu 提交于
      This patch adds a netlink interface that will eventually displace
      the existing ioctl interface.  It utilises the elegant rtnl_link_ops
      mechanism.
      
      This also means that user-space no longer needs to rely on the
      tunnel interface being of type GRE to identify GRE tunnels.  The
      identification can now occur using rtnl_link_ops.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c19e654d
    • H
      gre: Move MTU setting out of ipgre_tunnel_bind_dev · 42aa9162
      Herbert Xu 提交于
      This patch moves the dev->mtu setting out of ipgre_tunnel_bind_dev.
      This is in prepartion of using rtnl_link where we'll need to make
      the MTU setting conditional on whether the user has supplied an
      MTU.  This also requires the move of the ipgre_tunnel_bind_dev
      call out of the dev->init function so that we can access the user
      parameters later.
      
      This patch also adds a check to prevent setting the MTU below
      the minimum of 68.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42aa9162
    • H
      gre: Use needed_headroom · c95b819a
      Herbert Xu 提交于
      Now that we have dev->needed_headroom, we can use it instead of
      having a bogus dev->hard_header_len.  This also allows us to
      include dev->hard_header_len in the MTU computation so that when
      we do have a meaningful hard_harder_len in future it is included
      automatically in figuring out the MTU.
      
      Incidentally, this fixes a bug where we ignored the needed_headroom
      field of the underlying device in calculating our own hard_header_len.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c95b819a
  11. 09 10月, 2008 4 次提交
    • S
      ipvs: Remove stray file left over from ipvs move · 071d7ab6
      Sven Wegener 提交于
      Commit cb7f6a7b ("IPVS: Move IPVS to
      net/netfilter/ipvs") has left a stray file in the old location of ipvs.
      Signed-off-by: NSven Wegener <sven.wegener@stealer.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      071d7ab6
    • E
      inet: cleanup of local_port_range · 3c689b73
      Eric Dumazet 提交于
      I noticed sysctl_local_port_range[] and its associated seqlock
      sysctl_local_port_range_lock were on separate cache lines.
      Moreover, sysctl_local_port_range[] was close to unrelated
      variables, highly modified, leading to cache misses.
      
      Moving these two variables in a structure can help data
      locality and moving this structure to read_mostly section
      helps sharing of this data among cpus.
      
      Cleanup of extern declarations (moved in include file where
      they belong), and use of inet_get_local_port_range()
      accessor instead of direct access to ports values.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c689b73
    • E
      udp: Improve port randomization · 9088c560
      Eric Dumazet 提交于
      Current UDP port allocation is suboptimal.
      We select the shortest chain to chose a port (out of 512)
      that will hash in this shortest chain.
      
      First, it can lead to give not so ramdom ports and ease
      give attackers more opportunities to break the system.
      
      Second, it can consume a lot of CPU to scan all table
      in order to find the shortest chain.
      
      Third, in some pathological cases we can fail to find
      a free port even if they are plenty of them.
      
      This patch zap the search for a short chain and only
      use one random seed. Problem of getting long chains
      should be addressed in another way, since we can
      obtain long chains with non random ports.
      
      Based on a report and patch from Vitaly Mayatskikh
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9088c560
    • I
      tcp: fix length used for checksum in a reset · 52cd5750
      Ilpo Järvinen 提交于
      While looking for some common code I came across difference
      in checksum calculation between tcp_v6_send_(reset|ack) I
      couldn't explain. I checked both v4 and v6 and found out that
      both seem to have the same "feature". I couldn't find anything
      in rfc nor anywhere else which would state that md5 option
      should be ignored like it was in case of reset so I came to
      a conclusion that this is probably a genuine bug. I suspect
      that addition of md5 just was fooled by the excessive
      copy-paste code in those functions and the reset part was
      never tested well enough to find out the problem.
      Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52cd5750
  12. 08 10月, 2008 11 次提交