1. 18 9月, 2015 4 次提交
  2. 25 5月, 2015 1 次提交
  3. 21 4月, 2015 1 次提交
    • S
      ip_forward: Drop frames with attached skb->sk · 2ab95749
      Sebastian Pöhn 提交于
      Initial discussion was:
      [FYI] xfrm: Don't lookup sk_policy for timewait sockets
      
      Forwarded frames should not have a socket attached. Especially
      tw sockets will lead to panics later-on in the stack.
      
      This was observed with TPROXY assigning a tw socket and broken
      policy routing (misconfigured). As a result frame enters
      forwarding path instead of input. We cannot solve this in
      TPROXY as it cannot know that policy routing is broken.
      
      v2:
      Remove useless comment
      Signed-off-by: NSebastian Poehn <sebastian.poehn@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ab95749
  4. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd
  5. 12 3月, 2015 1 次提交
  6. 27 1月, 2015 1 次提交
  7. 13 5月, 2014 1 次提交
  8. 08 5月, 2014 2 次提交
  9. 14 2月, 2014 2 次提交
    • D
      ipv4: ip_forward: perform skb->pkt_type check at the beginning · d4f2fa6a
      Denis Kirjanov 提交于
      Packets which have L2 address different from ours should be
      already filtered before entering into ip_forward().
      
      Perform that check at the beginning to avoid processing such packets.
      Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4f2fa6a
    • F
      net: ip, ipv6: handle gso skbs in forwarding path · fe6cc55f
      Florian Westphal 提交于
      Marcelo Ricardo Leitner reported problems when the forwarding link path
      has a lower mtu than the incoming one if the inbound interface supports GRO.
      
      Given:
      Host <mtu1500> R1 <mtu1200> R2
      
      Host sends tcp stream which is routed via R1 and R2.  R1 performs GRO.
      
      In this case, the kernel will fail to send ICMP fragmentation needed
      messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
      checks in forward path. Instead, Linux tries to send out packets exceeding
      the mtu.
      
      When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
      not fragment the packets when forwarding, and again tries to send out
      packets exceeding R1-R2 link mtu.
      
      This alters the forwarding dstmtu checks to take the individual gso
      segment lengths into account.
      
      For ipv6, we send out pkt too big error for gso if the individual
      segments are too big.
      
      For ipv4, we either send icmp fragmentation needed, or, if the DF bit
      is not set, perform software segmentation and let the output path
      create fragments when the packet is leaving the machine.
      It is not 100% correct as the error message will contain the headers of
      the GRO skb instead of the original/segmented one, but it seems to
      work fine in my (limited) tests.
      
      Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
      sofware segmentation.
      
      However it turns out that skb_segment() assumes skb nr_frags is related
      to mss size so we would BUG there.  I don't want to mess with it considering
      Herbert and Eric disagree on what the correct behavior should be.
      
      Hannes Frederic Sowa notes that when we would shrink gso_size
      skb_segment would then also need to deal with the case where
      SKB_MAX_FRAGS would be exceeded.
      
      This uses sofware segmentation in the forward path when we hit ipv4
      non-DF packets and the outgoing link mtu is too small.  Its not perfect,
      but given the lack of bug reports wrt. GRO fwd being broken this is a
      rare case anyway.  Also its not like this could not be improved later
      once the dust settles.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Reported-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe6cc55f
  10. 14 1月, 2014 1 次提交
    • H
      ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing · f87c10a8
      Hannes Frederic Sowa 提交于
      While forwarding we should not use the protocol path mtu to calculate
      the mtu for a forwarded packet but instead use the interface mtu.
      
      We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
      introduced for multicast forwarding. But as it does not conflict with
      our usage in unicast code path it is perfect for reuse.
      
      I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
      along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
      dependencies because of IPSKB_FORWARDED.
      
      Because someone might have written a software which does probe
      destinations manually and expects the kernel to honour those path mtus
      I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
      can disable this new behaviour. We also still use mtus which are locked on a
      route for forwarding.
      
      The reason for this change is, that path mtus information can be injected
      into the kernel via e.g. icmp_err protocol handler without verification
      of local sockets. As such, this could cause the IPv4 forwarding path to
      wrongfully emit fragmentation needed notifications or start to fragment
      packets along a path.
      
      Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
      won't be set and further fragmentation logic will use the path mtu to
      determine the fragmentation size. They also recheck packet size with
      help of path mtu discovery and report appropriate errors.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: John Heffner <johnwheffner@gmail.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f87c10a8
  11. 09 10月, 2012 2 次提交
  12. 08 6月, 2012 1 次提交
    • V
      snmp: fix OutOctets counter to include forwarded datagrams · 2d8dbb04
      Vincent Bernat 提交于
      RFC 4293 defines ipIfStatsOutOctets (similar definition for
      ipSystemStatsOutOctets):
      
         The total number of octets in IP datagrams delivered to the lower
         layers for transmission.  Octets from datagrams counted in
         ipIfStatsOutTransmits MUST be counted here.
      
      And ipIfStatsOutTransmits:
      
         The total number of IP datagrams that this entity supplied to the
         lower layers for transmission.  This includes datagrams generated
         locally and those forwarded by this entity.
      
      Therefore, IPSTATS_MIB_OUTOCTETS must be incremented when incrementing
      IPSTATS_MIB_OUTFORWDATAGRAMS.
      
      IP_UPD_PO_STATS is not used since ipIfStatsOutRequests must not
      include forwarded datagrams:
      
         The total number of IP datagrams that local IP user-protocols
         (including ICMP) supplied to IP in requests for transmission.  Note
         that this counter does not include any datagrams counted in
         ipIfStatsOutForwDatagrams.
      Signed-off-by: NVincent Bernat <bernat@luffy.cx>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8dbb04
  13. 16 4月, 2012 1 次提交
  14. 24 11月, 2011 1 次提交
  15. 13 5月, 2011 2 次提交
  16. 11 6月, 2010 1 次提交
  17. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  18. 25 3月, 2010 1 次提交
  19. 03 6月, 2009 2 次提交
  20. 29 10月, 2008 1 次提交
  21. 17 7月, 2008 2 次提交
  22. 20 6月, 2008 1 次提交
  23. 12 6月, 2008 1 次提交
  24. 29 3月, 2008 1 次提交
  25. 06 3月, 2008 1 次提交
  26. 29 1月, 2008 1 次提交
  27. 16 10月, 2007 1 次提交
    • P
      [IPV4]: Uninline netfilter okfns · 861d0486
      Patrick McHardy 提交于
      Now that we don't pass double skb pointers to nf_hook_slow anymore, gcc
      can generate tail calls for some of the netfilter hook okfn invocations,
      so there is no need to inline the functions anymore. This caused huge
      code bloat since we ended up with one inlined version and one out-of-line
      version since we pass the address to nf_hook_slow.
      
      Before:
         text    data     bss     dec     hex filename
      8997385 1016524  524652 10538561         a0ce41 vmlinux
      
      After:
         text    data     bss     dec     hex filename
      8994009 1016524  524652 10535185         a0c111 vmlinux
      -------------------------------------------------------
        -3376
      
      All cases have been verified to generate tail-calls with and without
      netfilter. The okfns in ipmr and xfrm4_input still remain inline because
      gcc can't generate tail-calls for them.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      861d0486
  28. 11 10月, 2007 1 次提交
    • M
      [IPV4] IPSEC: Omit redirect for tunnelled packet. · 3b26a9a6
      Masahide NAKAMURA 提交于
      IPv4 IPsec tunnel gateway incorrectly sends redirect to
      sender if it is onlink host when network device the IPsec tunnelled
      packet is arrived is the same as the one the decapsulated packet
      is sent.
      
      With this patch, it omits to send the redirect when the forwarding
      skbuff carries secpath, since such skbuff should be assumed as
      a decapsulated packet from IPsec tunnel by own.
      
      Request for comments:
      Alternatively we'd have another way to change net/ipv4/route.c
      (__mkroute_input) to use RTCF_DOREDIRECT flag unless skbuff
      has no secpath. It is better than this patch at performance
      point of view because IPv4 redirect judgement is done at
      routing slow-path. However, it should be taken care of resource
      changes between SAD(XFRM states) and routing table. In other words,
      When IPv4 SAD is changed does the related routing entry go to its
      slow-path? If not, it is reasonable to apply this patch.
      Signed-off-by: NMasahide NAKAMURA <nakam@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b26a9a6
  29. 19 7月, 2007 1 次提交
  30. 26 4月, 2007 2 次提交
    • H
      [NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETE · 35fc92a9
      Herbert Xu 提交于
      Right now Xen has a horrible hack that lets it forward packets with
      partial checksums.  One of the reasons that CHECKSUM_PARTIAL and
      CHECKSUM_COMPLETE were added is so that we can get rid of this hack
      (where it creates two extra bits in the skbuff to essentially mirror
      ip_summed without being destroyed by the forwarding code).
      
      I had forgotten that I've already gone through all the deivce drivers
      last time around to make sure that they're looking at ip_summed ==
      CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit.  In any case,
      I've now done that again so it should definitely be safe.
      
      Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE
      values on forward so we I'm setting that to CHECKSUM_NONE.  This should
      be safe to remove for bridging but I'd like to check that code path
      first.
      
      So here is the patch that lets us get rid of the hack by preserving
      ip_summed (mostly) on forwarded packets.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35fc92a9
    • J
      [NET] Move DF check to ip_forward · 9af3912e
      John Heffner 提交于
      Do fragmentation check in ip_forward, similar to ipv6 forwarding.
      Signed-off-by: NJohn Heffner <jheffner@psc.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9af3912e