1. 17 2月, 2014 1 次提交
  2. 14 1月, 2014 1 次提交
    • H
      ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing · f87c10a8
      Hannes Frederic Sowa 提交于
      While forwarding we should not use the protocol path mtu to calculate
      the mtu for a forwarded packet but instead use the interface mtu.
      
      We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
      introduced for multicast forwarding. But as it does not conflict with
      our usage in unicast code path it is perfect for reuse.
      
      I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
      along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
      dependencies because of IPSKB_FORWARDED.
      
      Because someone might have written a software which does probe
      destinations manually and expects the kernel to honour those path mtus
      I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
      can disable this new behaviour. We also still use mtus which are locked on a
      route for forwarding.
      
      The reason for this change is, that path mtus information can be injected
      into the kernel via e.g. icmp_err protocol handler without verification
      of local sockets. As such, this could cause the IPv4 forwarding path to
      wrongfully emit fragmentation needed notifications or start to fragment
      packets along a path.
      
      Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
      won't be set and further fragmentation logic will use the path mtu to
      determine the fragmentation size. They also recheck packet size with
      help of path mtu discovery and report appropriate errors.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: John Heffner <johnwheffner@gmail.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f87c10a8
  3. 21 11月, 2013 1 次提交
    • A
      ipv4: fix race in concurrent ip_route_input_slow() · dcdfdf56
      Alexei Starovoitov 提交于
      CPUs can ask for local route via ip_route_input_noref() concurrently.
      if nh_rth_input is not cached yet, CPUs will proceed to allocate
      equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
      via rt_cache_route()
      Most of the time they succeed, but on occasion the following two lines:
      	orig = *p;
      	prev = cmpxchg(p, orig, rt);
      in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
      But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
      so dst is leaking. dst_destroy() is never called and 'lo' device
      refcnt doesn't go to zero, which can be seen in the logs as:
      	unregister_netdevice: waiting for lo to become free. Usage count = 1
      Adding mdelay() between above two lines makes it easily reproducible.
      Fix it similar to nh_pcpu_rth_output case.
      
      Fixes: d2d68ba9 ("ipv4: Cache input routes in fib_info nexthops.")
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcdfdf56
  4. 06 11月, 2013 1 次提交
    • H
      ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE · 482fc609
      Hannes Frederic Sowa 提交于
      Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery,
      their sockets won't accept and install new path mtu information and they
      will always use the interface mtu for outgoing packets. It is guaranteed
      that the packet is not fragmented locally. But we won't set the DF-Flag
      on the outgoing frames.
      
      Florian Weimer had the idea to use this flag to ensure DNS servers are
      never generating outgoing fragments. They may well be fragmented on the
      path, but the server never stores or usees path mtu values, which could
      well be forged in an attack.
      
      (The root of the problem with path MTU discovery is that there is
      no reliable way to authenticate ICMP Fragmentation Needed But DF Set
      messages because they are sent from intermediate routers with their
      source addresses, and the IMCP payload will not always contain sufficient
      information to identify a flow.)
      
      Recent research in the DNS community showed that it is possible to
      implement an attack where DNS cache poisoning is feasible by spoofing
      fragments. This work was done by Amir Herzberg and Haya Shulman:
      <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>
      
      This issue was previously discussed among the DNS community, e.g.
      <http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>,
      without leading to fixes.
      
      This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode
      regarding local fragmentation with UFO/CORK" for the enforcement of the
      non-fragmentable checks. If other users than ip_append_page/data should
      use this semantic too, we have to add a new flag to IPCB(skb)->flags to
      suppress local fragmentation and check for this in ip_finish_output.
      
      Many thanks to Florian Weimer for the idea and feedback while implementing
      this patch.
      
      Cc: David S. Miller <davem@davemloft.net>
      Suggested-by: NFlorian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      482fc609
  5. 18 10月, 2013 1 次提交
  6. 08 10月, 2013 1 次提交
  7. 21 8月, 2013 1 次提交
    • E
      ipv4: raise IP_MAX_MTU to theoretical limit · 734d2725
      Eric Dumazet 提交于
      As discussed last year [1], there is no compelling reason
      to limit IPv4 MTU to 0xFFF0, while real limit is 0xFFFF
      
      [1] : http://marc.info/?l=linux-netdev&m=135607247609434&w=2
      
      Willem raised this issue again because some of our internal
      regression tests broke after lo mtu being set to 65536.
      
      IP_MTU reports 0xFFF0, and the test attempts to send a RAW datagram of
      mtu + 1 bytes, expecting the send() to fail, but it does not.
      
      Alexey raised interesting points about TCP MSS, that should be addressed
      in follow-up patches in TCP stack if needed, as someone could also set
      an odd mtu anyway.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      734d2725
  8. 01 8月, 2013 1 次提交
  9. 29 6月, 2013 1 次提交
    • T
      ipv4: use next hop exceptions also for input routes · 2ffae99d
      Timo Teräs 提交于
      Commit d2d68ba9 (ipv4: Cache input routes in fib_info nexthops)
      assmued that "locally destined, and routed packets, never trigger
      PMTU events or redirects that will be processed by us".
      
      However, it seems that tunnel devices do trigger PMTU events in certain
      cases. At least ip_gre, ip6_gre, sit, and ipip do use the inner flow's
      skb_dst(skb)->ops->update_pmtu to propage mtu information from the
      outer flows. These can cause the inner flow mtu to be decreased. If
      next hop exceptions are not consulted for pmtu, IP fragmentation will
      not be done properly for these routes.
      
      It also seems that we really need to have the PMTU information always
      for netfilter TCPMSS clamp-to-pmtu feature to work properly.
      
      So for the time being, cache separate copies of input routes for
      each next hop exception.
      Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
      Reviewed-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ffae99d
  10. 13 6月, 2013 1 次提交
  11. 03 6月, 2013 3 次提交
  12. 28 5月, 2013 1 次提交
    • M
      ipv4: fix redirect handling for TCP packets · f96ef988
      Michal Kubecek 提交于
      Unlike ipv4_redirect() and ipv4_sk_redirect(), ip_do_redirect()
      doesn't call __build_flow_key() directly but via
      ip_rt_build_flow_key() wrapper. This leads to __build_flow_key()
      getting pointer to IPv4 header of the ICMP redirect packet
      rather than pointer to the embedded IPv4 header of the packet
      initiating the redirect.
      
      As a result, handling of ICMP redirects initiated by TCP packets
      is broken. Issue was introduced by
      
      	4895c771 ("ipv4: Add FIB nexthop exceptions.")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f96ef988
  13. 22 3月, 2013 1 次提交
  14. 20 2月, 2013 1 次提交
  15. 19 2月, 2013 1 次提交
  16. 23 1月, 2013 1 次提交
  17. 22 1月, 2013 1 次提交
  18. 17 1月, 2013 2 次提交
  19. 08 12月, 2012 1 次提交
  20. 23 11月, 2012 1 次提交
    • J
      ipv4: do not cache looped multicasts · 63617421
      Julian Anastasov 提交于
      	Starting from 3.6 we cache output routes for
      multicasts only when using route to 224/4. For local receivers
      we can set RTCF_LOCAL flag depending on the membership but
      in such case we use maddr and saddr which are not caching
      keys as before. Additionally, we can not use same place to
      cache routes that differ in RTCF_LOCAL flag value.
      
      	Fix it by caching only RTCF_MULTICAST entries
      without RTCF_LOCAL (send-only, no loopback). As a side effect,
      we avoid unneeded lookup for fnhe when not caching because
      multicasts are not redirected and they do not learn PMTU.
      
      	Thanks to Maxime Bizon for showing the caching
      problems in __mkroute_output for 3.6 kernels: different
      RTCF_LOCAL flag in cache can lead to wrong ip_mc_output or
      ip_output call and the visible problem is that traffic can
      not reach local receivers via loopback.
      Reported-by: NMaxime Bizon <mbizon@freebox.fr>
      Tested-by: NMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63617421
  21. 19 11月, 2012 1 次提交
  22. 13 11月, 2012 1 次提交
  23. 19 10月, 2012 1 次提交
  24. 11 10月, 2012 1 次提交
  25. 09 10月, 2012 7 次提交
  26. 19 9月, 2012 3 次提交
  27. 11 9月, 2012 1 次提交
  28. 08 9月, 2012 2 次提交