1. 21 8月, 2013 1 次提交
    • E
      ipv4: raise IP_MAX_MTU to theoretical limit · 734d2725
      Eric Dumazet 提交于
      As discussed last year [1], there is no compelling reason
      to limit IPv4 MTU to 0xFFF0, while real limit is 0xFFFF
      
      [1] : http://marc.info/?l=linux-netdev&m=135607247609434&w=2
      
      Willem raised this issue again because some of our internal
      regression tests broke after lo mtu being set to 65536.
      
      IP_MTU reports 0xFFF0, and the test attempts to send a RAW datagram of
      mtu + 1 bytes, expecting the send() to fail, but it does not.
      
      Alexey raised interesting points about TCP MSS, that should be addressed
      in follow-up patches in TCP stack if needed, as someone could also set
      an odd mtu anyway.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      734d2725
  2. 01 8月, 2013 1 次提交
  3. 29 6月, 2013 1 次提交
    • T
      ipv4: use next hop exceptions also for input routes · 2ffae99d
      Timo Teräs 提交于
      Commit d2d68ba9 (ipv4: Cache input routes in fib_info nexthops)
      assmued that "locally destined, and routed packets, never trigger
      PMTU events or redirects that will be processed by us".
      
      However, it seems that tunnel devices do trigger PMTU events in certain
      cases. At least ip_gre, ip6_gre, sit, and ipip do use the inner flow's
      skb_dst(skb)->ops->update_pmtu to propage mtu information from the
      outer flows. These can cause the inner flow mtu to be decreased. If
      next hop exceptions are not consulted for pmtu, IP fragmentation will
      not be done properly for these routes.
      
      It also seems that we really need to have the PMTU information always
      for netfilter TCPMSS clamp-to-pmtu feature to work properly.
      
      So for the time being, cache separate copies of input routes for
      each next hop exception.
      Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
      Reviewed-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ffae99d
  4. 13 6月, 2013 1 次提交
  5. 03 6月, 2013 3 次提交
  6. 28 5月, 2013 1 次提交
    • M
      ipv4: fix redirect handling for TCP packets · f96ef988
      Michal Kubecek 提交于
      Unlike ipv4_redirect() and ipv4_sk_redirect(), ip_do_redirect()
      doesn't call __build_flow_key() directly but via
      ip_rt_build_flow_key() wrapper. This leads to __build_flow_key()
      getting pointer to IPv4 header of the ICMP redirect packet
      rather than pointer to the embedded IPv4 header of the packet
      initiating the redirect.
      
      As a result, handling of ICMP redirects initiated by TCP packets
      is broken. Issue was introduced by
      
      	4895c771 ("ipv4: Add FIB nexthop exceptions.")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f96ef988
  7. 22 3月, 2013 1 次提交
  8. 20 2月, 2013 1 次提交
  9. 19 2月, 2013 1 次提交
  10. 23 1月, 2013 1 次提交
  11. 22 1月, 2013 1 次提交
  12. 17 1月, 2013 2 次提交
  13. 08 12月, 2012 1 次提交
  14. 23 11月, 2012 1 次提交
    • J
      ipv4: do not cache looped multicasts · 63617421
      Julian Anastasov 提交于
      	Starting from 3.6 we cache output routes for
      multicasts only when using route to 224/4. For local receivers
      we can set RTCF_LOCAL flag depending on the membership but
      in such case we use maddr and saddr which are not caching
      keys as before. Additionally, we can not use same place to
      cache routes that differ in RTCF_LOCAL flag value.
      
      	Fix it by caching only RTCF_MULTICAST entries
      without RTCF_LOCAL (send-only, no loopback). As a side effect,
      we avoid unneeded lookup for fnhe when not caching because
      multicasts are not redirected and they do not learn PMTU.
      
      	Thanks to Maxime Bizon for showing the caching
      problems in __mkroute_output for 3.6 kernels: different
      RTCF_LOCAL flag in cache can lead to wrong ip_mc_output or
      ip_output call and the visible problem is that traffic can
      not reach local receivers via loopback.
      Reported-by: NMaxime Bizon <mbizon@freebox.fr>
      Tested-by: NMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63617421
  15. 19 11月, 2012 1 次提交
  16. 13 11月, 2012 1 次提交
  17. 19 10月, 2012 1 次提交
  18. 11 10月, 2012 1 次提交
  19. 09 10月, 2012 7 次提交
  20. 19 9月, 2012 3 次提交
  21. 11 9月, 2012 1 次提交
  22. 08 9月, 2012 2 次提交
  23. 01 9月, 2012 1 次提交
    • A
      ipv4: Minor logic clean-up in ipv4_mtu · 98d75c37
      Alexander Duyck 提交于
      In ipv4_mtu there is some logic where we are testing for a non-zero value
      and a timer expiration, then setting the value to zero, and then testing if
      the value is zero we set it to a value based on the dst.  Instead of
      bothering with the extra steps it is easier to just cleanup the logic so
      that we set it to the dst based value if it is zero or if the timer has
      expired.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      98d75c37
  24. 31 8月, 2012 1 次提交
    • E
      ipv4: must use rcu protection while calling fib_lookup · c5ae7d41
      Eric Dumazet 提交于
      Following lockdep splat was reported by Pavel Roskin :
      
      [ 1570.586223] ===============================
      [ 1570.586225] [ INFO: suspicious RCU usage. ]
      [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
      [ 1570.586229] -------------------------------
      [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
      [ 1570.586233]
      [ 1570.586233] other info that might help us debug this:
      [ 1570.586233]
      [ 1570.586236]
      [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
      [ 1570.586238] 2 locks held by Chrome_IOThread/4467:
      [ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
      [ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
      [ 1570.586260]
      [ 1570.586260] stack backtrace:
      [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
      [ 1570.586265] Call Trace:
      [ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
      [ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
      [ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
      [ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
      [ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
      [ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
      [ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
      [ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
      [ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
      [ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
      [ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
      [ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
      [ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
      [ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
      [ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
      [ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
      [ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
      [ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
      [ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NPavel Roskin <proski@gnu.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5ae7d41
  25. 24 8月, 2012 1 次提交
  26. 23 8月, 2012 1 次提交
    • E
      ipv4: properly update pmtu · 9b04f350
      Eric Dumazet 提交于
      Sylvain Munault reported following info :
      
       - TCP connection get "stuck" with data in send queue when doing
         "large" transfers ( like typing 'ps ax' on a ssh connection )
       - Only happens on path where the PMTU is lower than the MTU of
         the interface
       - Is not present right after boot, it only appears 10-20min after
         boot or so. (and that's inside the _same_ TCP connection, it works
         fine at first and then in the same ssh session, it'll get stuck)
       - Definitely seems related to fragments somehow since I see a router
         sending ICMP message saying fragmentation is needed.
       - Exact same setup works fine with kernel 3.5.1
      
      Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration
      period is over.
      
      ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration,
      but dst_set_expires() does nothing because dst.expires is already set.
      
      It seems we want to set the expires field to a new value, regardless
      of prior one.
      
      With help from Julian Anastasov.
      Reported-by: NSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      CC: Julian Anastasov <ja@ssi.bg>
      Tested-by: NSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b04f350
  27. 15 8月, 2012 1 次提交
  28. 10 8月, 2012 1 次提交