1. 06 12月, 2015 1 次提交
  2. 04 12月, 2015 2 次提交
    • P
      net: ipv6: restrict hop_limit sysctl setting to range [1; 255] · d6df198d
      Phil Sutter 提交于
      Setting a value bigger than 255 resulted in using only the lower eight
      bits of that value as it is assigned to the u8 header field. To avoid
      this unexpected result, reject such values.
      
      Setting a value of zero is technically possible, but hosts receiving
      such a packet have to treat it like hop_limit was set to one, according
      to RFC2460. Therefore I don't see a use-case for that.
      
      Setting a route's hop_limit to zero in iproute2 means to use the sysctl
      default, which is not the case here: Setting e.g.
      net.conf.eth0.hop_limit=0 will not make the kernel use
      net.conf.all.hop_limit for outgoing packets on eth0. To avoid these
      kinds of confusion, reject zero.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6df198d
    • E
      ipv6: kill sk_dst_lock · 6bd4f355
      Eric Dumazet 提交于
      While testing the np->opt RCU conversion, I found that UDP/IPv6 was
      using a mixture of xchg() and sk_dst_lock to protect concurrent changes
      to sk->sk_dst_cache, leading to possible corruptions and crashes.
      
      ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
      way to fix the mess is to remove sk_dst_lock completely, as we did for
      IPv4.
      
      __ip6_dst_store() and ip6_dst_store() share same implementation.
      
      sk_setup_caps() being called with socket lock being held or not,
      we have to use sk_dst_set() instead of __sk_dst_set()
      
      Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);"
      in ip6_dst_store() before the sk_setup_caps(sk, dst) call.
      
      This is because ip6_dst_store() can be called from process context,
      without any lock held.
      
      As soon as the dst is installed in sk->sk_dst_cache, dst can be freed
      from another cpu doing a concurrent ip6_dst_store()
      
      Doing the dst dereference before doing the install is needed to make
      sure no use after free would trigger.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bd4f355
  3. 03 12月, 2015 2 次提交
  4. 02 12月, 2015 1 次提交
  5. 01 12月, 2015 1 次提交
  6. 25 11月, 2015 2 次提交
  7. 23 11月, 2015 2 次提交
  8. 19 11月, 2015 1 次提交
  9. 17 11月, 2015 1 次提交
  10. 16 11月, 2015 4 次提交
    • E
      tcp: ensure proper barriers in lockless contexts · 00fd38d9
      Eric Dumazet 提交于
      Some functions access TCP sockets without holding a lock and
      might output non consistent data, depending on compiler and or
      architecture.
      
      tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ...
      
      Introduce sk_state_load() and sk_state_store() to fix the issues,
      and more clearly document where this lack of locking is happening.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      00fd38d9
    • M
      ipv6: Check rt->dst.from for the DST_NOCACHE route · 02bcf4e0
      Martin KaFai Lau 提交于
      All DST_NOCACHE rt6_info used to have rt->dst.from set to
      its parent.
      
      After commit 8e3d5be7 ("ipv6: Avoid double dst_free"),
      DST_NOCACHE is also set to rt6_info which does not have
      a parent (i.e. rt->dst.from is NULL).
      
      This patch catches the rt->dst.from == NULL case.
      
      Fixes: 8e3d5be7 ("ipv6: Avoid double dst_free")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02bcf4e0
    • M
      ipv6: Check expire on DST_NOCACHE route · 5973fb1e
      Martin KaFai Lau 提交于
      Since the expires of the DST_NOCACHE rt can be set during
      the ip6_rt_update_pmtu(), we also need to consider the expires
      value when doing ip6_dst_check().
      
      This patches creates __rt6_check_expired() to only
      check the expire value (if one exists) of the current rt.
      
      In rt6_dst_from_check(), it adds __rt6_check_expired() as
      one of the condition check.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5973fb1e
    • M
      ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree · 0d3f6d29
      Martin KaFai Lau 提交于
      The original bug report:
      https://bugzilla.redhat.com/show_bug.cgi?id=1272571
      
      The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
      happens when ndisc starts sending router solicitation at the gre
      interface.  The simplified oops stack is like:
      
      __lock_acquire+0x1b2/0x1c30
      lock_acquire+0xb9/0x140
      _raw_write_lock_bh+0x3f/0x50
      __ip6_ins_rt+0x2e/0x60
      ip6_ins_rt+0x49/0x50
      ~~~~~~~~
      __ip6_rt_update_pmtu.part.54+0x145/0x250
      ip6_rt_update_pmtu+0x2e/0x40
      ~~~~~~~~
      ip_tunnel_xmit+0x1f1/0xf40
      __gre_xmit+0x7a/0x90
      ipgre_xmit+0x15a/0x220
      dev_hard_start_xmit+0x2bd/0x480
      __dev_queue_xmit+0x696/0x730
      dev_queue_xmit+0x10/0x20
      neigh_direct_output+0x11/0x20
      ip6_finish_output2+0x21f/0x770
      ip6_finish_output+0xa7/0x1d0
      ip6_output+0x56/0x190
      ~~~~~~~~
      ndisc_send_skb+0x1d9/0x400
      ndisc_send_rs+0x88/0xc0
      ~~~~~~~~
      
      The rt passed to ip6_rt_update_pmtu() is created by
      icmp6_dst_alloc() and it is not managed by the fib6 tree,
      so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
      a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
      and it causes the ip6_ins_rt() oops.
      
      During pmtu update, we only want to create a RTF_CACHE clone
      from a rt which is currently managed (or owned) by the
      fib6 tree.  It means either rt->rt6i_node != NULL or
      rt is a RTF_PCPU clone.
      
      It is worth to note that rt6i_table may not be NULL even it is
      not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
      Hence, rt6i_node is a better check instead of rt6i_table.
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reported-by: NChris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
      Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d3f6d29
  11. 06 11月, 2015 2 次提交
  12. 05 11月, 2015 1 次提交
  13. 03 11月, 2015 5 次提交
  14. 02 11月, 2015 2 次提交
  15. 30 10月, 2015 1 次提交
  16. 29 10月, 2015 2 次提交
  17. 28 10月, 2015 1 次提交
  18. 27 10月, 2015 1 次提交
  19. 23 10月, 2015 3 次提交
  20. 22 10月, 2015 4 次提交
    • D
      net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set · d46a9d67
      David Ahern 提交于
      741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if
      oif is given. Hajime reported that this change breaks the Mobile IPv6
      use case that wants to force the message through one interface yet use
      the source address from another interface. Handle this case by only
      adding the flag if oif is set and saddr is not set.
      
      Fixes: 741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      Cc: Hajime Tazaki <thehajime@gmail.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d46a9d67
    • E
      ipv6: gro: support sit protocol · feec0cb3
      Eric Dumazet 提交于
      Tom Herbert added SIT support to GRO with commit
      19424e05 ("sit: Add gro callbacks to sit_offload"),
      later reverted by Herbert Xu.
      
      The problem came because Tom patch was building GRO
      packets without proper meta data : If packets were locally
      delivered, we would not care.
      
      But if packets needed to be forwarded, GSO engine was not
      able to segment individual segments.
      
      With the following patch, we correctly set skb->encapsulation
      and inner network header. We also update gso_type.
      
      Tested:
      
      Server :
      netserver
      modprobe dummy
      ifconfig dummy0 8.0.0.1 netmask 255.255.255.0 up
      arp -s 8.0.0.100 4e:32:51:04:47:e5
      iptables -I INPUT -s 10.246.7.151 -j TEE --gateway 8.0.0.100
      ifconfig sixtofour0
      sixtofour0 Link encap:IPv6-in-IPv4
                inet6 addr: 2002:af6:798::1/128 Scope:Global
                inet6 addr: 2002:af6:798::/128 Scope:Global
                UP RUNNING NOARP  MTU:1480  Metric:1
                RX packets:411169 errors:0 dropped:0 overruns:0 frame:0
                TX packets:409414 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:0
                RX bytes:20319631739 (20.3 GB)  TX bytes:29529556 (29.5 MB)
      
      Client :
      netperf -H 2002:af6:798::1 -l 1000 &
      
      Checked on server traffic copied on dummy0 and verify segments were
      properly rebuilt, with proper IP headers, TCP checksums...
      
      tcpdump on eth0 shows proper GRO aggregation takes place.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      feec0cb3
    • A
      netlink: Rightsize IFLA_AF_SPEC size calculation · b1974ed0
      Arad, Ronen 提交于
      if_nlmsg_size() overestimates the minimum allocation size of netlink
      dump request (when called from rtnl_calcit()) or the size of the
      message (when called from rtnl_getlink()). This is because
      ext_filter_mask is not supported by rtnl_link_get_af_size() and
      rtnl_link_get_size().
      
      The over-estimation is significant when at least one netdev has many
      VLANs configured (8 bytes for each configured VLAN).
      
      This patch-set "rightsizes" the protocol specific attribute size
      calculation by propagating ext_filter_mask to rtnl_link_get_af_size()
      and adding this a argument to get_link_af_size op in rtnl_af_ops.
      
      Bridge module already used filtering aware sizing for notifications.
      br_get_link_af_size_filtered() is consistent with the modified
      get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops.
      br_get_link_af_size() becomes unused and thus removed.
      Signed-off-by: NRonen Arad <ronen.arad@intel.com>
      Acked-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1974ed0
    • D
      net: Really fix vti6 with oif in dst lookups · f1900fb5
      David Ahern 提交于
      6e28b000 ("net: Fix vti use case with oif in dst lookups for IPv6")
      is missing the checks on FLOWI_FLAG_SKIP_NH_OIF. Add them.
      
      Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1900fb5
  21. 19 10月, 2015 1 次提交
    • S
      xfrm: Fix pmtu discovery for local generated packets. · ca064bd8
      Steffen Klassert 提交于
      Commit 044a832a ("xfrm: Fix local error reporting crash
      with interfamily tunnels") moved the setting of skb->protocol
      behind the last access of the inner mode family to fix an
      interfamily crash. Unfortunately now skb->protocol might not
      be set at all, so we fail dispatch to the inner address family.
      As a reault, the local error handler is not called and the
      mtu value is not reported back to userspace.
      
      We fix this by setting skb->protocol on message size errors
      before we call xfrm_local_error.
      
      Fixes: 044a832a ("xfrm: Fix local error reporting crash with interfamily tunnels")
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      ca064bd8