1. 23 5月, 2020 1 次提交
  2. 19 5月, 2020 1 次提交
  3. 14 5月, 2020 3 次提交
  4. 10 5月, 2020 1 次提交
  5. 09 5月, 2020 2 次提交
  6. 08 5月, 2020 1 次提交
    • M
      Revert "ipv6: add mtu lock check in __ip6_rt_update_pmtu" · 09454fd0
      Maciej Żenczykowski 提交于
      This reverts commit 19bda36c:
      
      | ipv6: add mtu lock check in __ip6_rt_update_pmtu
      |
      | Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu.
      | It leaded to that mtu lock doesn't really work when receiving the pkt
      | of ICMPV6_PKT_TOOBIG.
      |
      | This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4
      | did in __ip_rt_update_pmtu.
      
      The above reasoning is incorrect.  IPv6 *requires* icmp based pmtu to work.
      There's already a comment to this effect elsewhere in the kernel:
      
        $ git grep -p -B1 -A3 'RTAX_MTU lock'
        net/ipv6/route.c=4813=
      
        static int rt6_mtu_change_route(struct fib6_info *f6i, void *p_arg)
        ...
          /* In IPv6 pmtu discovery is not optional,
             so that RTAX_MTU lock cannot disable it.
             We still use this lock to block changes
             caused by addrconf/ndisc.
          */
      
      This reverts to the pre-4.9 behaviour.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Fixes: 19bda36c ("ipv6: add mtu lock check in __ip6_rt_update_pmtu")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09454fd0
  7. 02 5月, 2020 1 次提交
    • D
      ipv6: Use global sernum for dst validation with nexthop objects · 8f34e53b
      David Ahern 提交于
      Nik reported a bug with pcpu dst cache when nexthop objects are
      used illustrated by the following:
          $ ip netns add foo
          $ ip -netns foo li set lo up
          $ ip -netns foo addr add 2001:db8:11::1/128 dev lo
          $ ip netns exec foo sysctl net.ipv6.conf.all.forwarding=1
          $ ip li add veth1 type veth peer name veth2
          $ ip li set veth1 up
          $ ip addr add 2001:db8:10::1/64 dev veth1
          $ ip li set dev veth2 netns foo
          $ ip -netns foo li set veth2 up
          $ ip -netns foo addr add 2001:db8:10::2/64 dev veth2
          $ ip -6 nexthop add id 100 via 2001:db8:10::2 dev veth1
          $ ip -6 route add 2001:db8:11::1/128 nhid 100
      
          Create a pcpu entry on cpu 0:
          $ taskset -a -c 0 ip -6 route get 2001:db8:11::1
      
          Re-add the route entry:
          $ ip -6 ro del 2001:db8:11::1
          $ ip -6 route add 2001:db8:11::1/128 nhid 100
      
          Route get on cpu 0 returns the stale pcpu:
          $ taskset -a -c 0 ip -6 route get 2001:db8:11::1
          RTNETLINK answers: Network is unreachable
      
          While cpu 1 works:
          $ taskset -a -c 1 ip -6 route get 2001:db8:11::1
          2001:db8:11::1 from :: via 2001:db8:10::2 dev veth1 src 2001:db8:10::1 metric 1024 pref medium
      
      Conversion of FIB entries to work with external nexthop objects
      missed an important difference between IPv4 and IPv6 - how dst
      entries are invalidated when the FIB changes. IPv4 has a per-network
      namespace generation id (rt_genid) that is bumped on changes to the FIB.
      Checking if a dst_entry is still valid means comparing rt_genid in the
      rtable to the current value of rt_genid for the namespace.
      
      IPv6 also has a per network namespace counter, fib6_sernum, but the
      count is saved per fib6_node. With the per-node counter only dst_entries
      based on fib entries under the node are invalidated when changes are
      made to the routes - limiting the scope of invalidations. IPv6 uses a
      reference in the rt6_info, 'from', to track the corresponding fib entry
      used to create the dst_entry. When validating a dst_entry, the 'from'
      is used to backtrack to the fib6_node and check the sernum of it to the
      cookie passed to the dst_check operation.
      
      With the inline format (nexthop definition inline with the fib6_info),
      dst_entries cached in the fib6_nh have a 1:1 correlation between fib
      entries, nexthop data and dst_entries. With external nexthops, IPv6
      looks more like IPv4 which means multiple fib entries across disparate
      fib6_nodes can all reference the same fib6_nh. That means validation
      of dst_entries based on external nexthops needs to use the IPv4 format
      - the per-network namespace counter.
      
      Add sernum to rt6_info and set it when creating a pcpu dst entry. Update
      rt6_get_cookie to return sernum if it is set and update dst_check for
      IPv6 to look for sernum set and based the check on it if so. Finally,
      rt6_get_pcpu_route needs to validate the cached entry before returning
      a pcpu entry (similar to the rt_cache_valid calls in __mkroute_input and
      __mkroute_output for IPv4).
      
      This problem only affects routes using the new, external nexthops.
      
      Thanks to the kbuild test robot for catching the IS_ENABLED needed
      around rt_genid_ipv6 before I sent this out.
      
      Fixes: 5b98324e ("ipv6: Allow routes to use nexthop objects")
      Reported-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f34e53b
  8. 29 4月, 2020 2 次提交
  9. 27 4月, 2020 1 次提交
  10. 30 3月, 2020 1 次提交
  11. 24 3月, 2020 1 次提交
  12. 13 3月, 2020 1 次提交
  13. 17 2月, 2020 1 次提交
    • B
      ipv6: Fix nlmsg_flags when splitting a multipath route · afecdb37
      Benjamin Poirier 提交于
      When splitting an RTA_MULTIPATH request into multiple routes and adding the
      second and later components, we must not simply remove NLM_F_REPLACE but
      instead replace it by NLM_F_CREATE. Otherwise, it may look like the netlink
      message was malformed.
      
      For example,
      	ip route add 2001:db8::1/128 dev dummy0
      	ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0 \
      		nexthop via fe80::30:2 dev dummy0
      results in the following warnings:
      [ 1035.057019] IPv6: RTM_NEWROUTE with no NLM_F_CREATE or NLM_F_REPLACE
      [ 1035.057517] IPv6: NLM_F_CREATE should be set when creating new route
      
      This patch makes the nlmsg sequence look equivalent for __ip6_ins_rt() to
      what it would get if the multipath route had been added in multiple netlink
      operations:
      	ip route add 2001:db8::1/128 dev dummy0
      	ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0
      	ip route append 2001:db8::1/128 nexthop via fe80::30:2 dev dummy0
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: NBenjamin Poirier <bpoirier@cumulusnetworks.com>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afecdb37
  14. 15 1月, 2020 1 次提交
  15. 25 12月, 2019 4 次提交
  16. 22 11月, 2019 1 次提交
  17. 21 11月, 2019 1 次提交
  18. 08 11月, 2019 1 次提交
    • E
      ipv6: fixes rt6_probe() and fib6_nh->last_probe init · 1bef4c22
      Eric Dumazet 提交于
      While looking at a syzbot KCSAN report [1], I found multiple
      issues in this code :
      
      1) fib6_nh->last_probe has an initial value of 0.
      
         While probably okay on 64bit kernels, this causes an issue
         on 32bit kernels since the time_after(jiffies, 0 + interval)
         might be false ~24 days after boot (for HZ=1000)
      
      2) The data-race found by KCSAN
         I could use READ_ONCE() and WRITE_ONCE(), but we also can
         take the opportunity of not piling-up too many rt6_probe_deferred()
         works by using instead cmpxchg() so that only one cpu wins the race.
      
      [1]
      BUG: KCSAN: data-race in find_match / find_match
      
      write to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 1:
       rt6_probe net/ipv6/route.c:663 [inline]
       find_match net/ipv6/route.c:757 [inline]
       find_match+0x5bd/0x790 net/ipv6/route.c:733
       __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
       find_rr_leaf net/ipv6/route.c:852 [inline]
       rt6_select net/ipv6/route.c:896 [inline]
       fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
       ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
       ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
       fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
       ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
       ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
       ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
       ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
       inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
       inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       tcp_xmit_probe_skb+0x19b/0x1d0 net/ipv4/tcp_output.c:3735
      
      read to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 0:
       rt6_probe net/ipv6/route.c:657 [inline]
       find_match net/ipv6/route.c:757 [inline]
       find_match+0x521/0x790 net/ipv6/route.c:733
       __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
       find_rr_leaf net/ipv6/route.c:852 [inline]
       rt6_select net/ipv6/route.c:896 [inline]
       fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
       ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
       ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
       fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
       ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
       ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
       ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
       ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
       inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
       inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 18894 Comm: udevd Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: cc3a86c8 ("ipv6: Change rt6_probe to take a fib6_nh")
      Fixes: f547fac6 ("ipv6: rate-limit probes for neighbourless routes")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bef4c22
  19. 06 11月, 2019 1 次提交
  20. 05 11月, 2019 1 次提交
  21. 17 10月, 2019 1 次提交
  22. 16 10月, 2019 1 次提交
    • M
      blackhole_netdev: fix syzkaller reported issue · b0818f80
      Mahesh Bandewar 提交于
      While invalidating the dst, we assign backhole_netdev instead of
      loopback device. However, this device does not have idev pointer
      and hence no ip6_ptr even if IPv6 is enabled. Possibly this has
      triggered the syzbot reported crash.
      
      The syzbot report does not have reproducer, however, this is the
      only device that doesn't have matching idev created.
      
      Crash instruction is :
      
      static inline bool ip6_ignore_linkdown(const struct net_device *dev)
      {
              const struct inet6_dev *idev = __in6_dev_get(dev);
      
              return !!idev->cnf.ignore_routes_with_linkdown; <= crash
      }
      
      Also ipv6 always assumes presence of idev and never checks for it
      being NULL (as does the above referenced code). So adding a idev
      for the blackhole_netdev to avoid this class of crashes in the future.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0818f80
  23. 12 9月, 2019 1 次提交
    • S
      ipv6: Don't use dst gateway directly in ip6_confirm_neigh() · cbfd6891
      Stefano Brivio 提交于
      This is the equivalent of commit 2c6b55f4 ("ipv6: fix neighbour
      resolution with raw socket") for ip6_confirm_neigh(): we can send a
      packet with MSG_CONFIRM on a raw socket for a connected route, so the
      gateway would be :: here, and we should pick the next hop using
      rt6_nexthop() instead.
      
      This was found by code review and, to the best of my knowledge, doesn't
      actually fix a practical issue: the destination address from the packet
      is not considered while confirming a neighbour, as ip6_confirm_neigh()
      calls choose_neigh_daddr() without passing the packet, so there are no
      similar issues as the one fixed by said commit.
      
      A possible source of issues with the existing implementation might come
      from the fact that, if we have a cached dst, we won't consider it,
      while rt6_nexthop() takes care of that. I might just not be creative
      enough to find a practical problem here: the only way to affect this
      with cached routes is to have one coming from an ICMPv6 redirect, but
      if the next hop is a directly connected host, there should be no
      topology for which a redirect applies here, and tests with redirected
      routes show no differences for MSG_CONFIRM (and MSG_PROBE) packets on
      raw sockets destined to a directly connected host.
      
      However, directly using the dst gateway here is not consistent anymore
      with neighbour resolution, and, in general, as we want the next hop,
      using rt6_nexthop() looks like the only sane way to fetch it.
      Reported-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbfd6891
  24. 07 9月, 2019 1 次提交
  25. 05 9月, 2019 3 次提交
    • D
      net: Properly update v4 routes with v6 nexthop · 7bdf4de1
      Donald Sharp 提交于
      When creating a v4 route that uses a v6 nexthop from a nexthop group.
      Allow the kernel to properly send the nexthop as v6 via the RTA_VIA
      attribute.
      
      Broken behavior:
      
      $ ip nexthop add via fe80::9 dev eth0
      $ ip nexthop show
      id 1 via fe80::9 dev eth0 scope link
      $ ip route add 4.5.6.7/32 nhid 1
      $ ip route show
      default via 10.0.2.2 dev eth0
      4.5.6.7 nhid 1 via 254.128.0.0 dev eth0
      10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
      $
      
      Fixed behavior:
      
      $ ip nexthop add via fe80::9 dev eth0
      $ ip nexthop show
      id 1 via fe80::9 dev eth0 scope link
      $ ip route add 4.5.6.7/32 nhid 1
      $ ip route show
      default via 10.0.2.2 dev eth0
      4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0
      10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
      $
      
      v2, v3: Addresses code review comments from David Ahern
      
      Fixes: dcb1ecb5 (“ipv4: Prepare for fib6_nh from a nexthop object”)
      Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7bdf4de1
    • D
      ipv6: Fix RTA_MULTIPATH with nexthop objects · 4255ff05
      David Ahern 提交于
      A change to the core nla helpers was missed during the push of
      the nexthop changes. rt6_fill_node_nexthop should be calling
      nla_nest_start_noflag not nla_nest_start. Currently, iproute2
      does not print multipath data because of parsing issues with
      the attribute.
      
      Fixes: f88d8ea6 ("ipv6: Plumb support for nexthop object in a fib6_info")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4255ff05
    • M
      net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others) · d55a2e37
      Maciej Żenczykowski 提交于
      There is a subtle change in behaviour introduced by:
        commit c7a1ce39
        'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'
      
      Before that patch /proc/net/ipv6_route includes:
      00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo
      
      Afterwards /proc/net/ipv6_route includes:
      00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo
      
      ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).
      
      AFAICT, this is incorrect since these routes are *not* coming from RA's.
      
      As such, this patch restores the old behaviour.
      
      Fixes: c7a1ce39 ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d55a2e37
  26. 06 8月, 2019 2 次提交
  27. 19 7月, 2019 1 次提交
  28. 18 7月, 2019 1 次提交
  29. 09 7月, 2019 1 次提交
  30. 02 7月, 2019 1 次提交