1. 05 11月, 2016 1 次提交
    • L
      net: inet: Support UID-based routing in IP protocols. · e2d118a1
      Lorenzo Colitti 提交于
      - Use the UID in routing lookups made by protocol connect() and
        sendmsg() functions.
      - Make sure that routing lookups triggered by incoming packets
        (e.g., Path MTU discovery) take the UID of the socket into
        account.
      - For packets not associated with a userspace socket, (e.g., ping
        replies) use UID 0 inside the user namespace corresponding to
        the network namespace the socket belongs to. This allows
        all namespaces to apply routing and iptables rules to
        kernel-originated traffic in that namespaces by matching UID 0.
        This is better than using the UID of the kernel socket that is
        sending the traffic, because the UID of kernel sockets created
        at namespace creation time (e.g., the per-processor ICMP and
        TCP sockets) is the UID of the user that created the socket,
        which might not be mapped in the namespace.
      
      Tested: compiles allnoconfig, allyesconfig, allmodconfig
      Tested: https://android-review.googlesource.com/253302Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2d118a1
  2. 11 9月, 2016 1 次提交
  3. 12 4月, 2016 1 次提交
    • D
      net: vrf: Fix dst reference counting · 9ab179d8
      David Ahern 提交于
      Vivek reported a kernel exception deleting a VRF with an active
      connection through it. The root cause is that the socket has a cached
      reference to a dst that is destroyed. Converting the dst_destroy to
      dst_release and letting proper reference counting kick in does not
      work as the dst has a reference to the device which needs to be released
      as well.
      
      I talked to Hannes about this at netdev and he pointed out the ipv4 and
      ipv6 dst handling has dst_ifdown for just this scenario. Rather than
      continuing with the reinvented dst wheel in VRF just remove it and
      leverage the ipv4 and ipv6 versions.
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ab179d8
  4. 08 4月, 2016 1 次提交
  5. 17 2月, 2016 1 次提交
  6. 05 1月, 2016 1 次提交
    • D
      net: Propagate lookup failure in l3mdev_get_saddr to caller · b5bdacf3
      David Ahern 提交于
      Commands run in a vrf context are not failing as expected on a route lookup:
          root@kenny:~# ip ro ls table vrf-red
          unreachable default
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          ping: Warning: source address might be selected on device other than vrf-red.
          PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.
      
          --- 10.100.1.254 ping statistics ---
          2 packets transmitted, 0 received, 100% packet loss, time 999ms
      
      Since the vrf table does not have a route for 10.100.1.254 the ping
      should have failed. The saddr lookup causes a full VRF table lookup.
      Propogating a lookup failure to the user allows the command to fail as
      expected:
      
          root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
          connect: No route to host
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5bdacf3
  7. 07 10月, 2015 2 次提交
  8. 05 10月, 2015 1 次提交
  9. 30 9月, 2015 2 次提交
  10. 26 9月, 2015 1 次提交
  11. 18 9月, 2015 1 次提交
  12. 16 9月, 2015 1 次提交
  13. 02 9月, 2015 1 次提交
  14. 21 8月, 2015 1 次提交
  15. 14 8月, 2015 3 次提交
  16. 22 7月, 2015 1 次提交
  17. 16 1月, 2015 1 次提交
    • E
      ipv4: per cpu uncached list · 5055c371
      Eric Dumazet 提交于
      RAW sockets with hdrinc suffer from contention on rt_uncached_lock
      spinlock.
      
      One solution is to use percpu lists, since most routes are destroyed
      by the cpu that created them.
      
      It is unclear why we even have to put these routes in uncached_list,
      as all outgoing packets should be freed when a device is dismantled.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: caacf05e ("ipv4: Properly purge netdev references on uncached routes.")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5055c371
  18. 25 3月, 2014 1 次提交
  19. 14 1月, 2014 1 次提交
    • H
      ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing · f87c10a8
      Hannes Frederic Sowa 提交于
      While forwarding we should not use the protocol path mtu to calculate
      the mtu for a forwarded packet but instead use the interface mtu.
      
      We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
      introduced for multicast forwarding. But as it does not conflict with
      our usage in unicast code path it is perfect for reuse.
      
      I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
      along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
      dependencies because of IPSKB_FORWARDED.
      
      Because someone might have written a software which does probe
      destinations manually and expects the kernel to honour those path mtus
      I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
      can disable this new behaviour. We also still use mtus which are locked on a
      route for forwarding.
      
      The reason for this change is, that path mtus information can be injected
      into the kernel via e.g. icmp_err protocol handler without verification
      of local sockets. As such, this could cause the IPv4 forwarding path to
      wrongfully emit fragmentation needed notifications or start to fragment
      packets along a path.
      
      Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
      won't be set and further fragmentation logic will use the path mtu to
      determine the fragmentation size. They also recheck packet size with
      help of path mtu discovery and report appropriate errors.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: John Heffner <johnwheffner@gmail.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f87c10a8
  20. 06 12月, 2013 1 次提交
  21. 06 11月, 2013 1 次提交
    • H
      ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE · 482fc609
      Hannes Frederic Sowa 提交于
      Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery,
      their sockets won't accept and install new path mtu information and they
      will always use the interface mtu for outgoing packets. It is guaranteed
      that the packet is not fragmented locally. But we won't set the DF-Flag
      on the outgoing frames.
      
      Florian Weimer had the idea to use this flag to ensure DNS servers are
      never generating outgoing fragments. They may well be fragmented on the
      path, but the server never stores or usees path mtu values, which could
      well be forged in an attack.
      
      (The root of the problem with path MTU discovery is that there is
      no reliable way to authenticate ICMP Fragmentation Needed But DF Set
      messages because they are sent from intermediate routers with their
      source addresses, and the IMCP payload will not always contain sufficient
      information to identify a flow.)
      
      Recent research in the DNS community showed that it is possible to
      implement an attack where DNS cache poisoning is feasible by spoofing
      fragments. This work was done by Amir Herzberg and Haya Shulman:
      <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>
      
      This issue was previously discussed among the DNS community, e.g.
      <http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>,
      without leading to fixes.
      
      This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode
      regarding local fragmentation with UFO/CORK" for the enforcement of the
      non-fragmentable checks. If other users than ip_append_page/data should
      use this semantic too, we have to add a new flag to IPCB(skb)->flags to
      suppress local fragmentation and check for this in ip_finish_output.
      
      Many thanks to Florian Weimer for the idea and feedback while implementing
      this patch.
      
      Cc: David S. Miller <davem@davemloft.net>
      Suggested-by: NFlorian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      482fc609
  22. 18 10月, 2013 1 次提交
  23. 29 9月, 2013 1 次提交
    • F
      ipv4: processing ancillary IP_TOS or IP_TTL · aa661581
      Francesco Fusco 提交于
      If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
      packets with the specified TTL or TOS overriding the socket values specified
      with the traditional setsockopt().
      
      The struct inet_cork stores the values of TOS, TTL and priority that are
      passed through the struct ipcm_cookie. If there are user-specified TOS
      (tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
      used to override the per-socket values. In case of TOS also the priority
      is changed accordingly.
      
      Two helper functions get_rttos and get_rtconn_flags are defined to take
      into account the presence of a user specified TOS value when computing
      RT_TOS and RT_CONN_FLAGS.
      Signed-off-by: NFrancesco Fusco <ffusco@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa661581
  24. 23 9月, 2013 1 次提交
  25. 14 8月, 2013 1 次提交
  26. 04 11月, 2012 1 次提交
  27. 09 10月, 2012 1 次提交
    • J
      ipv4: introduce rt_uses_gateway · 155e8336
      Julian Anastasov 提交于
      Add new flag to remember when route is via gateway.
      We will use it to allow rt_gateway to contain address of
      directly connected host for the cases when DST_NOCACHE is
      used or when the NH exception caches per-destination route
      without DST_NOCACHE flag, i.e. when routes are not used for
      other destinations. By this way we force the neighbour
      resolving to work with the routed destination but we
      can use different address in the packet, feature needed
      for IPVS-DR where original packet for virtual IP is routed
      via route to real IP.
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      155e8336
  28. 19 9月, 2012 1 次提交
  29. 08 9月, 2012 1 次提交
  30. 01 8月, 2012 1 次提交
  31. 27 7月, 2012 1 次提交
  32. 24 7月, 2012 1 次提交
  33. 21 7月, 2012 4 次提交
    • D
      ipv4: Kill rt->fi · 2860583f
      David S. Miller 提交于
      It's not really needed.
      
      We only grabbed a reference to the fib_info for the sake of fib_info
      local metrics.
      
      However, fib_info objects are freed using RCU, as are therefore their
      private metrics (if any).
      
      We would have triggered a route cache flush if we eliminated a
      reference to a fib_info object in the routing tables.
      
      Therefore, any existing cached routes will first check and see that
      they have been invalidated before an errant reference to these
      metric values would occur.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2860583f
    • D
      ipv4: Turn rt->rt_route_iif into rt->rt_is_input. · 9917e1e8
      David S. Miller 提交于
      That is this value's only use, as a boolean to indicate whether
      a route is an input route or not.
      
      So implement it that way, using a u16 gap present in the struct
      already.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9917e1e8
    • D
      ipv4: Kill rt->rt_oif · 4fd551d7
      David S. Miller 提交于
      Never actually used.
      
      It was being set on output routes to the original OIF specified in the
      flow key used for the lookup.
      
      Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of
      the flowi4_oif and flowi4_iif values, thanks to feedback from Julian
      Anastasov.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fd551d7
    • D
      ipv4: Adjust semantics of rt->rt_gateway. · f8126f1d
      David S. Miller 提交于
      In order to allow prefixed routes, we have to adjust how rt_gateway
      is set and interpreted.
      
      The new interpretation is:
      
      1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr
      
      2) rt_gateway != 0, destination requires a nexthop gateway
      
      Abstract the fetching of the proper nexthop value using a new
      inline helper, rt_nexthop(), as suggested by Joe Perches.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Tested-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      f8126f1d