1. 13 3月, 2020 1 次提交
  2. 04 11月, 2019 1 次提交
    • F
      net: icmp6: provide input address for traceroute6 · fac6fce9
      Francesco Ruggeri 提交于
      traceroute6 output can be confusing, in that it shows the address
      that a router would use to reach the sender, rather than the address
      the packet used to reach the router.
      Consider this case:
      
              ------------------------ N2
               |                    |
             ------              ------  N3  ----
             | R1 |              | R2 |------|H2|
             ------              ------      ----
               |                    |
              ------------------------ N1
                        |
                       ----
                       |H1|
                       ----
      
      where H1's default route is through R1, and R1's default route is
      through R2 over N2.
      traceroute6 from H1 to H2 shows R2's address on N1 rather than on N2.
      
      The script below can be used to reproduce this scenario.
      
      traceroute6 output without this patch:
      
      traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
       1  2000:101::1 (2000:101::1)  0.036 ms  0.008 ms  0.006 ms
       2  2000:101::2 (2000:101::2)  0.011 ms  0.008 ms  0.007 ms
       3  2000:103::4 (2000:103::4)  0.013 ms  0.010 ms  0.009 ms
      
      traceroute6 output with this patch:
      
      traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
       1  2000:101::1 (2000:101::1)  0.056 ms  0.019 ms  0.006 ms
       2  2000:102::2 (2000:102::2)  0.013 ms  0.008 ms  0.008 ms
       3  2000:103::4 (2000:103::4)  0.013 ms  0.009 ms  0.009 ms
      
      #!/bin/bash
      #
      #        ------------------------ N2
      #         |                    |
      #       ------              ------  N3  ----
      #       | R1 |              | R2 |------|H2|
      #       ------              ------      ----
      #         |                    |
      #        ------------------------ N1
      #                  |
      #                 ----
      #                 |H1|
      #                 ----
      #
      # N1: 2000:101::/64
      # N2: 2000:102::/64
      # N3: 2000:103::/64
      #
      # R1's host part of address: 1
      # R2's host part of address: 2
      # H1's host part of address: 3
      # H2's host part of address: 4
      #
      # For example:
      # the IPv6 address of R1's interface on N2 is 2000:102::1/64
      #
      # Nets are implemented by macvlan interfaces (bridge mode) over
      # dummy interfaces.
      #
      
      # Create net namespaces
      ip netns add host1
      ip netns add host2
      ip netns add rtr1
      ip netns add rtr2
      
      # Create nets
      ip link add net1 type dummy; ip link set net1 up
      ip link add net2 type dummy; ip link set net2 up
      ip link add net3 type dummy; ip link set net3 up
      
      # Add interfaces to net1, move them to their nemaspaces
      ip link add link net1 dev host1net1 type macvlan mode bridge
      ip link set host1net1 netns host1
      ip link add link net1 dev rtr1net1 type macvlan mode bridge
      ip link set rtr1net1 netns rtr1
      ip link add link net1 dev rtr2net1 type macvlan mode bridge
      ip link set rtr2net1 netns rtr2
      
      # Add interfaces to net2, move them to their nemaspaces
      ip link add link net2 dev rtr1net2 type macvlan mode bridge
      ip link set rtr1net2 netns rtr1
      ip link add link net2 dev rtr2net2 type macvlan mode bridge
      ip link set rtr2net2 netns rtr2
      
      # Add interfaces to net3, move them to their nemaspaces
      ip link add link net3 dev rtr2net3 type macvlan mode bridge
      ip link set rtr2net3 netns rtr2
      ip link add link net3 dev host2net3 type macvlan mode bridge
      ip link set host2net3 netns host2
      
      # Configure interfaces and routes in host1
      ip netns exec host1 ip link set lo up
      ip netns exec host1 ip link set host1net1 up
      ip netns exec host1 ip -6 addr add 2000:101::3/64 dev host1net1
      ip netns exec host1 ip -6 route add default via 2000:101::1
      
      # Configure interfaces and routes in rtr1
      ip netns exec rtr1 ip link set lo up
      ip netns exec rtr1 ip link set rtr1net1 up
      ip netns exec rtr1 ip -6 addr add 2000:101::1/64 dev rtr1net1
      ip netns exec rtr1 ip link set rtr1net2 up
      ip netns exec rtr1 ip -6 addr add 2000:102::1/64 dev rtr1net2
      ip netns exec rtr1 ip -6 route add default via 2000:102::2
      ip netns exec rtr1 sysctl net.ipv6.conf.all.forwarding=1
      
      # Configure interfaces and routes in rtr2
      ip netns exec rtr2 ip link set lo up
      ip netns exec rtr2 ip link set rtr2net1 up
      ip netns exec rtr2 ip -6 addr add 2000:101::2/64 dev rtr2net1
      ip netns exec rtr2 ip link set rtr2net2 up
      ip netns exec rtr2 ip -6 addr add 2000:102::2/64 dev rtr2net2
      ip netns exec rtr2 ip link set rtr2net3 up
      ip netns exec rtr2 ip -6 addr add 2000:103::2/64 dev rtr2net3
      ip netns exec rtr2 sysctl net.ipv6.conf.all.forwarding=1
      
      # Configure interfaces and routes in host2
      ip netns exec host2 ip link set lo up
      ip netns exec host2 ip link set host2net3 up
      ip netns exec host2 ip -6 addr add 2000:103::4/64 dev host2net3
      ip netns exec host2 ip -6 route add default via 2000:103::2
      
      # Ping host2 from host1
      ip netns exec host1 ping6 -c5 2000:103::4
      
      # Traceroute host2 from host1
      ip netns exec host1 traceroute6 2000:103::4
      
      # Delete nets
      ip link del net3
      ip link del net2
      ip link del net1
      
      # Delete namespaces
      ip netns del rtr2
      ip netns del rtr1
      ip netns del host2
      ip netns del host1
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Original-patch-by: NHonggang Xu <hxu@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fac6fce9
  3. 02 7月, 2019 1 次提交
  4. 13 6月, 2019 1 次提交
  5. 04 6月, 2019 1 次提交
  6. 31 5月, 2019 1 次提交
  7. 19 4月, 2019 1 次提交
    • S
      ipv6: Add rate limit mask for ICMPv6 messages · 0bc19985
      Stephen Suryaputra 提交于
      To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP
      message types use larger numeric values, a simple bitmask doesn't fit.
      I use large bitmap. The input and output are the in form of list of
      ranges. Set the default to rate limit all error messages but Packet Too
      Big. For Packet Too Big, use ratemask instead of hard-coded.
      
      There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow()
      aren't called. This patch only adds them to icmpv6_echo_reply().
      
      Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says
      that it is also acceptable to rate limit informational messages. Thus,
      I removed the current hard-coded behavior of icmpv6_mask_allow() that
      doesn't rate limit informational messages.
      
      v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL
          isn't defined, expand the description in ip-sysctl.txt and remove
          unnecessary conditional before kfree().
      v3: Inline the bitmap instead of dynamically allocated. Still is a
          pointer to it is needed because of the way proc_do_large_bitmap work.
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0bc19985
  8. 21 3月, 2019 1 次提交
  9. 20 3月, 2019 1 次提交
  10. 25 2月, 2019 2 次提交
  11. 05 1月, 2019 1 次提交
    • E
      ipv6: make icmp6_send() robust against null skb->dev · 8d933670
      Eric Dumazet 提交于
      syzbot was able to crash one host with the following stack trace :
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8
      RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline]
      RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426
       icmpv6_send
       smack_socket_sock_rcv_skb
       security_sock_rcv_skb
       sk_filter_trim_cap
       __sk_receive_skb
       dccp_v6_do_rcv
       release_sock
      
      This is because a RX packet found socket owned by user and
      was stored into socket backlog. Before leaving RCU protected section,
      skb->dev was cleared in __sk_receive_skb(). When socket backlog
      was finally handled at release_sock() time, skb was fed to
      smack_socket_sock_rcv_skb() then icmp6_send()
      
      We could fix the bug in smack_socket_sock_rcv_skb(), or simply
      make icmp6_send() more robust against such possibility.
      
      In the future we might provide to icmp6_send() the net pointer
      instead of infering it.
      
      Fixes: d66a8acb ("Smack: Inform peer that IPv6 traffic has been blocked")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d933670
  12. 09 11月, 2018 1 次提交
    • S
      net: Convert protocol error handlers from void to int · 32bbd879
      Stefano Brivio 提交于
      We'll need this to handle ICMP errors for tunnels without a sending socket
      (i.e. FoU and GUE). There, we might have to look up different types of IP
      tunnels, registered as network protocols, before we get a match, so we
      want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both
      inet_protos and inet6_protos. These error codes will be used in the next
      patch.
      
      For consistency, return sensible error codes in protocol error handlers
      whenever handlers can't handle errors because, even if valid, they don't
      match a protocol or any of its states.
      
      This has no effect on existing error handling paths.
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32bbd879
  13. 13 8月, 2018 1 次提交
    • V
      ipv6: Add icmp_echo_ignore_all support for ICMPv6 · e6f86b0f
      Virgile Jarry 提交于
      Preventing the kernel from responding to ICMP Echo Requests messages
      can be useful in several ways. The sysctl parameter
      'icmp_echo_ignore_all' can be used to prevent the kernel from
      responding to IPv4 ICMP echo requests. For IPv6 pings, such
      a sysctl kernel parameter did not exist.
      
      Add the ability to prevent the kernel from responding to IPv6
      ICMP echo requests through the use of the following sysctl
      parameter : /proc/sys/net/ipv6/icmp/echo_ignore_all.
      Update the documentation to reflect this change.
      Signed-off-by: NVirgile Jarry <virgile@acceis.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6f86b0f
  14. 06 8月, 2018 1 次提交
    • G
      ipv6: icmp: Updating pmtu for link local route · 5f379ef5
      Georg Kohmann 提交于
      When a ICMPV6_PKT_TOOBIG is received from a link local address the pmtu will
      be updated on a route with an arbitrary interface index. Subsequent packets
      sent back to the same link local address may therefore end up not
      considering the updated pmtu.
      
      Current behavior breaks TAHI v6LC4.1.4 Reduce PMTU On-link. Referring to RFC
      1981: Section 3: "Note that Path MTU Discovery must be performed even in
      cases where a node "thinks" a destination is attached to the same link as
      itself. In a situation such as when a neighboring router acts as proxy [ND]
      for some destination, the destination can to appear to be directly
      connected but is in fact more than one hop away."
      
      Using the interface index from the incoming ICMPV6_PKT_TOOBIG when updating
      the pmtu.
      Signed-off-by: NGeorg Kohmann <geokohma@cisco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f379ef5
  15. 22 7月, 2018 1 次提交
    • D
      net/ipv6: Fix linklocal to global address with VRF · 24b711ed
      David Ahern 提交于
      Example setup:
          host: ip -6 addr add dev eth1 2001:db8:104::4
                 where eth1 is enslaved to a VRF
      
          switch: ip -6 ro add 2001:db8:104::4/128 dev br1
                  where br1 only has an LLA
      
                 ping6 2001:db8:104::4
                 ssh   2001:db8:104::4
      
      (NOTE: UDP works fine if the PKTINFO has the address set to the global
      address and ifindex is set to the index of eth1 with a destination an
      LLA).
      
      For ICMP, icmp6_iif needs to be updated to check if skb->dev is an
      L3 master. If it is then return the ifindex from rt6i_idev similar
      to what is done for loopback.
      
      For TCP, restore the original tcp_v6_iif definition which is needed in
      most places and add a new tcp_v6_iif_l3_slave that considers the
      l3_slave variability. This latter check is only needed for socket
      lookups.
      
      Fixes: 9ff74384 ("net: vrf: Handle ipv6 multicast and link-local addresses")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24b711ed
  16. 07 7月, 2018 2 次提交
  17. 13 6月, 2018 1 次提交
    • K
      treewide: kzalloc() -> kcalloc() · 6396bb22
      Kees Cook 提交于
      The kzalloc() function has a 2-factor argument form, kcalloc(). This
      patch replaces cases of:
      
              kzalloc(a * b, gfp)
      
      with:
              kcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kzalloc(a * b * c, gfp)
      
      with:
      
              kzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kzalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kzalloc
      + kcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(sizeof(THING) * C2, ...)
      |
        kzalloc(sizeof(TYPE) * C2, ...)
      |
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(C1 * C2, ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6396bb22
  18. 28 3月, 2018 1 次提交
  19. 05 3月, 2018 2 次提交
  20. 01 3月, 2018 1 次提交
  21. 20 2月, 2018 1 次提交
    • K
      net: Convert icmpv6_sk_ops, ndisc_net_ops and igmp6_net_ops · 1a2e9332
      Kirill Tkhai 提交于
      These pernet_operations create and destroy net::ipv6.icmp_sk
      socket, used to send ICMP or error reply.
      
      Nobody can dereference the socket to handle a packet before
      net is initialized, as there is no routing; nobody can do
      that in parallel with exit, as all of devices are moved
      to init_net or destroyed and there are no packets it-flight.
      So, it's possible to mark these pernet_operations as async.
      
      The same for ndisc_net_ops and for igmp6_net_ops. The last
      one also creates and destroys /proc entries.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a2e9332
  22. 18 10月, 2017 1 次提交
  23. 07 10月, 2017 1 次提交
  24. 06 10月, 2017 1 次提交
  25. 30 8月, 2017 1 次提交
    • D
      ipv6: Use rt6i_idev index for echo replies to a local address · 1b70d792
      David Ahern 提交于
      Tariq repored local pings to linklocal address is failing:
      $ ifconfig ens8
      ens8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet 11.141.16.6  netmask 255.255.0.0  broadcast 11.141.255.255
              inet6 fe80::7efe:90ff:fecb:7502  prefixlen 64  scopeid 0x20<link>
              ether 7c:fe:90:cb:75:02  txqueuelen 1000  (Ethernet)
              RX packets 12  bytes 1164 (1.1 KiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 30  bytes 2484 (2.4 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      
      $  /bin/ping6 -c 3 fe80::7efe:90ff:fecb:7502%ens8
      PING fe80::7efe:90ff:fecb:7502%ens8(fe80::7efe:90ff:fecb:7502) 56 data bytes
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b70d792
  26. 25 8月, 2017 1 次提交
    • J
      ipv6: Compute multipath hash for ICMP errors from offending packet · 23aebdac
      Jakub Sitnicki 提交于
      When forwarding or sending out an ICMPv6 error, look at the embedded
      packet that triggered the error and compute a flow hash over its
      headers.
      
      This let's us route the ICMP error together with the flow it belongs to
      when multipath (ECMP) routing is in use, which in turn makes Path MTU
      Discovery work in ECMP load-balanced or anycast setups (RFC 7690).
      
      Granted, end-hosts behind the ECMP router (aka servers) need to reflect
      the IPv6 Flow Label for PMTUD to work.
      
      The code is organized to be in parallel with ipv4 stack:
      
        ip_multipath_l3_keys -> ip6_multipath_l3_keys
        fib_multipath_hash   -> rt6_multipath_hash
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23aebdac
  27. 22 8月, 2017 1 次提交
    • D
      net: ipv6: put host and anycast routes on device with address · 4832c30d
      David Ahern 提交于
      One nagging difference between ipv4 and ipv6 is host routes for ipv6
      addresses are installed using the loopback device or VRF / L3 Master
      device. e.g.,
      
          2001:db8:1::/120 dev veth0 proto kernel metric 256 pref medium
          local 2001:db8:1::1 dev lo table local proto kernel metric 0 pref medium
      
      Using the loopback device is convenient -- necessary for local tx, but
      has some nasty side effects, most notably setting the 'lo' device down
      causes all host routes for all local IPv6 address to be removed from the
      FIB and completely breaks IPv6 networking across all interfaces.
      
      This patch puts FIB entries for IPv6 routes against the device. This
      simplifies the routes in the FIB, for example by making dst->dev and
      rt6i_idev->dev the same (a future patch can look at removing the device
      reference taken for rt6i_idev for FIB entries).
      
      When copies are made on FIB lookups, the cloned route has dst->dev
      set to loopback (or the L3 master device). This is needed for the
      local Tx of packets to local addresses.
      
      With fib entries allocated against the real network device, the addrconf
      code that reinserts host routes on admin up of 'lo' is no longer needed.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4832c30d
  28. 15 6月, 2017 1 次提交
    • J
      net: don't global ICMP rate limit packets originating from loopback · 849a44de
      Jesper Dangaard Brouer 提交于
      Florian Weimer seems to have a glibc test-case which requires that
      loopback interfaces does not get ICMP ratelimited.  This was broken by
      commit c0303efe ("net: reduce cycles spend on ICMP replies that
      gets rate limited").
      
      An ICMP response will usually be routed back-out the same incoming
      interface.  Thus, take advantage of this and skip global ICMP
      ratelimit when the incoming device is loopback.  In the unlikely event
      that the outgoing it not loopback, due to strange routing policy
      rules, ICMP rate limiting still works via peer ratelimiting via
      icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
      (section 4.3.2.8 "Rate Limiting").
      
      This seems to fix the reproducer given by Florian.  While still
      avoiding to perform expensive and unneeded outgoing route lookup for
      rate limited packets (in the non-loopback case).
      
      Fixes: c0303efe ("net: reduce cycles spend on ICMP replies that gets rate limited")
      Reported-by: NFlorian Weimer <fweimer@redhat.com>
      Reported-by: N"H.J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      849a44de
  29. 10 1月, 2017 2 次提交
  30. 25 12月, 2016 1 次提交
  31. 29 11月, 2016 1 次提交
    • D
      net: handle no dst on skb in icmp6_send · 79dc7e3f
      David Ahern 提交于
      Andrey reported the following while fuzzing the kernel with syzkaller:
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 0 PID: 3859 Comm: a.out Not tainted 4.9.0-rc6+ #429
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      task: ffff8800666d4200 task.stack: ffff880067348000
      RIP: 0010:[<ffffffff833617ec>]  [<ffffffff833617ec>]
      icmp6_send+0x5fc/0x1e30 net/ipv6/icmp.c:451
      RSP: 0018:ffff88006734f2c0  EFLAGS: 00010206
      RAX: ffff8800666d4200 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
      RBP: ffff88006734f630 R08: ffff880064138418 R09: 0000000000000003
      R10: dffffc0000000000 R11: 0000000000000005 R12: 0000000000000000
      R13: ffffffff84e7e200 R14: ffff880064138484 R15: ffff8800641383c0
      FS:  00007fb3887a07c0(0000) GS:ffff88006cc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000000 CR3: 000000006b040000 CR4: 00000000000006f0
      Stack:
       ffff8800666d4200 ffff8800666d49f8 ffff8800666d4200 ffffffff84c02460
       ffff8800666d4a1a 1ffff1000ccdaa2f ffff88006734f498 0000000000000046
       ffff88006734f440 ffffffff832f4269 ffff880064ba7456 0000000000000000
      Call Trace:
       [<ffffffff83364ddc>] icmpv6_param_prob+0x2c/0x40 net/ipv6/icmp.c:557
       [<     inline     >] ip6_tlvopt_unknown net/ipv6/exthdrs.c:88
       [<ffffffff83394405>] ip6_parse_tlv+0x555/0x670 net/ipv6/exthdrs.c:157
       [<ffffffff8339a759>] ipv6_parse_hopopts+0x199/0x460 net/ipv6/exthdrs.c:663
       [<ffffffff832ee773>] ipv6_rcv+0xfa3/0x1dc0 net/ipv6/ip6_input.c:191
       ...
      
      icmp6_send / icmpv6_send is invoked for both rx and tx paths. In both
      cases the dst->dev should be preferred for determining the L3 domain
      if the dst has been set on the skb. Fallback to the skb->dev if it has
      not. This covers the case reported here where icmp6_send is invoked on
      Rx before the route lookup.
      
      Fixes: 5d41ce29 ("net: icmp6_send should use dst dev to determine L3 domain")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79dc7e3f
  32. 08 11月, 2016 1 次提交
  33. 05 11月, 2016 1 次提交
    • L
      net: inet: Support UID-based routing in IP protocols. · e2d118a1
      Lorenzo Colitti 提交于
      - Use the UID in routing lookups made by protocol connect() and
        sendmsg() functions.
      - Make sure that routing lookups triggered by incoming packets
        (e.g., Path MTU discovery) take the UID of the socket into
        account.
      - For packets not associated with a userspace socket, (e.g., ping
        replies) use UID 0 inside the user namespace corresponding to
        the network namespace the socket belongs to. This allows
        all namespaces to apply routing and iptables rules to
        kernel-originated traffic in that namespaces by matching UID 0.
        This is better than using the UID of the kernel socket that is
        sending the traffic, because the UID of kernel sockets created
        at namespace creation time (e.g., the per-processor ICMP and
        TCP sockets) is the UID of the user that created the socket,
        which might not be mapped in the namespace.
      
      Tested: compiles allnoconfig, allyesconfig, allmodconfig
      Tested: https://android-review.googlesource.com/253302Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2d118a1
  34. 19 6月, 2016 3 次提交