1. 23 5月, 2020 1 次提交
  2. 27 4月, 2020 1 次提交
  3. 05 12月, 2019 1 次提交
    • S
      net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup · 6c8991f4
      Sabrina Dubroca 提交于
      ipv6_stub uses the ip6_dst_lookup function to allow other modules to
      perform IPv6 lookups. However, this function skips the XFRM layer
      entirely.
      
      All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
      ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
      which calls xfrm_lookup_route(). This patch fixes this inconsistent
      behavior by switching the stub to ip6_dst_lookup_flow, which also calls
      xfrm_lookup_route().
      
      This requires some changes in all the callers, as these two functions
      take different arguments and have different return types.
      
      Fixes: 5f81bd2e ("ipv6: export a stub for IPv6 symbols used by vxlan")
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c8991f4
  4. 19 7月, 2019 1 次提交
  5. 21 5月, 2019 1 次提交
  6. 28 4月, 2019 2 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  7. 23 4月, 2019 1 次提交
  8. 30 3月, 2019 1 次提交
  9. 27 2月, 2019 1 次提交
  10. 20 1月, 2019 2 次提交
  11. 16 10月, 2018 4 次提交
  12. 11 10月, 2018 1 次提交
  13. 09 10月, 2018 3 次提交
  14. 25 9月, 2018 1 次提交
    • S
      mpls: allow routes on ip6gre devices · d8e2262a
      Saif Hasan 提交于
      Summary:
      
      This appears to be necessary and sufficient change to enable `MPLS` on
      `ip6gre` tunnels (RFC4023).
      
      This diff allows IP6GRE devices to be recognized by MPLS kernel module
      and hence user can configure interface to accept packets with mpls
      headers as well setup mpls routes on them.
      
      Test Plan:
      
      Test plan consists of multiple containers connected via GRE-V6 tunnel.
      Then carrying out testing steps as below.
      
      - Carry out necessary sysctl settings on all containers
      
      ```
      sysctl -w net.mpls.platform_labels=65536
      sysctl -w net.mpls.ip_ttl_propagate=1
      sysctl -w net.mpls.conf.lo.input=1
      ```
      
      - Establish IP6GRE tunnels
      
      ```
      ip -6 tunnel add name if_1_2_1 mode ip6gre \
        local 2401:db00:21:6048:feed:0::1 \
        remote 2401:db00:21:6048:feed:0::2 key 1
      ip link set dev if_1_2_1 up
      sysctl -w net.mpls.conf.if_1_2_1.input=1
      ip -4 addr add 169.254.0.2/31 dev if_1_2_1 scope link
      
      ip -6 tunnel add name if_1_3_1 mode ip6gre \
        local 2401:db00:21:6048:feed:0::1 \
        remote 2401:db00:21:6048:feed:0::3 key 1
      ip link set dev if_1_3_1 up
      sysctl -w net.mpls.conf.if_1_3_1.input=1
      ip -4 addr add 169.254.0.4/31 dev if_1_3_1 scope link
      ```
      
      - Install MPLS encap rules on node-1 towards node-2
      
      ```
      ip route add 192.168.0.11/32 nexthop encap mpls 32/64 \
        via inet 169.254.0.3 dev if_1_2_1
      ```
      
      - Install MPLS forwarding rules on node-2 and node-3
      ```
      // node2
      ip -f mpls route add 32 via inet 169.254.0.7 dev if_2_4_1
      
      // node3
      ip -f mpls route add 64 via inet 169.254.0.12 dev if_4_3_1
      ```
      
      - Ping 192.168.0.11 (node4) from 192.168.0.1 (node1) (where routing
        towards 192.168.0.1 is via IP route directly towards node1 from node4)
      ```
      ping 192.168.0.11
      ```
      
      - tcpdump on interface to capture ping packets wrapped within MPLS
        header which inturn wrapped within IP6GRE header
      
      ```
      16:43:41.121073 IP6
        2401:db00:21:6048:feed::1 > 2401:db00:21:6048:feed::2:
        DSTOPT GREv0, key=0x1, length 100:
        MPLS (label 32, exp 0, ttl 255) (label 64, exp 0, [S], ttl 255)
        IP 192.168.0.1 > 192.168.0.11:
        ICMP echo request, id 1208, seq 45, length 64
      
      0x0000:  6000 2cdb 006c 3c3f 2401 db00 0021 6048  `.,..l<?$....!`H
      0x0010:  feed 0000 0000 0001 2401 db00 0021 6048  ........$....!`H
      0x0020:  feed 0000 0000 0002 2f00 0401 0401 0100  ......../.......
      0x0030:  2000 8847 0000 0001 0002 00ff 0004 01ff  ...G............
      0x0040:  4500 0054 3280 4000 ff01 c7cb c0a8 0001  E..T2.@.........
      0x0050:  c0a8 000b 0800 a8d7 04b8 002d 2d3c a05b  ...........--<.[
      0x0060:  0000 0000 bcd8 0100 0000 0000 1011 1213  ................
      0x0070:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
      0x0080:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
      0x0090:  3435 3637                                4567
      ```
      Signed-off-by: NSaif Hasan <has@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8e2262a
  15. 28 3月, 2018 1 次提交
  16. 18 3月, 2018 1 次提交
  17. 05 3月, 2018 1 次提交
  18. 09 2月, 2018 1 次提交
    • D
      mpls, nospec: Sanitize array index in mpls_label_ok() · 3968523f
      Dan Williams 提交于
      mpls_label_ok() validates that the 'platform_label' array index from a
      userspace netlink message payload is valid. Under speculation the
      mpls_label_ok() result may not resolve in the CPU pipeline until after
      the index is used to access an array element. Sanitize the index to zero
      to prevent userspace-controlled arbitrary out-of-bounds speculation, a
      precursor for a speculative execution side channel vulnerability.
      
      Cc: <stable@vger.kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3968523f
  19. 05 12月, 2017 1 次提交
  20. 12 10月, 2017 1 次提交
  21. 08 10月, 2017 1 次提交
  22. 10 8月, 2017 1 次提交
  23. 08 7月, 2017 1 次提交
  24. 05 7月, 2017 1 次提交
  25. 04 7月, 2017 1 次提交
    • R
      mpls: route get support · 397fc9e5
      Roopa Prabhu 提交于
      This patch adds RTM_GETROUTE doit handler for mpls routes.
      
      Input:
      RTA_DST - input label
      RTA_NEWDST - labels in packet for multipath selection
      
      By default the getroute handler returns matched
      nexthop label, via and oif
      
      With RTM_F_FIB_MATCH flag, full matched route is
      returned.
      
      example (with patched iproute2):
      $ip -f mpls route show
      101
              nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
              nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
      201
              nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
              nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
      
      $ip -f mpls route get 103
      RTNETLINK answers: Network is unreachable
      
      $ip -f mpls route get 101
      101 as to 102/103 via inet 172.16.2.2 dev virt1-2
      
      $ip -f mpls route get as to 302/303 101
      101 as to 302/303 via inet 172.16.12.2 dev virt1-12
      
      $ip -f mpls route get fibmatch 103
      RTNETLINK answers: Network is unreachable
      
      $ip -f mpls route get fibmatch 101
      101
              nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
              nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      397fc9e5
  26. 01 6月, 2017 1 次提交
  27. 30 5月, 2017 5 次提交
  28. 09 5月, 2017 1 次提交
    • M
      treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68
      Michal Hocko 提交于
      There are many code paths opencoding kvmalloc.  Let's use the helper
      instead.  The main difference to kvmalloc is that those users are
      usually not considering all the aspects of the memory allocator.  E.g.
      allocation requests <= 32kB (with 4kB pages) are basically never failing
      and invoke OOM killer to satisfy the allocation.  This sounds too
      disruptive for something that has a reasonable fallback - the vmalloc.
      On the other hand those requests might fallback to vmalloc even when the
      memory allocator would succeed after several more reclaim/compaction
      attempts previously.  There is no guarantee something like that happens
      though.
      
      This patch converts many of those places to kv[mz]alloc* helpers because
      they are more conservative.
      
      Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
      Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
      Acked-by: David Sterba <dsterba@suse.com> # btrfs
      Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
      Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Colin Cross <ccross@android.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Santosh Raspatur <santosh@chelsio.com>
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Yishai Hadas <yishaih@mellanox.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: "Yan, Zheng" <zyan@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      752ade68
  29. 18 4月, 2017 1 次提交