1. 01 9月, 2018 1 次提交
  2. 06 6月, 2018 1 次提交
    • E
      net: metrics: add proper netlink validation · 5b5e7a0d
      Eric Dumazet 提交于
      Before using nla_get_u32(), better make sure the attribute
      is of the proper size.
      
      Code recently was changed, but bug has been there from beginning
      of git.
      
      BUG: KMSAN: uninit-value in rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746
      CPU: 1 PID: 14139 Comm: syz-executor6 Not tainted 4.17.0-rc5+ #103
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
       __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
       rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746
       fib_dump_info+0xc42/0x2190 net/ipv4/fib_semantics.c:1361
       rtmsg_fib+0x65f/0x8c0 net/ipv4/fib_semantics.c:419
       fib_table_insert+0x2314/0x2b50 net/ipv4/fib_trie.c:1287
       inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x455a09
      RSP: 002b:00007faae5fd8c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007faae5fd96d4 RCX: 0000000000455a09
      RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000013
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000005d0 R14: 00000000006fdc20 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:294 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:529
       fib_convert_metrics net/ipv4/fib_semantics.c:1056 [inline]
       fib_create_info+0x2d46/0x9dc0 net/ipv4/fib_semantics.c:1150
       fib_table_insert+0x3e4/0x2b50 net/ipv4/fib_trie.c:1146
       inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
       kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2753 [inline]
       __kmalloc_node_track_caller+0xb32/0x11b0 mm/slub.c:4395
       __kmalloc_reserve net/core/skbuff.c:138 [inline]
       __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206
       alloc_skb include/linux/skbuff.h:988 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
       netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a919525a ("net: Move fib_convert_metrics to metrics file")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b5e7a0d
  3. 18 4月, 2018 1 次提交
    • D
      net: Move fib_convert_metrics to metrics file · a919525a
      David Ahern 提交于
      Move logic of fib_convert_metrics into ip_metrics_convert. This allows
      the code that converts netlink attributes into metrics struct to be
      re-used in a later patch by IPv6.
      
      This is mostly a code move with the following changes to variable names:
        - fi->fib_net becomes net
        - fc_mx and fc_mx_len are passed as inputs pulled from fib_config
        - metrics array is passed as an input from fi->fib_metrics->metrics
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a919525a
  4. 02 4月, 2018 1 次提交
    • X
      route: check sysctl_fib_multipath_use_neigh earlier than hash · 6174a30d
      Xin Long 提交于
      Prior to this patch, when one packet is hashed into path [1]
      (hash <= nh_upper_bound) and it's neigh is dead, it will try
      path [2]. However, if path [2]'s neigh is alive but it's
      hash > nh_upper_bound, it will not return this alive path.
      This packet will never be sent even if path [2] is alive.
      
       3.3.3.1/24:
        nexthop via 1.1.1.254 dev eth1 weight 1 <--[1] (dead neigh)
        nexthop via 2.2.2.254 dev eth2 weight 1 <--[2]
      
      With sysctl_fib_multipath_use_neigh set is supposed to find an
      available path respecting to the l3/l4 hash. But if there is
      no available route with this hash, it should at least return
      an alive route even with other hash.
      
      This patch is to fix it by processing fib_multipath_use_neigh
      earlier than the hash check, so that it will at least return
      an alive route if there is when fib_multipath_use_neigh is
      enabled. It's also compatible with before when there are alive
      routes with the l3/l4 hash.
      
      Fixes: a6db4494 ("net: ipv4: Consider failed nexthops in multipath routes")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6174a30d
  5. 05 3月, 2018 1 次提交
  6. 01 3月, 2018 2 次提交
  7. 17 2月, 2018 1 次提交
    • S
      fib_semantics: Don't match route with mismatching tclassid · a8c6db1d
      Stefano Brivio 提交于
      In fib_nh_match(), if output interface or gateway are passed in
      the FIB configuration, we don't have to check next hops of
      multipath routes to conclude whether we have a match or not.
      
      However, we might still have routes with different realms
      matching the same output interface and gateway configuration,
      and this needs to cause the match to fail. Otherwise the first
      route inserted in the FIB will match, regardless of the realms:
      
       # ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
       # ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
       # ip route list table 1234
       1.1.1.1 dev eth0 scope link realms 1/2
       1.1.1.1 dev eth0 scope link realms 3/4
       # ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
       # ip route list table 1234
       1.1.1.1 dev ens3 scope link realms 3/4
      
      whereas route with realms 3/4 should have been deleted instead.
      
      Explicitly check for fc_flow passed in the FIB configuration
      (this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
      fail matching if it differs from nh_tclassid.
      
      The handling of RTA_FLOW for multipath routes later in
      fib_nh_match() is still needed, as we can have multiple RTA_FLOW
      attributes that need to be matched against the tclassid of each
      next hop.
      
      v2: Check that fc_flow is set before discarding the match, so
          that the user can still select the first matching rule by
          not specifying any realm, as suggested by David Ahern.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8c6db1d
  8. 14 2月, 2018 2 次提交
  9. 20 12月, 2017 1 次提交
    • P
      ipv4: fib: Fix metrics match when deleting a route · d03a4557
      Phil Sutter 提交于
      The recently added fib_metrics_match() causes a regression for routes
      with both RTAX_FEATURES and RTAX_CC_ALGO if the latter has
      TCP_CONG_NEEDS_ECN flag set:
      
      | # ip link add d0 type dummy
      | # ip link set d0 up
      | # ip route add 172.29.29.0/24 dev d0 features ecn congctl dctcp
      | # ip route del 172.29.29.0/24 dev d0 features ecn congctl dctcp
      | RTNETLINK answers: No such process
      
      During route insertion, fib_convert_metrics() detects that the given CC
      algo requires ECN and hence sets DST_FEATURE_ECN_CA bit in
      RTAX_FEATURES.
      
      During route deletion though, fib_metrics_match() compares stored
      RTAX_FEATURES value with that from userspace (which obviously has no
      knowledge about DST_FEATURE_ECN_CA) and fails.
      
      Fixes: 5f9ae3d9 ("ipv4: do metrics match when looking up and deleting a route")
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d03a4557
  10. 15 11月, 2017 1 次提交
  11. 03 11月, 2017 1 次提交
    • F
      fib: fib_dump_info can no longer use __in_dev_get_rtnl · 25dd169a
      Florian Westphal 提交于
      syzbot reported yet another regression added with DOIT_UNLOCKED.
      When nexthop is marked as dead, fib_dump_info uses __in_dev_get_rtnl():
      
      ./include/linux/inetdevice.h:230 suspicious rcu_dereference_protected() usage!
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by syz-executor2/23859:
       #0:  (rcu_read_lock){....}, at: [<ffffffff840283f0>]
      inet_rtm_getroute+0xaa0/0x2d70 net/ipv4/route.c:2738
      [..]
        lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4665
        __in_dev_get_rtnl include/linux/inetdevice.h:230 [inline]
        fib_dump_info+0x1136/0x13d0 net/ipv4/fib_semantics.c:1377
        inet_rtm_getroute+0xf97/0x2d70 net/ipv4/route.c:2785
      ..
      
      This isn't safe anymore, callers either hold RTNL mutex or rcu read lock,
      so these spots must use rcu_dereference_rtnl() or plain rcu_derefence()
      (plus unconditional rcu read lock).
      
      This does the latter.
      
      Fixes: 394f51ab ("ipv4: route: set ipv4 RTM_GETROUTE to not use rtnl")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25dd169a
  12. 29 9月, 2017 2 次提交
  13. 24 8月, 2017 1 次提交
    • X
      ipv4: do metrics match when looking up and deleting a route · 5f9ae3d9
      Xin Long 提交于
      Now when ipv4 route inserts a fib_info, it memcmp fib_metrics.
      It means ipv4 route identifies one route also with metrics.
      
      But when removing a route, it tries to find the route without
      caring about the metrics. It will cause that the route with
      right metrics can't be removed.
      
      Thomas noticed this issue when doing the testing:
      
      1. add:
         # ip route append 192.168.7.0/24 dev v window 1000
         # ip route append 192.168.7.0/24 dev v window 1001
         # ip route append 192.168.7.0/24 dev v window 1002
         # ip route append 192.168.7.0/24 dev v window 1003
      2. delete:
         # ip route delete 192.168.7.0/24 dev v window 1002
      3. show:
           192.168.7.0/24 proto boot scope link window 1001
           192.168.7.0/24 proto boot scope link window 1002
           192.168.7.0/24 proto boot scope link window 1003
      
      The one with window 1002 wasn't deleted but the first one was.
      
      This patch is to do metrics match when looking up and deleting
      one route.
      Reported-by: NThomas Haller <thaller@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f9ae3d9
  14. 19 8月, 2017 1 次提交
  15. 16 8月, 2017 1 次提交
  16. 04 8月, 2017 1 次提交
    • I
      net: core: Make the FIB notification chain generic · 04b1d4e5
      Ido Schimmel 提交于
      The FIB notification chain is currently soley used by IPv4 code.
      However, we're going to introduce IPv6 FIB offload support, which
      requires these notification as well.
      
      As explained in commit c3852ef7 ("ipv4: fib: Replay events when
      registering FIB notifier"), upon registration to the chain, the callee
      receives a full dump of the FIB tables and rules by traversing all the
      net namespaces. The integrity of the dump is ensured by a per-namespace
      sequence counter that is incremented whenever a change to the tables or
      rules occurs.
      
      In order to allow more address families to use the chain, each family is
      expected to register its fib_notifier_ops in its pernet init. These
      operations allow the common code to read the family's sequence counter
      as well as dump its tables and rules in the given net namespace.
      
      Additionally, a 'family' parameter is added to sent notifications, so
      that listeners could distinguish between the different families.
      
      Implement the common code that allows listeners to register to the chain
      and for address families to register their fib_notifier_ops. Subsequent
      patches will implement these operations in IPv6.
      
      In the future, ipmr and ip6mr will be extended to provide these
      notifications as well.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04b1d4e5
  17. 03 8月, 2017 1 次提交
  18. 01 8月, 2017 1 次提交
  19. 04 7月, 2017 1 次提交
  20. 18 6月, 2017 3 次提交
    • W
      ipv4: mark DST_NOGC and remove the operation of dst_free() · b838d5e1
      Wei Wang 提交于
      With the previous preparation patches, we are ready to get rid of the
      dst gc operation in ipv4 code and release dst based on refcnt only.
      So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls
      to dst_free().
      At this point, all dst created in ipv4 code do not use the dst gc
      anymore and will be destroyed at the point when refcnt drops to 0.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b838d5e1
    • W
      ipv4: call dst_dev_put() properly · 95c47f9c
      Wei Wang 提交于
      As the intend of this patch series is to completely remove dst gc,
      we need to call dst_dev_put() to release the reference to dst->dev
      when removing routes from fib because we won't keep the gc list anymore
      and will lose the dst pointer right after removing the routes.
      Without the gc list, there is no way to find all the dst's that have
      dst->dev pointing to the going-down dev.
      Hence, we are doing dst_dev_put() immediately before we lose the last
      reference of the dst from the routing code. The next dst_check() will
      trigger a route re-lookup to find another route (if there is any).
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95c47f9c
    • W
      ipv4: take dst->__refcnt when caching dst in fib · 0830106c
      Wei Wang 提交于
      In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers
      to struct rtable but they never increment dst->__refcnt.
      This leads to the need of the dst garbage collector because when user
      is done with this dst and calls dst_release(), it can only decrement
      dst->__refcnt and can not free the dst even it sees dst->__refcnt
      drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing
      code might still hold reference to it.
      And when the routing code tries to delete a route, it has to put the
      dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread
      running periodically to check on dst->__refcnt and finally to free dst
      when refcnt becomes 0.
      
      This patch increments dst->__refcnt when
      fib_nh/fib_nh_exception holds reference to this dst and properly release
      the dst when fib_nh/fib_nh_exception has been updated with a new dst.
      
      This patch is a preparation in order to fully get rid of dst gc later.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0830106c
  21. 30 5月, 2017 1 次提交
  22. 27 5月, 2017 1 次提交
    • E
      ipv4: add reference counting to metrics · 3fb07daf
      Eric Dumazet 提交于
      Andrey Konovalov reported crashes in ipv4_mtu()
      
      I could reproduce the issue with KASAN kernels, between
      10.246.7.151 and 10.246.7.152 :
      
      1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &
      
      2) At the same time run following loop :
      while :
      do
       ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
       ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
      done
      
      Cong Wang attempted to add back rt->fi in commit
      82486aa6 ("ipv4: restore rt->fi for reference counting")
      but this proved to add some issues that were complex to solve.
      
      Instead, I suggested to add a refcount to the metrics themselves,
      being a standalone object (in particular, no reference to other objects)
      
      I tried to make this patch as small as possible to ease its backport,
      instead of being super clean. Note that we believe that only ipv4 dst
      need to take care of the metric refcount. But if this is wrong,
      this patch adds the basic infrastructure to extend this to other
      families.
      
      Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
      for his efforts on this problem.
      
      Fixes: 2860583f ("ipv4: Kill rt->fi")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: NJulian Anastasov <ja@ssi.bg>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fb07daf
  23. 23 5月, 2017 2 次提交
  24. 22 3月, 2017 1 次提交
    • N
      net: ipv4: add support for ECMP hash policy choice · bf4e0a3d
      Nikolay Aleksandrov 提交于
      This patch adds support for ECMP hash policy choice via a new sysctl
      called fib_multipath_hash_policy and also adds support for L4 hashes.
      The current values for fib_multipath_hash_policy are:
       0 - layer 3 (default)
       1 - layer 4
      If there's an skb hash already set and it matches the chosen policy then it
      will be used instead of being calculated (currently only for L4).
      In L3 mode we always calculate the hash due to the ICMP error special
      case, the flow dissector's field consistentification should handle the
      address order thus we can remove the address reversals.
      If the skb is provided we always use it for the hash calculation,
      otherwise we fallback to fl4, that is if skb is NULL fl4 has to be set.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf4e0a3d
  25. 09 2月, 2017 1 次提交
  26. 31 1月, 2017 1 次提交
  27. 13 1月, 2017 1 次提交
  28. 11 1月, 2017 1 次提交
    • D
      net: ipv4: Fix multipath selection with vrf · 7a18c5b9
      David Ahern 提交于
      fib_select_path does not call fib_select_multipath if oif is set in the
      flow struct. For VRF use cases oif is always set, so multipath route
      selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
      check similar to what is done in fib_table_lookup.
      
      Add saddr and proto to the flow struct for the fib lookup done by the
      VRF driver to better match hash computation for a flow.
      
      Fixes: 613d09b3 ("net: Use VRF device index for lookups on TX")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a18c5b9
  29. 07 1月, 2017 1 次提交
  30. 25 12月, 2016 1 次提交
  31. 04 12月, 2016 1 次提交
    • I
      ipv4: fib: Export free_fib_info() · b423cb10
      Ido Schimmel 提交于
      The FIB notification chain is going to be converted to an atomic chain,
      which means switchdev drivers will have to offload FIB entries in
      deferred work, as hardware operations entail sleeping.
      
      However, while the work is queued fib info might be freed, so a
      reference must be taken. To release the reference (and potentially free
      the fib info) fib_info_put() will be called, which in turn calls
      free_fib_info().
      
      Export free_fib_info() so that modules will be able to invoke
      fib_info_put().
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b423cb10
  32. 07 9月, 2016 1 次提交
  33. 20 8月, 2016 1 次提交
  34. 12 7月, 2016 1 次提交
    • J
      ipv4: reject RTNH_F_DEAD and RTNH_F_LINKDOWN from user space · 80610229
      Julian Anastasov 提交于
      Vegard Nossum is reporting for a crash in fib_dump_info
      when nh_dev = NULL and fib_nhs == 1:
      
      Pid: 50, comm: netlink.exe Not tainted 4.7.0-rc5+
      RIP: 0033:[<00000000602b3d18>]
      RSP: 0000000062623890  EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 000000006261b800 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000024 RDI: 000000006245ba00
      RBP: 00000000626238f0 R08: 000000000000029c R09: 0000000000000000
      R10: 0000000062468038 R11: 000000006245ba00 R12: 000000006245ba00
      R13: 00000000625f96c0 R14: 00000000601e16f0 R15: 0000000000000000
      Kernel panic - not syncing: Kernel mode fault at addr 0x2e0, ip 0x602b3d18
      CPU: 0 PID: 50 Comm: netlink.exe Not tainted 4.7.0-rc5+ #581
      Stack:
       626238f0 960226a02 00000400 000000fe
       62623910 600afca7 62623970 62623a48
       62468038 00000018 00000000 00000000
      Call Trace:
       [<602b3e93>] rtmsg_fib+0xd3/0x190
       [<602b6680>] fib_table_insert+0x260/0x500
       [<602b0e5d>] inet_rtm_newroute+0x4d/0x60
       [<60250def>] rtnetlink_rcv_msg+0x8f/0x270
       [<60267079>] netlink_rcv_skb+0xc9/0xe0
       [<60250d4b>] rtnetlink_rcv+0x3b/0x50
       [<60265400>] netlink_unicast+0x1a0/0x2c0
       [<60265e47>] netlink_sendmsg+0x3f7/0x470
       [<6021dc9a>] sock_sendmsg+0x3a/0x90
       [<6021e0d0>] ___sys_sendmsg+0x300/0x360
       [<6021fa64>] __sys_sendmsg+0x54/0xa0
       [<6021fac0>] SyS_sendmsg+0x10/0x20
       [<6001ea68>] handle_syscall+0x88/0x90
       [<600295fd>] userspace+0x3fd/0x500
       [<6001ac55>] fork_handler+0x85/0x90
      
      $ addr2line -e vmlinux -i 0x602b3d18
      include/linux/inetdevice.h:222
      net/ipv4/fib_semantics.c:1264
      
      Problem happens when RTNH_F_LINKDOWN is provided from user space
      when creating routes that do not use the flag, catched with
      netlink fuzzer.
      
      Currently, the kernel allows user space to set both flags
      to nh_flags and fib_flags but this is not intentional, the
      assumption was that they are not set. Fix this by rejecting
      both flags with EINVAL.
      Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Fixes: 0eeb075f ("net: ipv4 sysctl option to ignore routes when nexthop link is down")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
      Cc: Dinesh Dutt <ddutt@cumulusnetworks.com>
      Cc: Scott Feldman <sfeldma@gmail.com>
      Reviewed-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80610229