1. 29 1月, 2021 10 次提交
  2. 21 1月, 2021 3 次提交
  3. 08 1月, 2021 3 次提交
  4. 07 11月, 2020 11 次提交
  5. 20 10月, 2020 1 次提交
    • I
      nexthop: Fix performance regression in nexthop deletion · df6afe2f
      Ido Schimmel 提交于
      While insertion of 16k nexthops all using the same netdev ('dummy10')
      takes less than a second, deletion takes about 130 seconds:
      
      # time -p ip -b nexthop.batch
      real 0.29
      user 0.01
      sys 0.15
      
      # time -p ip link set dev dummy10 down
      real 131.03
      user 0.06
      sys 0.52
      
      This is because of repeated calls to synchronize_rcu() whenever a
      nexthop is removed from a nexthop group:
      
      # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K
      ...
          b'finish_task_switch'
          b'schedule'
          b'schedule_timeout'
          b'wait_for_completion'
          b'__wait_rcu_gp'
          b'synchronize_rcu.part.0'
          b'synchronize_rcu'
          b'__remove_nexthop'
          b'remove_nexthop'
          b'nexthop_flush_dev'
          b'nh_netdev_event'
          b'raw_notifier_call_chain'
          b'call_netdevice_notifiers_info'
          b'__dev_notify_flags'
          b'dev_change_flags'
          b'do_setlink'
          b'__rtnl_newlink'
          b'rtnl_newlink'
          b'rtnetlink_rcv_msg'
          b'netlink_rcv_skb'
          b'rtnetlink_rcv'
          b'netlink_unicast'
          b'netlink_sendmsg'
          b'____sys_sendmsg'
          b'___sys_sendmsg'
          b'__sys_sendmsg'
          b'__x64_sys_sendmsg'
          b'do_syscall_64'
          b'entry_SYSCALL_64_after_hwframe'
          -                ip (277)
              126554955
      
      Since nexthops are always deleted under RTNL, synchronize_net() can be
      used instead. It will call synchronize_rcu_expedited() which only blocks
      for several microseconds as opposed to multiple milliseconds like
      synchronize_rcu().
      
      With this patch deletion of 16k nexthops takes less than a second:
      
      # time -p ip link set dev dummy10 down
      real 0.12
      user 0.00
      sys 0.04
      
      Tested with fib_nexthops.sh which includes torture tests that prompted
      the initial change:
      
      # ./fib_nexthops.sh
      ...
      Tests passed: 134
      Tests failed:   0
      
      Fixes: 90f33bff ("nexthops: don't modify published nexthop groups")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20201016172914.643282-1-idosch@idosch.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      df6afe2f
  6. 16 9月, 2020 2 次提交
  7. 27 8月, 2020 5 次提交
  8. 23 8月, 2020 1 次提交
    • N
      net: nexthop: don't allow empty NHA_GROUP · eeaac363
      Nikolay Aleksandrov 提交于
      Currently the nexthop code will use an empty NHA_GROUP attribute, but it
      requires at least 1 entry in order to function properly. Otherwise we
      end up derefencing null or random pointers all over the place due to not
      having any nh_grp_entry members allocated, nexthop code relies on having at
      least the first member present. Empty NHA_GROUP doesn't make any sense so
      just disallow it.
      Also add a WARN_ON for any future users of nexthop_create_group().
      
       BUG: kernel NULL pointer dereference, address: 0000000000000080
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP
       CPU: 0 PID: 558 Comm: ip Not tainted 5.9.0-rc1+ #93
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
       RIP: 0010:fib_check_nexthop+0x4a/0xaa
       Code: 0f 84 83 00 00 00 48 c7 02 80 03 f7 81 c3 40 80 fe fe 75 12 b8 ea ff ff ff 48 85 d2 74 6b 48 c7 02 40 03 f7 81 c3 48 8b 40 10 <48> 8b 80 80 00 00 00 eb 36 80 78 1a 00 74 12 b8 ea ff ff ff 48 85
       RSP: 0018:ffff88807983ba00 EFLAGS: 00010213
       RAX: 0000000000000000 RBX: ffff88807983bc00 RCX: 0000000000000000
       RDX: ffff88807983bc00 RSI: 0000000000000000 RDI: ffff88807bdd0a80
       RBP: ffff88807983baf8 R08: 0000000000000dc0 R09: 000000000000040a
       R10: 0000000000000000 R11: ffff88807bdd0ae8 R12: 0000000000000000
       R13: 0000000000000000 R14: ffff88807bea3100 R15: 0000000000000001
       FS:  00007f10db393700(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000080 CR3: 000000007bd0f004 CR4: 00000000003706f0
       Call Trace:
        fib_create_info+0x64d/0xaf7
        fib_table_insert+0xf6/0x581
        ? __vma_adjust+0x3b6/0x4d4
        inet_rtm_newroute+0x56/0x70
        rtnetlink_rcv_msg+0x1e3/0x20d
        ? rtnl_calcit.isra.0+0xb8/0xb8
        netlink_rcv_skb+0x5b/0xac
        netlink_unicast+0xfa/0x17b
        netlink_sendmsg+0x334/0x353
        sock_sendmsg_nosec+0xf/0x3f
        ____sys_sendmsg+0x1a0/0x1fc
        ? copy_msghdr_from_user+0x4c/0x61
        ___sys_sendmsg+0x63/0x84
        ? handle_mm_fault+0xa39/0x11b5
        ? sockfd_lookup_light+0x72/0x9a
        __sys_sendmsg+0x50/0x6e
        do_syscall_64+0x54/0xbe
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7f10dacc0bb7
       Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 9a 4b 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 f2 2a 00 f7 d8 64 89 02 48
       RSP: 002b:00007ffcbe628bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 00007ffcbe628f80 RCX: 00007f10dacc0bb7
       RDX: 0000000000000000 RSI: 00007ffcbe628c60 RDI: 0000000000000003
       RBP: 000000005f41099c R08: 0000000000000001 R09: 0000000000000008
       R10: 00000000000005e9 R11: 0000000000000246 R12: 0000000000000000
       R13: 0000000000000000 R14: 00007ffcbe628d70 R15: 0000563a86c6e440
       Modules linked in:
       CR2: 0000000000000080
      
      CC: David Ahern <dsahern@gmail.com>
      Fixes: 430a0491 ("nexthop: Add support for nexthop groups")
      Reported-by: syzbot+a61aa19b0c14c8770bd9@syzkaller.appspotmail.com
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeaac363
  9. 11 6月, 2020 1 次提交
    • D
      nexthop: Fix fdb labeling for groups · ce9ac056
      David Ahern 提交于
      fdb nexthops are marked with a flag. For standalone nexthops, a flag was
      added to the nh_info struct. For groups that flag was added to struct
      nexthop when it should have been added to the group information. Fix
      by removing the flag from the nexthop struct and adding a flag to nh_group
      that mirrors nh_info and is really only a caching of the individual types.
      Add a helper, nexthop_is_fdb, for use by the vxlan code and fixup the
      internal code to use the flag from either nh_info or nh_group.
      
      v2
      - propagate fdb_nh in remove_nh_grp_entry
      
      Fixes: 38428d68 ("nexthop: support for fdb ecmp nexthops")
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce9ac056
  10. 02 6月, 2020 1 次提交
  11. 28 5月, 2020 1 次提交
  12. 27 5月, 2020 1 次提交