1. 24 2月, 2020 1 次提交
  2. 21 2月, 2020 1 次提交
    • K
      net: core: Distribute switch variables for initialization · 161d1792
      Kees Cook 提交于
      Variables declared in a switch statement before any case statements
      cannot be automatically initialized with compiler instrumentation (as
      they are not part of any execution flow). With GCC's proposed automatic
      stack variable initialization feature, this triggers a warning (and they
      don't get initialized). Clang's automatic stack variable initialization
      (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
      doesn't initialize such variables[1]. Note that these warnings (or silent
      skipping) happen before the dead-store elimination optimization phase,
      so even when the automatic initializations are later elided in favor of
      direct initializations, the warnings remain.
      
      To avoid these problems, move such variables into the "case" where
      they're used or lift them up into the main function body.
      
      net/core/skbuff.c: In function ‘skb_checksum_setup_ip’:
      net/core/skbuff.c:4809:7: warning: statement will never be executed [-Wswitch-unreachable]
       4809 |   int err;
            |       ^~~
      
      [1] https://bugs.llvm.org/show_bug.cgi?id=44916Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      161d1792
  3. 19 2月, 2020 1 次提交
  4. 17 2月, 2020 5 次提交
    • R
      skbuff: remove stale bit mask comments · 8955b435
      Randy Dunlap 提交于
      Remove stale comments since this flag is no longer a bit mask
      but is a bit field.
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8955b435
    • T
      net: export netdev_next_lower_dev_rcu() · 7151affe
      Taehee Yoo 提交于
      netdev_next_lower_dev_rcu() will be used to implement a function,
      which is to walk all lower interfaces.
      There are already functions that they walk their lower interface.
      (netdev_walk_all_lower_dev_rcu, netdev_walk_all_lower_dev()).
      But, there would be cases that couldn't be covered by given
      netdev_walk_all_lower_dev_{rcu}() function.
      So, some modules would want to implement own function,
      which is to walk all lower interfaces.
      
      In the next patch, netdev_next_lower_dev_rcu() will be used.
      In addition, this patch removes two unused prototypes in netdevice.h.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7151affe
    • E
      net: add strict checks in netdev_name_node_alt_destroy() · e08ad805
      Eric Dumazet 提交于
      netdev_name_node_alt_destroy() does a lookup over all
      device names of a namespace.
      
      We need to make sure the name belongs to the device
      of interest, and that we do not destroy its primary
      name, since we rely on it being not deleted :
      dev->name_node would indeed point to freed memory.
      
      syzbot report was the following :
      
      BUG: KASAN: use-after-free in dev_net include/linux/netdevice.h:2206 [inline]
      BUG: KASAN: use-after-free in mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
      BUG: KASAN: use-after-free in mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
      BUG: KASAN: use-after-free in mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
      Read of size 8 at addr ffff88809886c588 by task swapper/1/0
      
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:641
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
       dev_net include/linux/netdevice.h:2206 [inline]
       mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
       mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
       mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
       mld_send_initial_cr net/ipv6/mcast.c:2083 [inline]
       mld_dad_timer_expire+0x24/0x230 net/ipv6/mcast.c:2118
       call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404
       expire_timers kernel/time/timer.c:1449 [inline]
       __run_timers kernel/time/timer.c:1773 [inline]
       __run_timers kernel/time/timer.c:1740 [inline]
       run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786
       __do_softirq+0x262/0x98c kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x19b/0x1e0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:546 [inline]
       smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1146
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
       </IRQ>
      RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
      Code: 68 73 c5 f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 94 be 59 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 84 be 59 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 de 2a 74 f9 e8 09
      RSP: 0018:ffffc90000d3fd68 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff136761a RBX: ffff8880a99fc340 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffff8880a99fcbd4
      RBP: ffffc90000d3fd98 R08: ffff8880a99fc340 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
      R13: ffffffff8aa5a1c0 R14: 0000000000000000 R15: 0000000000000001
       arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:686
       default_idle_call+0x84/0xb0 kernel/sched/idle.c:94
       cpuidle_idle_call kernel/sched/idle.c:154 [inline]
       do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361
       start_secondary+0x2f4/0x410 arch/x86/kernel/smpboot.c:264
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242
      
      Allocated by task 10229:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:515 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:488
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
       __do_kmalloc_node mm/slab.c:3616 [inline]
       __kmalloc_node+0x4e/0x70 mm/slab.c:3623
       kmalloc_node include/linux/slab.h:578 [inline]
       kvmalloc_node+0x68/0x100 mm/util.c:574
       kvmalloc include/linux/mm.h:645 [inline]
       kvzalloc include/linux/mm.h:653 [inline]
       alloc_netdev_mqs+0x98/0xe40 net/core/dev.c:9797
       rtnl_create_link+0x22d/0xaf0 net/core/rtnetlink.c:3047
       __rtnl_newlink+0xf9f/0x1790 net/core/rtnetlink.c:3309
       rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
       rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       __sys_sendto+0x262/0x380 net/socket.c:1998
       __do_compat_sys_socketcall net/compat.c:771 [inline]
       __se_compat_sys_socketcall net/compat.c:719 [inline]
       __ia32_compat_sys_socketcall+0x530/0x710 net/compat.c:719
       do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
       do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
       entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
      
      Freed by task 10229:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:337 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:476
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       __netdev_name_node_alt_destroy+0x1ff/0x2a0 net/core/dev.c:322
       netdev_name_node_alt_destroy+0x57/0x80 net/core/dev.c:334
       rtnl_alt_ifname net/core/rtnetlink.c:3518 [inline]
       rtnl_linkprop.isra.0+0x575/0x6f0 net/core/rtnetlink.c:3567
       rtnl_dellinkprop+0x46/0x60 net/core/rtnetlink.c:3588
       rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       ____sys_sendmsg+0x753/0x880 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
       __compat_sys_sendmsg net/compat.c:642 [inline]
       __do_compat_sys_sendmsg net/compat.c:649 [inline]
       __se_compat_sys_sendmsg net/compat.c:646 [inline]
       __ia32_compat_sys_sendmsg+0x7a/0xb0 net/compat.c:646
       do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
       do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
       entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
      
      The buggy address belongs to the object at ffff88809886c000
       which belongs to the cache kmalloc-4k of size 4096
      The buggy address is located 1416 bytes inside of
       4096-byte region [ffff88809886c000, ffff88809886d000)
      The buggy address belongs to the page:
      page:ffffea0002621b00 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
      flags: 0xfffe0000010200(slab|head)
      raw: 00fffe0000010200 ffffea0002610d08 ffffea0002607608 ffff8880aa402000
      raw: 0000000000000000 ffff88809886c000 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88809886c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88809886c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff88809886c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
       ffff88809886c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88809886c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 36fbf1e5 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e08ad805
    • E
      net: rtnetlink: fix bugs in rtnl_alt_ifname() · 44bfa9c5
      Eric Dumazet 提交于
      Since IFLA_ALT_IFNAME is an NLA_STRING, we have no
      guarantee it is nul terminated.
      
      We should use nla_strdup() instead of kstrdup(), since this
      helper will make sure not accessing out-of-bounds data.
      
      BUG: KMSAN: uninit-value in strlen+0x5e/0xa0 lib/string.c:535
      CPU: 1 PID: 19157 Comm: syz-executor.5 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       strlen+0x5e/0xa0 lib/string.c:535
       kstrdup+0x7f/0x1a0 mm/util.c:59
       rtnl_alt_ifname net/core/rtnetlink.c:3495 [inline]
       rtnl_linkprop+0x85d/0xc00 net/core/rtnetlink.c:3553
       rtnl_newlinkprop+0x9d/0xb0 net/core/rtnetlink.c:3568
       rtnetlink_rcv_msg+0x1153/0x1570 net/core/rtnetlink.c:5424
       netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45b3b9
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ff1c7b1ac78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007ff1c7b1b6d4 RCX: 000000000045b3b9
      RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000003
      RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000009cb R14: 00000000004cb3dd R15: 000000000075bf2c
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2774 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
       netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 36fbf1e5 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44bfa9c5
    • J
      net: fib_rules: Correctly set table field when table number exceeds 8 bits · 540e585a
      Jethro Beekman 提交于
      In 709772e6, RT_TABLE_COMPAT was added to
      allow legacy software to deal with routing table numbers >= 256, but the
      same change to FIB rule queries was overlooked.
      Signed-off-by: NJethro Beekman <jethro@fortanix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540e585a
  5. 14 2月, 2020 1 次提交
  6. 12 2月, 2020 1 次提交
    • T
      core: Don't skip generic XDP program execution for cloned SKBs · ad1e03b2
      Toke Høiland-Jørgensen 提交于
      The current generic XDP handler skips execution of XDP programs entirely if
      an SKB is marked as cloned. This leads to some surprising behaviour, as
      packets can end up being cloned in various ways, which will make an XDP
      program not see all the traffic on an interface.
      
      This was discovered by a simple test case where an XDP program that always
      returns XDP_DROP is installed on a veth device. When combining this with
      the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending
      side, SKBs reliably end up in the cloned state, causing them to be passed
      through to the receiving interface instead of being dropped. A minimal
      reproducer script for this is included below.
      
      This patch fixed the issue by simply triggering the existing linearisation
      code for cloned SKBs instead of skipping the XDP program execution. This
      behaviour is in line with the behaviour of the native XDP implementation
      for the veth driver, which will reallocate and copy the SKB data if the SKB
      is marked as shared.
      
      Reproducer Python script (requires BCC and Scapy):
      
      from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP
      from bcc import BPF
      import time, sys, subprocess, shlex
      
      SKB_MODE = (1 << 1)
      DRV_MODE = (1 << 2)
      PYTHON=sys.executable
      
      def client():
          time.sleep(2)
          # Sniffing on the sender causes skb_cloned() to be set
          s = AsyncSniffer()
          s.start()
      
          for p in range(10):
              sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"),
                    verbose=False)
              time.sleep(0.1)
      
          s.stop()
          return 0
      
      def server(mode):
          prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}")
          func = prog.load_func("dummy_drop", BPF.XDP)
          prog.attach_xdp("a_to_b", func, mode)
      
          time.sleep(1)
      
          s = sniff(iface="a_to_b", count=10, timeout=15)
          if len(s):
              print(f"Got {len(s)} packets - should have gotten 0")
              return 1
          else:
              print("Got no packets - as expected")
              return 0
      
      if len(sys.argv) < 2:
          print(f"Usage: {sys.argv[0]} <skb|drv>")
          sys.exit(1)
      
      if sys.argv[1] == "client":
          sys.exit(client())
      elif sys.argv[1] == "server":
          mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE
          sys.exit(server(mode))
      else:
          try:
              mode = sys.argv[1]
              if mode not in ('skb', 'drv'):
                  print(f"Usage: {sys.argv[0]} <skb|drv>")
                  sys.exit(1)
              print(f"Running in {mode} mode")
      
              for cmd in [
                      'ip netns add netns_a',
                      'ip netns add netns_b',
                      'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b',
                      # Disable ipv6 to make sure there's no address autoconf traffic
                      'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1',
                      'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1',
                      'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa',
                      'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc',
                      'ip -n netns_a link set dev a_to_b up',
                      'ip -n netns_b link set dev b_to_a up']:
                  subprocess.check_call(shlex.split(cmd))
      
              server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}"))
              client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client"))
      
              client.wait()
              server.wait()
              sys.exit(server.returncode)
      
          finally:
              subprocess.run(shlex.split("ip netns delete netns_a"))
              subprocess.run(shlex.split("ip netns delete netns_b"))
      
      Fixes: d4455169 ("net: xdp: support xdp generic on virtual devices")
      Reported-by: NStepan Horacek <shoracek@redhat.com>
      Suggested-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad1e03b2
  7. 08 2月, 2020 5 次提交
    • M
      bpf: Improve bucket_log calculation logic · 88d6f130
      Martin KaFai Lau 提交于
      It was reported that the max_t, ilog2, and roundup_pow_of_two macros have
      exponential effects on the number of states in the sparse checker.
      
      This patch breaks them up by calculating the "nbuckets" first so that the
      "bucket_log" only needs to take ilog2().
      
      In addition, Linus mentioned:
      
        Patch looks good, but I'd like to point out that it's not just sparse.
      
        You can see it with a simple
      
          make net/core/bpf_sk_storage.i
          grep 'smap->bucket_log = ' net/core/bpf_sk_storage.i | wc
      
        and see the end result:
      
            1  365071 2686974
      
        That's one line (the assignment line) that is 2,686,974 characters in
        length.
      
        Now, sparse does happen to react particularly badly to that (I didn't
        look to why, but I suspect it's just that evaluating all the types
        that don't actually ever end up getting used ends up being much more
        expensive than it should be), but I bet it's not good for gcc either.
      
      Fixes: 6ac99e8f ("bpf: Introduce bpf sk local storage")
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Reported-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Link: https://lore.kernel.org/bpf/20200207081810.3918919-1-kafai@fb.com
      88d6f130
    • J
      bpf, sockhash: Synchronize_rcu before free'ing map · 0b2dc839
      Jakub Sitnicki 提交于
      We need to have a synchronize_rcu before free'ing the sockhash because any
      outstanding psock references will have a pointer to the map and when they
      use it, this could trigger a use after free.
      
      This is a sister fix for sockhash, following commit 2bb90e5c ("bpf:
      sockmap, synchronize_rcu before free'ing map") which addressed sockmap,
      which comes from a manual audit.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200206111652.694507-3-jakub@cloudflare.com
      0b2dc839
    • J
      bpf, sockmap: Don't sleep while holding RCU lock on tear-down · db6a5018
      Jakub Sitnicki 提交于
      rcu_read_lock is needed to protect access to psock inside sock_map_unref
      when tearing down the map. However, we can't afford to sleep in lock_sock
      while in RCU read-side critical section. Grab the RCU lock only after we
      have locked the socket.
      
      This fixes RCU warnings triggerable on a VM with 1 vCPU when free'ing a
      sockmap/sockhash that contains at least one socket:
      
      | =============================
      | WARNING: suspicious RCU usage
      | 5.5.0-04005-g8fc91b97 #450 Not tainted
      | -----------------------------
      | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section!
      |
      | other info that might help us debug this:
      |
      |
      | rcu_scheduler_active = 2, debug_locks = 1
      | 4 locks held by kworker/0:1/62:
      |  #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_map_free+0x5/0x170
      |  #3: ffff8881368c5df8 (&stab->lock){+...}, at: sock_map_free+0x64/0x170
      |
      | stack backtrace:
      | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b97 #450
      | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      | Workqueue: events bpf_map_free_deferred
      | Call Trace:
      |  dump_stack+0x71/0xa0
      |  ___might_sleep+0x105/0x190
      |  lock_sock_nested+0x28/0x90
      |  sock_map_free+0x95/0x170
      |  bpf_map_free_deferred+0x58/0x80
      |  process_one_work+0x260/0x5e0
      |  worker_thread+0x4d/0x3e0
      |  kthread+0x108/0x140
      |  ? process_one_work+0x5e0/0x5e0
      |  ? kthread_park+0x90/0x90
      |  ret_from_fork+0x3a/0x50
      
      | =============================
      | WARNING: suspicious RCU usage
      | 5.5.0-04005-g8fc91b97-dirty #452 Not tainted
      | -----------------------------
      | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section!
      |
      | other info that might help us debug this:
      |
      |
      | rcu_scheduler_active = 2, debug_locks = 1
      | 4 locks held by kworker/0:1/62:
      |  #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0
      |  #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_hash_free+0x5/0x1d0
      |  #3: ffff888139966e00 (&htab->buckets[i].lock){+...}, at: sock_hash_free+0x92/0x1d0
      |
      | stack backtrace:
      | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b97-dirty #452
      | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      | Workqueue: events bpf_map_free_deferred
      | Call Trace:
      |  dump_stack+0x71/0xa0
      |  ___might_sleep+0x105/0x190
      |  lock_sock_nested+0x28/0x90
      |  sock_hash_free+0xec/0x1d0
      |  bpf_map_free_deferred+0x58/0x80
      |  process_one_work+0x260/0x5e0
      |  worker_thread+0x4d/0x3e0
      |  kthread+0x108/0x140
      |  ? process_one_work+0x5e0/0x5e0
      |  ? kthread_park+0x90/0x90
      |  ret_from_fork+0x3a/0x50
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200206111652.694507-2-jakub@cloudflare.com
      db6a5018
    • L
      bpf, sockmap: Check update requirements after locking · 85b8ac01
      Lorenz Bauer 提交于
      It's currently possible to insert sockets in unexpected states into
      a sockmap, due to a TOCTTOU when updating the map from a syscall.
      sock_map_update_elem checks that sk->sk_state == TCP_ESTABLISHED,
      locks the socket and then calls sock_map_update_common. At this
      point, the socket may have transitioned into another state, and
      the earlier assumptions don't hold anymore. Crucially, it's
      conceivable (though very unlikely) that a socket has become unhashed.
      This breaks the sockmap's assumption that it will get a callback
      via sk->sk_prot->unhash.
      
      Fix this by checking the (fixed) sk_type and sk_protocol without the
      lock, followed by a locked check of sk_state.
      
      Unfortunately it's not possible to push the check down into
      sock_(map|hash)_update_common, since BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
      run before the socket has transitioned from TCP_SYN_RECV into
      TCP_ESTABLISHED.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/20200207103713.28175-1-lmb@cloudflare.com
      85b8ac01
    • I
      drop_monitor: Do not cancel uninitialized work item · dfa7f709
      Ido Schimmel 提交于
      Drop monitor uses a work item that takes care of constructing and
      sending netlink notifications to user space. In case drop monitor never
      started to monitor, then the work item is uninitialized and not
      associated with a function.
      
      Therefore, a stop command from user space results in canceling an
      uninitialized work item which leads to the following warning [1].
      
      Fix this by not processing a stop command if drop monitor is not
      currently monitoring.
      
      [1]
      [   31.735402] ------------[ cut here ]------------
      [   31.736470] WARNING: CPU: 0 PID: 143 at kernel/workqueue.c:3032 __flush_work+0x89f/0x9f0
      ...
      [   31.738120] CPU: 0 PID: 143 Comm: dwdump Not tainted 5.5.0-custom-09491-g16d4077796b8 #727
      [   31.741968] RIP: 0010:__flush_work+0x89f/0x9f0
      ...
      [   31.760526] Call Trace:
      [   31.771689]  __cancel_work_timer+0x2a6/0x3b0
      [   31.776809]  net_dm_cmd_trace+0x300/0xef0
      [   31.777549]  genl_rcv_msg+0x5c6/0xd50
      [   31.781005]  netlink_rcv_skb+0x13b/0x3a0
      [   31.784114]  genl_rcv+0x29/0x40
      [   31.784720]  netlink_unicast+0x49f/0x6a0
      [   31.787148]  netlink_sendmsg+0x7cf/0xc80
      [   31.790426]  ____sys_sendmsg+0x620/0x770
      [   31.793458]  ___sys_sendmsg+0xfd/0x170
      [   31.802216]  __sys_sendmsg+0xdf/0x1a0
      [   31.806195]  do_syscall_64+0xa0/0x540
      [   31.806885]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 8e94c3bc ("drop_monitor: Allow user to start monitoring hardware drops")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfa7f709
  8. 05 2月, 2020 1 次提交
    • J
      devlink: report 0 after hitting end in region read · d5b90e99
      Jacob Keller 提交于
      commit fdd41ec2 ("devlink: Return right error code in case of errors
      for region read") modified the region read code to report errors
      properly in unexpected cases.
      
      In the case where the start_offset and ret_offset match, it unilaterally
      converted this into an error. This causes an issue for the "dump"
      version of the command. In this case, the devlink region dump will
      always report an invalid argument:
      
      000000000000ffd0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      000000000000ffe0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      devlink answers: Invalid argument
      000000000000fff0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      This occurs because the expected flow for the dump is to return 0 after
      there is no further data.
      
      The simplest fix would be to stop converting the error code to -EINVAL
      if start_offset == ret_offset. However, avoid unnecessary work by
      checking for when start_offset is larger than the region size and
      returning 0 upfront.
      
      Fixes: fdd41ec2 ("devlink: Return right error code in case of errors for region read")
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5b90e99
  9. 04 2月, 2020 2 次提交
  10. 30 1月, 2020 2 次提交
  11. 27 1月, 2020 8 次提交
  12. 25 1月, 2020 2 次提交
  13. 24 1月, 2020 1 次提交
  14. 23 1月, 2020 4 次提交
    • E
      net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() · d836f5c6
      Eric Dumazet 提交于
      rtnl_create_link() needs to apply dev->min_mtu and dev->max_mtu
      checks that we apply in do_setlink()
      
      Otherwise malicious users can crash the kernel, for example after
      an integer overflow :
      
      BUG: KASAN: use-after-free in memset include/linux/string.h:365 [inline]
      BUG: KASAN: use-after-free in __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238
      Write of size 32 at addr ffff88819f20b9c0 by task swapper/0/0
      
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       check_memory_region_inline mm/kasan/generic.c:185 [inline]
       check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
       memset+0x24/0x40 mm/kasan/common.c:108
       memset include/linux/string.h:365 [inline]
       __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238
       alloc_skb include/linux/skbuff.h:1049 [inline]
       alloc_skb_with_frags+0x93/0x590 net/core/skbuff.c:5664
       sock_alloc_send_pskb+0x7ad/0x920 net/core/sock.c:2242
       sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2259
       mld_newpack+0x1d7/0x7f0 net/ipv6/mcast.c:1609
       add_grhead.isra.0+0x299/0x370 net/ipv6/mcast.c:1713
       add_grec+0x7db/0x10b0 net/ipv6/mcast.c:1844
       mld_send_cr net/ipv6/mcast.c:1970 [inline]
       mld_ifc_timer_expire+0x3d3/0x950 net/ipv6/mcast.c:2477
       call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404
       expire_timers kernel/time/timer.c:1449 [inline]
       __run_timers kernel/time/timer.c:1773 [inline]
       __run_timers kernel/time/timer.c:1740 [inline]
       run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786
       __do_softirq+0x262/0x98c kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x19b/0x1e0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
       </IRQ>
      RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
      Code: 98 6b ea f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 44 1c 60 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 34 1c 60 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 4e 5d 9a f9 e8 79
      RSP: 0018:ffffffff89807ce8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff13266ae RBX: ffffffff8987a1c0 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffffffff8987aa54
      RBP: ffffffff89807d18 R08: ffffffff8987a1c0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
      R13: ffffffff8a799980 R14: 0000000000000000 R15: 0000000000000000
       arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:690
       default_idle_call+0x84/0xb0 kernel/sched/idle.c:94
       cpuidle_idle_call kernel/sched/idle.c:154 [inline]
       do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361
       rest_init+0x23b/0x371 init/main.c:451
       arch_call_rest_init+0xe/0x1b
       start_kernel+0x904/0x943 init/main.c:784
       x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490
       x86_64_start_kernel+0x77/0x7b arch/x86/kernel/head64.c:471
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242
      
      The buggy address belongs to the page:
      page:ffffea00067c82c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
      raw: 057ffe0000000000 ffffea00067c82c8 ffffea00067c82c8 0000000000000000
      raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88819f20b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88819f20b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      >ffff88819f20b980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                                 ^
       ffff88819f20ba00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88819f20ba80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d836f5c6
    • M
      bpf: Add BPF_FUNC_jiffies64 · 5576b991
      Martin KaFai Lau 提交于
      This patch adds a helper to read the 64bit jiffies.  It will be used
      in a later patch to implement the bpf_cubic.c.
      
      The helper is inlined for jit_requested and 64 BITS_PER_LONG
      as the map_gen_lookup().  Other cases could be considered together
      with map_gen_lookup() if needed.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200122233646.903260-1-kafai@fb.com
      5576b991
    • M
      net: Fix packet reordering caused by GRO and listified RX cooperation · c8079432
      Maxim Mikityanskiy 提交于
      Commit 323ebb61 ("net: use listified RX for handling GRO_NORMAL
      skbs") introduces batching of GRO_NORMAL packets in napi_frags_finish,
      and commit 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in
      napi_gro_receive()") adds the same to napi_skb_finish. However,
      dev_gro_receive (that is called just before napi_{frags,skb}_finish) can
      also pass skbs to the networking stack: e.g., when the GRO session is
      flushed, napi_gro_complete is called, which passes pp directly to
      netif_receive_skb_internal, skipping napi->rx_list. It means that the
      packet stored in pp will be handled by the stack earlier than the
      packets that arrived before, but are still waiting in napi->rx_list. It
      leads to TCP reorderings that can be observed in the TCPOFOQueue counter
      in netstat.
      
      This commit fixes the reordering issue by making napi_gro_complete also
      use napi->rx_list, so that all packets going through GRO will keep their
      order. In order to keep napi_gro_flush working properly, gro_normal_list
      calls are moved after the flush to clear napi->rx_list.
      
      iwlwifi calls napi_gro_flush directly and does the same thing that is
      done by gro_normal_list, so the same change is applied there:
      napi_gro_flush is moved to be before the flush of napi->rx_list.
      
      A few other drivers also use napi_gro_flush (brocade/bna/bnad.c,
      cortina/gemini.c, hisilicon/hns3/hns3_enet.c). The first two also use
      napi_complete_done afterwards, which performs the gro_normal_list flush,
      so they are fine. The latter calls napi_gro_receive right after
      napi_gro_flush, so it can end up with non-empty napi->rx_list anyway.
      
      Fixes: 323ebb61 ("net: use listified RX for handling GRO_NORMAL skbs")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Cc: Alexander Lobakin <alobakin@dlink.ru>
      Cc: Edward Cree <ecree@solarflare.com>
      Acked-by: NAlexander Lobakin <alobakin@dlink.ru>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8079432
    • J
      net, sk_msg: Don't check if sock is locked when tearing down psock · 58c8db92
      Jakub Sitnicki 提交于
      As John Fastabend reports [0], psock state tear-down can happen on receive
      path *after* unlocking the socket, if the only other psock user, that is
      sockmap or sockhash, releases its psock reference before tcp_bpf_recvmsg
      does so:
      
       tcp_bpf_recvmsg()
        psock = sk_psock_get(sk)                         <- refcnt 2
        lock_sock(sk);
        ...
                                        sock_map_free()  <- refcnt 1
        release_sock(sk)
        sk_psock_put()                                   <- refcnt 0
      
      Remove the lockdep check for socket lock in psock tear-down that got
      introduced in 7e81a353 ("bpf: Sockmap, ensure sock lock held during
      tear down").
      
      [0] https://lore.kernel.org/netdev/5e25dc995d7d_74082aaee6e465b441@john-XPS-13-9370.notmuch/
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Reported-by: syzbot+d73682fcf7fee6982fe3@syzkaller.appspotmail.com
      Suggested-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58c8db92
  15. 22 1月, 2020 1 次提交
  16. 21 1月, 2020 1 次提交
    • J
      net-sysfs: Fix reference count leak · cb626bf5
      Jouni Hogander 提交于
      Netdev_register_kobject is calling device_initialize. In case of error
      reference taken by device_initialize is not given up.
      
      Drivers are supposed to call free_netdev in case of error. In non-error
      case the last reference is given up there and device release sequence
      is triggered. In error case this reference is kept and the release
      sequence is never started.
      
      Fix this by setting reg_state as NETREG_UNREGISTERED if registering
      fails.
      
      This is the rootcause for couple of memory leaks reported by Syzkaller:
      
      BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
        comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
        backtrace:
          [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
          [<000000002340019b>] device_add+0x882/0x1750
          [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
          [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
          [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
          [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
          [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
          [<00000000984cabb9>] do_syscall_64+0x16f/0x580
          [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000e6ca2d9f>] 0xffffffffffffffff
      
      BUG: memory leak
      unreferenced object 0xffff8880668ba588 (size 8):
        comm "kobject_set_nam", pid 286, jiffies 4294725297 (age 9.871s)
        hex dump (first 8 bytes):
          6e 72 30 00 cc be df 2b                          nr0....+
        backtrace:
          [<00000000a322332a>] __kmalloc_track_caller+0x16e/0x290
          [<00000000236fd26b>] kstrdup+0x3e/0x70
          [<00000000dd4a2815>] kstrdup_const+0x3e/0x50
          [<0000000049a377fc>] kvasprintf_const+0x10e/0x160
          [<00000000627fc711>] kobject_set_name_vargs+0x5b/0x140
          [<0000000019eeab06>] dev_set_name+0xc0/0xf0
          [<0000000069cb12bc>] netdev_register_kobject+0xc8/0x320
          [<00000000f2e83732>] register_netdevice+0xa1b/0xf00
          [<000000009e1f57cc>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000009c560784>] tun_chr_ioctl+0x2f/0x40
          [<000000000d759e02>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000351d7c31>] ksys_ioctl+0x99/0xb0
          [<000000008390040a>] __x64_sys_ioctl+0x78/0xb0
          [<0000000052d196b7>] do_syscall_64+0x16f/0x580
          [<0000000019af9236>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000bc384531>] 0xffffffffffffffff
      
      v3 -> v4:
        Set reg_state to NETREG_UNREGISTERED if registering fails
      
      v2 -> v3:
      * Replaced BUG_ON with WARN_ON in free_netdev and netdev_release
      
      v1 -> v2:
      * Relying on driver calling free_netdev rather than calling
        put_device directly in error path
      
      Reported-by: syzbot+ad8ca40ecd77896d51e2@syzkaller.appspotmail.com
      Cc: David Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: NJouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb626bf5
  17. 19 1月, 2020 3 次提交