1. 20 2月, 2017 3 次提交
  2. 19 2月, 2017 1 次提交
  3. 18 2月, 2017 3 次提交
    • D
      irda: Fix lockdep annotations in hashbin_delete(). · 4c03b862
      David S. Miller 提交于
      A nested lock depth was added to the hasbin_delete() code but it
      doesn't actually work some well and results in tons of lockdep splats.
      
      Fix the code instead to properly drop the lock around the operation
      and just keep peeking the head of the hashbin queue.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c03b862
    • A
      dccp: fix freeing skb too early for IPV6_RECVPKTINFO · 5edabca9
      Andrey Konovalov 提交于
      In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
      is forcibly freed via __kfree_skb in dccp_rcv_state_process if
      dccp_v6_conn_request successfully returns.
      
      However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
      is saved to ireq->pktopts and the ref count for skb is incremented in
      dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
      in dccp_rcv_state_process.
      
      Fix by calling consume_skb instead of doing goto discard and therefore
      calling __kfree_skb.
      
      Similar fixes for TCP:
      
      fb7e2399 [TCP]: skb is unexpectedly freed.
      0aea76d3 tcp: SYN packets are now
      simply consumed
      Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5edabca9
    • A
      packet: Do not call fanout_release from atomic contexts · 2bd624b4
      Anoob Soman 提交于
      Commit 66644982 ("packet: call fanout_release, while UNREGISTERING a
      netdev"), unfortunately, introduced the following issues.
      
      1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside
      rcu_read-side critical section. rcu_read_lock disables preemption, most often,
      which prohibits calling sleeping functions.
      
      [  ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
      [  ]
      [  ] rcu_scheduler_active = 1, debug_locks = 0
      [  ] 4 locks held by ovs-vswitchd/1969:
      [  ]  #0:  (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40
      [  ]  #1:  (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
      [  ]  #2:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20
      [  ]  #3:  (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0
      [  ]
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110
      [  ]  [<ffffffff810a2da7>] ___might_sleep+0x57/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30
      [  ]  [<ffffffff81186e88>] ? printk+0x4d/0x4f
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock).
      "sleeping function called from invalid context"
      
      [  ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
      [  ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810a2f52>] ___might_sleep+0x202/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      3. calling dev_remove_pack(&fanout->prot_hook), from inside
      spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack()
      -> synchronize_net(), which might sleep.
      
      [  ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff81186274>] __schedule_bug+0x64/0x73
      [  ]  [<ffffffff8162b8cb>] __schedule+0x6b/0xd10
      [  ]  [<ffffffff8162c5db>] schedule+0x6b/0x80
      [  ]  [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410
      [  ]  [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810
      [  ]  [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10
      [  ]  [<ffffffff8154eab5>] synchronize_net+0x35/0x50
      [  ]  [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20
      [  ]  [<ffffffff8161077e>] fanout_release+0xbe/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      4. fanout_release() races with calls from different CPU.
      
      To fix the above problems, remove the call to fanout_release() under
      rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and
      netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order
      to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
      __fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
      fanout->prot_hook is removed as well.
      
      Fixes: 66644982 ("packet: call fanout_release, while UNREGISTERING a netdev")
      Reported-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NAnoob Soman <anoob.soman@citrix.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bd624b4
  4. 16 2月, 2017 2 次提交
    • D
      rhashtable: Revert nested table changes. · bf3f14d6
      David S. Miller 提交于
      This reverts commits:
      
      6a254780
      9dbbfb0a
      40137906
      
      It's too risky to put in this late in the release
      cycle.  We'll put these changes into the next merge
      window instead.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf3f14d6
    • M
      net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification · 7627ae60
      Marcus Huewe 提交于
      When setting a neigh related sysctl parameter, we always send a
      NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
      executing
      
      	sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000
      
      a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.
      
      This is caused by commit 2a4501ae ("neigh: Send a
      notification when DELAY_PROBE_TIME changes"). According to the
      commit's description, it was intended to generate such an event
      when setting the "delay_first_probe_time" sysctl parameter.
      
      In order to fix this, only generate this event when actually
      setting the "delay_first_probe_time" sysctl parameter. This fix
      should not have any unintended side-effects, because all but one
      registered netevent callbacks check for other netevent event
      types (the registered callbacks were obtained by grepping for
      "register_netevent_notifier"). The only callback that uses the
      NETEVENT_DELAY_PROBE_TIME_UPDATE event is
      mlxsw_sp_router_netevent_event() (in
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
      of this event, it only accesses the DELAY_PROBE_TIME of the
      passed neigh_parms.
      
      Fixes: 2a4501ae ("neigh: Send a notification when DELAY_PROBE_TIME changes")
      Signed-off-by: NMarcus Huewe <suse-tux@gmx.de>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7627ae60
  5. 15 2月, 2017 5 次提交
  6. 14 2月, 2017 2 次提交
    • H
      tipc: Fix tipc_sk_reinit race conditions · 9dbbfb0a
      Herbert Xu 提交于
      There are two problems with the function tipc_sk_reinit.  Firstly
      it's doing a manual walk over an rhashtable.  This is broken as
      an rhashtable can be resized and if you manually walk over it
      during a resize then you may miss entries.
      
      Secondly it's missing memory barriers as previously the code used
      spinlocks which provide the barriers implicitly.
      
      This patch fixes both problems.
      
      Fixes: 07f6c4bc ("tipc: convert tipc reference table to...")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9dbbfb0a
    • R
      NET: Fix /proc/net/arp for AX.25 · 4872e57c
      Ralf Baechle 提交于
      When sending ARP requests over AX.25 links the hwaddress in the neighbour
      cache are not getting initialized.  For such an incomplete arp entry
      ax2asc2 will generate an empty string resulting in /proc/net/arp output
      like the following:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      172.20.1.99      0x3         0x0              *        bpq0
      
      The missing field will confuse the procfs parsing of arp(8) resulting in
      incorrect output for the device such as the following:
      
      $ arp
      Address                  HWtype  HWaddress           Flags Mask            Iface
      gateway                  ether   52:54:00:00:5d:5f   C                     ens3
      172.20.1.99                      (incomplete)                              ens3
      
      This changes the content of /proc/net/arp to:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      172.20.1.99      0x3         0x0         *                     *        bpq0
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      
      To do so it change ax2asc to put the string "*" in buf for a NULL address
      argument.  Finally the HW address field is left aligned in a 17 character
      field (the length of an ethernet HW address in the usual hex notation) for
      readability.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4872e57c
  7. 13 2月, 2017 1 次提交
  8. 11 2月, 2017 1 次提交
  9. 10 2月, 2017 2 次提交
    • H
      igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() · 9c8bb163
      Hangbin Liu 提交于
      In function igmpv3/mld_add_delrec() we allocate pmc and put it in
      idev->mc_tomb, so we should free it when we don't need it in del_delrec().
      But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
      
      Fixes: 24803f38 ("igmp: do not remove igmp souce list info when ...")
      Fixes: 1666d49e ("mld: do not remove mld souce list info when ...")
      Reported-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c8bb163
    • W
      kcm: fix 0-length case for kcm_sendmsg() · 98e3862c
      WANG Cong 提交于
      Dmitry reported a kernel warning:
      
       WARNING: CPU: 3 PID: 2936 at net/kcm/kcmsock.c:627
       kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
       CPU: 3 PID: 2936 Comm: a.out Not tainted 4.10.0-rc6+ #209
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:15 [inline]
        dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
        panic+0x1fb/0x412 kernel/panic.c:179
        __warn+0x1c4/0x1e0 kernel/panic.c:539
        warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
        kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
        kcm_sendmsg+0x163a/0x2200 net/kcm/kcmsock.c:1029
        sock_sendmsg_nosec net/socket.c:635 [inline]
        sock_sendmsg+0xca/0x110 net/socket.c:645
        sock_write_iter+0x326/0x600 net/socket.c:848
        new_sync_write fs/read_write.c:499 [inline]
        __vfs_write+0x483/0x740 fs/read_write.c:512
        vfs_write+0x187/0x530 fs/read_write.c:560
        SYSC_write fs/read_write.c:607 [inline]
        SyS_write+0xfb/0x230 fs/read_write.c:599
        entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      when calling syscall(__NR_write, sock2, 0x208aaf27ul, 0x0ul) on a KCM
      seqpacket socket. It appears that kcm_sendmsg() does not handle len==0
      case correctly, which causes an empty skb is allocated and queued.
      Fix this by skipping the skb allocation for len==0 case.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98e3862c
  10. 09 2月, 2017 6 次提交
    • F
      net: dsa: Do not destroy invalid network devices · 382e1eea
      Florian Fainelli 提交于
      dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
      for the network device not being NULL before attempting to destroy it. We were
      not setting the slave network device as NULL if dsa_slave_create() failed, so
      we would later on be calling dsa_slave_destroy() on a now free'd and
      unitialized network device, causing crashes in dsa_slave_destroy().
      
      Fixes: 83c0afae ("net: dsa: Add new binding implementation")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      382e1eea
    • W
      ping: fix a null pointer dereference · 73d2c667
      WANG Cong 提交于
      Andrey reported a kernel crash:
      
        general protection fault: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff880060048040 task.stack: ffff880069be8000
        RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
        RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
        RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
        RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
        RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
        RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
        R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
        R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
        FS:  00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
        Call Trace:
         inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
         sock_sendmsg_nosec net/socket.c:635 [inline]
         sock_sendmsg+0xca/0x110 net/socket.c:645
         SYSC_sendto+0x660/0x810 net/socket.c:1687
         SyS_sendto+0x40/0x50 net/socket.c:1655
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because we miss a check for NULL pointer for skb_peek() when
      the queue is empty. Other places already have the same check.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73d2c667
    • W
      packet: round up linear to header len · 57031eb7
      Willem de Bruijn 提交于
      Link layer protocols may unconditionally pull headers, as Ethernet
      does in eth_type_trans. Ensure that the entire link layer header
      always lies in the skb linear segment. tpacket_snd has such a check.
      Extend this to packet_snd.
      
      Variable length link layer headers complicate the computation
      somewhat. Here skb->len may be smaller than dev->hard_header_len.
      
      Round up the linear length to be at least as long as the smallest of
      the two.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57031eb7
    • W
      net: introduce device min_header_len · 217e6fa2
      Willem de Bruijn 提交于
      The stack must not pass packets to device drivers that are shorter
      than the minimum link layer header length.
      
      Previously, packet sockets would drop packets smaller than or equal
      to dev->hard_header_len, but this has false positives. Zero length
      payload is used over Ethernet. Other link layer protocols support
      variable length headers. Support for validation of these protocols
      removed the min length check for all protocols.
      
      Introduce an explicit dev->min_header_len parameter and drop all
      packets below this value. Initially, set it to non-zero only for
      Ethernet and loopback. Other protocols can follow in a patch to
      net-next.
      
      Fixes: 9ed988cd ("packet: validate variable length ll headers")
      Reported-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      217e6fa2
    • W
      sit: fix a double free on error path · d7426c69
      WANG Cong 提交于
      Dmitry reported a double free in sit_init_net():
      
        kernel BUG at mm/percpu.c:689!
        invalid opcode: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
        Hardware name: Google Google Compute Engine/Google Compute Engine,
        BIOS Google 01/01/2011
        task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
        RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
        RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
        RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
        RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
        RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
        R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
        R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
        FS:  00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
        DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
        Call Trace:
         free_percpu+0x212/0x520 mm/percpu.c:1264
         ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
         sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
         ops_init+0x10a/0x530 net/core/net_namespace.c:115
         setup_net+0x2ed/0x690 net/core/net_namespace.c:291
         copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
         create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
         unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
         SYSC_unshare kernel/fork.c:2281 [inline]
         SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because when tunnel->dst_cache init fails, we free dev->tstats
      once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
      redundant but its ndo_uinit() does not seem enough to clean up everything
      here. So avoid this by setting dev->tstats to NULL after the first free,
      at least for -net.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7426c69
    • M
      ipv6: addrconf: fix generation of new temporary addresses · a11a7f71
      Marcus Huewe 提交于
      Under some circumstances it is possible that no new temporary addresses
      will be generated.
      
      For instance, addrconf_prefix_rcv_add_addr() indirectly calls
      ipv6_create_tempaddr(), which creates a tentative temporary address and
      starts dad. Next, addrconf_prefix_rcv_add_addr() indirectly calls
      addrconf_verify_rtnl(). Now, assume that the previously created temporary
      address has the least preferred lifetime among all existing addresses and
      is still tentative (that is, dad is still running). Hence, the next run of
      addrconf_verify_rtnl() is performed when the preferred lifetime of the
      temporary address ends. If dad succeeds before the next run, the temporary
      address becomes deprecated during the next run, but no new temporary
      address is generated.
      
      In order to fix this, schedule the next addrconf_verify_rtnl() run slightly
      before the temporary address becomes deprecated, if dad succeeded.
      Signed-off-by: NMarcus Huewe <suse-tux@gmx.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a11a7f71
  11. 08 2月, 2017 3 次提交
  12. 07 2月, 2017 2 次提交
  13. 06 2月, 2017 6 次提交
  14. 05 2月, 2017 2 次提交
  15. 04 2月, 2017 1 次提交
    • E
      net: use a work queue to defer net_disable_timestamp() work · 5fa8bbda
      Eric Dumazet 提交于
      Dmitry reported a warning [1] showing that we were calling
      net_disable_timestamp() -> static_key_slow_dec() from a non
      process context.
      
      Grabbing a mutex while holding a spinlock or rcu_read_lock()
      is not allowed.
      
      As Cong suggested, we now use a work queue.
      
      It is possible netstamp_clear() exits while netstamp_needed_deferred
      is not zero, but it is probably not worth trying to do better than that.
      
      netstamp_needed_deferred atomic tracks the exact number of deferred
      decrements.
      
      [1]
      [ INFO: suspicious RCU usage. ]
      4.10.0-rc5+ #192 Not tainted
      -------------------------------
      ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
      critical section!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 0
      2 locks held by syz-executor14/23111:
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] lock_sock
      include/net/sock.h:1454 [inline]
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>]
      rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>] nf_hook
      include/linux/netfilter.h:201 [inline]
       #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>]
      __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160
      
      stack backtrace:
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
       rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
       ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559
      RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005
      RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
      R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400
      BUG: sleeping function called from invalid context at
      kernel/locking/mutex.c:752
      in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
      INFO: lockdep is turned off.
      CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
       __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
       mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
       atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
       __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
       static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
       net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
       sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
       __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sock_wfree+0xae/0x120 net/core/sock.c:1645
       skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put include/net/inet_frag.h:133 [inline]
       nf_ct_frag6_gather+0x1106/0x3840
      net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook include/linux/netfilter.h:212 [inline]
       __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
       rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x600 net/socket.c:848
       do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
       do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
       vfs_writev+0x87/0xc0 fs/read_write.c:911
       do_writev+0x110/0x2c0 fs/read_write.c:944
       SYSC_writev fs/read_write.c:1017 [inline]
       SyS_writev+0x27/0x30 fs/read_write.c:1014
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x445559
      
      Fixes: b90e5794 ("net: dont call jump_label_dec from irq context")
      Suggested-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fa8bbda
新手
引导
客服 返回
顶部