1. 28 9月, 2019 18 次提交
  2. 27 9月, 2019 22 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 3c30819d
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-09-27
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix libbpf's BTF dumper to not skip anonymous enum definitions, from Andrii.
      
      2) Fix BTF verifier issues when handling the BTF of vmlinux, from Alexei.
      
      3) Fix nested calls into bpf_event_output() from TCP sockops BPF
         programs, from Allan.
      
      4) Fix NULL pointer dereference in AF_XDP's xsk map creation when
         allocation fails, from Jonathan.
      
      5) Remove unneeded 64 byte alignment requirement of the AF_XDP UMEM
         headroom, from Bjorn.
      
      6) Remove unused XDP_OPTIONS getsockopt() call which results in an error
         on older kernels, from Toke.
      
      7) Fix a client/server race in tcp_rtt BPF kselftest case, from Stanislav.
      
      8) Fix indentation issue in BTF's btf_enum_check_kflag_member(), from Colin.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c30819d
    • D
      Merge branch 'qdisc-destroy' · 5c7ff181
      David S. Miller 提交于
      Vlad Buslov says:
      
      ====================
      Fix Qdisc destroy issues caused by adding fine-grained locking to filter API
      
      TC filter API unlocking introduced several new fine-grained locks. The
      change caused sleeping-while-atomic BUGs in several Qdiscs that call cls
      APIs which need to obtain new mutex while holding sch tree spinlock. This
      series fixes affected Qdiscs by ensuring that cls API that became sleeping
      is only called outside of sch tree lock critical section.
      ====================
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c7ff181
    • V
      net: sched: sch_sfb: don't call qdisc_put() while holding tree lock · e3ae1f96
      Vlad Buslov 提交于
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for sfb:
      
      tc qdisc add dev ens1f0 handle 1: root sfb
      tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
      tc qdisc change dev ens1f0 root handle 1: sfb
      
      Resulting dmesg:
      
      [ 7265.938717] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 7265.940152] in_atomic(): 1, irqs_disabled(): 0, pid: 28579, name: tc
      [ 7265.941455] INFO: lockdep is turned off.
      [ 7265.942744] CPU: 11 PID: 28579 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 7265.944065] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 7265.945396] Call Trace:
      [ 7265.946709]  dump_stack+0x85/0xc0
      [ 7265.947994]  ___might_sleep.cold+0xac/0xbc
      [ 7265.949282]  __mutex_lock+0x5b/0x960
      [ 7265.950543]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.951803]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.953022]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 7265.954248]  tcf_block_put_ext.part.0+0x21/0x50
      [ 7265.955478]  tcf_block_put+0x50/0x70
      [ 7265.956694]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 7265.957898]  qdisc_destroy+0x5f/0x160
      [ 7265.959099]  sfb_change+0x175/0x330 [sch_sfb]
      [ 7265.960304]  tc_modify_qdisc+0x324/0x840
      [ 7265.961503]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 7265.962692]  ? netlink_deliver_tap+0x95/0x400
      [ 7265.963876]  ? rtnl_dellink+0x2d0/0x2d0
      [ 7265.965064]  netlink_rcv_skb+0x49/0x110
      [ 7265.966251]  netlink_unicast+0x171/0x200
      [ 7265.967427]  netlink_sendmsg+0x224/0x3f0
      [ 7265.968595]  sock_sendmsg+0x5e/0x60
      [ 7265.969753]  ___sys_sendmsg+0x2ae/0x330
      [ 7265.970916]  ? ___sys_recvmsg+0x159/0x1f0
      [ 7265.972074]  ? do_wp_page+0x9c/0x790
      [ 7265.973233]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 7265.974407]  __sys_sendmsg+0x59/0xa0
      [ 7265.975591]  do_syscall_64+0x5c/0xb0
      [ 7265.976753]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 7265.977938] RIP: 0033:0x7f229069f7b8
      [ 7265.979117] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 7265.981681] RSP: 002b:00007ffd7ed2d158 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 7265.983001] RAX: ffffffffffffffda RBX: 000000005d813ca1 RCX: 00007f229069f7b8
      [ 7265.984336] RDX: 0000000000000000 RSI: 00007ffd7ed2d1c0 RDI: 0000000000000003
      [ 7265.985682] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000165c9a0
      [ 7265.987021] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 7265.988309] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      In sfb_change() function use qdisc_purge_queue() instead of
      qdisc_tree_flush_backlog() to properly reset old child Qdisc and save
      pointer to it into local temporary variable. Put reference to Qdisc after
      sch tree lock is released in order not to call potentially sleeping cls API
      in atomic section. This is safe to do because Qdisc has already been reset
      by qdisc_purge_queue() inside sch tree lock critical section.
      
      Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Suggested-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3ae1f96
    • V
      net: sched: multiq: don't call qdisc_put() while holding tree lock · c2999f7f
      Vlad Buslov 提交于
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for multiq:
      
      tc qdisc add dev ens1f0 root handle 1: multiq
      tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
      ethtool -L ens1f0 combined 2
      tc qdisc change dev ens1f0 root handle 1: multiq
      
      Resulting dmesg:
      
      [ 5539.419344] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 5539.420945] in_atomic(): 1, irqs_disabled(): 0, pid: 27658, name: tc
      [ 5539.422435] INFO: lockdep is turned off.
      [ 5539.423904] CPU: 21 PID: 27658 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 5539.425400] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 5539.426911] Call Trace:
      [ 5539.428380]  dump_stack+0x85/0xc0
      [ 5539.429823]  ___might_sleep.cold+0xac/0xbc
      [ 5539.431262]  __mutex_lock+0x5b/0x960
      [ 5539.432682]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.434103]  ? __nla_validate_parse+0x51/0x840
      [ 5539.435493]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.436903]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 5539.438327]  tcf_block_put_ext.part.0+0x21/0x50
      [ 5539.439752]  tcf_block_put+0x50/0x70
      [ 5539.441165]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 5539.442570]  qdisc_destroy+0x5f/0x160
      [ 5539.444000]  multiq_tune+0x14a/0x420 [sch_multiq]
      [ 5539.445421]  tc_modify_qdisc+0x324/0x840
      [ 5539.446841]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 5539.448269]  ? netlink_deliver_tap+0x95/0x400
      [ 5539.449691]  ? rtnl_dellink+0x2d0/0x2d0
      [ 5539.451116]  netlink_rcv_skb+0x49/0x110
      [ 5539.452522]  netlink_unicast+0x171/0x200
      [ 5539.453914]  netlink_sendmsg+0x224/0x3f0
      [ 5539.455304]  sock_sendmsg+0x5e/0x60
      [ 5539.456686]  ___sys_sendmsg+0x2ae/0x330
      [ 5539.458071]  ? ___sys_recvmsg+0x159/0x1f0
      [ 5539.459461]  ? do_wp_page+0x9c/0x790
      [ 5539.460846]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 5539.462263]  __sys_sendmsg+0x59/0xa0
      [ 5539.463661]  do_syscall_64+0x5c/0xb0
      [ 5539.465044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 5539.466454] RIP: 0033:0x7f1fe08177b8
      [ 5539.467863] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 5539.470906] RSP: 002b:00007ffe812de5d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 5539.472483] RAX: ffffffffffffffda RBX: 000000005d8135e3 RCX: 00007f1fe08177b8
      [ 5539.474069] RDX: 0000000000000000 RSI: 00007ffe812de640 RDI: 0000000000000003
      [ 5539.475655] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000182e9b0
      [ 5539.477203] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 5539.478699] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      Rearrange locking in multiq_tune() in following ways:
      
      - In loop that removes Qdiscs from disabled queues, call
        qdisc_purge_queue() instead of qdisc_tree_flush_backlog() on Qdisc that
        is being destroyed. Save the Qdisc in temporary allocated array and call
        qdisc_put() on each element of the array after sch tree lock is released.
        This is safe to do because Qdiscs have already been reset by
        qdisc_purge_queue() inside sch tree lock critical section.
      
      - Do the same change for second loop that initializes Qdiscs for newly
        enabled queues in multiq_tune() function. Since sch tree lock is obtained
        and released on each iteration of this loop, just call qdisc_put()
        directly outside of critical section. Don't verify that old Qdisc is not
        noop_qdisc before releasing reference to it because such check is already
        performed by qdisc_put*() functions.
      
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2999f7f
    • V
      net: sched: sch_htb: don't call qdisc_put() while holding tree lock · 4ce70b4a
      Vlad Buslov 提交于
      Recent changes that removed rtnl dependency from rules update path of tc
      also made tcf_block_put() function sleeping. This function is called from
      ops->destroy() of several Qdisc implementations, which in turn is called by
      qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
      which results sleeping-while-atomic BUG.
      
      Steps to reproduce for htb:
      
      tc qdisc add dev ens1f0 root handle 1: htb default 12
      tc class add dev ens1f0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
      tc qdisc add dev ens1f0 parent 1:1 handle 40: sfq perturb 10
      tc class add dev ens1f0 parent 1:1 classid 1:2 htb rate 100kbps ceil 100kbps
      
      Resulting dmesg:
      
      [ 4791.148551] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
      [ 4791.151354] in_atomic(): 1, irqs_disabled(): 0, pid: 27273, name: tc
      [ 4791.152805] INFO: lockdep is turned off.
      [ 4791.153605] CPU: 19 PID: 27273 Comm: tc Tainted: G        W         5.3.0-rc8+ #721
      [ 4791.154336] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [ 4791.155075] Call Trace:
      [ 4791.155803]  dump_stack+0x85/0xc0
      [ 4791.156529]  ___might_sleep.cold+0xac/0xbc
      [ 4791.157251]  __mutex_lock+0x5b/0x960
      [ 4791.157966]  ? console_unlock+0x363/0x5d0
      [ 4791.158676]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.159395]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.160103]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
      [ 4791.160815]  tcf_block_put_ext.part.0+0x21/0x50
      [ 4791.161530]  tcf_block_put+0x50/0x70
      [ 4791.162233]  sfq_destroy+0x15/0x50 [sch_sfq]
      [ 4791.162936]  qdisc_destroy+0x5f/0x160
      [ 4791.163642]  htb_change_class.cold+0x5df/0x69d [sch_htb]
      [ 4791.164505]  tc_ctl_tclass+0x19d/0x480
      [ 4791.165360]  rtnetlink_rcv_msg+0x170/0x4b0
      [ 4791.166191]  ? netlink_deliver_tap+0x95/0x400
      [ 4791.166907]  ? rtnl_dellink+0x2d0/0x2d0
      [ 4791.167625]  netlink_rcv_skb+0x49/0x110
      [ 4791.168345]  netlink_unicast+0x171/0x200
      [ 4791.169058]  netlink_sendmsg+0x224/0x3f0
      [ 4791.169771]  sock_sendmsg+0x5e/0x60
      [ 4791.170475]  ___sys_sendmsg+0x2ae/0x330
      [ 4791.171183]  ? ___sys_recvmsg+0x159/0x1f0
      [ 4791.171894]  ? do_wp_page+0x9c/0x790
      [ 4791.172595]  ? __handle_mm_fault+0xcd3/0x19e0
      [ 4791.173309]  __sys_sendmsg+0x59/0xa0
      [ 4791.174024]  do_syscall_64+0x5c/0xb0
      [ 4791.174725]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 4791.175435] RIP: 0033:0x7f0aa41497b8
      [ 4791.176129] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
      4
      [ 4791.177532] RSP: 002b:00007fff4e37d588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 4791.178243] RAX: ffffffffffffffda RBX: 000000005d8132f7 RCX: 00007f0aa41497b8
      [ 4791.178947] RDX: 0000000000000000 RSI: 00007fff4e37d5f0 RDI: 0000000000000003
      [ 4791.179662] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000020149a0
      [ 4791.180382] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
      [ 4791.181100] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
      
      In htb_change_class() function save parent->leaf.q to local temporary
      variable and put reference to it after sch tree lock is released in order
      not to call potentially sleeping cls API in atomic section. This is safe to
      do because Qdisc has already been reset by qdisc_purge_queue() inside sch
      tree lock critical section.
      
      Fixes: c266f64d ("net: sched: protect block state with mutex")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ce70b4a
    • K
      net/rds: Check laddr_check before calling it · 05733434
      Ka-Cheong Poon 提交于
      In rds_bind(), laddr_check is called without checking if it is NULL or
      not.  And rs_transport should be reset if rds_add_bound() fails.
      
      Fixes: c5c1a030 ("net/rds: An rds_sock is added too early to the hash table")
      Reported-by: syzbot+fae39afd2101a17ec624@syzkaller.appspotmail.com
      Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05733434
    • D
      Merge branch 'SO_PRIORITY' · 4e1e83be
      David S. Miller 提交于
      Eric Dumazet says:
      
      ====================
      tcp: provide correct skb->priority
      
      SO_PRIORITY socket option requests TCP egress packets
      to contain a user provided value.
      
      TCP manages to send most packets with the requested values,
      notably for TCP_ESTABLISHED state, but fails to do so for
      few packets.
      
      These packets are control packets sent on behalf
      of SYN_RECV or TIME_WAIT states.
      
      Note that to test this with packetdrill, it is a bit
      of a hassle, since packetdrill can not verify priority
      of egress packets, other than indirect observations,
      using for example sch_prio on its tunnel device.
      
      The bad skb priorities cause problems for GCP,
      as this field is one of the keys used in routing.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e1e83be
    • E
      tcp: honor SO_PRIORITY in TIME_WAIT state · f6c0f5d2
      Eric Dumazet 提交于
      ctl packets sent on behalf of TIME_WAIT sockets currently
      have a zero skb->priority, which can cause various problems.
      
      In this patch we :
      
      - add a tw_priority field in struct inet_timewait_sock.
      
      - populate it from sk->sk_priority when a TIME_WAIT is created.
      
      - For IPv4, change ip_send_unicast_reply() and its two
        callers to propagate tw_priority correctly.
        ip_send_unicast_reply() no longer changes sk->sk_priority.
      
      - For IPv6, make sure TIME_WAIT sockets pass their tw_priority
        field to tcp_v6_send_response() and tcp_v6_send_ack().
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6c0f5d2
    • E
      ipv6: tcp: provide sk->sk_priority to ctl packets · e9a5dcee
      Eric Dumazet 提交于
      We can populate skb->priority for some ctl packets
      instead of always using zero.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9a5dcee
    • E
      ipv6: add priority parameter to ip6_xmit() · 4f6570d7
      Eric Dumazet 提交于
      Currently, ip6_xmit() sets skb->priority based on sk->sk_priority
      
      This is not desirable for TCP since TCP shares the same ctl socket
      for a given netns. We want to be able to send RST or ACK packets
      with a non zero skb->priority.
      
      This patch has no functional change.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f6570d7
    • A
      bpf: Fix bpf_event_output re-entry issue · 768fb61f
      Allan Zhang 提交于
      BPF_PROG_TYPE_SOCK_OPS program can reenter bpf_event_output because it
      can be called from atomic and non-atomic contexts since we don't have
      bpf_prog_active to prevent it happen.
      
      This patch enables 3 levels of nesting to support normal, irq and nmi
      context.
      
      We can easily reproduce the issue by running netperf crr mode with 100
      flows and 10 threads from netperf client side.
      
      Here is the whole stack dump:
      
      [  515.228898] WARNING: CPU: 20 PID: 14686 at kernel/trace/bpf_trace.c:549 bpf_event_output+0x1f9/0x220
      [  515.228903] CPU: 20 PID: 14686 Comm: tcp_crr Tainted: G        W        4.15.0-smp-fixpanic #44
      [  515.228904] Hardware name: Intel TBG,ICH10/Ikaria_QC_1b, BIOS 1.22.0 06/04/2018
      [  515.228905] RIP: 0010:bpf_event_output+0x1f9/0x220
      [  515.228906] RSP: 0018:ffff9a57ffc03938 EFLAGS: 00010246
      [  515.228907] RAX: 0000000000000012 RBX: 0000000000000001 RCX: 0000000000000000
      [  515.228907] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff836b0f80
      [  515.228908] RBP: ffff9a57ffc039c8 R08: 0000000000000004 R09: 0000000000000012
      [  515.228908] R10: ffff9a57ffc1de40 R11: 0000000000000000 R12: 0000000000000002
      [  515.228909] R13: ffff9a57e13bae00 R14: 00000000ffffffff R15: ffff9a57ffc1e2c0
      [  515.228910] FS:  00007f5a3e6ec700(0000) GS:ffff9a57ffc00000(0000) knlGS:0000000000000000
      [  515.228910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  515.228911] CR2: 0000537082664fff CR3: 000000061fed6002 CR4: 00000000000226f0
      [  515.228911] Call Trace:
      [  515.228913]  <IRQ>
      [  515.228919]  [<ffffffff82c6c6cb>] bpf_sockopt_event_output+0x3b/0x50
      [  515.228923]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
      [  515.228927]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
      [  515.228930]  [<ffffffff82cf90a5>] ? tcp_init_transfer+0x125/0x150
      [  515.228933]  [<ffffffff82cf9159>] ? tcp_finish_connect+0x89/0x110
      [  515.228936]  [<ffffffff82cf98e4>] ? tcp_rcv_state_process+0x704/0x1010
      [  515.228939]  [<ffffffff82c6e263>] ? sk_filter_trim_cap+0x53/0x2a0
      [  515.228942]  [<ffffffff82d90d1f>] ? tcp_v6_inbound_md5_hash+0x6f/0x1d0
      [  515.228945]  [<ffffffff82d92160>] ? tcp_v6_do_rcv+0x1c0/0x460
      [  515.228947]  [<ffffffff82d93558>] ? tcp_v6_rcv+0x9f8/0xb30
      [  515.228951]  [<ffffffff82d737c0>] ? ip6_route_input+0x190/0x220
      [  515.228955]  [<ffffffff82d5f7ad>] ? ip6_protocol_deliver_rcu+0x6d/0x450
      [  515.228958]  [<ffffffff82d60246>] ? ip6_rcv_finish+0xb6/0x170
      [  515.228961]  [<ffffffff82d5fb90>] ? ip6_protocol_deliver_rcu+0x450/0x450
      [  515.228963]  [<ffffffff82d60361>] ? ipv6_rcv+0x61/0xe0
      [  515.228966]  [<ffffffff82d60190>] ? ipv6_list_rcv+0x330/0x330
      [  515.228969]  [<ffffffff82c4976b>] ? __netif_receive_skb_one_core+0x5b/0xa0
      [  515.228972]  [<ffffffff82c497d1>] ? __netif_receive_skb+0x21/0x70
      [  515.228975]  [<ffffffff82c4a8d2>] ? process_backlog+0xb2/0x150
      [  515.228978]  [<ffffffff82c4aadf>] ? net_rx_action+0x16f/0x410
      [  515.228982]  [<ffffffff830000dd>] ? __do_softirq+0xdd/0x305
      [  515.228986]  [<ffffffff8252cfdc>] ? irq_exit+0x9c/0xb0
      [  515.228989]  [<ffffffff82e02de5>] ? smp_call_function_single_interrupt+0x65/0x120
      [  515.228991]  [<ffffffff82e020e1>] ? call_function_single_interrupt+0x81/0x90
      [  515.228992]  </IRQ>
      [  515.228996]  [<ffffffff82a11ff0>] ? io_serial_in+0x20/0x20
      [  515.229000]  [<ffffffff8259c040>] ? console_unlock+0x230/0x490
      [  515.229003]  [<ffffffff8259cbaa>] ? vprintk_emit+0x26a/0x2a0
      [  515.229006]  [<ffffffff8259cbff>] ? vprintk_default+0x1f/0x30
      [  515.229008]  [<ffffffff8259d9f5>] ? vprintk_func+0x35/0x70
      [  515.229011]  [<ffffffff8259d4bb>] ? printk+0x50/0x66
      [  515.229013]  [<ffffffff82637637>] ? bpf_event_output+0xb7/0x220
      [  515.229016]  [<ffffffff82c6c6cb>] ? bpf_sockopt_event_output+0x3b/0x50
      [  515.229019]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
      [  515.229023]  [<ffffffff82c29e87>] ? release_sock+0x97/0xb0
      [  515.229026]  [<ffffffff82ce9d6a>] ? tcp_recvmsg+0x31a/0xda0
      [  515.229029]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
      [  515.229032]  [<ffffffff82ce77c1>] ? tcp_set_state+0x191/0x1b0
      [  515.229035]  [<ffffffff82ced10e>] ? tcp_disconnect+0x2e/0x600
      [  515.229038]  [<ffffffff82cecbbb>] ? tcp_close+0x3eb/0x460
      [  515.229040]  [<ffffffff82d21082>] ? inet_release+0x42/0x70
      [  515.229043]  [<ffffffff82d58809>] ? inet6_release+0x39/0x50
      [  515.229046]  [<ffffffff82c1f32d>] ? __sock_release+0x4d/0xd0
      [  515.229049]  [<ffffffff82c1f3e5>] ? sock_close+0x15/0x20
      [  515.229052]  [<ffffffff8273b517>] ? __fput+0xe7/0x1f0
      [  515.229055]  [<ffffffff8273b66e>] ? ____fput+0xe/0x10
      [  515.229058]  [<ffffffff82547bf2>] ? task_work_run+0x82/0xb0
      [  515.229061]  [<ffffffff824086df>] ? exit_to_usermode_loop+0x7e/0x11f
      [  515.229064]  [<ffffffff82408171>] ? do_syscall_64+0x111/0x130
      [  515.229067]  [<ffffffff82e0007c>] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: a5a3a828 ("bpf: add perf event notificaton support for sock_ops")
      Signed-off-by: NAllan Zhang <allanzhang@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NStanislav Fomichev <sdf@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20190925234312.94063-2-allanzhang@google.com
      768fb61f
    • A
      net: dsa: qca8k: Fix port enable for CPU port · 2b6fd3ea
      Andrew Lunn 提交于
      The CPU port does not have a PHY connected to it. So calling
      phy_support_asym_pause() results in an Opps. As with other DSA
      drivers, add a guard that the port is a user port.
      Reported-by: NMichal Vokáč <michal.vokac@ysoft.com>
      Fixes: 0394a63a ("net: dsa: enable and disable all ports")
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Tested-by: NMichal Vokáč <michal.vokac@ysoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b6fd3ea
    • E
      sch_netem: fix rcu splat in netem_enqueue() · 159d2c7d
      Eric Dumazet 提交于
      qdisc_root() use from netem_enqueue() triggers a lockdep warning.
      
      __dev_queue_xmit() uses rcu_read_lock_bh() which is
      not equivalent to rcu_read_lock() + local_bh_disable_bh as far
      as lockdep is concerned.
      
      WARNING: suspicious RCU usage
      5.3.0-rc7+ #0 Not tainted
      -----------------------------
      include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by syz-executor427/8855:
       #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline]
       #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214
       #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline]
       #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838
      
      stack backtrace:
      CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357
       qdisc_root include/net/sch_generic.h:492 [inline]
       netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479
       __dev_xmit_skb net/core/dev.c:3527 [inline]
       __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838
       dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
       neigh_hh_output include/net/neighbour.h:500 [inline]
       neigh_output include/net/neighbour.h:509 [inline]
       ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290
       ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125
       ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555
       udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887
       udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174
       inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
       __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
       __do_sys_sendmmsg net/socket.c:2442 [inline]
       __se_sys_sendmmsg net/socket.c:2439 [inline]
       __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
       do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      159d2c7d
    • E
      kcm: disable preemption in kcm_parse_func_strparser() · 0355d6c1
      Eric Dumazet 提交于
      After commit a2c11b03 ("kcm: use BPF_PROG_RUN")
      syzbot easily triggers the warning in cant_sleep().
      
      As explained in commit 6cab5e90 ("bpf: run bpf programs
      with preemption disabled") we need to disable preemption before
      running bpf programs.
      
      BUG: assuming atomic context at net/kcm/kcmsock.c:382
      in_atomic(): 0, irqs_disabled(): 0, pid: 7, name: kworker/u4:0
      3 locks held by kworker/u4:0/7:
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: __write_once_size include/linux/compiler.h:226 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: set_work_data kernel/workqueue.c:620 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
       #0: ffff888216726128 ((wq_completion)kstrp){+.+.}, at: process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
       #1: ffff8880a989fdc0 ((work_completion)(&strp->work)){+.+.}, at: process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
       #2: ffff888098998d10 (sk_lock-AF_INET){+.+.}, at: lock_sock include/net/sock.h:1522 [inline]
       #2: ffff888098998d10 (sk_lock-AF_INET){+.+.}, at: strp_sock_lock+0x2e/0x40 net/strparser/strparser.c:440
      CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: kstrp strp_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       __cant_sleep kernel/sched/core.c:6826 [inline]
       __cant_sleep.cold+0xa4/0xbc kernel/sched/core.c:6803
       kcm_parse_func_strparser+0x54/0x200 net/kcm/kcmsock.c:382
       __strp_recv+0x5dc/0x1b20 net/strparser/strparser.c:221
       strp_recv+0xcf/0x10b net/strparser/strparser.c:343
       tcp_read_sock+0x285/0xa00 net/ipv4/tcp.c:1639
       strp_read_sock+0x14d/0x200 net/strparser/strparser.c:366
       do_strp_work net/strparser/strparser.c:414 [inline]
       strp_work+0xe3/0x130 net/strparser/strparser.c:423
       process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
      
      Fixes: a2c11b03 ("kcm: use BPF_PROG_RUN")
      Fixes: 6cab5e90 ("bpf: run bpf programs with preemption disabled")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0355d6c1
    • D
      net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse() · 23104218
      Dan Carpenter 提交于
      The "gmac->phy_mode" variable is an enum and in this context GCC will
      treat it as an unsigned int so the error handling will never be
      triggered.
      
      Fixes: b1c17215 ("stmmac: add ipq806x glue layer")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23104218
    • D
      net: nixge: Fix a signedness bug in nixge_probe() · 1a4b62a0
      Dan Carpenter 提交于
      The "priv->phy_mode" is an enum and in this context GCC will treat it
      as an unsigned int so it can never be less than zero.
      
      Fixes: 492caffa ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a4b62a0
    • D
      of: mdio: Fix a signedness bug in of_phy_get_and_connect() · d7eb6512
      Dan Carpenter 提交于
      The "iface" variable is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: b7862412 ("of_mdio: Abstract a general interface for phy connect")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7eb6512
    • D
      net: axienet: fix a signedness bug in probe · 73e211e1
      Dan Carpenter 提交于
      The "lp->phy_mode" is an enum but in this context GCC treats it as an
      unsigned int so the error handling is never triggered.
      
      Fixes: ee06b172 ("net: axienet: add support for standard phy-mode binding")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73e211e1
    • D
      net: stmmac: dwmac-meson8b: Fix signedness bug in probe · f1021051
      Dan Carpenter 提交于
      The "dwmac->phy_mode" is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: 566e8251 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1021051
    • D
      net: socionext: Fix a signedness bug in ave_probe() · 7f9e88e6
      Dan Carpenter 提交于
      The "phy_mode" variable is an enum and in this context GCC treats it as
      an unsigned int so the error handling is never triggered.
      
      Fixes: 4c270b55 ("net: ethernet: socionext: add AVE ethernet driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f9e88e6
    • D
      enetc: Fix a signedness bug in enetc_of_get_phy() · ced81eb8
      Dan Carpenter 提交于
      The "priv->if_mode" is type phy_interface_t which is an enum.  In this
      context GCC will treat the enum as an unsigned int so this error
      handling is never triggered.
      
      Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ced81eb8
    • D
      net: netsec: Fix signedness bug in netsec_probe() · bd55f8dd
      Dan Carpenter 提交于
      The "priv->phy_interface" variable is an enum and in this context GCC
      will treat it as an unsigned int so the error handling is never
      triggered.
      
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd55f8dd