1. 10 1月, 2020 2 次提交
  2. 09 1月, 2020 17 次提交
    • T
      tipc: fix wrong connect() return code · 9546a0b7
      Tuong Lien 提交于
      The current 'tipc_wait_for_connect()' function does a wait-loop for the
      condition 'sk->sk_state != TIPC_CONNECTING' to conclude if the socket
      connecting has done. However, when the condition is met, it returns '0'
      even in the case the connecting is actually failed, the socket state is
      set to 'TIPC_DISCONNECTING' (e.g. when the server socket has closed..).
      This results in a wrong return code for the 'connect()' call from user,
      making it believe that the connection is established and go ahead with
      building, sending a message, etc. but finally failed e.g. '-EPIPE'.
      
      This commit fixes the issue by changing the wait condition to the
      'tipc_sk_connected(sk)', so the function will return '0' only when the
      connection is really established. Otherwise, either the socket 'sk_err'
      if any or '-ETIMEDOUT'/'-EINTR' will be returned correspondingly.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9546a0b7
    • T
      tipc: fix link overflow issue at socket shutdown · 49afb806
      Tuong Lien 提交于
      When a socket is suddenly shutdown or released, it will reject all the
      unreceived messages in its receive queue. This applies to a connected
      socket too, whereas there is only one 'FIN' message required to be sent
      back to its peer in this case.
      
      In case there are many messages in the queue and/or some connections
      with such messages are shutdown at the same time, the link layer will
      easily get overflowed at the 'TIPC_SYSTEM_IMPORTANCE' backlog level
      because of the message rejections. As a result, the link will be taken
      down. Moreover, immediately when the link is re-established, the socket
      layer can continue to reject the messages and the same issue happens...
      
      The commit refactors the '__tipc_shutdown()' function to only send one
      'FIN' in the situation mentioned above. For the connectionless case, it
      is unavoidable but usually there is no rejections for such socket
      messages because they are 'dest-droppable' by default.
      
      In addition, the new code makes the other socket states clear
      (e.g.'TIPC_LISTEN') and treats as a separate case to avoid misbehaving.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49afb806
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · b73a6561
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Missing netns context in arp_tables, from Florian Westphal.
      
      2) Underflow in flowtable reference counter, from wenxu.
      
      3) Fix incorrect ethernet destination address in flowtable offload,
         from wenxu.
      
      4) Check for status of neighbour entry, from wenxu.
      
      5) Fix NAT port mangling, from wenxu.
      
      6) Unbind callbacks from destroy path to cleanup hardware properly
         on flowtable removal.
      
      7) Fix missing casting statistics timestamp, add nf_flowtable_time_stamp
         and use it.
      
      8) NULL pointer exception when timeout argument is null in conntrack
         dccp and sctp protocol helpers, from Florian Westphal.
      
      9) Possible nul-dereference in ipset with IPSET_ATTR_LINENO, also from
         Florian.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b73a6561
    • F
      netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present · 22dad713
      Florian Westphal 提交于
      The set uadt functions assume lineno is never NULL, but it is in
      case of ip_set_utest().
      
      syzkaller managed to generate a netlink message that calls this with
      LINENO attr present:
      
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      RIP: 0010:hash_mac4_uadt+0x1bc/0x470 net/netfilter/ipset/ip_set_hash_mac.c:104
      Call Trace:
       ip_set_utest+0x55b/0x890 net/netfilter/ipset/ip_set_core.c:1867
       nfnetlink_rcv_msg+0xcf2/0xfb0 net/netfilter/nfnetlink.c:229
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       nfnetlink_rcv+0x1ba/0x460 net/netfilter/nfnetlink.c:563
      
      pass a dummy lineno storage, its easier than patching all set
      implementations.
      
      This seems to be a day-0 bug.
      
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Reported-by: syzbot+34bd2369d38707f3f4a7@syzkaller.appspotmail.com
      Fixes: a7b4f989 ("netfilter: ipset: IP set core support")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      22dad713
    • F
      netfilter: conntrack: dccp, sctp: handle null timeout argument · 1d9a7acd
      Florian Westphal 提交于
      The timeout pointer can be NULL which means we should modify the
      per-nets timeout instead.
      
      All do this, except sctp and dccp which instead give:
      
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      net/netfilter/nf_conntrack_proto_dccp.c:682
       ctnl_timeout_parse_policy+0x150/0x1d0 net/netfilter/nfnetlink_cttimeout.c:67
       cttimeout_default_set+0x150/0x1c0 net/netfilter/nfnetlink_cttimeout.c:368
       nfnetlink_rcv_msg+0xcf2/0xfb0 net/netfilter/nfnetlink.c:229
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
      
      Reported-by: syzbot+46a4ad33f345d1dd346e@syzkaller.appspotmail.com
      Fixes: c779e849 ("netfilter: conntrack: remove get_timeout() indirection")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1d9a7acd
    • A
      atm: eni: fix uninitialized variable warning · 30780d08
      Arnd Bergmann 提交于
      With -O3, gcc has found an actual unintialized variable stored
      into an mmio register in two instances:
      
      drivers/atm/eni.c: In function 'discard':
      drivers/atm/eni.c:465:13: error: 'dma[1]' is used uninitialized in this function [-Werror=uninitialized]
         writel(dma[i*2+1],eni_dev->rx_dma+dma_wr*8+4);
                   ^
      drivers/atm/eni.c:465:13: error: 'dma[3]' is used uninitialized in this function [-Werror=uninitialized]
      
      Change the code to always write zeroes instead.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30780d08
    • E
      macvlan: do not assume mac_header is set in macvlan_broadcast() · 96cc4b69
      Eric Dumazet 提交于
      Use of eth_hdr() in tx path is error prone.
      
      Many drivers call skb_reset_mac_header() before using it,
      but others do not.
      
      Commit 6d1ccff6 ("net: reset mac header in dev_start_xmit()")
      attempted to fix this generically, but commit d346a3fa
      ("packet: introduce PACKET_QDISC_BYPASS socket option") brought
      back the macvlan bug.
      
      Lets add a new helper, so that tx paths no longer have
      to call skb_reset_mac_header() only to get a pointer
      to skb->data.
      
      Hopefully we will be able to revert 6d1ccff6
      ("net: reset mac header in dev_start_xmit()") and save few cycles
      in transmit fast path.
      
      BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
      BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline]
      BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
      Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579
      
      CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145
       __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
       mc_hash drivers/net/macvlan.c:251 [inline]
       macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
       macvlan_queue_xmit drivers/net/macvlan.c:520 [inline]
       macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559
       __netdev_start_xmit include/linux/netdevice.h:4447 [inline]
       netdev_start_xmit include/linux/netdevice.h:4461 [inline]
       dev_direct_xmit+0x419/0x630 net/core/dev.c:4079
       packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240
       packet_snd net/packet/af_packet.c:2966 [inline]
       packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:659
       __sys_sendto+0x262/0x380 net/socket.c:1985
       __do_sys_sendto net/socket.c:1997 [inline]
       __se_sys_sendto net/socket.c:1993 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x442639
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639
      RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:513 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
       __do_kmalloc mm/slab.c:3656 [inline]
       __kmalloc+0x163/0x770 mm/slab.c:3665
       kmalloc include/linux/slab.h:561 [inline]
       tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:335 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8880a4932000
       which belongs to the cache kmalloc-4k of size 4096
      The buggy address is located 1025 bytes inside of
       4096-byte region [ffff8880a4932000, ffff8880a4933000)
      The buggy address belongs to the page:
      page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
      raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000
      raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
       ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: b863ceb7 ("[NET]: Add macvlan driver")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96cc4b69
    • D
      Merge branch 'net-ungraft-prio' · 2f806c2a
      David S. Miller 提交于
      Petr Machata says:
      
      ====================
      When ungrafting from PRIO, replace child with FIFO
      
      When a child Qdisc is removed from one of the PRIO Qdisc's bands, it is
      replaced unconditionally by a NOOP qdisc. As a result, any traffic hitting
      that band gets dropped. That is incorrect--no Qdisc was explicitly added
      when PRIO was created, and after removal, none should have to be added
      either.
      
      In patch #2, this problem is fixed for PRIO by first attempting to create a
      default Qdisc and only falling back to noop when that fails. This pattern
      of attempting to create an invisible FIFO, using NOOP only as a fallback,
      is also seen in some other Qdiscs.
      
      The only driver currently offloading PRIO (and thus presumably the only one
      impacted by this) is mlxsw. Therefore patch #1 extends mlxsw to handle the
      replacement by an invisible FIFO gracefully.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f806c2a
    • P
      net: sch_prio: When ungrafting, replace with FIFO · 240ce7f6
      Petr Machata 提交于
      When a child Qdisc is removed from one of the PRIO Qdisc's bands, it is
      replaced unconditionally by a NOOP qdisc. As a result, any traffic hitting
      that band gets dropped. That is incorrect--no Qdisc was explicitly added
      when PRIO was created, and after removal, none should have to be added
      either.
      
      Fix PRIO by first attempting to create a default Qdisc and only falling
      back to noop when that fails. This pattern of attempting to create an
      invisible FIFO, using NOOP only as a fallback, is also seen in other
      Qdiscs.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      240ce7f6
    • P
      mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO · 3971a535
      Petr Machata 提交于
      The following patch will change PRIO to replace a removed Qdisc with an
      invisible FIFO, instead of NOOP. mlxsw will see this replacement due to the
      graft message that is generated. But because FIFO does not issue its own
      REPLACE message, when the graft operation takes place, the Qdisc that mlxsw
      tracks under the indicated band is still the old one. The child
      handle (0:0) therefore does not match, and mlxsw rejects the graft
      operation, which leads to an extack message:
      
          Warning: Offloading graft operation failed.
      
      Fix by ignoring the invisible children in the PRIO graft handler. The
      DESTROY message of the removed Qdisc is going to follow shortly and handle
      the removal.
      
      Fixes: 32dc5efc ("mlxsw: spectrum: qdiscs: prio: Handle graft command")
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3971a535
    • N
      MAINTAINERS: Remove myself as co-maintainer for qcom-ethqos · cb6f74a1
      Niklas Cassel 提交于
      As I am no longer with Linaro, I no longer have access to documentation
      for this IP. The Linaro email will start bouncing soon.
      
      Vinod is fully capable to maintain this driver by himself, therefore
      remove myself as co-maintainer for qcom-ethqos.
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb6f74a1
    • E
      gtp: fix bad unlock balance in gtp_encap_enable_socket · 90d72256
      Eric Dumazet 提交于
      WARNING: bad unlock balance detected!
      5.5.0-rc5-syzkaller #0 Not tainted
      -------------------------------------
      syz-executor921/9688 is trying to release lock (sk_lock-AF_INET6) at:
      [<ffffffff84bf8506>] gtp_encap_enable_socket+0x146/0x400 drivers/net/gtp.c:830
      but there are no more locks to release!
      
      other info that might help us debug this:
      2 locks held by syz-executor921/9688:
       #0: ffffffff8a4d8840 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
       #0: ffffffff8a4d8840 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x405/0xaf0 net/core/rtnetlink.c:5421
       #1: ffff88809304b560 (slock-AF_INET6){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
       #1: ffff88809304b560 (slock-AF_INET6){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2951
      
      stack backtrace:
      CPU: 0 PID: 9688 Comm: syz-executor921 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_unlock_imbalance_bug kernel/locking/lockdep.c:4008 [inline]
       print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3984
       __lock_release kernel/locking/lockdep.c:4242 [inline]
       lock_release+0x5f2/0x960 kernel/locking/lockdep.c:4503
       sock_release_ownership include/net/sock.h:1496 [inline]
       release_sock+0x17c/0x1c0 net/core/sock.c:2961
       gtp_encap_enable_socket+0x146/0x400 drivers/net/gtp.c:830
       gtp_encap_enable drivers/net/gtp.c:852 [inline]
       gtp_newlink+0x9fc/0xc60 drivers/net/gtp.c:666
       __rtnl_newlink+0x109e/0x1790 net/core/rtnetlink.c:3305
       rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3363
       rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5424
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x58c/0x7d0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:659
       ____sys_sendmsg+0x753/0x880 net/socket.c:2330
       ___sys_sendmsg+0x100/0x170 net/socket.c:2384
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg net/socket.c:2424 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2424
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x445d49
      Code: e8 bc b7 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f8019074db8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000006dac38 RCX: 0000000000445d49
      RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
      RBP: 00000000006dac30 R08: 0000000000000004 R09: 0000000000000000
      R10: 0000000000000008 R11: 0000000000000246 R12: 00000000006dac3c
      R13: 00007ffea687f6bf R14: 00007f80190759c0 R15: 20c49ba5e353f7cf
      
      Fixes: e198987e ("gtp: fix suspicious RCU usage")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Taehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90d72256
    • E
      pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM · d9e15a27
      Eric Dumazet 提交于
      As diagnosed by Florian :
      
      If TCA_FQ_QUANTUM is set to 0x80000000, fq_deueue()
      can loop forever in :
      
      if (f->credit <= 0) {
        f->credit += q->quantum;
        goto begin;
      }
      
      ... because f->credit is either 0 or -2147483648.
      
      Let's limit TCA_FQ_QUANTUM to no more than 1 << 20 :
      This max value should limit risks of breaking user setups
      while fixing this bug.
      
      Fixes: afe4fd06 ("pkt_sched: fq: Fair Queue packet scheduler")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Diagnosed-by: NFlorian Westphal <fw@strlen.de>
      Reported-by: syzbot+dc9071cc5a85950bdfce@syzkaller.appspotmail.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9e15a27
    • M
      tipc: remove meaningless assignment in Makefile · b969fee1
      Masahiro Yamada 提交于
      There is no module named tipc_diag.
      
      The assignment to tipc_diag-y has no effect.
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b969fee1
    • M
      tipc: do not add socket.o to tipc-y twice · ea04b445
      Masahiro Yamada 提交于
      net/tipc/Makefile adds socket.o twice.
      
      tipc-y	+= addr.o bcast.o bearer.o \
                 core.o link.o discover.o msg.o  \
                 name_distr.o  subscr.o monitor.o name_table.o net.o  \
                 netlink.o netlink_compat.o node.o socket.o eth_media.o \
                                                   ^^^^^^^^
                 topsrv.o socket.o group.o trace.o
                          ^^^^^^^^
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea04b445
    • C
      net: stmmac: dwmac-sun8i: Allow all RGMII modes · f1239d8a
      Chen-Yu Tsai 提交于
      Allow all the RGMII modes to be used. This would allow us to represent
      the hardware better in the device tree with RGMII_ID where in most
      cases the PHY's internal delay for both RX and TX are used.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Signed-off-by: NChen-Yu Tsai <wens@csie.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1239d8a
    • C
      net: stmmac: dwmac-sunxi: Allow all RGMII modes · 52cc73e5
      Chen-Yu Tsai 提交于
      Allow all the RGMII modes to be used. This would allow us to represent
      the hardware better in the device tree with RGMII_ID where in most
      cases the PHY's internal delay for both RX and TX are used.
      
      Fixes: af0bd4e9 ("net: stmmac: sunxi platform extensions for GMAC in Allwinner A20 SoC's")
      Signed-off-by: NChen-Yu Tsai <wens@csie.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52cc73e5
  3. 08 1月, 2020 7 次提交
  4. 07 1月, 2020 14 次提交
    • D
      Merge tag 'mlx5-fixes-2020-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · c101fffc
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2020-01-06
      
      This series introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v5.3
       ('net/mlx5: Move devlink registration before interfaces load')
      
      For -stable v5.4
       ('net/mlx5e: Fix hairpin RSS table size')
       ('net/mlx5: DR, Init lists that are used in rule's member')
       ('net/mlx5e: Always print health reporter message to dmesg')
       ('net/mlx5: DR, No need for atomic refcount for internal SW steering resources')
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c101fffc
    • E
      net/mlx5: DR, Init lists that are used in rule's member · df55c558
      Erez Shitrit 提交于
      Whenever adding new member of rule object we attach it to 2 lists,
      These 2 lists should be initialized first.
      
      Fixes: 41d07074 ("net/mlx5: DR, Expose steering rule functionality")
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      df55c558
    • E
      net/mlx5e: Fix hairpin RSS table size · 6412bb39
      Eli Cohen 提交于
      Set hairpin table size to the corret size, based on the groups that
      would be created in it. Groups are laid out on the table such that a
      group occupies a range of entries in the table. This implies that the
      group ranges should have correspondence to the table they are laid upon.
      
      The patch cited below  made group 1's size to grow hence causing
      overflow of group range laid on the table.
      
      Fixes: a795d8db ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets")
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6412bb39
    • Y
      net/mlx5: DR, No need for atomic refcount for internal SW steering resources · 4ce380ca
      Yevgeny Kliteynik 提交于
      No need for an atomic refcounter for the STE and hashtables.
      These are internal SW steering resources and they are always
      under domain mutex.
      
      This also fixes the following refcount error:
        refcount_t: addition on 0; use-after-free.
        WARNING: CPU: 9 PID: 3527 at lib/refcount.c:25 refcount_warn_saturate+0x81/0xe0
        Call Trace:
         dr_table_init_nic+0x10d/0x110 [mlx5_core]
         mlx5dr_table_create+0xb4/0x230 [mlx5_core]
         mlx5_cmd_dr_create_flow_table+0x39/0x120 [mlx5_core]
         __mlx5_create_flow_table+0x221/0x5f0 [mlx5_core]
         esw_create_offloads_fdb_tables+0x180/0x5a0 [mlx5_core]
         ...
      
      Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities")
      Signed-off-by: NYevgeny Kliteynik <kliteyn@mellanox.com>
      Reviewed-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4ce380ca
    • P
      Revert "net/mlx5: Support lockless FTE read lookups" · 1f0593e7
      Parav Pandit 提交于
      This reverts commit 7dee607e.
      
      During cleanup path, FTE's parent node group is removed which is
      referenced by the FTE while freeing the FTE.
      Hence FTE's lockless read lookup optimization done in cited commit is
      not possible at the moment.
      
      Hence, revert the commit.
      
      This avoid below KAZAN call trace.
      
      [  110.390896] BUG: KASAN: use-after-free in find_root.isra.14+0x56/0x60
      [mlx5_core]
      [  110.391048] Read of size 4 at addr ffff888c19e6d220 by task
      swapper/12/0
      
      [  110.391219] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.5.0-rc1+
      [  110.391222] Hardware name: HP ProLiant DL380p Gen8, BIOS P70
      08/02/2014
      [  110.391225] Call Trace:
      [  110.391229]  <IRQ>
      [  110.391246]  dump_stack+0x95/0xd5
      [  110.391307]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
      [  110.391320]  print_address_description.constprop.5+0x20/0x320
      [  110.391379]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
      [  110.391435]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
      [  110.391441]  __kasan_report+0x149/0x18c
      [  110.391499]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
      [  110.391504]  kasan_report+0x12/0x20
      [  110.391511]  __asan_report_load4_noabort+0x14/0x20
      [  110.391567]  find_root.isra.14+0x56/0x60 [mlx5_core]
      [  110.391625]  del_sw_fte_rcu+0x4a/0x100 [mlx5_core]
      [  110.391633]  rcu_core+0x404/0x1950
      [  110.391640]  ? rcu_accelerate_cbs_unlocked+0x100/0x100
      [  110.391649]  ? run_rebalance_domains+0x201/0x280
      [  110.391654]  rcu_core_si+0xe/0x10
      [  110.391661]  __do_softirq+0x181/0x66c
      [  110.391670]  irq_exit+0x12c/0x150
      [  110.391675]  smp_apic_timer_interrupt+0xf0/0x370
      [  110.391681]  apic_timer_interrupt+0xf/0x20
      [  110.391684]  </IRQ>
      [  110.391695] RIP: 0010:cpuidle_enter_state+0xfa/0xba0
      [  110.391703] Code: 3d c3 9b b5 50 e8 56 75 6e fe 48 89 45 c8 0f 1f 44
      00 00 31 ff e8 a6 94 6e fe 45 84 ff 0f 85 f6 02 00 00 fb 66 0f 1f 44 00
      00 <45> 85 f6 0f 88 db 06 00 00 4d 63 fe 4b 8d 04 7f 49 8d 04 87 49 8d
      [  110.391706] RSP: 0018:ffff888c23a6fce8 EFLAGS: 00000246 ORIG_RAX:
      ffffffffffffff13
      [  110.391712] RAX: dffffc0000000000 RBX: ffffe8ffff7002f8 RCX:
      000000000000001f
      [  110.391715] RDX: 1ffff11184ee6cb5 RSI: 0000000040277d83 RDI:
      ffff888c277365a8
      [  110.391718] RBP: ffff888c23a6fd40 R08: 0000000000000002 R09:
      0000000000035280
      [  110.391721] R10: ffff888c23a6fc80 R11: ffffed11847485d0 R12:
      ffffffffb1017740
      [  110.391723] R13: 0000000000000003 R14: 0000000000000003 R15:
      0000000000000000
      [  110.391732]  ? cpuidle_enter_state+0xea/0xba0
      [  110.391738]  cpuidle_enter+0x4f/0xa0
      [  110.391747]  call_cpuidle+0x6d/0xc0
      [  110.391752]  do_idle+0x360/0x430
      [  110.391758]  ? arch_cpu_idle_exit+0x40/0x40
      [  110.391765]  ? complete+0x67/0x80
      [  110.391771]  cpu_startup_entry+0x1d/0x20
      [  110.391779]  start_secondary+0x2f3/0x3c0
      [  110.391784]  ? set_cpu_sibling_map+0x2500/0x2500
      [  110.391795]  secondary_startup_64+0xa4/0xb0
      
      [  110.391841] Allocated by task 290:
      [  110.391917]  save_stack+0x21/0x90
      [  110.391921]  __kasan_kmalloc.constprop.8+0xa7/0xd0
      [  110.391925]  kasan_kmalloc+0x9/0x10
      [  110.391929]  kmem_cache_alloc_trace+0xf6/0x270
      [  110.391987]  create_root_ns.isra.36+0x58/0x260 [mlx5_core]
      [  110.392044]  mlx5_init_fs+0x5fd/0x1ee0 [mlx5_core]
      [  110.392092]  mlx5_load_one+0xc7a/0x3860 [mlx5_core]
      [  110.392139]  init_one+0x6ff/0xf90 [mlx5_core]
      [  110.392145]  local_pci_probe+0xde/0x190
      [  110.392150]  work_for_cpu_fn+0x56/0xa0
      [  110.392153]  process_one_work+0x678/0x1140
      [  110.392157]  worker_thread+0x573/0xba0
      [  110.392162]  kthread+0x341/0x400
      [  110.392166]  ret_from_fork+0x1f/0x40
      
      [  110.392218] Freed by task 2742:
      [  110.392288]  save_stack+0x21/0x90
      [  110.392292]  __kasan_slab_free+0x137/0x190
      [  110.392296]  kasan_slab_free+0xe/0x10
      [  110.392299]  kfree+0x94/0x250
      [  110.392357]  tree_put_node+0x257/0x360 [mlx5_core]
      [  110.392413]  tree_remove_node+0x63/0xb0 [mlx5_core]
      [  110.392469]  clean_tree+0x199/0x240 [mlx5_core]
      [  110.392525]  mlx5_cleanup_fs+0x76/0x580 [mlx5_core]
      [  110.392572]  mlx5_unload+0x22/0xc0 [mlx5_core]
      [  110.392619]  mlx5_unload_one+0x99/0x260 [mlx5_core]
      [  110.392666]  remove_one+0x61/0x160 [mlx5_core]
      [  110.392671]  pci_device_remove+0x10b/0x2c0
      [  110.392677]  device_release_driver_internal+0x1e4/0x490
      [  110.392681]  device_driver_detach+0x36/0x40
      [  110.392685]  unbind_store+0x147/0x200
      [  110.392688]  drv_attr_store+0x6f/0xb0
      [  110.392693]  sysfs_kf_write+0x127/0x1d0
      [  110.392697]  kernfs_fop_write+0x296/0x420
      [  110.392702]  __vfs_write+0x66/0x110
      [  110.392707]  vfs_write+0x1a0/0x500
      [  110.392711]  ksys_write+0x164/0x250
      [  110.392715]  __x64_sys_write+0x73/0xb0
      [  110.392720]  do_syscall_64+0x9f/0x3a0
      [  110.392725]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 7dee607e ("net/mlx5: Support lockless FTE read lookups")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      1f0593e7
    • M
      net/mlx5: Move devlink registration before interfaces load · a6f3b623
      Michael Guralnik 提交于
      Register devlink before interfaces are added.
      This will allow interfaces to use devlink while initalizing. For example,
      call mlx5_is_roce_enabled.
      
      Fixes: aba25279 ("net/mlx5e: Add TX reporter support")
      Signed-off-by: NMichael Guralnik <michaelgur@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      a6f3b623
    • E
      net/mlx5e: Always print health reporter message to dmesg · 99cda454
      Eran Ben Elisha 提交于
      In case a reporter exists, error message is logged only to the devlink
      tracer. The devlink tracer is a visibility utility only, which user can
      choose not to monitor.
      After cited patch, 3rd party monitoring tools that tracks these error
      message will no longer find them in dmesg, causing a regression.
      
      With this patch, error messages are also logged into the dmesg.
      
      Fixes: c50de4af ("net/mlx5e: Generalize tx reporter's functionality")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      99cda454
    • D
      net/mlx5e: Avoid duplicating rule destinations · 554fe75c
      Dmytro Linkin 提交于
      Following scenario easily break driver logic and crash the kernel:
      1. Add rule with mirred actions to same device.
      2. Delete this rule.
      In described scenario rule is not added to database and on deletion
      driver access invalid entry.
      Example:
      
       $ tc filter add dev ens1f0_0 ingress protocol ip prio 1 \
             flower skip_sw \
             action mirred egress mirror dev ens1f0_1 pipe \
             action mirred egress redirect dev ens1f0_1
       $ tc filter del dev ens1f0_0 ingress protocol ip prio 1
      
      Dmesg output:
      
      [  376.634396] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 3439): DESTROY_FLOW_GROUP(0x934) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x563e2f)
      [  376.654983] mlx5_core 0000:82:00.0: del_hw_flow_group:567:(pid 3439): flow steering can't destroy fg 89 of ft 3145728
      [  376.673433] kasan: CONFIG_KASAN_INLINE enabled
      [  376.683769] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  376.695229] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
      [  376.705069] CPU: 7 PID: 3439 Comm: tc Not tainted 5.4.0-rc5+ #76
      [  376.714959] Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0a 08/12/2016
      [  376.726371] RIP: 0010:mlx5_del_flow_rules+0x105/0x960 [mlx5_core]
      [  376.735817] Code: 01 00 00 00 48 83 eb 08 e8 28 d9 ff ff 4c 39 e3 75 d8 4c 8d bd c0 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 84 04 00 00 48 8d 7d 28 8b 9 d
      [  376.761261] RSP: 0018:ffff888847c56db8 EFLAGS: 00010202
      [  376.770054] RAX: dffffc0000000000 RBX: ffff8888582a6da0 RCX: ffff888847c56d60
      [  376.780743] RDX: 0000000000000058 RSI: 0000000000000008 RDI: 0000000000000282
      [  376.791328] RBP: 0000000000000000 R08: fffffbfff0c60ea6 R09: fffffbfff0c60ea6
      [  376.802050] R10: fffffbfff0c60ea5 R11: ffffffff8630752f R12: ffff8888582a6da0
      [  376.812798] R13: dffffc0000000000 R14: ffff8888582a6da0 R15: 00000000000002c0
      [  376.823445] FS:  00007f675f9a8840(0000) GS:ffff88886d200000(0000) knlGS:0000000000000000
      [  376.834971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  376.844179] CR2: 00000000007d9640 CR3: 00000007d3f26003 CR4: 00000000001606e0
      [  376.854843] Call Trace:
      [  376.868542]  __mlx5_eswitch_del_rule+0x49/0x300 [mlx5_core]
      [  376.877735]  mlx5e_tc_del_fdb_flow+0x6ec/0x9e0 [mlx5_core]
      [  376.921549]  mlx5e_flow_put+0x2b/0x50 [mlx5_core]
      [  376.929813]  mlx5e_delete_flower+0x5b6/0xbd0 [mlx5_core]
      [  376.973030]  tc_setup_cb_reoffload+0x29/0xc0
      [  376.980619]  fl_reoffload+0x50a/0x770 [cls_flower]
      [  377.015087]  tcf_block_playback_offloads+0xbd/0x250
      [  377.033400]  tcf_block_setup+0x1b2/0xc60
      [  377.057247]  tcf_block_offload_cmd+0x195/0x240
      [  377.098826]  tcf_block_offload_unbind+0xe7/0x180
      [  377.107056]  __tcf_block_put+0xe5/0x400
      [  377.114528]  ingress_destroy+0x3d/0x60 [sch_ingress]
      [  377.122894]  qdisc_destroy+0xf1/0x5a0
      [  377.129993]  qdisc_graft+0xa3d/0xe50
      [  377.151227]  tc_get_qdisc+0x48e/0xa20
      [  377.165167]  rtnetlink_rcv_msg+0x35d/0x8d0
      [  377.199528]  netlink_rcv_skb+0x11e/0x340
      [  377.219638]  netlink_unicast+0x408/0x5b0
      [  377.239913]  netlink_sendmsg+0x71b/0xb30
      [  377.267505]  sock_sendmsg+0xb1/0xf0
      [  377.273801]  ___sys_sendmsg+0x635/0x900
      [  377.312784]  __sys_sendmsg+0xd3/0x170
      [  377.338693]  do_syscall_64+0x95/0x460
      [  377.344833]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  377.352321] RIP: 0033:0x7f675e58e090
      
      To avoid this, for every mirred action check if output device was
      already processed. If so - drop rule with EOPNOTSUPP error.
      Signed-off-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      554fe75c
    • D
      bpf: Fix passing modified ctx to ld/abs/ind instruction · 6d4f151a
      Daniel Borkmann 提交于
      Anatoly has been fuzzing with kBdysch harness and reported a KASAN
      slab oob in one of the outcomes:
      
        [...]
        [   77.359642] BUG: KASAN: slab-out-of-bounds in bpf_skb_load_helper_8_no_cache+0x71/0x130
        [   77.360463] Read of size 4 at addr ffff8880679bac68 by task bpf/406
        [   77.361119]
        [   77.361289] CPU: 2 PID: 406 Comm: bpf Not tainted 5.5.0-rc2-xfstests-00157-g2187f215 #1
        [   77.362134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
        [   77.362984] Call Trace:
        [   77.363249]  dump_stack+0x97/0xe0
        [   77.363603]  print_address_description.constprop.0+0x1d/0x220
        [   77.364251]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
        [   77.365030]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
        [   77.365860]  __kasan_report.cold+0x37/0x7b
        [   77.366365]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
        [   77.366940]  kasan_report+0xe/0x20
        [   77.367295]  bpf_skb_load_helper_8_no_cache+0x71/0x130
        [   77.367821]  ? bpf_skb_load_helper_8+0xf0/0xf0
        [   77.368278]  ? mark_lock+0xa3/0x9b0
        [   77.368641]  ? kvm_sched_clock_read+0x14/0x30
        [   77.369096]  ? sched_clock+0x5/0x10
        [   77.369460]  ? sched_clock_cpu+0x18/0x110
        [   77.369876]  ? bpf_skb_load_helper_8+0xf0/0xf0
        [   77.370330]  ___bpf_prog_run+0x16c0/0x28f0
        [   77.370755]  __bpf_prog_run32+0x83/0xc0
        [   77.371153]  ? __bpf_prog_run64+0xc0/0xc0
        [   77.371568]  ? match_held_lock+0x1b/0x230
        [   77.371984]  ? rcu_read_lock_held+0xa1/0xb0
        [   77.372416]  ? rcu_is_watching+0x34/0x50
        [   77.372826]  sk_filter_trim_cap+0x17c/0x4d0
        [   77.373259]  ? sock_kzfree_s+0x40/0x40
        [   77.373648]  ? __get_filter+0x150/0x150
        [   77.374059]  ? skb_copy_datagram_from_iter+0x80/0x280
        [   77.374581]  ? do_raw_spin_unlock+0xa5/0x140
        [   77.375025]  unix_dgram_sendmsg+0x33a/0xa70
        [   77.375459]  ? do_raw_spin_lock+0x1d0/0x1d0
        [   77.375893]  ? unix_peer_get+0xa0/0xa0
        [   77.376287]  ? __fget_light+0xa4/0xf0
        [   77.376670]  __sys_sendto+0x265/0x280
        [   77.377056]  ? __ia32_sys_getpeername+0x50/0x50
        [   77.377523]  ? lock_downgrade+0x350/0x350
        [   77.377940]  ? __sys_setsockopt+0x2a6/0x2c0
        [   77.378374]  ? sock_read_iter+0x240/0x240
        [   77.378789]  ? __sys_socketpair+0x22a/0x300
        [   77.379221]  ? __ia32_sys_socket+0x50/0x50
        [   77.379649]  ? mark_held_locks+0x1d/0x90
        [   77.380059]  ? trace_hardirqs_on_thunk+0x1a/0x1c
        [   77.380536]  __x64_sys_sendto+0x74/0x90
        [   77.380938]  do_syscall_64+0x68/0x2a0
        [   77.381324]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [   77.381878] RIP: 0033:0x44c070
        [...]
      
      After further debugging, turns out while in case of other helper functions
      we disallow passing modified ctx, the special case of ld/abs/ind instruction
      which has similar semantics (except r6 being the ctx argument) is missing
      such check. Modified ctx is impossible here as bpf_skb_load_helper_8_no_cache()
      and others are expecting skb fields in original position, hence, add
      check_ctx_reg() to reject any modified ctx. Issue was first introduced back
      in f1174f77 ("bpf/verifier: rework value tracking").
      
      Fixes: f1174f77 ("bpf/verifier: rework value tracking")
      Reported-by: NAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200106215157.3553-1-daniel@iogearbox.net
      6d4f151a
    • D
      Merge branch 'atlantic-bugfixes' · d76063c5
      David S. Miller 提交于
      Igor Russkikh says:
      
      ====================
      Aquantia/Marvell atlantic bugfixes 2020/01
      
      Here is a set of recently discovered bugfixes,
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d76063c5
    • I
      net: atlantic: remove duplicate entries · b585f860
      Igor Russkikh 提交于
      Function entries were duplicated accidentally, removing the dups.
      
      Fixes: ea4b4d7f ("net: atlantic: loopback tests via private flags")
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b585f860
    • I
      net: atlantic: loopback configuration in improper place · 883daa18
      Igor Russkikh 提交于
      Initial loopback configuration should be called earlier, before
      starting traffic on HW blocks. Otherwise depending on race conditions
      it could be kept disabled.
      
      Fixes: ea4b4d7f ("net: atlantic: loopback tests via private flags")
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      883daa18
    • I
      net: atlantic: broken link status on old fw · ac70957e
      Igor Russkikh 提交于
      Last code/checkpatch cleanup did a copy paste error where code from
      firmware 3 API logic was moved to firmware 1 logic.
      
      This resulted in FW1.x users would never see the link state as active.
      
      Fixes: 7b0c342f ("net: atlantic: code style cleanup")
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac70957e
    • R
      bpf: cgroup: prevent out-of-order release of cgroup bpf · e10360f8
      Roman Gushchin 提交于
      Before commit 4bfc0bb2 ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
      cgroup bpf structures were released with
      corresponding cgroup structures. It guaranteed the hierarchical order
      of destruction: children were always first. It preserved attached
      programs from being released before their propagated copies.
      
      But with cgroup auto-detachment there are no such guarantees anymore:
      cgroup bpf is released as soon as the cgroup is offline and there are
      no live associated sockets. It means that an attached program can be
      detached and released, while its propagated copy is still living
      in the cgroup subtree. This will obviously lead to an use-after-free
      bug.
      
      To reproduce the issue the following script can be used:
      
        #!/bin/bash
      
        CGROOT=/sys/fs/cgroup
      
        mkdir -p ${CGROOT}/A ${CGROOT}/B ${CGROOT}/A/C
        sleep 1
      
        ./test_cgrp2_attach ${CGROOT}/A egress &
        A_PID=$!
        ./test_cgrp2_attach ${CGROOT}/B egress &
        B_PID=$!
      
        echo $$ > ${CGROOT}/A/C/cgroup.procs
        iperf -s &
        S_PID=$!
        iperf -c localhost -t 100 &
        C_PID=$!
      
        sleep 1
      
        echo $$ > ${CGROOT}/B/cgroup.procs
        echo ${S_PID} > ${CGROOT}/B/cgroup.procs
        echo ${C_PID} > ${CGROOT}/B/cgroup.procs
      
        sleep 1
      
        rmdir ${CGROOT}/A/C
        rmdir ${CGROOT}/A
      
        sleep 1
      
        kill -9 ${S_PID} ${C_PID} ${A_PID} ${B_PID}
      
      On the unpatched kernel the following stacktrace can be obtained:
      
      [   33.619799] BUG: unable to handle page fault for address: ffffbdb4801ab002
      [   33.620677] #PF: supervisor read access in kernel mode
      [   33.621293] #PF: error_code(0x0000) - not-present page
      [   33.622754] Oops: 0000 [#1] SMP NOPTI
      [   33.623202] CPU: 0 PID: 601 Comm: iperf Not tainted 5.5.0-rc2+ #23
      [   33.625545] RIP: 0010:__cgroup_bpf_run_filter_skb+0x29f/0x3d0
      [   33.635809] Call Trace:
      [   33.636118]  ? __cgroup_bpf_run_filter_skb+0x2bf/0x3d0
      [   33.636728]  ? __switch_to_asm+0x40/0x70
      [   33.637196]  ip_finish_output+0x68/0xa0
      [   33.637654]  ip_output+0x76/0xf0
      [   33.638046]  ? __ip_finish_output+0x1c0/0x1c0
      [   33.638576]  __ip_queue_xmit+0x157/0x410
      [   33.639049]  __tcp_transmit_skb+0x535/0xaf0
      [   33.639557]  tcp_write_xmit+0x378/0x1190
      [   33.640049]  ? _copy_from_iter_full+0x8d/0x260
      [   33.640592]  tcp_sendmsg_locked+0x2a2/0xdc0
      [   33.641098]  ? sock_has_perm+0x10/0xa0
      [   33.641574]  tcp_sendmsg+0x28/0x40
      [   33.641985]  sock_sendmsg+0x57/0x60
      [   33.642411]  sock_write_iter+0x97/0x100
      [   33.642876]  new_sync_write+0x1b6/0x1d0
      [   33.643339]  vfs_write+0xb6/0x1a0
      [   33.643752]  ksys_write+0xa7/0xe0
      [   33.644156]  do_syscall_64+0x5b/0x1b0
      [   33.644605]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix this by grabbing a reference to the bpf structure of each ancestor
      on the initialization of the cgroup bpf structure, and dropping the
      reference at the end of releasing the cgroup bpf structure.
      
      This will restore the hierarchical order of cgroup bpf releasing,
      without adding any operations on hot paths.
      
      Thanks to Josef Bacik for the debugging and the initial analysis of
      the problem.
      
      Fixes: 4bfc0bb2 ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
      Reported-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e10360f8