1. 27 11月, 2021 12 次提交
    • K
      af_unix: Replace the big lock with small locks. · afd20b92
      Kuniyuki Iwashima 提交于
      The hash table of AF_UNIX sockets is protected by the single lock.  This
      patch replaces it with per-hash locks.
      
      The effect is noticeable when we handle multiple sockets simultaneously.
      Here is a test result on an EC2 c5.24xlarge instance.  It shows latency
      (under 10us only) in unix_insert_unbound_socket() while 64 CPUs creating
      1024 sockets for each in parallel.
      
        Without this patch:
      
           nsec          : count     distribution
              0          : 179      |                                        |
              500        : 3021     |*********                               |
              1000       : 6271     |*******************                     |
              1500       : 6318     |*******************                     |
              2000       : 5828     |*****************                       |
              2500       : 5124     |***************                         |
              3000       : 4426     |*************                           |
              3500       : 3672     |***********                             |
              4000       : 3138     |*********                               |
              4500       : 2811     |********                                |
              5000       : 2384     |*******                                 |
              5500       : 2023     |******                                  |
              6000       : 1954     |*****                                   |
              6500       : 1737     |*****                                   |
              7000       : 1749     |*****                                   |
              7500       : 1520     |****                                    |
              8000       : 1469     |****                                    |
              8500       : 1394     |****                                    |
              9000       : 1232     |***                                     |
              9500       : 1138     |***                                     |
              10000      : 994      |***                                     |
      
        With this patch:
      
           nsec          : count     distribution
              0          : 1634     |****                                    |
              500        : 13170    |****************************************|
              1000       : 13156    |*************************************** |
              1500       : 9010     |***************************             |
              2000       : 6363     |*******************                     |
              2500       : 4443     |*************                           |
              3000       : 3240     |*********                               |
              3500       : 2549     |*******                                 |
              4000       : 1872     |*****                                   |
              4500       : 1504     |****                                    |
              5000       : 1247     |***                                     |
              5500       : 1035     |***                                     |
              6000       : 889      |**                                      |
              6500       : 744      |**                                      |
              7000       : 634      |*                                       |
              7500       : 498      |*                                       |
              8000       : 433      |*                                       |
              8500       : 355      |*                                       |
              9000       : 336      |*                                       |
              9500       : 284      |                                        |
              10000      : 243      |                                        |
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      afd20b92
    • K
      af_unix: Save hash in sk_hash. · e6b4b873
      Kuniyuki Iwashima 提交于
      To replace unix_table_lock with per-hash locks in the next patch, we need
      to save a hash in each socket because /proc/net/unix or BPF prog iterate
      sockets while holding a hash table lock and release it later in a different
      function.
      
      Currently, we store a real/pseudo hash in struct unix_address.  However, we
      do not allocate it to unbound sockets, nor should we do just for that.  For
      this purpose, we can use sk_hash.  Then, we no longer use the hash field in
      struct unix_address and can remove it.
      
      Also, this patch does
        - rename unix_insert_socket() to unix_insert_unbound_socket()
        - remove the redundant list argument from __unix_insert_socket() and
           unix_insert_unbound_socket()
        - use 'unsigned int' instead of 'unsigned' in __unix_set_addr_hash()
        - remove 'inline' from unix_remove_socket() and
           unix_insert_unbound_socket().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e6b4b873
    • K
      af_unix: Add helpers to calculate hashes. · f452be49
      Kuniyuki Iwashima 提交于
      This patch adds three helper functions that calculate hashes for unbound
      sockets and bound sockets with BSD/abstract addresses.
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f452be49
    • K
      af_unix: Remove UNIX_ABSTRACT() macro and test sun_path[0] instead. · 5ce7ab49
      Kuniyuki Iwashima 提交于
      In BSD and abstract address cases, we store sockets in the hash table with
      keys between 0 and UNIX_HASH_SIZE - 1.  However, the hash saved in a socket
      varies depending on its address type; sockets with BSD addresses always
      have UNIX_HASH_SIZE in their unix_sk(sk)->addr->hash.
      
      This is just for the UNIX_ABSTRACT() macro used to check the address type.
      The difference of the saved hashes comes from the first byte of the address
      in the first place.  So, we can test it directly.
      
      Then we can keep a real hash in each socket and replace unix_table_lock
      with per-hash locks in the later patch.
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5ce7ab49
    • K
      af_unix: Allocate unix_address in unix_bind_(bsd|abstract)(). · 12f21c49
      Kuniyuki Iwashima 提交于
      To terminate address with '\0' in unix_bind_bsd(), we add
      unix_create_addr() and call it in unix_bind_bsd() and unix_bind_abstract().
      
      Also, unix_bind_abstract() does not return -EEXIST.  Only
      kern_path_create() and vfs_mknod() in unix_bind_bsd() can return it,
      so we move the last error check in unix_bind() to unix_bind_bsd().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      12f21c49
    • K
      af_unix: Remove unix_mkname(). · 5c32a3ed
      Kuniyuki Iwashima 提交于
      This patch removes unix_mkname() and postpones calculating a hash to
      unix_bind_abstract().  Some BSD stuffs still remain in unix_bind()
      though, the next patch packs them into unix_bind_bsd().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5c32a3ed
    • K
      af_unix: Copy unix_mkname() into unix_find_(bsd|abstract)(). · d2d8c9fd
      Kuniyuki Iwashima 提交于
      We should not call unix_mkname() before unix_find_other() and instead do
      the same thing where necessary based on the address type:
      
        - terminating the address with '\0' in unix_find_bsd()
        - calculating the hash in unix_find_abstract().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d2d8c9fd
    • K
      af_unix: Cut unix_validate_addr() out of unix_mkname(). · b8a58aa6
      Kuniyuki Iwashima 提交于
      unix_mkname() tests socket address length and family and does some
      processing based on the address type.  It is called in the early stage,
      and therefore some instructions are redundant and can end up in vain.
      
      The address length/family tests are done twice in unix_bind().  Also, the
      address type is rechecked later in unix_bind() and unix_find_other(), where
      we can do the same processing.  Moreover, in the BSD address case, the hash
      is set to 0 but never used and confusing.
      
      This patch moves the address tests out of unix_mkname(), and the following
      patches move the other part into appropriate places and remove
      unix_mkname() finally.
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b8a58aa6
    • K
      af_unix: Return an error as a pointer in unix_find_other(). · aed26f55
      Kuniyuki Iwashima 提交于
      We can return an error as a pointer and need not pass an additional
      argument to unix_find_other().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      aed26f55
    • K
      af_unix: Factorise unix_find_other() based on address types. · fa39ef0e
      Kuniyuki Iwashima 提交于
      As done in the commit fa42d910 ("unix_bind(): take BSD and abstract
      address cases into new helpers"), this patch moves BSD and abstract address
      cases from unix_find_other() into unix_find_bsd() and unix_find_abstract().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      fa39ef0e
    • K
      af_unix: Pass struct sock to unix_autobind(). · f7ed31f4
      Kuniyuki Iwashima 提交于
      We do not use struct socket in unix_autobind() and pass struct sock to
      unix_bind_bsd() and unix_bind_abstract().  Let's pass it to unix_autobind()
      as well.
      
      Also, this patch fixes these errors by checkpatch.pl.
      
        ERROR: do not use assignment in if condition
        #1795: FILE: net/unix/af_unix.c:1795:
        +	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr
      
        CHECK: Logical continuations should be on the previous line
        #1796: FILE: net/unix/af_unix.c:1796:
        +	if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr
        +	    && (err = unix_autobind(sock)) != 0)
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f7ed31f4
    • K
      af_unix: Use offsetof() instead of sizeof(). · 755662ce
      Kuniyuki Iwashima 提交于
      The length of the AF_UNIX socket address contains an offset to the member
      sun_path of struct sockaddr_un.
      
      Currently, the preceding member is just sun_family, and its type is
      sa_family_t and resolved to short.  Therefore, the offset is represented by
      sizeof(short).  However, it is not clear and fragile to changes in struct
      sockaddr_storage or sockaddr_un.
      
      This commit makes it clear and robust by rewriting sizeof() with
      offsetof().
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      755662ce
  2. 20 11月, 2021 1 次提交
  3. 16 11月, 2021 1 次提交
  4. 27 10月, 2021 1 次提交
  5. 12 10月, 2021 1 次提交
  6. 06 10月, 2021 1 次提交
  7. 30 9月, 2021 1 次提交
    • E
      af_unix: fix races in sk_peer_pid and sk_peer_cred accesses · 35306eb2
      Eric Dumazet 提交于
      Jann Horn reported that SO_PEERCRED and SO_PEERGROUPS implementations
      are racy, as af_unix can concurrently change sk_peer_pid and sk_peer_cred.
      
      In order to fix this issue, this patch adds a new spinlock that needs
      to be used whenever these fields are read or written.
      
      Jann also pointed out that l2cap_sock_get_peer_pid_cb() is currently
      reading sk->sk_peer_pid which makes no sense, as this field
      is only possibly set by AF_UNIX sockets.
      We will have to clean this in a separate patch.
      This could be done by reverting b48596d1 "Bluetooth: L2CAP: Add get_peer_pid callback"
      or implementing what was truly expected.
      
      Fixes: 109f6e39 ("af_unix: Allow SO_PEERCRED to work across namespaces.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJann Horn <jannh@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35306eb2
  8. 28 9月, 2021 1 次提交
    • K
      af_unix: Return errno instead of NULL in unix_create1(). · f4bd73b5
      Kuniyuki Iwashima 提交于
      unix_create1() returns NULL on error, and the callers assume that it never
      fails for reasons other than out of memory.  So, the callers always return
      -ENOMEM when unix_create1() fails.
      
      However, it also returns NULL when the number of af_unix sockets exceeds
      twice the limit controlled by sysctl: fs.file-max.  In this case, the
      callers should return -ENFILE like alloc_empty_file().
      
      This patch changes unix_create1() to return the correct error value instead
      of NULL on error.
      
      Out of curiosity, the assumption has been wrong since 1999 due to this
      change introduced in 2.2.4 [0].
      
        diff -u --recursive --new-file v2.2.3/linux/net/unix/af_unix.c linux/net/unix/af_unix.c
        --- v2.2.3/linux/net/unix/af_unix.c	Tue Jan 19 11:32:53 1999
        +++ linux/net/unix/af_unix.c	Sun Mar 21 07:22:00 1999
        @@ -388,6 +413,9 @@
         {
         	struct sock *sk;
      
        +	if (atomic_read(&unix_nr_socks) >= 2*max_files)
        +		return NULL;
        +
         	MOD_INC_USE_COUNT;
         	sk = sk_alloc(PF_UNIX, GFP_KERNEL, 1);
         	if (!sk) {
      
      [0]: https://cdn.kernel.org/pub/linux/kernel/v2.2/patch-2.2.4.gz
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4bd73b5
  9. 09 9月, 2021 1 次提交
    • E
      net/af_unix: fix a data-race in unix_dgram_poll · 04f08eb4
      Eric Dumazet 提交于
      syzbot reported another data-race in af_unix [1]
      
      Lets change __skb_insert() to use WRITE_ONCE() when changing
      skb head qlen.
      
      Also, change unix_dgram_poll() to use lockless version
      of unix_recvq_full()
      
      It is verry possible we can switch all/most unix_recvq_full()
      to the lockless version, this will be done in a future kernel version.
      
      [1] HEAD commit: 8596e589
      
      BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll
      
      write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
       __skb_insert include/linux/skbuff.h:1938 [inline]
       __skb_queue_before include/linux/skbuff.h:2043 [inline]
       __skb_queue_tail include/linux/skbuff.h:2076 [inline]
       skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
       unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
       sock_sendmsg_nosec net/socket.c:703 [inline]
       sock_sendmsg net/socket.c:723 [inline]
       ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
       ___sys_sendmsg net/socket.c:2446 [inline]
       __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
       __do_sys_sendmmsg net/socket.c:2561 [inline]
       __se_sys_sendmmsg net/socket.c:2558 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
       skb_queue_len include/linux/skbuff.h:1869 [inline]
       unix_recvq_full net/unix/af_unix.c:194 [inline]
       unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
       sock_poll+0x23e/0x260 net/socket.c:1288
       vfs_poll include/linux/poll.h:90 [inline]
       ep_item_poll fs/eventpoll.c:846 [inline]
       ep_send_events fs/eventpoll.c:1683 [inline]
       ep_poll fs/eventpoll.c:1798 [inline]
       do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
       __do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
       __se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
       __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x0000001b -> 0x00000001
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G        W         5.14.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 86b18aaa ("skbuff: fix a data race in skb_queue_len()")
      Cc: Qian Cai <cai@lca.pw>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04f08eb4
  10. 31 8月, 2021 1 次提交
    • E
      af_unix: fix potential NULL deref in unix_dgram_connect() · dc56ad70
      Eric Dumazet 提交于
      syzbot was able to trigger NULL deref in unix_dgram_connect() [1]
      
      This happens in
      
      	if (unix_peer(sk))
      		sk->sk_state = other->sk_state = TCP_ESTABLISHED; // crash because @other is NULL
      
      Because locks have been dropped, unix_peer() might be non NULL,
      while @other is NULL (AF_UNSPEC case)
      
      We need to move code around, so that we no longer access
      unix_peer() and sk_state while locks have been released.
      
      [1]
      general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
      CPU: 0 PID: 10341 Comm: syz-executor239 Not tainted 5.14.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:unix_dgram_connect+0x32a/0xc60 net/unix/af_unix.c:1226
      Code: 00 00 45 31 ed 49 83 bc 24 f8 05 00 00 00 74 69 e8 eb 5b a6 f9 48 8d 7d 12 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 e0 07 00 00
      RSP: 0018:ffffc9000a89fcd8 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
      RDX: 0000000000000002 RSI: ffffffff87cf4ef5 RDI: 0000000000000012
      RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88802e1917c3
      R10: ffffffff87cf4eba R11: 0000000000000001 R12: ffff88802e191740
      R13: 0000000000000000 R14: ffff88802e191d38 R15: ffff88802e1917c0
      FS:  00007f3eb0052700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004787d0 CR3: 0000000029c0a000 CR4: 00000000001506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __sys_connect_file+0x155/0x1a0 net/socket.c:1890
       __sys_connect+0x161/0x190 net/socket.c:1907
       __do_sys_connect net/socket.c:1917 [inline]
       __se_sys_connect net/socket.c:1914 [inline]
       __x64_sys_connect+0x6f/0xb0 net/socket.c:1914
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x446a89
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 a1 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f3eb0052208 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 00000000004cc4d8 RCX: 0000000000446a89
      RDX: 000000000000006e RSI: 0000000020000180 RDI: 0000000000000003
      RBP: 00000000004cc4d0 R08: 00007f3eb0052700 R09: 0000000000000000
      R10: 00007f3eb0052700 R11: 0000000000000246 R12: 00000000004cc4dc
      R13: 00007ffd791e79cf R14: 00007f3eb0052300 R15: 0000000000022000
      Modules linked in:
      ---[ end trace 4eb809357514968c ]---
      RIP: 0010:unix_dgram_connect+0x32a/0xc60 net/unix/af_unix.c:1226
      Code: 00 00 45 31 ed 49 83 bc 24 f8 05 00 00 00 74 69 e8 eb 5b a6 f9 48 8d 7d 12 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 e0 07 00 00
      RSP: 0018:ffffc9000a89fcd8 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
      RDX: 0000000000000002 RSI: ffffffff87cf4ef5 RDI: 0000000000000012
      RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88802e1917c3
      R10: ffffffff87cf4eba R11: 0000000000000001 R12: ffff88802e191740
      R13: 0000000000000000 R14: ffff88802e191d38 R15: ffff88802e1917c0
      FS:  00007f3eb0052700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffd791fe960 CR3: 0000000029c0a000 CR4: 00000000001506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 83301b53 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Cong Wang <cong.wang@bytedance.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc56ad70
  11. 23 8月, 2021 1 次提交
    • J
      af_unix: Fix NULL pointer bug in unix_shutdown · d359902d
      Jiang Wang 提交于
      Commit 94531cfc ("af_unix: Add unix_stream_proto for sockmap")
      introduced a bug for af_unix SEQPACKET type. In unix_shutdown, the
      unhash function will call prot->unhash(), which is NULL for SEQPACKET.
      And kernel will panic. On ARM32, it will show following messages: (it
      likely affects x86 too).
      
      Fix the bug by checking the prot->unhash is NULL or not first.
      
      Kernel log:
      <--- cut here ---
       Unable to handle kernel NULL pointer dereference at virtual address
      00000000
       pgd = 2fba1ffb
       *pgd=00000000
       Internal error: Oops: 80000005 [#1] PREEMPT SMP THUMB2
       Modules linked in:
       CPU: 1 PID: 1999 Comm: falkon Tainted: G        W
      5.14.0-rc5-01175-g94531cfc-dirty #9240
       Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
       PC is at 0x0
       LR is at unix_shutdown+0x81/0x1a8
       pc : [<00000000>]    lr : [<c08f3311>]    psr: 600f0013
       sp : e45aff70  ip : e463a3c0  fp : beb54f04
       r10: 00000125  r9 : e45ae000  r8 : c4a56664
       r7 : 00000001  r6 : c4a56464  r5 : 00000001  r4 : c4a56400
       r3 : 00000000  r2 : c5a6b180  r1 : 00000000  r0 : c4a56400
       Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
       Control: 50c5387d  Table: 05aa804a  DAC: 00000051
       Register r0 information: slab PING start c4a56400 pointer offset 0
       Register r1 information: NULL pointer
       Register r2 information: slab task_struct start c5a6b180 pointer offset 0
       Register r3 information: NULL pointer
       Register r4 information: slab PING start c4a56400 pointer offset 0
       Register r5 information: non-paged memory
       Register r6 information: slab PING start c4a56400 pointer offset 100
       Register r7 information: non-paged memory
       Register r8 information: slab PING start c4a56400 pointer offset 612
       Register r9 information: non-slab/vmalloc memory
       Register r10 information: non-paged memory
       Register r11 information: non-paged memory
       Register r12 information: slab filp start e463a3c0 pointer offset 0
       Process falkon (pid: 1999, stack limit = 0x9ec48895)
       Stack: (0xe45aff70 to 0xe45b0000)
       ff60:                                     e45ae000 c5f26a00 00000000 00000125
       ff80: c0100264 c07f7fa3 beb54f04 fffffff7 00000001 e6f3fc0e b5e5e9ec beb54ec4
       ffa0: b5da0ccc c010024b b5e5e9ec beb54ec4 0000000f 00000000 00000000 beb54ebc
       ffc0: b5e5e9ec beb54ec4 b5da0ccc 00000125 beb54f58 00785238 beb5529c beb54f04
       ffe0: b5da1e24 beb54eac b301385c b62b6ee8 600f0030 0000000f 00000000 00000000
       [<c08f3311>] (unix_shutdown) from [<c07f7fa3>] (__sys_shutdown+0x2f/0x50)
       [<c07f7fa3>] (__sys_shutdown) from [<c010024b>]
      (__sys_trace_return+0x1/0x16)
       Exception stack(0xe45affa8 to 0xe45afff0)
      
      Fixes: 94531cfc ("af_unix: Add unix_stream_proto for sockmap")
      Reported-by: NDmitry Osipenko <digetx@gmail.com>
      Signed-off-by: NJiang Wang <jiang.wang@bytedance.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: NDmitry Osipenko <digetx@gmail.com>
      Acked-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/bpf/20210821180738.1151155-1-jiang.wang@bytedance.com
      d359902d
  12. 17 8月, 2021 2 次提交
  13. 16 8月, 2021 1 次提交
  14. 15 8月, 2021 1 次提交
  15. 14 8月, 2021 1 次提交
  16. 04 8月, 2021 1 次提交
  17. 29 7月, 2021 1 次提交
    • M
      af_unix: fix garbage collect vs MSG_PEEK · cbcf0112
      Miklos Szeredi 提交于
      unix_gc() assumes that candidate sockets can never gain an external
      reference (i.e.  be installed into an fd) while the unix_gc_lock is
      held.  Except for MSG_PEEK this is guaranteed by modifying inflight
      count under the unix_gc_lock.
      
      MSG_PEEK does not touch any variable protected by unix_gc_lock (file
      count is not), yet it needs to be serialized with garbage collection.
      Do this by locking/unlocking unix_gc_lock:
      
       1) increment file count
      
       2) lock/unlock barrier to make sure incremented file count is visible
          to garbage collection
      
       3) install file into fd
      
      This is a lock barrier (unlike smp_mb()) that ensures that garbage
      collection is run completely before or completely after the barrier.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cbcf0112
  18. 16 7月, 2021 5 次提交
  19. 30 6月, 2021 1 次提交
  20. 22 6月, 2021 5 次提交