1. 29 8月, 2019 6 次提交
    • D
      rxrpc: Fix local refcounting · 6d471741
      David Howells 提交于
      [ Upstream commit 68553f1a6f746bf860bce3eb42d78c26a717d9c0 ]
      
      Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called
      on an unbound socket on which rx->local is not yet set.
      
      The following reproduced (includes omitted):
      
      	int main(void)
      	{
      		socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
      		return 0;
      	}
      
      causes the following oops to occur:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000010
      	...
      	RIP: 0010:rxrpc_unuse_local+0x8/0x1b
      	...
      	Call Trace:
      	 rxrpc_release+0x2b5/0x338
      	 __sock_release+0x37/0xa1
      	 sock_close+0x14/0x17
      	 __fput+0x115/0x1e9
      	 task_work_run+0x72/0x98
      	 do_exit+0x51b/0xa7a
      	 ? __context_tracking_exit+0x4e/0x10e
      	 do_group_exit+0xab/0xab
      	 __x64_sys_exit_group+0x14/0x17
      	 do_syscall_64+0x89/0x1d4
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com
      Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Jeffrey Altman <jaltman@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6d471741
    • D
      rxrpc: Fix local endpoint replacement · ce3f9e19
      David Howells 提交于
      [ Upstream commit b00df840fb4004b7087940ac5f68801562d0d2de ]
      
      When a local endpoint (struct rxrpc_local) ceases to be in use by any
      AF_RXRPC sockets, it starts the process of being destroyed, but this
      doesn't cause it to be removed from the namespace endpoint list immediately
      as tearing it down isn't trivial and can't be done in softirq context, so
      it gets deferred.
      
      If a new socket comes along that wants to bind to the same endpoint, a new
      rxrpc_local object will be allocated and rxrpc_lookup_local() will use
      list_replace() to substitute the new one for the old.
      
      Then, when the dying object gets to rxrpc_local_destroyer(), it is removed
      unconditionally from whatever list it is on by calling list_del_init().
      
      However, list_replace() doesn't reset the pointers in the replaced
      list_head and so the list_del_init() will likely corrupt the local
      endpoints list.
      
      Fix this by using list_replace_init() instead.
      
      Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting")
      Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ce3f9e19
    • D
      rxrpc: Fix read-after-free in rxrpc_queue_local() · a05354cb
      David Howells 提交于
      commit 06d9532fa6b34f12a6d75711162d47c17c1add72 upstream.
      
      rxrpc_queue_local() attempts to queue the local endpoint it is given and
      then, if successful, prints a trace line.  The trace line includes the
      current usage count - but we're not allowed to look at the local endpoint
      at this point as we passed our ref on it to the workqueue.
      
      Fix this by reading the usage count before queuing the work item.
      
      Also fix the reading of local->debug_id for trace lines, which must be done
      with the same consideration as reading the usage count.
      
      Fixes: 09d2bf59 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
      Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a05354cb
    • D
      rxrpc: Fix local endpoint refcounting · f28023c4
      David Howells 提交于
      commit 730c5fd42c1e3652a065448fd235cb9fafb2bd10 upstream.
      
      The object lifetime management on the rxrpc_local struct is broken in that
      the rxrpc_local_processor() function is expected to clean up and remove an
      object - but it may get requeued by packets coming in on the backing UDP
      socket once it starts running.
      
      This may result in the assertion in rxrpc_local_rcu() firing because the
      memory has been scheduled for RCU destruction whilst still queued:
      
      	rxrpc: Assertion failed
      	------------[ cut here ]------------
      	kernel BUG at net/rxrpc/local_object.c:468!
      
      Note that if the processor comes around before the RCU free function, it
      will just do nothing because ->dead is true.
      
      Fix this by adding a separate refcount to count active users of the
      endpoint that causes the endpoint to be destroyed when it reaches 0.
      
      The original refcount can then be used to refcount objects through the work
      processor and cause the memory to be rcu freed when that reaches 0.
      
      Fixes: 4f95dd78 ("rxrpc: Rework local endpoint management")
      Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f28023c4
    • D
      rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet · 4db2043e
      David Howells 提交于
      [ Upstream commit c69565ee6681e151e2bb80502930a16e04b553d1 ]
      
      Fix the fact that a notification isn't sent to the recvmsg side to indicate
      a call failed when sendmsg() fails to transmit a DATA packet with the error
      ENETUNREACH, EHOSTUNREACH or ECONNREFUSED.
      
      Without this notification, the afs client just sits there waiting for the
      call to complete in some manner (which it's not now going to do), which
      also pins the rxrpc call in place.
      
      This can be seen if the client has a scope-level IPv6 address, but not a
      global-level IPv6 address, and we try and transmit an operation to a
      server's IPv6 address.
      
      Looking in /proc/net/rxrpc/calls shows completed calls just sat there with
      an abort code of RX_USER_ABORT and an error code of -ENETUNREACH.
      
      Fixes: c54e43d7 ("rxrpc: Fix missing start of call timeout")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Reviewed-by: NJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4db2043e
    • D
      rxrpc: Fix potential deadlock · 0d68fbc2
      David Howells 提交于
      [ Upstream commit 60034d3d146b11922ab1db613bce062dddc0327a ]
      
      There is a potential deadlock in rxrpc_peer_keepalive_dispatch() whereby
      rxrpc_put_peer() is called with the peer_hash_lock held, but if it reduces
      the peer's refcount to 0, rxrpc_put_peer() calls __rxrpc_put_peer() - which
      the tries to take the already held lock.
      
      Fix this by providing a version of rxrpc_put_peer() that can be called in
      situations where the lock is already held.
      
      The bug may produce the following lockdep report:
      
      ============================================
      WARNING: possible recursive locking detected
      5.2.0-next-20190718 #41 Not tainted
      --------------------------------------------
      kworker/0:3/21678 is trying to acquire lock:
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at: spin_lock_bh
      /./include/linux/spinlock.h:343 [inline]
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
      __rxrpc_put_peer /net/rxrpc/peer_object.c:415 [inline]
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
      rxrpc_put_peer+0x2d3/0x6a0 /net/rxrpc/peer_object.c:435
      
      but task is already holding lock:
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at: spin_lock_bh
      /./include/linux/spinlock.h:343 [inline]
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
      rxrpc_peer_keepalive_dispatch /net/rxrpc/peer_event.c:378 [inline]
      00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
      rxrpc_peer_keepalive_worker+0x6b3/0xd02 /net/rxrpc/peer_event.c:430
      
      Fixes: 330bdcfa ("rxrpc: Fix the keepalive generator [ver #2]")
      Reported-by: syzbot+72af434e4b3417318f84@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Reviewed-by: NJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0d68fbc2
  2. 28 7月, 2019 1 次提交
    • D
      rxrpc: Fix send on a connected, but unbound socket · bfa79135
      David Howells 提交于
      [ Upstream commit e835ada07091f40dcfb1bc735082bd0a7c005e59 ]
      
      If sendmsg() or sendmmsg() is called on a connected socket that hasn't had
      bind() called on it, then an oops will occur when the kernel tries to
      connect the call because no local endpoint has been allocated.
      
      Fix this by implicitly binding the socket if it is in the
      RXRPC_CLIENT_UNBOUND state, just like it does for the RXRPC_UNBOUND state.
      
      Further, the state should be transitioned to RXRPC_CLIENT_BOUND after this
      to prevent further attempts to bind it.
      
      This can be tested with:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/socket.h>
      	#include <arpa/inet.h>
      	#include <linux/rxrpc.h>
      	static const unsigned char inet6_addr[16] = {
      		0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 0xac, 0x14, 0x14, 0xaa
      	};
      	int main(void)
      	{
      		struct sockaddr_rxrpc srx;
      		struct cmsghdr *cm;
      		struct msghdr msg;
      		unsigned char control[16];
      		int fd;
      		memset(&srx, 0, sizeof(srx));
      		srx.srx_family = 0x21;
      		srx.srx_service = 0;
      		srx.transport_type = AF_INET;
      		srx.transport_len = 0x1c;
      		srx.transport.sin6.sin6_family = AF_INET6;
      		srx.transport.sin6.sin6_port = htons(0x4e22);
      		srx.transport.sin6.sin6_flowinfo = htons(0x4e22);
      		srx.transport.sin6.sin6_scope_id = htons(0xaa3b);
      		memcpy(&srx.transport.sin6.sin6_addr, inet6_addr, 16);
      		cm = (struct cmsghdr *)control;
      		cm->cmsg_len	= CMSG_LEN(sizeof(unsigned long));
      		cm->cmsg_level	= SOL_RXRPC;
      		cm->cmsg_type	= RXRPC_USER_CALL_ID;
      		*(unsigned long *)CMSG_DATA(cm) = 0;
      		msg.msg_name = NULL;
      		msg.msg_namelen = 0;
      		msg.msg_iov = NULL;
      		msg.msg_iovlen = 0;
      		msg.msg_control = control;
      		msg.msg_controllen = cm->cmsg_len;
      		msg.msg_flags = 0;
      		fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
      		connect(fd, (struct sockaddr *)&srx, sizeof(srx));
      		sendmsg(fd, &msg, 0);
      		return 0;
      	}
      
      Leading to the following oops:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000018
      	#PF: supervisor read access in kernel mode
      	#PF: error_code(0x0000) - not-present page
      	...
      	RIP: 0010:rxrpc_connect_call+0x42/0xa01
      	...
      	Call Trace:
      	 ? mark_held_locks+0x47/0x59
      	 ? __local_bh_enable_ip+0xb6/0xba
      	 rxrpc_new_client_call+0x3b1/0x762
      	 ? rxrpc_do_sendmsg+0x3c0/0x92e
      	 rxrpc_do_sendmsg+0x3c0/0x92e
      	 rxrpc_sendmsg+0x16b/0x1b5
      	 sock_sendmsg+0x2d/0x39
      	 ___sys_sendmsg+0x1a4/0x22a
      	 ? release_sock+0x19/0x9e
      	 ? reacquire_held_locks+0x136/0x160
      	 ? release_sock+0x19/0x9e
      	 ? find_held_lock+0x2b/0x6e
      	 ? __lock_acquire+0x268/0xf73
      	 ? rxrpc_connect+0xdd/0xe4
      	 ? __local_bh_enable_ip+0xb6/0xba
      	 __sys_sendmsg+0x5e/0x94
      	 do_syscall_64+0x7d/0x1bf
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 2341e077 ("rxrpc: Simplify connect() implementation and simplify sendmsg() op")
      Reported-by: syzbot+7966f2a0b2c7da8939b4@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfa79135
  3. 05 5月, 2019 1 次提交
    • D
      rxrpc: Fix net namespace cleanup · fdd36abd
      David Howells 提交于
      [ Upstream commit b13023421b5179413421333f602850914f6a7ad8 ]
      
      In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the
      ->calls list is empty, emitting error messages if not, and (2) wait for the
      RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0).
      
      To avoid taking the call_lock, the function prechecks ->calls and if empty,
      it returns to avoid taking the lock - this is wrong, however: it still
      needs to go and do the second phase and wait for ->nr_calls to become 0.
      
      Without this, the rxrpc_net struct may get deallocated before we get to the
      RCU cleanup for the last calls.  This can lead to:
      
        Slab corruption (Not tainted): kmalloc-16k start=ffff88802b178000, len=16384
        050: 6b 6b 6b 6b 6b 6b 6b 6b 61 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkakkkkkkk
      
      Note the "61" at offset 0x58.  This corresponds to the ->nr_calls member of
      struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k
      slab).
      
      Fix this by flipping the condition on the if-statement, putting the locked
      section inside the if-body and dropping the return from there.  The
      function will then always go on to wait for the RCU cleanup on outstanding
      calls.
      
      Fixes: 2baec2c3 ("rxrpc: Support network namespacing")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdd36abd
  4. 02 5月, 2019 1 次提交
    • E
      rxrpc: fix race condition in rxrpc_input_packet() · 920ecc72
      Eric Dumazet 提交于
      commit 032be5f19a94de51093851757089133dcc1e92aa upstream.
      
      After commit 5271953c ("rxrpc: Use the UDP encap_rcv hook"),
      rxrpc_input_packet() is directly called from lockless UDP receive
      path, under rcu_read_lock() protection.
      
      It must therefore use RCU rules :
      
      - udp_sk->sk_user_data can be cleared at any point in this function.
        rcu_dereference_sk_user_data() is what we need here.
      
      - Also, since sk_user_data might have been set in rxrpc_open_socket()
        we must observe a proper RCU grace period before kfree(local) in
        rxrpc_lookup_local()
      
      v4: @local can be NULL in xrpc_lookup_local() as reported by kbuild test robot <lkp@intel.com>
              and Julia Lawall <julia.lawall@lip6.fr>, thanks !
      
      v3,v2 : addressed David Howells feedback, thanks !
      
      syzbot reported :
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 19236 Comm: syz-executor703 Not tainted 5.1.0-rc6 #79
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__lock_acquire+0xbef/0x3fb0 kernel/locking/lockdep.c:3573
      Code: 00 0f 85 a5 1f 00 00 48 81 c4 10 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 4a 21 00 00 49 81 7d 00 20 54 9c 89 0f 84 cf f4
      RSP: 0018:ffff88809d7aef58 EFLAGS: 00010002
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000026 RSI: 0000000000000000 RDI: 0000000000000001
      RBP: ffff88809d7af090 R08: 0000000000000001 R09: 0000000000000001
      R10: ffffed1015d05bc7 R11: ffff888089428600 R12: 0000000000000000
      R13: 0000000000000130 R14: 0000000000000001 R15: 0000000000000001
      FS:  00007f059044d700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004b6040 CR3: 00000000955ca000 CR4: 00000000001406f0
      Call Trace:
       lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4211
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:152
       skb_queue_tail+0x26/0x150 net/core/skbuff.c:2972
       rxrpc_reject_packet net/rxrpc/input.c:1126 [inline]
       rxrpc_input_packet+0x4a0/0x5536 net/rxrpc/input.c:1414
       udp_queue_rcv_one_skb+0xaf2/0x1780 net/ipv4/udp.c:2011
       udp_queue_rcv_skb+0x128/0x730 net/ipv4/udp.c:2085
       udp_unicast_rcv_skb.isra.0+0xb9/0x360 net/ipv4/udp.c:2245
       __udp4_lib_rcv+0x701/0x2ca0 net/ipv4/udp.c:2301
       udp_rcv+0x22/0x30 net/ipv4/udp.c:2482
       ip_protocol_deliver_rcu+0x60/0x8f0 net/ipv4/ip_input.c:208
       ip_local_deliver_finish+0x23b/0x390 net/ipv4/ip_input.c:234
       NF_HOOK include/linux/netfilter.h:289 [inline]
       NF_HOOK include/linux/netfilter.h:283 [inline]
       ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:255
       dst_input include/net/dst.h:450 [inline]
       ip_rcv_finish+0x1e1/0x300 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:289 [inline]
       NF_HOOK include/linux/netfilter.h:283 [inline]
       ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0x115/0x1a0 net/core/dev.c:4987
       __netif_receive_skb+0x2c/0x1c0 net/core/dev.c:5099
       netif_receive_skb_internal+0x117/0x660 net/core/dev.c:5202
       napi_frags_finish net/core/dev.c:5769 [inline]
       napi_gro_frags+0xade/0xd10 net/core/dev.c:5843
       tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
       call_write_iter include/linux/fs.h:1866 [inline]
       do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
       do_iter_write fs/read_write.c:957 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:938
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
       do_writev+0x15e/0x370 fs/read_write.c:1037
       __do_sys_writev fs/read_write.c:1110 [inline]
       __se_sys_writev fs/read_write.c:1107 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 5271953c ("rxrpc: Use the UDP encap_rcv hook")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      920ecc72
  5. 20 4月, 2019 1 次提交
    • D
      rxrpc: Fix client call connect/disconnect race · 11582064
      David Howells 提交于
      [ Upstream commit 930c9f9125c85b5134b3e711bc252ecc094708e3 ]
      
      rxrpc_disconnect_client_call() reads the call's connection ID protocol
      value (call->cid) as part of that function's variable declarations.  This
      is bad because it's not inside the locked section and so may race with
      someone granting use of the channel to the call.
      
      This manifests as an assertion failure (see below) where the call in the
      presumed channel (0 because call->cid wasn't set when we read it) doesn't
      match the call attached to the channel we were actually granted (if 1, 2 or
      3).
      
      Fix this by moving the read and dependent calculations inside of the
      channel_lock section.  Also, only set the channel number and pointer
      variables if cid is not zero (ie. unset).
      
      This problem can be induced by injecting an occasional error in
      rxrpc_wait_for_channel() before the call to schedule().
      
      Make two further changes also:
      
       (1) Add a trace for wait failure in rxrpc_connect_call().
      
       (2) Drop channel_lock before BUG'ing in the case of the assertion failure.
      
      The failure causes a trace akin to the following:
      
      rxrpc: Assertion failed - 18446612685268945920(0xffff8880beab8c00) == 18446612685268621312(0xffff8880bea69800) is false
      ------------[ cut here ]------------
      kernel BUG at net/rxrpc/conn_client.c:824!
      ...
      RIP: 0010:rxrpc_disconnect_client_call+0x2bf/0x99d
      ...
      Call Trace:
       rxrpc_connect_call+0x902/0x9b3
       ? wake_up_q+0x54/0x54
       rxrpc_new_client_call+0x3a0/0x751
       ? rxrpc_kernel_begin_call+0x141/0x1bc
       ? afs_alloc_call+0x1b5/0x1b5
       rxrpc_kernel_begin_call+0x141/0x1bc
       afs_make_call+0x20c/0x525
       ? afs_alloc_call+0x1b5/0x1b5
       ? __lock_is_held+0x40/0x71
       ? lockdep_init_map+0xaf/0x193
       ? lockdep_init_map+0xaf/0x193
       ? __lock_is_held+0x40/0x71
       ? yfs_fs_fetch_data+0x33b/0x34a
       yfs_fs_fetch_data+0x33b/0x34a
       afs_fetch_data+0xdc/0x3b7
       afs_read_dir+0x52d/0x97f
       afs_dir_iterate+0xa0/0x661
       ? iterate_dir+0x63/0x141
       iterate_dir+0xa2/0x141
       ksys_getdents64+0x9f/0x11b
       ? filldir+0x111/0x111
       ? do_syscall_64+0x3e/0x1a0
       __x64_sys_getdents64+0x16/0x19
       do_syscall_64+0x7d/0x1a0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 45025bce ("rxrpc: Improve management and caching of client connection objects")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      11582064
  6. 19 3月, 2019 1 次提交
  7. 13 2月, 2019 1 次提交
    • E
      rxrpc: bad unlock balance in rxrpc_recvmsg · 21b8697e
      Eric Dumazet 提交于
      [ Upstream commit 6dce3c20ac429e7a651d728e375853370c796e8d ]
      
      When either "goto wait_interrupted;" or "goto wait_error;"
      paths are taken, socket lock has already been released.
      
      This patch fixes following syzbot splat :
      
      WARNING: bad unlock balance detected!
      5.0.0-rc4+ #59 Not tainted
      -------------------------------------
      syz-executor223/8256 is trying to release lock (sk_lock-AF_RXRPC) at:
      [<ffffffff86651353>] rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
      but there are no more locks to release!
      
      other info that might help us debug this:
      1 lock held by syz-executor223/8256:
       #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline]
       #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2798
      
      stack backtrace:
      CPU: 1 PID: 8256 Comm: syz-executor223 Not tainted 5.0.0-rc4+ #59
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_unlock_imbalance_bug kernel/locking/lockdep.c:3391 [inline]
       print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3368
       __lock_release kernel/locking/lockdep.c:3601 [inline]
       lock_release+0x67e/0xa00 kernel/locking/lockdep.c:3860
       sock_release_ownership include/net/sock.h:1471 [inline]
       release_sock+0x183/0x1c0 net/core/sock.c:2808
       rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
       sock_recvmsg_nosec net/socket.c:794 [inline]
       sock_recvmsg net/socket.c:801 [inline]
       sock_recvmsg+0xd0/0x110 net/socket.c:797
       __sys_recvfrom+0x1ff/0x350 net/socket.c:1845
       __do_sys_recvfrom net/socket.c:1863 [inline]
       __se_sys_recvfrom net/socket.c:1859 [inline]
       __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:1859
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x446379
      Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fe5da89fd98 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
      RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446379
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
      R13: 0000000000000000 R14: 0000000000000000 R15: 20c49ba5e353f7cf
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Howells <dhowells@redhat.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21b8697e
  8. 23 11月, 2018 1 次提交
    • D
      rxrpc: Fix lockup due to no error backoff after ack transmit error · 36b05750
      David Howells 提交于
      [ Upstream commit c7e86acfcee30794dc99a0759924bf7b9d43f1ca ]
      
      If the network becomes (partially) unavailable, say by disabling IPv6, the
      background ACK transmission routine can get itself into a tizzy by
      proposing immediate ACK retransmission.  Since we're in the call event
      processor, that happens immediately without returning to the workqueue
      manager.
      
      The condition should clear after a while when either the network comes back
      or the call times out.
      
      Fix this by:
      
       (1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
           This will allow a certain amount of time to elapse before we try
           again.
      
       (2) Enforce a return to the workqueue manager after a certain number of
           iterations of the call processing loop.
      
       (3) Add a backoff delay that increases the delay on deferred ACKs by a
           jiffy per failed transmission to a limit of HZ.  The backoff delay is
           cleared on a successful return from kernel_sendmsg().
      
       (4) Cancel calls immediately if the opening sendmsg fails.  The layer
           above can arrange retransmission or rotate to another server.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36b05750
  9. 04 11月, 2018 1 次提交
  10. 16 10月, 2018 4 次提交
  11. 09 10月, 2018 3 次提交
    • D
      rxrpc: Fix the packet reception routine · c1e15b49
      David Howells 提交于
      The rxrpc_input_packet() function and its call tree was built around the
      assumption that data_ready() handler called from UDP to inform a kernel
      service that there is data to be had was non-reentrant.  This means that
      certain locking could be dispensed with.
      
      This, however, turns out not to be the case with a multi-queue network card
      that can deliver packets to multiple cpus simultaneously.  Each of those
      cpus can be in the rxrpc_input_packet() function at the same time.
      
      Fix by adding or changing some structure members:
      
       (1) Add peer->rtt_input_lock to serialise access to the RTT buffer.
      
       (2) Make conn->service_id into a 32-bit variable so that it can be
           cmpxchg'd on all arches.
      
       (3) Add call->input_lock to serialise access to the Rx/Tx state.  Note
           that although the Rx and Tx states are (almost) entirely separate,
           there's no point completing the separation and having separate locks
           since it's a bi-phasal RPC protocol rather than a bi-direction
           streaming protocol.  Data transmission and data reception do not take
           place simultaneously on any particular call.
      
      and making the following functional changes:
      
       (1) In rxrpc_input_data(), hold call->input_lock around the core to
           prevent simultaneous producing of packets into the Rx ring and
           updating of tracking state for a particular call.
      
       (2) In rxrpc_input_ping_response(), only read call->ping_serial once, and
           check it before checking RXRPC_CALL_PINGING as that's a cheaper test.
           The bit test and bit clear can then be combined.  No further locking
           is needed here.
      
       (3) In rxrpc_input_ack(), take call->input_lock after we've parsed much of
           the ACK packet.  The superseded ACK check is then done both before and
           after the lock is taken.
      
           The handing of ackinfo data is split, parsing before the lock is taken
           and processing with it held.  This is keyed on rxMTU being non-zero.
      
           Congestion management is also done within the locked section.
      
       (4) In rxrpc_input_ackall(), take call->input_lock around the Tx window
           rotation.  The ACKALL packet carries no information and is only really
           useful after all packets have been transmitted since it's imprecise.
      
       (5) In rxrpc_input_implicit_end_call(), we use rx->incoming_lock to
           prevent calls being simultaneously implicitly ended on two cpus and
           also to prevent any races with incoming call setup.
      
       (6) In rxrpc_input_packet(), use cmpxchg() to effect the service upgrade
           on a connection.  It is only permitted to happen once for a
           connection.
      
       (7) In rxrpc_new_incoming_call(), we have to recheck the routing inside
           rx->incoming_lock to see if someone else set up the call, connection
           or peer whilst we were getting there.  We can't trust the values from
           the earlier routing check unless we pin refs on them - which we want
           to avoid.
      
           Further, we need to allow for an incoming call to have its state
           changed on another CPU between us making it live and us adjusting it
           because the conn is now in the RXRPC_CONN_SERVICE state.
      
       (8) In rxrpc_peer_add_rtt(), take peer->rtt_input_lock around the access
           to the RTT buffer.  Don't need to lock around setting peer->rtt.
      
      For reference, the inventory of state-accessing or state-altering functions
      used by the packet input procedure is:
      
      > rxrpc_input_packet()
        * PACKET CHECKING
      
        * ROUTING
          > rxrpc_post_packet_to_local()
          > rxrpc_find_connection_rcu() - uses RCU
            > rxrpc_lookup_peer_rcu() - uses RCU
            > rxrpc_find_service_conn_rcu() - uses RCU
            > idr_find() - uses RCU
      
        * CONNECTION-LEVEL PROCESSING
          - Service upgrade
            - Can only happen once per conn
            ! Changed to use cmpxchg
          > rxrpc_post_packet_to_conn()
          - Setting conn->hi_serial
            - Probably safe not using locks
            - Maybe use cmpxchg
      
        * CALL-LEVEL PROCESSING
          > Old-call checking
            > rxrpc_input_implicit_end_call()
              > rxrpc_call_completed()
      	> rxrpc_queue_call()
      	! Need to take rx->incoming_lock
      	> __rxrpc_disconnect_call()
      	> rxrpc_notify_socket()
          > rxrpc_new_incoming_call()
            - Uses rx->incoming_lock for the entire process
              - Might be able to drop this earlier in favour of the call lock
            > rxrpc_incoming_call()
            	! Conflicts with rxrpc_input_implicit_end_call()
          > rxrpc_send_ping()
            - Don't need locks to check rtt state
            > rxrpc_propose_ACK
      
        * PACKET DISTRIBUTION
          > rxrpc_input_call_packet()
            > rxrpc_input_data()
      	* QUEUE DATA PACKET ON CALL
      	> rxrpc_reduce_call_timer()
      	  - Uses timer_reduce()
      	! Needs call->input_lock()
      	> rxrpc_receiving_reply()
      	  ! Needs locking around ack state
      	  > rxrpc_rotate_tx_window()
      	  > rxrpc_end_tx_phase()
      	> rxrpc_proto_abort()
      	> rxrpc_input_dup_data()
      	- Fills the Rx buffer
      	- rxrpc_propose_ACK()
      	- rxrpc_notify_socket()
      
            > rxrpc_input_ack()
      	* APPLY ACK PACKET TO CALL AND DISCARD PACKET
      	> rxrpc_input_ping_response()
      	  - Probably doesn't need any extra locking
      	  ! Need READ_ONCE() on call->ping_serial
      	  > rxrpc_input_check_for_lost_ack()
      	    - Takes call->lock to consult Tx buffer
      	  > rxrpc_peer_add_rtt()
      	    ! Needs to take a lock (peer->rtt_input_lock)
      	    ! Could perhaps manage with cmpxchg() and xadd() instead
      	> rxrpc_input_requested_ack
      	  - Consults Tx buffer
      	    ! Probably needs a lock
      	  > rxrpc_peer_add_rtt()
      	> rxrpc_propose_ack()
      	> rxrpc_input_ackinfo()
      	  - Changes call->tx_winsize
      	    ! Use cmpxchg to handle change
      	    ! Should perhaps track serial number
      	  - Uses peer->lock to record MTU specification changes
      	> rxrpc_proto_abort()
      	! Need to take call->input_lock
      	> rxrpc_rotate_tx_window()
      	> rxrpc_end_tx_phase()
      	> rxrpc_input_soft_acks()
      	- Consults the Tx buffer
      	> rxrpc_congestion_management()
      	  - Modifies the Tx annotations
      	  ! Needs call->input_lock()
      	  > rxrpc_queue_call()
      
            > rxrpc_input_abort()
      	* APPLY ABORT PACKET TO CALL AND DISCARD PACKET
      	> rxrpc_set_call_completion()
      	> rxrpc_notify_socket()
      
            > rxrpc_input_ackall()
      	* APPLY ACKALL PACKET TO CALL AND DISCARD PACKET
      	! Need to take call->input_lock
      	> rxrpc_rotate_tx_window()
      	> rxrpc_end_tx_phase()
      
          > rxrpc_reject_packet()
      
      There are some functions used by the above that queue the packet, after
      which the procedure is terminated:
      
       - rxrpc_post_packet_to_local()
         - local->event_queue is an sk_buff_head
         - local->processor is a work_struct
       - rxrpc_post_packet_to_conn()
         - conn->rx_queue is an sk_buff_head
         - conn->processor is a work_struct
       - rxrpc_reject_packet()
         - local->reject_queue is an sk_buff_head
         - local->processor is a work_struct
      
      And some that offload processing to process context:
      
       - rxrpc_notify_socket()
         - Uses RCU lock
         - Uses call->notify_lock to call call->notify_rx
         - Uses call->recvmsg_lock to queue recvmsg side
       - rxrpc_queue_call()
         - call->processor is a work_struct
       - rxrpc_propose_ACK()
         - Uses call->lock to wrap __rxrpc_propose_ACK()
      
      And a bunch that complete a call, all of which use call->state_lock to
      protect the call state:
      
       - rxrpc_call_completed()
       - rxrpc_set_call_completion()
       - rxrpc_abort_call()
       - rxrpc_proto_abort()
         - Also uses rxrpc_queue_call()
      
      Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c1e15b49
    • D
      rxrpc: Fix connection-level abort handling · 64753092
      David Howells 提交于
      Fix connection-level abort handling to cache the abort and error codes
      properly so that a new incoming call can be properly aborted if it races
      with the parent connection being aborted by another CPU.
      
      The abort_code and error parameters can then be dropped from
      rxrpc_abort_calls().
      
      Fixes: f5c17aae ("rxrpc: Calls should only have one terminal state")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      64753092
    • D
      rxrpc: Only take the rwind and mtu values from latest ACK · 298bc15b
      David Howells 提交于
      Move the out-of-order and duplicate ACK packet check to before the call to
      rxrpc_input_ackinfo() so that the receive window size and MTU size are only
      checked in the latest ACK packet and don't regress.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      298bc15b
  12. 08 10月, 2018 4 次提交
    • D
      rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window() · dfe99522
      David Howells 提交于
      Carry the call state out of the locked section in rxrpc_rotate_tx_window()
      rather than sampling it afterwards.  This is only used to select tracepoint
      data, but could have changed by the time we do the tracepoint.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      dfe99522
    • D
      rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window() · c479d5f2
      David Howells 提交于
      We should only call the function to end a call's Tx phase if we rotated the
      marked-last packet out of the transmission buffer.
      
      Make rxrpc_rotate_tx_window() return an indication of whether it just
      rotated the packet marked as the last out of the transmit buffer, carrying
      the information out of the locked section in that function.
      
      We can then check the return value instead of examining RXRPC_CALL_TX_LAST.
      
      Fixes: 70790dbe ("rxrpc: Pass the last Tx packet marker in the annotation buffer")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c479d5f2
    • D
      rxrpc: Don't need to take the RCU read lock in the packet receiver · bfd28211
      David Howells 提交于
      We don't need to take the RCU read lock in the rxrpc packet receive
      function because it's held further up the stack in the IP input routine
      around the UDP receive routines.
      
      Fix this by dropping the RCU read lock calls from rxrpc_input_packet().
      This simplifies the code.
      
      Fixes: 70790dbe ("rxrpc: Pass the last Tx packet marker in the annotation buffer")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bfd28211
    • D
      rxrpc: Use the UDP encap_rcv hook · 5271953c
      David Howells 提交于
      Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception
      in which a packet is placed onto the UDP receive queue and then immediately
      removed again by rxrpc.  Going via the queue in this manner seems like it
      should be unnecessary.
      
      This does, however, require the invention of a value to place in encap_type
      as that's one of the conditions to switch packets out to the encap_rcv
      hook.  Possibly the value doesn't actually matter for anything other than
      sockopts on the UDP socket, which aren't accessible outside of rxrpc
      anyway.
      
      This seems to cut a bit of time out of the time elapsed between each
      sk_buff being timestamped and turning up in rxrpc (the final number in the
      following trace excerpts).  I measured this by making the rxrpc_rx_packet
      trace point print the time elapsed between the skb being timestamped and
      the current time (in ns), e.g.:
      
      	... 424.278721: rxrpc_rx_packet: ...  ACK 25026
      
      So doing a 512MiB DIO read from my test server, with an unmodified kernel:
      
      	N       min     max     sum		mean    stddev
      	27605   2626    7581    7.83992e+07     2840.04 181.029
      
      and with the patch applied:
      
      	N       min     max     sum		mean    stddev
      	27547   1895    12165   6.77461e+07     2459.29 255.02
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5271953c
  13. 05 10月, 2018 2 次提交
    • D
      rxrpc: Fix the data_ready handler · 2cfa2271
      David Howells 提交于
      Fix the rxrpc_data_ready() function to pick up all packets and to not miss
      any.  There are two problems:
      
       (1) The sk_data_ready pointer on the UDP socket is set *after* it is
           bound.  This means that it's open for business before we're ready to
           dequeue packets and there's a tiny window exists in which a packet can
           sneak onto the receive queue, but we never know about it.
      
           Fix this by setting the pointers on the socket prior to binding it.
      
       (2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was
           an error on the transmission side, even though we set the
           sk_error_report hook.  Because rxrpc_data_ready() returns immediately
           in such a case, it never actually removes its packet from the receive
           queue.
      
           Fix this by abstracting out the UDP dequeuing and checksumming into a
           separate function that keeps hammering on skb_recv_udp() until it
           returns -EAGAIN, passing the packets extracted to the remainder of the
           function.
      
      and two potential problems:
      
       (3) It might be possible in some circumstances or in the future for
           packets to be being added to the UDP receive queue whilst rxrpc is
           running consuming them, so the data_ready() handler might get called
           less often than once per packet.
      
           Allow for this by fully draining the queue on each call as (2).
      
       (4) If a packet fails the checksum check, the code currently returns after
           discarding the packet without checking for more.
      
           Allow for this by fully draining the queue on each call as (2).
      
      Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      2cfa2271
    • D
      rxrpc: Fix some missed refs to init_net · 5e33a23b
      David Howells 提交于
      Fix some refs to init_net that should've been changed to the appropriate
      network namespace.
      
      Fixes: 2baec2c3 ("rxrpc: Support network namespacing")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      5e33a23b
  14. 28 9月, 2018 7 次提交
    • D
      rxrpc: Fix error distribution · f3344303
      David Howells 提交于
      Fix error distribution by immediately delivering the errors to all the
      affected calls rather than deferring them to a worker thread.  The problem
      with the latter is that retries and things can happen in the meantime when we
      want to stop that sooner.
      
      To this end:
      
       (1) Stop the error distributor from removing calls from the error_targets
           list so that peer->lock isn't needed to synchronise against other adds
           and removals.
      
       (2) Require the peer's error_targets list to be accessed with RCU, thereby
           avoiding the need to take peer->lock over distribution.
      
       (3) Don't attempt to affect a call's state if it is already marked complete.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f3344303
    • D
      rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket · 37a675e7
      David Howells 提交于
      It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on
      IP_RECVERR, so neither local errors nor ICMP-transported remote errors from
      IPv4 peer addresses are returned to the AF_RXRPC protocol.
      
      Make the sockopt setting code in rxrpc_open_socket() fall through from the
      AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in
      the AF_INET6 case.
      
      Fixes: f2aeed3a ("rxrpc: Fix error reception on AF_INET6 sockets")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      37a675e7
    • D
      rxrpc: Make service call handling more robust · 0099dc58
      David Howells 提交于
      Make the following changes to improve the robustness of the code that sets
      up a new service call:
      
       (1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a
           service ID check and pass that along to rxrpc_new_incoming_call().
           This means that I can remove the check from rxrpc_new_incoming_call()
           without the need to worry about the socket attached to the local
           endpoint getting replaced - which would invalidate the check.
      
       (2) Cache the rxrpc_peer struct, thereby allowing the peer search to be
           done once.  The peer is passed to rxrpc_new_incoming_call(), thereby
           saving the need to repeat the search.
      
           This also reduces the possibility of rxrpc_publish_service_conn()
           BUG()'ing due to the detection of a duplicate connection, despite the
           initial search done by rxrpc_find_connection_rcu() having turned up
           nothing.
      
           This BUG() shouldn't ever get hit since rxrpc_data_ready() *should* be
           non-reentrant and the result of the initial search should still hold
           true, but it has proven possible to hit.
      
           I *think* this may be due to __rxrpc_lookup_peer_rcu() cutting short
           the iteration over the hash table if it finds a matching peer with a
           zero usage count, but I don't know for sure since it's only ever been
           hit once that I know of.
      
           Another possibility is that a bug in rxrpc_data_ready() that checked
           the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag
           might've let through a packet that caused a spurious and invalid call
           to be set up.  That is addressed in another patch.
      
       (3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero
           usage count rather than stopping and returning not found, just in case
           there's another peer record behind it in the bucket.
      
       (4) Don't search the peer records in rxrpc_alloc_incoming_call(), but
           rather either use the peer cached in (2) or, if one wasn't found,
           preemptively install a new one.
      
      Fixes: 8496af50 ("rxrpc: Use RCU to access a peer's service connection tree")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0099dc58
    • D
      rxrpc: Improve up-front incoming packet checking · 403fc2a1
      David Howells 提交于
      Do more up-front checking on incoming packets to weed out invalid ones and
      also ones aimed at services that we don't support.
      
      Whilst we're at it, replace the clearing of call and skew if we don't find
      a connection with just initialising the variables to zero at the top of the
      function.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      403fc2a1
    • D
      rxrpc: Emit BUSY packets when supposed to rather than ABORTs · ece64fec
      David Howells 提交于
      In the input path, a received sk_buff can be marked for rejection by
      setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data
      (such as an abort code) in skb->priority.  The rejection is handled by
      queueing the sk_buff up for dealing with in process context.  The output
      code reads the mark and priority and, theoretically, generates an
      appropriate response packet.
      
      However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT
      message with a random abort code is generated (since skb->priority wasn't
      set to anything).
      
      Fix this by outputting the appropriate sort of packet.
      
      Also, whilst we're at it, most of the marks are no longer used, so remove
      them and rename the remaining two to something more obvious.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ece64fec
    • D
      rxrpc: Fix RTT gathering · b604dd98
      David Howells 提交于
      Fix RTT information gathering in AF_RXRPC by the following means:
      
       (1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.
      
       (2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
           collects it, set it at that point.
      
       (3) Allow ACKs to be requested on the last packet of a client call, but
           not a service call.  We need to be careful lest we undo:
      
      	bf7d620a
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Thu Oct 6 08:11:51 2016 +0100
      	rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase
      
           but that only really applies to service calls that we're handling,
           since the client side gets to send the final ACK (or not).
      
       (4) When about to transmit an ACK or DATA packet, record the Tx timestamp
           before only; don't update the timestamp afterwards.
      
       (5) Switch the ordering between recording the serial and recording the
           timestamp to always set the serial number first.  The serial number
           shouldn't be seen referenced by an ACK packet until we've transmitted
           the packet bearing it - so in the Rx path, we don't need the timestamp
           until we've checked the serial number.
      
      Fixes: cf1a6474 ("rxrpc: Add per-peer RTT tracker")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b604dd98
    • D
      rxrpc: Fix checks as to whether we should set up a new call · dc71db34
      David Howells 提交于
      There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED
      flag in the packet type field rather than in the packet flags field.
      
      Fix this by creating a pair of helper functions to check whether the packet
      is going to the client or to the server and use them generally.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      dc71db34
  15. 27 9月, 2018 1 次提交
  16. 12 8月, 2018 1 次提交
  17. 09 8月, 2018 1 次提交
    • D
      rxrpc: Fix the keepalive generator [ver #2] · 330bdcfa
      David Howells 提交于
      AF_RXRPC has a keepalive message generator that generates a message for a
      peer ~20s after the last transmission to that peer to keep firewall ports
      open.  The implementation is incorrect in the following ways:
      
       (1) It mixes up ktime_t and time64_t types.
      
       (2) It uses ktime_get_real(), the output of which may jump forward or
           backward due to adjustments to the time of day.
      
       (3) If the current time jumps forward too much or jumps backwards, the
           generator function will crank the base of the time ring round one slot
           at a time (ie. a 1s period) until it catches up, spewing out VERSION
           packets as it goes.
      
      Fix the problem by:
      
       (1) Only using time64_t.  There's no need for sub-second resolution.
      
       (2) Use ktime_get_seconds() rather than ktime_get_real() so that time
           isn't perceived to go backwards.
      
       (3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two
           parts:
      
           (a) The "worker" function that manages the buckets and the timer.
      
           (b) The "dispatch" function that takes the pending peers and
           	 potentially transmits a keepalive packet before putting them back
           	 in the ring into the slot appropriate to the revised last-Tx time.
      
       (4) Taking everything that's pending out of the ring and splicing it into
           a temporary collector list for processing.
      
           In the case that there's been a significant jump forward, the ring
           gets entirely emptied and then the time base can be warped forward
           before the peers are processed.
      
           The warping can't happen if the ring isn't empty because the slot a
           peer is in is keepalive-time dependent, relative to the base time.
      
       (5) Limit the number of iterations of the bucket array when scanning it.
      
       (6) Set the timer to skip any empty slots as there's no point waking up if
           there's nothing to do yet.
      
      This can be triggered by an incoming call from a server after a reboot with
      AF_RXRPC and AFS built into the kernel causing a peer record to be set up
      before userspace is started.  The system clock is then adjusted by
      userspace, thereby potentially causing the keepalive generator to have a
      meltdown - which leads to a message like:
      
      	watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23]
      	...
      	Workqueue: krxrpcd rxrpc_peer_keepalive_worker
      	EIP: lock_acquire+0x69/0x80
      	...
      	Call Trace:
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? _raw_spin_lock_bh+0x29/0x60
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? rxrpc_peer_keepalive_worker+0x5e/0x350
      	 ? __lock_acquire+0x3d3/0x870
      	 ? process_one_work+0x110/0x340
      	 ? process_one_work+0x166/0x340
      	 ? process_one_work+0x110/0x340
      	 ? worker_thread+0x39/0x3c0
      	 ? kthread+0xdb/0x110
      	 ? cancel_delayed_work+0x90/0x90
      	 ? kthread_stop+0x70/0x70
      	 ? ret_from_fork+0x19/0x24
      
      Fixes: ace45bec ("rxrpc: Fix firewall route keepalive")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      330bdcfa
  18. 04 8月, 2018 2 次提交
  19. 03 8月, 2018 1 次提交