1. 08 10月, 2019 1 次提交
  2. 29 8月, 2019 1 次提交
    • D
      rxrpc: Fix read-after-free in rxrpc_queue_local() · a05354cb
      David Howells 提交于
      commit 06d9532fa6b34f12a6d75711162d47c17c1add72 upstream.
      
      rxrpc_queue_local() attempts to queue the local endpoint it is given and
      then, if successful, prints a trace line.  The trace line includes the
      current usage count - but we're not allowed to look at the local endpoint
      at this point as we passed our ref on it to the workqueue.
      
      Fix this by reading the usage count before queuing the work item.
      
      Also fix the reading of local->debug_id for trace lines, which must be done
      with the same consideration as reading the usage count.
      
      Fixes: 09d2bf59 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
      Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a05354cb
  3. 26 7月, 2019 1 次提交
    • D
      rxrpc: Fix oops in tracepoint · 0f2f2ceb
      David Howells 提交于
      [ Upstream commit 99f0eae653b2db64917d0b58099eb51e300b311d ]
      
      If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the
      trace line that rxrpc_extract_header() tries to emit when a protocol error
      occurs (typically because the packet is short) because the call argument is
      NULL.
      
      Fix this by using ?: to assume 0 as the debug_id if call is NULL.
      
      This can then be induced by:
      
      	echo -e '\0\0\0\0\0\0\0\0' | ncat -4u --send-only <addr> 20001
      
      where addr has the following program running on it:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <unistd.h>
      	#include <sys/socket.h>
      	#include <arpa/inet.h>
      	#include <linux/rxrpc.h>
      	int main(void)
      	{
      		struct sockaddr_rxrpc srx;
      		int fd;
      		memset(&srx, 0, sizeof(srx));
      		srx.srx_family			= AF_RXRPC;
      		srx.srx_service			= 0;
      		srx.transport_type		= AF_INET;
      		srx.transport_len		= sizeof(srx.transport.sin);
      		srx.transport.sin.sin_family	= AF_INET;
      		srx.transport.sin.sin_port	= htons(0x4e21);
      		fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6);
      		bind(fd, (struct sockaddr *)&srx, sizeof(srx));
      		sleep(20);
      		return 0;
      	}
      
      It results in the following oops.
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000340
      	#PF: supervisor read access in kernel mode
      	#PF: error_code(0x0000) - not-present page
      	...
      	RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac
      	...
      	Call Trace:
      	 <IRQ>
      	 rxrpc_extract_header+0x86/0x171
      	 ? rcu_read_lock_sched_held+0x5d/0x63
      	 ? rxrpc_new_skb+0xd4/0x109
      	 rxrpc_input_packet+0xef/0x14fc
      	 ? rxrpc_input_data+0x986/0x986
      	 udp_queue_rcv_one_skb+0xbf/0x3d0
      	 udp_unicast_rcv_skb.isra.8+0x64/0x71
      	 ip_protocol_deliver_rcu+0xe4/0x1b4
      	 ip_local_deliver+0xf0/0x154
      	 __netif_receive_skb_one_core+0x50/0x6c
      	 netif_receive_skb_internal+0x26b/0x2e9
      	 napi_gro_receive+0xf8/0x1da
      	 rtl8169_poll+0x303/0x4c4
      	 net_rx_action+0x10e/0x333
      	 __do_softirq+0x1a5/0x38f
      	 irq_exit+0x54/0xc4
      	 do_IRQ+0xda/0xf8
      	 common_interrupt+0xf/0xf
      	 </IRQ>
      	 ...
      	 ? cpuidle_enter_state+0x23c/0x34d
      	 cpuidle_enter+0x2a/0x36
      	 do_idle+0x163/0x1ea
      	 cpu_startup_entry+0x1d/0x1f
      	 start_secondary+0x157/0x172
      	 secondary_startup_64+0xa4/0xb0
      
      Fixes: a25e21f0 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0f2f2ceb
  4. 20 4月, 2019 1 次提交
    • D
      rxrpc: Fix client call connect/disconnect race · 11582064
      David Howells 提交于
      [ Upstream commit 930c9f9125c85b5134b3e711bc252ecc094708e3 ]
      
      rxrpc_disconnect_client_call() reads the call's connection ID protocol
      value (call->cid) as part of that function's variable declarations.  This
      is bad because it's not inside the locked section and so may race with
      someone granting use of the channel to the call.
      
      This manifests as an assertion failure (see below) where the call in the
      presumed channel (0 because call->cid wasn't set when we read it) doesn't
      match the call attached to the channel we were actually granted (if 1, 2 or
      3).
      
      Fix this by moving the read and dependent calculations inside of the
      channel_lock section.  Also, only set the channel number and pointer
      variables if cid is not zero (ie. unset).
      
      This problem can be induced by injecting an occasional error in
      rxrpc_wait_for_channel() before the call to schedule().
      
      Make two further changes also:
      
       (1) Add a trace for wait failure in rxrpc_connect_call().
      
       (2) Drop channel_lock before BUG'ing in the case of the assertion failure.
      
      The failure causes a trace akin to the following:
      
      rxrpc: Assertion failed - 18446612685268945920(0xffff8880beab8c00) == 18446612685268621312(0xffff8880bea69800) is false
      ------------[ cut here ]------------
      kernel BUG at net/rxrpc/conn_client.c:824!
      ...
      RIP: 0010:rxrpc_disconnect_client_call+0x2bf/0x99d
      ...
      Call Trace:
       rxrpc_connect_call+0x902/0x9b3
       ? wake_up_q+0x54/0x54
       rxrpc_new_client_call+0x3a0/0x751
       ? rxrpc_kernel_begin_call+0x141/0x1bc
       ? afs_alloc_call+0x1b5/0x1b5
       rxrpc_kernel_begin_call+0x141/0x1bc
       afs_make_call+0x20c/0x525
       ? afs_alloc_call+0x1b5/0x1b5
       ? __lock_is_held+0x40/0x71
       ? lockdep_init_map+0xaf/0x193
       ? lockdep_init_map+0xaf/0x193
       ? __lock_is_held+0x40/0x71
       ? yfs_fs_fetch_data+0x33b/0x34a
       yfs_fs_fetch_data+0x33b/0x34a
       afs_fetch_data+0xdc/0x3b7
       afs_read_dir+0x52d/0x97f
       afs_dir_iterate+0xa0/0x661
       ? iterate_dir+0x63/0x141
       iterate_dir+0xa2/0x141
       ksys_getdents64+0x9f/0x11b
       ? filldir+0x111/0x111
       ? do_syscall_64+0x3e/0x1a0
       __x64_sys_getdents64+0x16/0x19
       do_syscall_64+0x7d/0x1a0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 45025bce ("rxrpc: Improve management and caching of client connection objects")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      11582064
  5. 09 10月, 2018 1 次提交
  6. 28 9月, 2018 1 次提交
    • D
      rxrpc: Fix error distribution · f3344303
      David Howells 提交于
      Fix error distribution by immediately delivering the errors to all the
      affected calls rather than deferring them to a worker thread.  The problem
      with the latter is that retries and things can happen in the meantime when we
      want to stop that sooner.
      
      To this end:
      
       (1) Stop the error distributor from removing calls from the error_targets
           list so that peer->lock isn't needed to synchronise against other adds
           and removals.
      
       (2) Require the peer's error_targets list to be accessed with RCU, thereby
           avoiding the need to take peer->lock over distribution.
      
       (3) Don't attempt to affect a call's state if it is already marked complete.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f3344303
  7. 01 8月, 2018 3 次提交
  8. 05 6月, 2018 1 次提交
    • D
      rxrpc: Fix handling of call quietly cancelled out on server · 1a025028
      David Howells 提交于
      Sometimes an in-progress call will stop responding on the fileserver when
      the fileserver quietly cancels the call with an internally marked abort
      (RX_CALL_DEAD), without sending an ABORT to the client.
      
      This causes the client's call to eventually expire from lack of incoming
      packets directed its way, which currently leads to it being cancelled
      locally with ETIME.  Note that it's not currently clear as to why this
      happens as it's really hard to reproduce.
      
      The rotation policy implement by kAFS, however, doesn't differentiate
      between ETIME meaning we didn't get any response from the server and ETIME
      meaning the call got cancelled mid-flow.  The latter leads to an oops when
      fetching data as the rotation partially resets the afs_read descriptor,
      which can result in a cleared page pointer being dereferenced because that
      page has already been filled.
      
      Handle this by the following means:
      
       (1) Set a flag on a call when we receive a packet for it.
      
       (2) Store the highest packet serial number so far received for a call
           (bearing in mind this may wrap).
      
       (3) If, when the "not received anything recently" timeout expires on a
           call, we've received at least one packet for a call and the connection
           as a whole has received packets more recently than that call, then
           cancel the call locally with ECONNRESET rather than ETIME.
      
           This indicates that the call was definitely in progress on the server.
      
       (4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME,
           don't try the next server, but rather abort the call.
      
           This avoids the oops as we don't try to reuse the afs_read struct.
           Rather, as-yet ungotten pages will be reread at a later data.
      
      Also:
      
       (5) Add an rxrpc tracepoint to log detection of the call being reset.
      
      Without this, I occasionally see an oops like the following:
      
          general protection fault: 0000 [#1] SMP PTI
          ...
          RIP: 0010:_copy_to_iter+0x204/0x310
          RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206
          RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560
          RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000
          RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400
          R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958
          R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560
          FS:  00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0
          Call Trace:
           skb_copy_datagram_iter+0x14e/0x289
           rxrpc_recvmsg_data.isra.0+0x6f3/0xf68
           ? trace_buffer_unlock_commit_regs+0x4f/0x89
           rxrpc_kernel_recv_data+0x149/0x421
           afs_extract_data+0x1e0/0x798
           ? afs_wait_for_call_to_complete+0xc9/0x52e
           afs_deliver_fs_fetch_data+0x33a/0x5ab
           afs_deliver_to_call+0x1ee/0x5e0
           ? afs_wait_for_call_to_complete+0xc9/0x52e
           afs_wait_for_call_to_complete+0x12b/0x52e
           ? wake_up_q+0x54/0x54
           afs_make_call+0x287/0x462
           ? afs_fs_fetch_data+0x3e6/0x3ed
           ? rcu_read_lock_sched_held+0x5d/0x63
           afs_fs_fetch_data+0x3e6/0x3ed
           afs_fetch_data+0xbb/0x14a
           afs_readpages+0x317/0x40d
           __do_page_cache_readahead+0x203/0x2ba
           ? ondemand_readahead+0x3a7/0x3c1
           ondemand_readahead+0x3a7/0x3c1
           generic_file_buffered_read+0x18b/0x62f
           __vfs_read+0xdb/0xfe
           vfs_read+0xb2/0x137
           ksys_read+0x50/0x8c
           do_syscall_64+0x7d/0x1a0
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Note the weird value in RDI which is a result of trying to kmap() a NULL
      page pointer.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a025028
  9. 11 5月, 2018 2 次提交
  10. 31 3月, 2018 2 次提交
  11. 28 3月, 2018 3 次提交
    • D
      rxrpc: Trace call completion · 1bae5d22
      David Howells 提交于
      Add a tracepoint to track rxrpc calls moving into the completed state and
      to log the completion type and the recorded error value and abort code.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      1bae5d22
    • D
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells 提交于
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a25e21f0
    • D
      rxrpc: Trace resend · 827efed6
      David Howells 提交于
      Add a tracepoint to trace packet resend events and to dump the Tx
      annotation buffer for added illumination.
      Signed-off-by: NDavid Howells <dhowells@rdhat.com>
      827efed6
  12. 24 11月, 2017 4 次提交
    • D
      rxrpc: Fix service endpoint expiry · f859ab61
      David Howells 提交于
      RxRPC service endpoints expire like they're supposed to by the following
      means:
      
       (1) Mark dead rxrpc_net structs (with ->live) rather than twiddling the
           global service conn timeout, otherwise the first rxrpc_net struct to
           die will cause connections on all others to expire immediately from
           then on.
      
       (2) Mark local service endpoints for which the socket has been closed
           (->service_closed) so that the expiration timeout can be much
           shortened for service and client connections going through that
           endpoint.
      
       (3) rxrpc_put_service_conn() needs to schedule the reaper when the usage
           count reaches 1, not 0, as idle conns have a 1 count.
      
       (4) The accumulator for the earliest time we might want to schedule for
           should be initialised to jiffies + MAX_JIFFY_OFFSET, not ULONG_MAX as
           the comparison functions use signed arithmetic.
      
       (5) Simplify the expiration handling, adding the expiration value to the
           idle timestamp each time rather than keeping track of the time in the
           past before which the idle timestamp must go to be expired.  This is
           much easier to read.
      
       (6) Ignore the timeouts if the net namespace is dead.
      
       (7) Restart the service reaper work item rather the client reaper.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f859ab61
    • D
      rxrpc: Add keepalive for a call · 415f44e4
      David Howells 提交于
      We need to transmit a packet every so often to act as a keepalive for the
      peer (which has a timeout from the last time it received a packet) and also
      to prevent any intervening firewalls from closing the route.
      
      Do this by resetting a timer every time we transmit a packet.  If the timer
      ever expires, we transmit a PING ACK packet and thereby also elicit a PING
      RESPONSE ACK from the other side - which prevents our last-rx timeout from
      expiring.
      
      The timer is set to 1/6 of the last-rx timeout so that we can detect the
      other side going away if it misses 6 replies in a row.
      
      This is particularly necessary for servers where the processing of the
      service function may take a significant amount of time.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      415f44e4
    • D
      rxrpc: Add a timeout for detecting lost ACKs/lost DATA · bd1fdf8c
      David Howells 提交于
      Add an extra timeout that is set/updated when we send a DATA packet that
      has the request-ack flag set.  This allows us to detect if we don't get an
      ACK in response to the latest flagged packet.
      
      The ACK packet is adjudged to have been lost if it doesn't turn up within
      2*RTT of the transmission.
      
      If the timeout occurs, we schedule the sending of a PING ACK to find out
      the state of the other side.  If a new DATA packet is ready to go sooner,
      we cancel the sending of the ping and set the request-ack flag on that
      instead.
      
      If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
      we had at the time of the ping transmission, we adjudge all the DATA
      packets sent between the response tx_top and the ping-time tx_top to have
      been lost and retransmit immediately.
      
      Rather than sending a PING ACK, we could just pick a DATA packet and
      speculatively retransmit that with request-ack set.  It should result in
      either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
      a PING-RESPONSE ACK mentioned above.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bd1fdf8c
    • D
      rxrpc: Fix call timeouts · a158bdd3
      David Howells 提交于
      Fix the rxrpc call expiration timeouts and make them settable from
      userspace.  By analogy with other rx implementations, there should be three
      timeouts:
      
       (1) "Normal timeout"
      
           This is set for all calls and is triggered if we haven't received any
           packets from the peer in a while.  It is measured from the last time
           we received any packet on that call.  This is not reset by any
           connection packets (such as CHALLENGE/RESPONSE packets).
      
           If a service operation takes a long time, the server should generate
           PING ACKs at a duration that's substantially less than the normal
           timeout so is to keep both sides alive.  This is set at 1/6 of normal
           timeout.
      
       (2) "Idle timeout"
      
           This is set only for a service call and is triggered if we stop
           receiving the DATA packets that comprise the request data.  It is
           measured from the last time we received a DATA packet.
      
       (3) "Hard timeout"
      
           This can be set for a call and specified the maximum lifetime of that
           call.  It should not be specified by default.  Some operations (such
           as volume transfer) take a long time.
      
      Allow userspace to set/change the timeouts on a call with sendmsg, using a
      control message:
      
      	RXRPC_SET_CALL_TIMEOUTS
      
      The data to the message is a number of 32-bit words, not all of which need
      be given:
      
      	u32 hard_timeout;	/* sec from first packet */
      	u32 idle_timeout;	/* msec from packet Rx */
      	u32 normal_timeout;	/* msec from data Rx */
      
      This can be set in combination with any other sendmsg() that affects a
      call.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a158bdd3
  13. 05 6月, 2017 1 次提交
    • D
      rxrpc: Add service upgrade support for client connections · 4e255721
      David Howells 提交于
      Make it possible for a client to use AuriStor's service upgrade facility.
      
      The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
      the first sendmsg() of a call.  This takes no parameters.
      
      When recvmsg() starts returning data from the call, the service ID field in
      the returned msg_name will reflect the result of the upgrade attempt.  If
      the upgrade was ignored, srx_service will match what was set in the
      sendmsg(); if the upgrade happened the srx_service will be altered to
      indicate the service the server upgraded to.
      
      Note that:
      
       (1) The choice of upgrade service is up to the server
      
       (2) Further client calls to the same server that would share a connection
           are blocked if an upgrade probe is in progress.
      
       (3) This should only be used to probe the service.  Clients should then
           use the returned service ID in all subsequent communications with that
           server (and not set the upgrade).  Note that the kernel will not
           retain this information should the connection expire from its cache.
      
       (4) If a server that supports upgrading is replaced by one that doesn't,
           whilst a connection is live, and if the replacement is running, say,
           OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
           server will not respond to packets sent to the upgraded connection.
      
           At this point, calls will time out and the server must be reprobed.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4e255721
  14. 06 4月, 2017 4 次提交
    • D
      rxrpc: Trace client call connection · 89ca6948
      David Howells 提交于
      Add a tracepoint (rxrpc_connect_call) to log the combination of rxrpc_call
      pointer, afs_call pointer/user data and wire call parameters to make it
      easier to match the tracebuffer contents to captured network packets.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      89ca6948
    • D
      rxrpc: Trace changes in a call's receive window size · 740586d2
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_rwind_change) to log changes in a call's receive
      window size as imposed by the peer through an ACK packet.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      740586d2
    • D
      rxrpc: Trace received aborts · 005ede28
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_abort) to record received aborts.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      005ede28
    • D
      rxrpc: Trace protocol errors in received packets · fb46f6ee
      David Howells 提交于
      Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
      packets.  The following changes are made:
      
       (1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
           call and mark the call aborted.  This is wrapped by
           rxrpc_abort_eproto() that makes the why string usable in trace.
      
       (2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
           generation points, replacing rxrpc_abort_call() with the latter.
      
       (3) Only send an abort packet in rxkad_verify_packet*() if we actually
           managed to abort the call.
      
      Note that a trace event is also emitted if a kernel user (e.g. afs) tries
      to send data through a call when it's not in the transmission phase, though
      it's not technically a receive event.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      fb46f6ee
  15. 02 3月, 2017 1 次提交
    • D
      rxrpc: Fix deadlock between call creation and sendmsg/recvmsg · 540b1c48
      David Howells 提交于
      All the routines by which rxrpc is accessed from the outside are serialised
      by means of the socket lock (sendmsg, recvmsg, bind,
      rxrpc_kernel_begin_call(), ...) and this presents a problem:
      
       (1) If a number of calls on the same socket are in the process of
           connection to the same peer, a maximum of four concurrent live calls
           are permitted before further calls need to wait for a slot.
      
       (2) If a call is waiting for a slot, it is deep inside sendmsg() or
           rxrpc_kernel_begin_call() and the entry function is holding the socket
           lock.
      
       (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
           from servicing the other calls as they need to take the socket lock to
           do so.
      
       (4) The socket is stuck until a call is aborted and makes its slot
           available to the waiter.
      
      Fix this by:
      
       (1) Provide each call with a mutex ('user_mutex') that arbitrates access
           by the users of rxrpc separately for each specific call.
      
       (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
           they've got a call and taken its mutex.
      
           Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
           set but someone else has the lock.  Should I instead only return
           EWOULDBLOCK if there's nothing currently to be done on a socket, and
           sleep in this particular instance because there is something to be
           done, but we appear to be blocked by the interrupt handler doing its
           ping?
      
       (3) Make rxrpc_new_client_call() unlock the socket after allocating a new
           call, locking its user mutex and adding it to the socket's call tree.
           The call is returned locked so that sendmsg() can add data to it
           immediately.
      
           From the moment the call is in the socket tree, it is subject to
           access by sendmsg() and recvmsg() - even if it isn't connected yet.
      
       (4) Lock new service calls in the UDP data_ready handler (in
           rxrpc_new_incoming_call()) because they may already be in the socket's
           tree and the data_ready handler makes them live immediately if a user
           ID has already been preassigned.
      
           Note that the new call is locked before any notifications are sent
           that it is live, so doing mutex_trylock() *ought* to always succeed.
           Userspace is prevented from doing sendmsg() on calls that are in a
           too-early state in rxrpc_do_sendmsg().
      
       (5) Make rxrpc_new_incoming_call() return the call with the user mutex
           held so that a ping can be scheduled immediately under it.
      
           Note that it might be worth moving the ping call into
           rxrpc_new_incoming_call() and then we can drop the mutex there.
      
       (6) Make rxrpc_accept_call() take the lock on the call it is accepting and
           release the socket after adding the call to the socket's tree.  This
           is slightly tricky as we've dequeued the call by that point and have
           to requeue it.
      
           Note that requeuing emits a trace event.
      
       (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
           new mutex immediately and don't bother with the socket mutex at all.
      
      This patch has the nice bonus that calls on the same socket are now to some
      extent parallelisable.
      
      Note that we might want to move rxrpc_service_prealloc() calls out from the
      socket lock and give it its own lock, so that we don't hang progress in
      other calls because we're waiting for the allocator.
      
      We probably also want to avoid calling rxrpc_notify_socket() from within
      the socket lock (rxrpc_accept_call()).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.c.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540b1c48
  16. 05 1月, 2017 2 次提交
    • D
      rxrpc: Add some more tracing · b1d9f7fd
      David Howells 提交于
      Add the following extra tracing information:
      
       (1) Modify the rxrpc_transmit tracepoint to record the Tx window size as
           this is varied by the slow-start algorithm.
      
       (2) Modify the rxrpc_rx_ack tracepoint to record more information from
           received ACK packets.
      
       (3) Add an rxrpc_rx_data tracepoint to record the information in DATA
           packets.
      
       (4) Add an rxrpc_disconnect_call tracepoint to record call disconnection,
           including the reason the call was disconnected.
      
       (5) Add an rxrpc_improper_term tracepoint to record implicit termination
           of a call by a client either by starting a new call on a particular
           connection channel without first transmitting the final ACK for the
           previous call.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b1d9f7fd
    • D
      rxrpc: Fix handling of enums-to-string translation in tracing · b54a134a
      David Howells 提交于
      Fix the way enum values are translated into strings in AF_RXRPC
      tracepoints.  The problem with just doing a lookup in a normal flat array
      of strings or chars is that external tracing infrastructure can't find it.
      Rather, TRACE_DEFINE_ENUM must be used.
      
      Also sort the enums and string tables to make it easier to keep them in
      order so that a future patch to __print_symbolic() can be optimised to try
      a direct lookup into the table first before iterating over it.
      
      A couple of _proto() macro calls are removed because they refered to tables
      that got moved to the tracing infrastructure.  The relevant data can be
      found by way of tracing.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b54a134a
  17. 30 9月, 2016 3 次提交
  18. 25 9月, 2016 1 次提交
    • D
      rxrpc: Implement slow-start · 57494343
      David Howells 提交于
      Implement RxRPC slow-start, which is similar to RFC 5681 for TCP.  A
      tracepoint is added to log the state of the congestion management algorithm
      and the decisions it makes.
      
      Notes:
      
       (1) Since we send fixed-size DATA packets (apart from the final packet in
           each phase), counters and calculations are in terms of packets rather
           than bytes.
      
       (2) The ACK packet carries the equivalent of TCP SACK.
      
       (3) The FLIGHT_SIZE calculation in RFC 5681 doesn't seem particularly
           suited to SACK of a small number of packets.  It seems that, almost
           inevitably, by the time three 'duplicate' ACKs have been seen, we have
           narrowed the loss down to one or two missing packets, and the
           FLIGHT_SIZE calculation ends up as 2.
      
       (4) In rxrpc_resend(), if there was no data that apparently needed
           retransmission, we transmit a PING ACK to ask the peer to tell us what
           its Rx window state is.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      57494343
  19. 23 9月, 2016 5 次提交
  20. 22 9月, 2016 1 次提交
    • D
      rxrpc: Add per-peer RTT tracker · cf1a6474
      David Howells 提交于
      Add a function to track the average RTT for a peer.  Sources of RTT data
      will be added in subsequent patches.
      
      The RTT data will be useful in the future for determining resend timeouts
      and for handling the slow-start part of the Rx protocol.
      
      Also add a pair of tracepoints, one to log transmissions to elicit a
      response for RTT purposes and one to log responses that contribute RTT
      data.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cf1a6474
  21. 17 9月, 2016 1 次提交
    • D
      rxrpc: Improve skb tracing · 71f3ca40
      David Howells 提交于
      Improve sk_buff tracing within AF_RXRPC by the following means:
      
       (1) Use an enum to note the event type rather than plain integers and use
           an array of event names rather than a big multi ?: list.
      
       (2) Distinguish Rx from Tx packets and account them separately.  This
           requires the call phase to be tracked so that we know what we might
           find in rxtx_buffer[].
      
       (3) Add a parameter to rxrpc_{new,see,get,free}_skb() to indicate the
           event type.
      
       (4) A pair of 'rotate' events are added to indicate packets that are about
           to be rotated out of the Rx and Tx windows.
      
       (5) A pair of 'lost' events are added, along with rxrpc_lose_skb() for
           packet loss injection recording.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
       
      71f3ca40