- 08 10月, 2018 2 次提交
-
-
由 David Howells 提交于
We don't need to take the RCU read lock in the rxrpc packet receive function because it's held further up the stack in the IP input routine around the UDP receive routines. Fix this by dropping the RCU read lock calls from rxrpc_input_packet(). This simplifies the code. Fixes: 70790dbe ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception in which a packet is placed onto the UDP receive queue and then immediately removed again by rxrpc. Going via the queue in this manner seems like it should be unnecessary. This does, however, require the invention of a value to place in encap_type as that's one of the conditions to switch packets out to the encap_rcv hook. Possibly the value doesn't actually matter for anything other than sockopts on the UDP socket, which aren't accessible outside of rxrpc anyway. This seems to cut a bit of time out of the time elapsed between each sk_buff being timestamped and turning up in rxrpc (the final number in the following trace excerpts). I measured this by making the rxrpc_rx_packet trace point print the time elapsed between the skb being timestamped and the current time (in ns), e.g.: ... 424.278721: rxrpc_rx_packet: ... ACK 25026 So doing a 512MiB DIO read from my test server, with an unmodified kernel: N min max sum mean stddev 27605 2626 7581 7.83992e+07 2840.04 181.029 and with the patch applied: N min max sum mean stddev 27547 1895 12165 6.77461e+07 2459.29 255.02 Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 05 10月, 2018 2 次提交
-
-
由 David Howells 提交于
Fix the rxrpc_data_ready() function to pick up all packets and to not miss any. There are two problems: (1) The sk_data_ready pointer on the UDP socket is set *after* it is bound. This means that it's open for business before we're ready to dequeue packets and there's a tiny window exists in which a packet can sneak onto the receive queue, but we never know about it. Fix this by setting the pointers on the socket prior to binding it. (2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was an error on the transmission side, even though we set the sk_error_report hook. Because rxrpc_data_ready() returns immediately in such a case, it never actually removes its packet from the receive queue. Fix this by abstracting out the UDP dequeuing and checksumming into a separate function that keeps hammering on skb_recv_udp() until it returns -EAGAIN, passing the packets extracted to the remainder of the function. and two potential problems: (3) It might be possible in some circumstances or in the future for packets to be being added to the UDP receive queue whilst rxrpc is running consuming them, so the data_ready() handler might get called less often than once per packet. Allow for this by fully draining the queue on each call as (2). (4) If a packet fails the checksum check, the code currently returns after discarding the packet without checking for more. Allow for this by fully draining the queue on each call as (2). Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NPaolo Abeni <pabeni@redhat.com>
-
由 David Howells 提交于
Fix some refs to init_net that should've been changed to the appropriate network namespace. Fixes: 2baec2c3 ("rxrpc: Support network namespacing") Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NPaolo Abeni <pabeni@redhat.com>
-
- 28 9月, 2018 5 次提交
-
-
由 David Howells 提交于
Make the following changes to improve the robustness of the code that sets up a new service call: (1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a service ID check and pass that along to rxrpc_new_incoming_call(). This means that I can remove the check from rxrpc_new_incoming_call() without the need to worry about the socket attached to the local endpoint getting replaced - which would invalidate the check. (2) Cache the rxrpc_peer struct, thereby allowing the peer search to be done once. The peer is passed to rxrpc_new_incoming_call(), thereby saving the need to repeat the search. This also reduces the possibility of rxrpc_publish_service_conn() BUG()'ing due to the detection of a duplicate connection, despite the initial search done by rxrpc_find_connection_rcu() having turned up nothing. This BUG() shouldn't ever get hit since rxrpc_data_ready() *should* be non-reentrant and the result of the initial search should still hold true, but it has proven possible to hit. I *think* this may be due to __rxrpc_lookup_peer_rcu() cutting short the iteration over the hash table if it finds a matching peer with a zero usage count, but I don't know for sure since it's only ever been hit once that I know of. Another possibility is that a bug in rxrpc_data_ready() that checked the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag might've let through a packet that caused a spurious and invalid call to be set up. That is addressed in another patch. (3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero usage count rather than stopping and returning not found, just in case there's another peer record behind it in the bucket. (4) Don't search the peer records in rxrpc_alloc_incoming_call(), but rather either use the peer cached in (2) or, if one wasn't found, preemptively install a new one. Fixes: 8496af50 ("rxrpc: Use RCU to access a peer's service connection tree") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Do more up-front checking on incoming packets to weed out invalid ones and also ones aimed at services that we don't support. Whilst we're at it, replace the clearing of call and skew if we don't find a connection with just initialising the variables to zero at the top of the function. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
In the input path, a received sk_buff can be marked for rejection by setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data (such as an abort code) in skb->priority. The rejection is handled by queueing the sk_buff up for dealing with in process context. The output code reads the mark and priority and, theoretically, generates an appropriate response packet. However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT message with a random abort code is generated (since skb->priority wasn't set to anything). Fix this by outputting the appropriate sort of packet. Also, whilst we're at it, most of the marks are no longer used, so remove them and rename the remaining two to something more obvious. Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix RTT information gathering in AF_RXRPC by the following means: (1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS. (2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready() collects it, set it at that point. (3) Allow ACKs to be requested on the last packet of a client call, but not a service call. We need to be careful lest we undo: bf7d620a Author: David Howells <dhowells@redhat.com> Date: Thu Oct 6 08:11:51 2016 +0100 rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase but that only really applies to service calls that we're handling, since the client side gets to send the final ACK (or not). (4) When about to transmit an ACK or DATA packet, record the Tx timestamp before only; don't update the timestamp afterwards. (5) Switch the ordering between recording the serial and recording the timestamp to always set the serial number first. The serial number shouldn't be seen referenced by an ACK packet until we've transmitted the packet bearing it - so in the Rx path, we don't need the timestamp until we've checked the serial number. Fixes: cf1a6474 ("rxrpc: Add per-peer RTT tracker") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED flag in the packet type field rather than in the packet flags field. Fix this by creating a pair of helper functions to check whether the packet is going to the client or to the server and use them generally. Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 01 8月, 2018 2 次提交
-
-
由 David Howells 提交于
Trace notifications from the softirq side of the socket to the process-context side. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Trace successful packet transmission (kernel_sendmsg() succeeded, that is) in AF_RXRPC. We can share the enum that defines the transmission points with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be applicable to both. Also, save the internal call->debug_id in the rxrpc_channel struct so that it can be used in retransmission trace lines. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 05 6月, 2018 1 次提交
-
-
由 David Howells 提交于
Sometimes an in-progress call will stop responding on the fileserver when the fileserver quietly cancels the call with an internally marked abort (RX_CALL_DEAD), without sending an ABORT to the client. This causes the client's call to eventually expire from lack of incoming packets directed its way, which currently leads to it being cancelled locally with ETIME. Note that it's not currently clear as to why this happens as it's really hard to reproduce. The rotation policy implement by kAFS, however, doesn't differentiate between ETIME meaning we didn't get any response from the server and ETIME meaning the call got cancelled mid-flow. The latter leads to an oops when fetching data as the rotation partially resets the afs_read descriptor, which can result in a cleared page pointer being dereferenced because that page has already been filled. Handle this by the following means: (1) Set a flag on a call when we receive a packet for it. (2) Store the highest packet serial number so far received for a call (bearing in mind this may wrap). (3) If, when the "not received anything recently" timeout expires on a call, we've received at least one packet for a call and the connection as a whole has received packets more recently than that call, then cancel the call locally with ECONNRESET rather than ETIME. This indicates that the call was definitely in progress on the server. (4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME, don't try the next server, but rather abort the call. This avoids the oops as we don't try to reuse the afs_read struct. Rather, as-yet ungotten pages will be reread at a later data. Also: (5) Add an rxrpc tracepoint to log detection of the call being reset. Without this, I occasionally see an oops like the following: general protection fault: 0000 [#1] SMP PTI ... RIP: 0010:_copy_to_iter+0x204/0x310 RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206 RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560 RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000 RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400 R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958 R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560 FS: 00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0 Call Trace: skb_copy_datagram_iter+0x14e/0x289 rxrpc_recvmsg_data.isra.0+0x6f3/0xf68 ? trace_buffer_unlock_commit_regs+0x4f/0x89 rxrpc_kernel_recv_data+0x149/0x421 afs_extract_data+0x1e0/0x798 ? afs_wait_for_call_to_complete+0xc9/0x52e afs_deliver_fs_fetch_data+0x33a/0x5ab afs_deliver_to_call+0x1ee/0x5e0 ? afs_wait_for_call_to_complete+0xc9/0x52e afs_wait_for_call_to_complete+0x12b/0x52e ? wake_up_q+0x54/0x54 afs_make_call+0x287/0x462 ? afs_fs_fetch_data+0x3e6/0x3ed ? rcu_read_lock_sched_held+0x5d/0x63 afs_fs_fetch_data+0x3e6/0x3ed afs_fetch_data+0xbb/0x14a afs_readpages+0x317/0x40d __do_page_cache_readahead+0x203/0x2ba ? ondemand_readahead+0x3a7/0x3c1 ondemand_readahead+0x3a7/0x3c1 generic_file_buffered_read+0x18b/0x62f __vfs_read+0xdb/0xfe vfs_read+0xb2/0x137 ksys_read+0x50/0x8c do_syscall_64+0x7d/0x1a0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note the weird value in RDI which is a result of trying to kmap() a NULL page pointer. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 5月, 2018 1 次提交
-
-
由 David Howells 提交于
The expect_rx_by call timeout is supposed to be set when a call is started to indicate that we need to receive a packet by that point. This is currently put back every time we receive a packet, but it isn't started when we first send a packet. Without this, the call may wait forever if the server doesn't deign to reply. Fix this by setting the timeout upon a successful UDP sendmsg call for the first DATA packet. The timeout is initiated only for initial transmission and not for subsequent retries as we don't want the retry mechanism to extend the timeout indefinitely. Fixes: a158bdd3 ("rxrpc: Fix call timeouts") Reported-by: NMarc Dionne <marc.dionne@auristor.com> Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 04 4月, 2018 1 次提交
-
-
由 David Howells 提交于
By analogy with other Rx implementations, RxRPC packet types 9, 10 and 11 should just be discarded rather than being aborted like other undefined packet types. Reported-by: NJeffrey Altman <jaltman@auristor.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2018 2 次提交
-
-
由 David Howells 提交于
If a call-level abort is received for the previous call to complete on a connection channel, then that abort is queued for the connection processor to handle. Unfortunately, the connection processor then assumes without checking that the abort is connection-level (ie. callNumber is 0) and distributes it over all active calls on that connection, thereby incorrectly aborting them. Fix this by discarding aborts aimed at a completed call. Further, discard all packets aimed at a call that's complete if there's currently an active call on a channel, since the DATA packets associated with the new call automatically terminate the old call. Fixes: 18bfeba5 ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Reported-by: NMarc Dionne <marc.dionne@auristor.com> Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix the firewall route keepalive part of AF_RXRPC which is currently function incorrectly by replying to VERSION REPLY packets from the server with VERSION REQUEST packets. Instead, send VERSION REPLY packets to the peers of service connections to act as keep-alives 20s after the latest packet was transmitted to that peer. Also, just discard VERSION REPLY packets rather than replying to them. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 28 3月, 2018 1 次提交
-
-
由 David Howells 提交于
In rxrpc and afs, use the debug_ids that are monotonically allocated to various objects as they're allocated rather than pointers as kernel pointers are now hashed making them less useful. Further, the debug ids aren't reused anywhere nearly as quickly. In addition, allow kernel services that use rxrpc, such as afs, to take numbers from the rxrpc counter, assign them to their own call struct and pass them in to rxrpc for both client and service calls so that the trace lines for each will have the same ID tag. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 29 11月, 2017 1 次提交
-
-
由 David Howells 提交于
Clean up some whitespace from rxrpc. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 24 11月, 2017 2 次提交
-
-
由 David Howells 提交于
Add an extra timeout that is set/updated when we send a DATA packet that has the request-ack flag set. This allows us to detect if we don't get an ACK in response to the latest flagged packet. The ACK packet is adjudged to have been lost if it doesn't turn up within 2*RTT of the transmission. If the timeout occurs, we schedule the sending of a PING ACK to find out the state of the other side. If a new DATA packet is ready to go sooner, we cancel the sending of the ping and set the request-ack flag on that instead. If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what we had at the time of the ping transmission, we adjudge all the DATA packets sent between the response tx_top and the ping-time tx_top to have been lost and retransmit immediately. Rather than sending a PING ACK, we could just pick a DATA packet and speculatively retransmit that with request-ack set. It should result in either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the a PING-RESPONSE ACK mentioned above. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix the rxrpc call expiration timeouts and make them settable from userspace. By analogy with other rx implementations, there should be three timeouts: (1) "Normal timeout" This is set for all calls and is triggered if we haven't received any packets from the peer in a while. It is measured from the last time we received any packet on that call. This is not reset by any connection packets (such as CHALLENGE/RESPONSE packets). If a service operation takes a long time, the server should generate PING ACKs at a duration that's substantially less than the normal timeout so is to keep both sides alive. This is set at 1/6 of normal timeout. (2) "Idle timeout" This is set only for a service call and is triggered if we stop receiving the DATA packets that comprise the request data. It is measured from the last time we received a DATA packet. (3) "Hard timeout" This can be set for a call and specified the maximum lifetime of that call. It should not be specified by default. Some operations (such as volume transfer) take a long time. Allow userspace to set/change the timeouts on a call with sendmsg, using a control message: RXRPC_SET_CALL_TIMEOUTS The data to the message is a number of 32-bit words, not all of which need be given: u32 hard_timeout; /* sec from first packet */ u32 idle_timeout; /* msec from packet Rx */ u32 normal_timeout; /* msec from data Rx */ This can be set in combination with any other sendmsg() that affects a call. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 02 11月, 2017 1 次提交
-
-
由 David Howells 提交于
Fix call expiry handling in the following ways (1) If all the request data from a client call is acked, don't send a follow up IDLE ACK with firstPacket == 1 and previousPacket == 0 as this appears to fool some servers into thinking everything has been accepted. (2) Never send an abort back to the server once it has ACK'd all the request packets; rather just try to reuse the channel for the next call. The first request DATA packet of the next call on the same channel will implicitly ACK the entire reply of the dead call - even if we haven't transmitted it yet. (3) Don't send RX_CALL_TIMEOUT in an ABORT packet, librx uses abort codes to pass local errors to the caller in addition to remote errors, and this is meant to be local only. The following also need to be addressed in future patches: (4) Service calls should send PING ACKs as 'keep alives' if the server is still processing the call. (5) VERSION REPLY packets should be sent to the peers of service connections to act as keep-alives. This is used to keep firewall routes in place. The AFS CM should enable this. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 24 10月, 2017 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2017 1 次提交
-
-
由 David Howells 提交于
Make it possible for a client to use AuriStor's service upgrade facility. The client does this by adding an RXRPC_UPGRADE_SERVICE control message to the first sendmsg() of a call. This takes no parameters. When recvmsg() starts returning data from the call, the service ID field in the returned msg_name will reflect the result of the upgrade attempt. If the upgrade was ignored, srx_service will match what was set in the sendmsg(); if the upgrade happened the srx_service will be altered to indicate the service the server upgraded to. Note that: (1) The choice of upgrade service is up to the server (2) Further client calls to the same server that would share a connection are blocked if an upgrade probe is in progress. (3) This should only be used to probe the service. Clients should then use the returned service ID in all subsequent communications with that server (and not set the upgrade). Note that the kernel will not retain this information should the connection expire from its cache. (4) If a server that supports upgrading is replaced by one that doesn't, whilst a connection is live, and if the replacement is running, say, OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement server will not respond to packets sent to the upgraded connection. At this point, calls will time out and the server must be reprobed. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 06 4月, 2017 4 次提交
-
-
由 David Howells 提交于
Add a tracepoint (rxrpc_rx_rwind_change) to log changes in a call's receive window size as imposed by the peer through an ACK packet. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Add a tracepoint (rxrpc_rx_abort) to record received aborts. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received packets. The following changes are made: (1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a call and mark the call aborted. This is wrapped by rxrpc_abort_eproto() that makes the why string usable in trace. (2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error generation points, replacing rxrpc_abort_call() with the latter. (3) Only send an abort packet in rxkad_verify_packet*() if we actually managed to abort the call. Note that a trace event is also emitted if a kernel user (e.g. afs) tries to send data through a call when it's not in the transmission phase, though it's not technically a receive event. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Use negative error codes in struct rxrpc_call::error because that's what the kernel normally deals with and to make the code consistent. We only turn them positive when transcribing into a cmsg for userspace recvmsg. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 11 3月, 2017 1 次提交
-
-
由 David Howells 提交于
The RxRPC ACK packet may contain an extension that includes the peer's current Rx window size for this call. We adjust the local Tx window size to match. However, the transmitter can stall if the receive window is reduced to 0 by the peer and then reopened. This is because the normal way that the transmitter is re-energised is by dropping something out of our Tx queue and thus making space. When a single gap is made, the transmitter is woken up. However, because there's nothing in the Tx queue at this point, this doesn't happen. To fix this, perform a wake_up() any time we see the peer's Rx window size increasing. The observable symptom is that calls start failing on ETIMEDOUT and the following: kAFS: SERVER DEAD state=-62 appears in dmesg. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2017 1 次提交
-
-
由 David Howells 提交于
The call state may be changed at any time by the data-ready routine in response to received packets, so if the call state is to be read and acted upon several times in a function, READ_ONCE() must be used unless the call state lock is held. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2017 1 次提交
-
-
由 David Howells 提交于
All the routines by which rxrpc is accessed from the outside are serialised by means of the socket lock (sendmsg, recvmsg, bind, rxrpc_kernel_begin_call(), ...) and this presents a problem: (1) If a number of calls on the same socket are in the process of connection to the same peer, a maximum of four concurrent live calls are permitted before further calls need to wait for a slot. (2) If a call is waiting for a slot, it is deep inside sendmsg() or rxrpc_kernel_begin_call() and the entry function is holding the socket lock. (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented from servicing the other calls as they need to take the socket lock to do so. (4) The socket is stuck until a call is aborted and makes its slot available to the waiter. Fix this by: (1) Provide each call with a mutex ('user_mutex') that arbitrates access by the users of rxrpc separately for each specific call. (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as they've got a call and taken its mutex. Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is set but someone else has the lock. Should I instead only return EWOULDBLOCK if there's nothing currently to be done on a socket, and sleep in this particular instance because there is something to be done, but we appear to be blocked by the interrupt handler doing its ping? (3) Make rxrpc_new_client_call() unlock the socket after allocating a new call, locking its user mutex and adding it to the socket's call tree. The call is returned locked so that sendmsg() can add data to it immediately. From the moment the call is in the socket tree, it is subject to access by sendmsg() and recvmsg() - even if it isn't connected yet. (4) Lock new service calls in the UDP data_ready handler (in rxrpc_new_incoming_call()) because they may already be in the socket's tree and the data_ready handler makes them live immediately if a user ID has already been preassigned. Note that the new call is locked before any notifications are sent that it is live, so doing mutex_trylock() *ought* to always succeed. Userspace is prevented from doing sendmsg() on calls that are in a too-early state in rxrpc_do_sendmsg(). (5) Make rxrpc_new_incoming_call() return the call with the user mutex held so that a ping can be scheduled immediately under it. Note that it might be worth moving the ping call into rxrpc_new_incoming_call() and then we can drop the mutex there. (6) Make rxrpc_accept_call() take the lock on the call it is accepting and release the socket after adding the call to the socket's tree. This is slightly tricky as we've dequeued the call by that point and have to requeue it. Note that requeuing emits a trace event. (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the new mutex immediately and don't bother with the socket mutex at all. This patch has the nice bonus that calls on the same socket are now to some extent parallelisable. Note that we might want to move rxrpc_service_prealloc() calls out from the socket lock and give it its own lock, so that we don't hang progress in other calls because we're waiting for the allocator. We probably also want to avoid calling rxrpc_notify_socket() from within the socket lock (rxrpc_accept_call()). Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NMarc Dionne <marc.c.dionne@auristor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 1月, 2017 2 次提交
-
-
由 David Howells 提交于
Add the following extra tracing information: (1) Modify the rxrpc_transmit tracepoint to record the Tx window size as this is varied by the slow-start algorithm. (2) Modify the rxrpc_rx_ack tracepoint to record more information from received ACK packets. (3) Add an rxrpc_rx_data tracepoint to record the information in DATA packets. (4) Add an rxrpc_disconnect_call tracepoint to record call disconnection, including the reason the call was disconnected. (5) Add an rxrpc_improper_term tracepoint to record implicit termination of a call by a client either by starting a new call on a particular connection channel without first transmitting the final ACK for the previous call. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix the way enum values are translated into strings in AF_RXRPC tracepoints. The problem with just doing a lookup in a normal flat array of strings or chars is that external tracing infrastructure can't find it. Rather, TRACE_DEFINE_ENUM must be used. Also sort the enums and string tables to make it easier to keep them in order so that a future patch to __print_symbolic() can be optimised to try a direct lookup into the table first before iterating over it. A couple of _proto() macro calls are removed because they refered to tables that got moved to the tracing infrastructure. The relevant data can be found by way of tracing. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 08 11月, 2016 1 次提交
-
-
由 Paolo Abeni 提交于
A new argument is added to __skb_recv_datagram to provide an explicit skb destructor, invoked under the receive queue lock. The UDP protocol uses such argument to perform memory reclaiming on dequeue, so that the UDP protocol does not set anymore skb->desctructor. Instead explicit memory reclaiming is performed at close() time and when skbs are removed from the receive queue. The in kernel UDP protocol users now need to call a skb_recv_udp() variant instead of skb_recv_datagram() to properly perform memory accounting on dequeue. Overall, this allows acquiring only once the receive queue lock on dequeue. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr sinks vanilla patched 1 440 560 3 2150 2300 6 3650 3800 9 4450 4600 12 6250 6450 v1 -> v2: - do rmem and allocated memory scheduling under the receive lock - do bulk scheduling in first_packet_length() and in udp_destruct_sock() - avoid the typdef for the dequeue callback Suggested-by: NEric Dumazet <edumazet@google.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 10月, 2016 3 次提交
-
-
由 David Howells 提交于
OpenAFS doesn't always correctly terminate client calls that it makes - this includes calls the OpenAFS servers make to the cache manager service. It should end the client call with either: (1) An ACK that has firstPacket set to one greater than the seq number of the reply DATA packet with the LAST_PACKET flag set (thereby hard-ACK'ing all packets). nAcks should be 0 and acks[] should be empty (ie. no soft-ACKs). (2) An ACKALL packet. OpenAFS, though, may send an ACK packet with firstPacket set to the last seq number or less and soft-ACKs listed for all packets up to and including the last DATA packet. The transmitter, however, is obliged to keep the call live and the soft-ACK'd DATA packets around until they're hard-ACK'd as the receiver is permitted to drop any merely soft-ACK'd packet and request retransmission by sending an ACK packet with a NACK in it. Further, OpenAFS will also terminate a client call by beginning the next client call on the same connection channel. This implicitly completes the previous call. This patch handles implicit ACK of a call on a channel by the reception of the first packet of the next call on that channel. If another call doesn't come along to implicitly ACK a call, then we have to time the call out. There are some bugs there that will be addressed in subsequent patches. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Separate the output of PING ACKs from the output of other sorts of ACK so that if we receive a PING ACK and schedule transmission of a PING RESPONSE ACK, the response doesn't get cancelled by a PING ACK we happen to be scheduling transmission of at the same time. If a PING RESPONSE gets lost, the other side might just sit there waiting for it and refuse to proceed otherwise. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
When a reply is deemed lost, we send a ping to find out the other end received all the request data packets we sent. This should be limited to client calls and we shouldn't do this on service calls. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 30 9月, 2016 4 次提交
-
-
由 David Howells 提交于
Keep that call timeouts as ktimes rather than jiffies so that they can be expressed as functions of RTT. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
The offset field in struct rxrpc_skb_priv is unnecessary as the value can always be calculated. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
When we receive an ACK from the peer that tells us what the peer's receive window (rwind) is, we should reduce ssthresh to rwind if rwind is smaller than ssthresh. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Switch to Congestion Avoidance mode at cwnd == ssthresh rather than relying on cwnd getting incremented beyond ssthresh and the window size, the mode being shifted and then cwnd being corrected. We need to make sure we switch into CA mode so that we stop marking every packet for ACK. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-