- 05 11月, 2020 3 次提交
-
-
由 Paolo Abeni 提交于
When the TCP stack splits a packet on the write queue, the tail half currently lose the associated skb extensions, and will not carry the DSM on the wire. The above does not cause functional problems and is allowed by the RFC, but interact badly with GRO and RX coalescing, as possible candidates for aggregation will carry different TCP options. This change tries to improve the MPTCP behavior, propagating the skb extensions on split. Additionally, we must prevent the MPTCP stack from updating the mapping after the split occur: that will both violate the RFC and fool the reader. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Florian Westphal 提交于
The function is short and won't sleep, so this can use the _fast version. Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Florian Westphal 提交于
In addition to tcp autotuning during read, it may also increase the receive buffer in tcp_clamp_window(). In this case, mptcp should adjust its receive buffer size as well so it can move all pending skbs from the subflow socket to the mptcp socket. At this time, TCP can have more skbs ready for processing than what the mptcp receive buffer size allows. In the mptcp case, the receive window announced is based on the free space of the mptcp parent socket instead of the individual subflows. Following the subflow allows mptcp to grow its receive buffer. This is especially noticeable for loopback traffic where two skbs are enough to fill the initial receive window. In mptcp_data_ready() we do not hold the mptcp socket lock, so modifying mptcp_sk->sk_rcvbuf is racy. Do it when moving skbs from subflow to mptcp socket, both sockets are locked in this case. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 30 10月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
When moving the skbs from the subflow into the msk receive queue, we must schedule there the required amount of memory. Try to borrow the required memory from the subflow, if needed, so that we leverage the existing TCP heuristic. Fixes: 6771bfd9 ("mptcp: update mptcp ack sequence from work queue") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/f6143a6193a083574f11b00dbf7b5ad151bc4ff4.1603810630.git.pabeni@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 11 10月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
The msk can close MP_JOIN subflows if the initial handshake fails. Currently such subflows are kept alive in the conn_list until the msk itself is closed. Beyond the wasted memory, we could end-up sending the DATA_FIN and the DATA_FIN ack on such socket, even after a reset. Fixes: 43b54c6e ("mptcp: Use full MPTCP-level disconnect state machine") Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 09 10月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
If recvmsg() and the workqueue race to dequeue the data pending on some subflow, the current mapping for such subflow covers several skbs and some of them have not reached yet the received, either the worker or recvmsg() can find a subflow with the data_avail flag set - since the current mapping is valid and in sequence - but no skbs in the receive queue - since the other entity just processed them. The above will lead to an unbounded loop in __mptcp_move_skbs() and a subsequent hang of any task trying to acquiring the msk socket lock. This change addresses the issue stopping the __mptcp_move_skbs() loop as soon as we detect the above race (empty receive queue with data_avail set). Reported-and-tested-by: syzbot+fcf8ca5817d6e92c6567@syzkaller.appspotmail.com Fixes: ab174ad8 ("mptcp: move ooo skbs into msk out of order queue.") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 06 10月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
Currently we skip calling tcp_cleanup_rbuf() when packets are moved into the OoO queue or simply dropped. In both cases we still increment tp->copied_seq, and we should ask the TCP stack to check for ack. Fixes: c76c6956 ("mptcp: call tcp_cleanup_rbuf on subflows") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 9月, 2020 1 次提交
-
-
由 Mat Martineau 提交于
The msk->ack_seq value is sometimes read without the msk lock held, so make proper use of READ_ONCE and WRITE_ONCE. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 9月, 2020 3 次提交
-
-
由 Geliang Tang 提交于
This patch added a new helper named mptcp_destroy_common containing the shared code between mptcp_destroy() and mptcp_sock_destruct(). Suggested-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geliang Tang 提交于
This patch implements the remove announced addr and subflow logic in PM netlink. When the PM netlink removes an address, we traverse all the existing msk sockets to find the relevant sockets. We add a new list named anno_list in mptcp_pm_data, to record all the announced addrs. In the traversing, we check if it has been recorded. If it has been, we trigger the RM_ADDR signal. We also check if this address is in conn_list. If it is, we remove the subflow which using this local address. Since we call mptcp_pm_free_anno_list in mptcp_destroy, we need to move __mptcp_init_sock before the mptcp_is_enabled check in mptcp_init_sock. Suggested-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: NPaolo Abeni <pabeni@redhat.com> Suggested-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geliang Tang 提交于
This patch added the RM_ADDR option parsing logic: We parsed the incoming options to find if the rm_addr option is received, and called mptcp_pm_rm_addr_received to schedule PM work to a new status, named MPTCP_PM_RM_ADDR_RECEIVED. PM work got this status, and called mptcp_pm_nl_rm_addr_received to handle it. In mptcp_pm_nl_rm_addr_received, we closed the subflow matching the rm_id, and updated PM counter. Suggested-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: NPaolo Abeni <pabeni@redhat.com> Suggested-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 9月, 2020 1 次提交
-
-
由 Ye Bin 提交于
Fixes coccicheck warnig: net/mptcp/protocol.c:164:11-18: WARNING: Unsigned expression compared with zero: max_seq > 0 Fixes: ab174ad8 ("mptcp: move ooo skbs into msk out of order queue") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NYe Bin <yebin10@huawei.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 9月, 2020 8 次提交
-
-
由 Paolo Abeni 提交于
That is needed to let the subflows announce promptly when new space is available in the receive buffer. tcp_cleanup_rbuf() is currently a static function, drop the scope modifier and add a declaration in the TCP header. Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Update the scheduler to less trivial heuristic: cache the last used subflow, and try to send on it a reasonably long burst of data. When the burst or the subflow send space is exhausted, pick the subflow with the lower ratio between write space and send buffer - that is, the subflow with the greater relative amount of free space. v1 -> v2: - fix 32 bit build breakage due to 64bits div - fix checkpath issues (uint64_t -> u64) Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Add a bunch of MPTCP mibs related to MPTCP OoO data processing. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Add an RB-tree to cope with OoO (at MPTCP level) data. __mptcp_move_skb() insert into the RB tree "future" data, eventually coalescing skb as allowed by the MPTCP DSN. To simplify sequence accounting, move the DSN inside the cb. After successfully enqueuing in sequence data, check if we can use any data from the RB tree. Additionally move the data_fin check after spooling data from the OoO tree, otherwise we could miss shutdown events. The RB tree code is copied as verbatim as possible from tcp_data_queue_ofo(), with a few simplifications due to the fact that MPTCP doesn't need to cope with sacks. All bugs here are added by me. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Factor-out existing code, will be re-used by the next patch. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Let the msk sendbuf track the size of the larger subflow's send window, so that we ensure mptcp_sendmsg() does not exceed MPTCP-level send window. The update is performed just before try to send any data. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
This is a prerequisite to allow receiving data from multiple subflows without re-injection. Instead of dropping the OoO - "future" data in subflow_check_data_avail(), call into __mptcp_move_skbs() and let the msk drop that. To avoid code duplication factor out the mptcp_subflow_discard_data() helper. Note that __mptcp_move_skbs() can now find multiple subflows with data avail (comprising to-be-discarded data), so must update the byte counter incrementally. v1 -> v2: - fix checkpatch issues (unsigned -> unsigned int) Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently, when checking for the 'msk is writable' condition, we look at the individual subflows write space. That works well while we send data via a single subflow, but will not as soon as we will enable concurrent xmit on multiple subflows. With this change msk becomes writable when the following conditions hold: - the socket has some free write space - there is at least a subflow with write free space Additionally we need to set the NOSPACE bit on all subflows before blocking. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 9月, 2020 1 次提交
-
-
由 YueHaibing 提交于
There is no caller in tree any more. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 8月, 2020 1 次提交
-
-
由 Florian Westphal 提交于
After subflow lock is dropped, more wmem might have been made available. This fixes a deadlock in mptcp_connect.sh 'mmap' mode: wmem is exhausted. But as the mptcp socket holds on to already-acked data (for retransmit) no wakeup will occur. Using 'goto restart' calls mptcp_clean_una(sk) which will free pages that have been acked completely in the mean time. Fixes: fb529e62 ("mptcp: break and restart in case mptcp sndbuf is full") Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 17 8月, 2020 1 次提交
-
-
由 Florian Westphal 提交于
This fix wasn't correct: When this function is invoked from the retransmission worker, the iterator contains garbage and resetting it causes a crash. As the work queue should not be performance critical also zero the msghdr struct. Fixes: 35759383 "(mptcp: sendmsg: reset iter on error)" Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 8月, 2020 1 次提交
-
-
由 Florian Westphal 提交于
Once we've copied data from the iterator we need to revert in case we end up not sending any data. This bug doesn't trigger with normal 'poll' based tests, because we only feed a small chunk of data to kernel after poll indicated POLLOUT. With blocking IO and large writes this triggers. Receiver ends up with less data than it should get. Fixes: 72511aab ("mptcp: avoid blocking in tcp_sendpages") Signed-off-by: NFlorian Westphal <fw@strlen.de> Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2020 2 次提交
-
-
由 Paolo Abeni 提交于
In case of memory pressure, mptcp_sendmsg() may call sk_stream_wait_memory() after succesfully xmitting some bytes. If the latter fails we currently return to the user-space the error code, ignoring the succeful xmit. Address the issue always checking for the xmitted bytes before mptcp_sendmsg() completes. Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests") Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geliang Tang 提交于
Use mptcp_for_each_subflow in mptcp_stream_accept instead of open-coding. Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 7月, 2020 9 次提交
-
-
由 Mat Martineau 提交于
The MPTCP socket's write_seq member can be read without the msk lock held, so use WRITE_ONCE() to store it. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
The MPTCP socket's write_seq member should be read with READ_ONCE() when the msk lock is not held. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
RFC 8684 appendix D describes the connection state machine for MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and MPTCP-level socket state changes described in that appendix, rather than simply sending DATA_FIN along with TCP FIN when disconnecting subflows. DATA_FIN is now sent and acknowledged before shutting down the subflows. Received DATA_FIN information (if not part of a data packet) is written to the MPTCP socket when the incoming DSS option is parsed by the subflow, and the MPTCP worker is scheduled to process the flag. DATA_FIN received as part of a full DSS mapping will be handled when the mapping is processed. The DATA_FIN is acknowledged by the worker if the reader is caught up. If there is still data to be moved to the MPTCP-level queue, ack_seq will be incremented to account for the DATA_FIN when it reaches the end of the stream and a DATA_ACK will be sent to the peer. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
After DATA_FIN has been sent, the peer will acknowledge it. An ack of the relevant MPTCP-level sequence number will update the MPTCP connection state appropriately. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
This will be used to transition to the appropriate state on close and determine if a DATA_FIN needs to be sent for that state transition. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
Incoming DATA_FIN headers need to propagate the presence of the DATA_FIN bit and the associated sequence number to the MPTCP layer, even when arriving on a bare ACK that does not get added to the receive queue. Add structure members to store the DATA_FIN information and helpers to set and check those values. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
Since DATA_FIN information is the same for every subflow, store it only in the mptcp_sock. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
mptcp_close() acquires the msk lock, so it clearly should not be held before the function is called. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mat Martineau 提交于
A MPTCP socket where sending has been shut down should not attempt to send additional data, since DATA_FIN has already been sent. Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 7月, 2020 1 次提交
-
-
由 Matthieu Baerts 提交于
Unblocking sockets used for outgoing connections were not containing inet info about the initial connection due to a typo there: the value of "err" variable is negative in the kernelspace. This fixes the creation of additional subflows where the remote port has to be reused if the other host didn't announce another one. This also fixes inet_diag showing blank info about MPTCP sockets from unblocking sockets doing a connect(). Fixes: 41be81a8 ("mptcp: fix unblocking connect()") Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Acked-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 7月, 2020 2 次提交
-
-
由 Christoph Hellwig 提交于
Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 7月, 2020 1 次提交
-
-
由 Paolo Abeni 提交于
Currently accepted msk sockets become established only after accept() returns the new sk to user-space. As MP_JOIN request are refused as per RFC spec on non fully established socket, the above causes mp_join self-tests instabilities. This change lets the msk entering the established status as soon as it receives the 3rd ack and propagates the first subflow fully established status on the msk socket. Finally we can change the subflow acceptance condition to take in account both the sock state and the msk fully established flag. Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: NChristoph Paasch <cpaasch@apple.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-