提交 · e65eaded4cc4de6bf153def9dde6b25392d9a236 · openeuler / Kernel

05 3月, 2021 2 次提交

mptcp: free resources when the port number is mismatched · 9238e900

由 Geliang Tang 提交于 3月 04, 2021

When the port number is mismatched with the announced ones, use
'goto dispose_child' to free the resources instead of using 'goto out'.

This patch also moves the port number checking code in
subflow_syn_recv_sock before mptcp_finish_join, otherwise subflow_drop_ctx
will fail in dispose_child.

Fixes: 5bc56388 ("mptcp: add port number check for MP_JOIN")
Reported-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9238e900

mptcp: put subflow sock on connect error · f0715779

由 Florian Westphal 提交于 3月 04, 2021

mptcp_add_pending_subflow() performs a sock_hold() on the subflow,
then adds the subflow to the join list.

Without a sock_put the subflow sk won't be freed in case connect() fails.

unreferenced object 0xffff88810c03b100 (size 3000):
[..]
    sk_prot_alloc.isra.0+0x2f/0x110
    sk_alloc+0x5d/0xc20
    inet6_create+0x2b7/0xd30
    __sock_create+0x17f/0x410
    mptcp_subflow_create_socket+0xff/0x9c0
    __mptcp_subflow_connect+0x1da/0xaf0
    mptcp_pm_nl_work+0x6e0/0x1120
    mptcp_worker+0x508/0x9a0

Fixes: 5b950ff4 ("mptcp: link MPC subflow into msk only after accept")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0715779

23 2月, 2021 1 次提交

mptcp: do not wakeup listener for MPJ subflows · 52557dbc

由 Paolo Abeni 提交于 2月 19, 2021

MPJ subflows are not exposed as fds to user spaces. As such,
incoming MPJ subflows are removed from the accept queue by
tcp_check_req()/tcp_get_cookie_sock().

Later tcp_child_process() invokes subflow_data_ready() on the
parent socket regardless of the subflow kind, leading to poll
wakeups even if the later accept will block.

Address the issue by double-checking the queue state before
waking the user-space.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/164Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

52557dbc

13 2月, 2021 2 次提交

mptcp: pass subflow socket to a few helpers · 6c714f1b

由 Florian Westphal 提交于 2月 12, 2021

Pass the first/initial subflow to the existing functions so they can
pass this on to the notification handler that is added later in the
series.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c714f1b

mptcp: schedule worker when subflow is closed · 40947e13

由 Florian Westphal 提交于 2月 12, 2021

When remote side closes a subflow we should schedule the worker to
dispose of the subflow in a timely manner.

Otherwise, SF_CLOSED event won't be generated until the mptcp
socket itself is closing or local side is closing another subflow.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40947e13

12 2月, 2021 2 次提交

mptcp: init mptcp request socket earlier · d8b59efa

由 Paolo Abeni 提交于 2月 11, 2021

The mptcp subflow route_req() callback performs the subflow
req initialization after the route_req() check. If the latter
fails, mptcp-specific bits of the current request sockets
are left uninitialized.

The above causes bad things at req socket disposal time, when
the mptcp resources are cleared.

This change addresses the issue by splitting subflow_init_req()
into the actual initialization and the mptcp-specific checks.
The initialization is moved before any possibly failing check.
Reported-by: NChristoph Paasch <cpaasch@apple.com>
Fixes: 7ea851d1 ("tcp: merge 'init_req' and 'route_req' functions")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8b59efa

mptcp: deliver ssk errors to msk · 15cc1045

由 Paolo Abeni 提交于 2月 11, 2021

Currently all errors received on msk subflows are ignored.
We need to catch at least the errors on connect() and
on fallback sockets.

Use a custom sk_error_report callback at subflow level,
and do the real action under the msk socket lock - via
the usual sock_owned_by_user()/release_callback() schema.

Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15cc1045

03 2月, 2021 5 次提交

mptcp: add the mibs for ADD_ADDR with port · 2fbdd9ea

由 Geliang Tang 提交于 2月 01, 2021

This patch adds the mibs for ADD_ADDR with port:

MPTCP_MIB_PORTADD for received ADD_ADDR suboption with a port number.

MPTCP_MIB_PORTSYNRX, MPTCP_MIB_PORTSYNACKRX, MPTCP_MIB_PORTACKRX, for
received MP_JOIN's SYN or SYN/ACK or ACK with a port number which is
different from the msk's port number.

MPTCP_MIB_MISMATCHPORTSYNRX and MPTCP_MIB_MISMATCHPORTACKRX, for
received SYN or ACK MP_JOIN with a mismatched port-number.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

2fbdd9ea

mptcp: add port number check for MP_JOIN · 5bc56388

由 Geliang Tang 提交于 2月 01, 2021

This patch adds two new helpers, subflow_use_different_sport and
subflow_use_different_dport, to check whether the subflow's source or
destination port number is different from the msk's port number. When
receiving the MP_JOIN's SYN/SYNACK/ACK, we do these port number checks
and print out the different port numbers.

And furthermore, when receiving the MP_JOIN's SYN/ACK, we also use a new
helper mptcp_pm_sport_in_anno_list to check whether this port number is
announced. If it isn't, we need to abort this connection.

This patch also populates the local address's port field in
local_address.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5bc56388

mptcp: add a new helper subflow_req_create_thmac · ec20e143

由 Geliang Tang 提交于 2月 01, 2021

This patch adds a new helper named subflow_req_create_thmac, which is
extracted from subflow_token_join_request. It initializes subflow_req's
local_nonce and thmac fields, those are the more expensive to populate.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ec20e143

mptcp: drop unused skb in subflow_token_join_request · b5e2e42f

由 Geliang Tang 提交于 2月 01, 2021

This patch drops the unused parameter skb in subflow_token_join_request.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b5e2e42f

mptcp: create the listening socket for new port · 1729cf18

由 Geliang Tang 提交于 2月 01, 2021

This patch creates a listening socket when an address with a port-number
is added by PM netlink. Then binds the new port to the socket, and
listens for new connections.

When the address is removed or the addresses are flushed by PM netlink,
release the listening socket.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1729cf18

28 1月, 2021 1 次提交

mptcp: support MPJoin with IPv4 mapped in v6 sk · 50a13bc3

由 Matthieu Baerts 提交于 1月 25, 2021

With an IPv4 mapped in v6 socket, we were trying to call inet6_bind()
with an IPv4 address resulting in a -EINVAL error because the given
addr_len -- size of the address structure -- was too short.

We now make sure to use address structures for the same family as the
MPTCP socket for both the bind() and the connect(). It means we convert
v4 addresses to v4 mapped in v6 or the opposite if needed.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/122Co-developed-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

50a13bc3

23 1月, 2021 3 次提交

mptcp: implement delegated actions · b19bc294

由 Paolo Abeni 提交于 1月 20, 2021

On MPTCP-level ack reception, the packet scheduler
may select a subflow other then the current one.

Prior to this commit we rely on the workqueue to trigger
action on such subflow.

This changeset introduces an infrastructure that allows
any MPTCP subflow to schedule actions (MPTCP xmit) on
others subflows without resorting to (multiple) process
reschedule.

A dummy NAPI instance is used instead. When MPTCP needs to
trigger action an a different subflow, it enqueues the target
subflow on the NAPI backlog and schedule such instance as needed.

The dummy NAPI poll method walks the sockets backlog and tries
to acquire the (BH) socket lock on each of them. If the socket
is owned by the user space, the action will be completed by
the sock release cb, otherwise push is started.

This change leverages the delegated action infrastructure
to avoid invoking the MPTCP worker to spool the pending data,
when the packet scheduler picks a subflow other then the one
currently processing the incoming MPTCP-level ack.

Additionally we further refine the subflow selection
invoking the packet scheduler for each chunk of data
even inside __mptcp_subflow_push_pending().

v1 -> v2:
 - fix possible UaF at shutdown time, resetting sock ops
   after removing the ulp context
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b19bc294

mptcp: re-enable sndbuf autotune · 5cf92bba

由 Paolo Abeni 提交于 1月 20, 2021

After commit 6e628cd3 ("mptcp: use mptcp release_cb for
delayed tasks"), MPTCP never sets the flag bit SOCK_NOSPACE
on its subflow. As a side effect, autotune never takes place,
as it happens inside tcp_new_space(), which in turn is called
only when the mentioned bit is set.

Let's sendmsg() set the subflows NOSPACE bit when looking for
more memory and use the subflow write_space callback to propagate
the snd buf update and wake-up the user-space.

Additionally, this allows dropping a bunch of duplicate code and
makes the SNDBUF_LIMITED chrono relevant again for MPTCP subflows.

Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks")
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5cf92bba

mptcp: always graft subflow socket to parent · 866f26f2

由 Paolo Abeni 提交于 1月 20, 2021

Currently, incoming subflows link to the parent socket,
while outgoing ones link to a per subflow socket. The latter
is not really needed, except at the initial connect() time and
for the first subflow.

Always graft the outgoing subflow to the parent socket and
free the unneeded ones early.

This allows some code cleanup, reduces the amount of memory
used and will simplify the next patch
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

866f26f2

15 12月, 2020 2 次提交

mptcp: hold mptcp socket before calling tcp_done · ab82e996

由 Florian Westphal 提交于 12月 10, 2020

When processing options from tcp reset path its possible that
tcp_done(ssk) drops the last reference on the mptcp socket which
results in use-after-free.
Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ab82e996

mptcp: attach subflow socket to parent cgroup · 3764b0c5

由 Nicolas Rybowski 提交于 12月 10, 2020

It has been observed that the kernel sockets created for the subflows
(except the first one) are not in the same cgroup as their parents.
That's because the additional subflows are created by kernel workers.

This is a problem with eBPF programs attached to the parent's
cgroup won't be executed for the children. But also with any other features
of CGroup linked to a sk.

This patch fixes this behaviour.

As the subflow sockets are created by the kernel, we can't use
'mem_cgroup_sk_alloc' because of the current context being the one of the
kworker. This is why we have to do low level memcg manipulation, if
required.
Suggested-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Suggested-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NNicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3764b0c5

10 12月, 2020 2 次提交

mptcp: plug subflow context memory leak · 0597d0f8

由 Paolo Abeni 提交于 12月 09, 2020

When a MPTCP listener socket is closed with unaccepted
children pending, the ULP release callback will be invoked,
but nobody will call into __mptcp_close_ssk() on the
corresponding subflow.

As a consequence, at ULP release time, the 'disposable' flag
will be cleared and the subflow context memory will be leaked.

This change addresses the issue always freeing the context if
the subflow is still in the accept queue at ULP release time.

Additionally, this fixes an incorrect code reference in the
related comment.

Note: this fix leverages the changes introduced by the previous
commit.

Fixes: e16163b6 ("mptcp: refactor shutdown and close")
Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0597d0f8

mptcp: link MPC subflow into msk only after accept · 5b950ff4

由 Paolo Abeni 提交于 12月 09, 2020

Christoph reported the following splat:

WARNING: CPU: 0 PID: 4615 at net/ipv4/inet_connection_sock.c:1031 inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031
Modules linked in:
CPU: 0 PID: 4615 Comm: syz-executor.4 Not tainted 5.9.0 #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031
Code: 03 00 00 00 e8 79 b2 3d ff e9 ad f9 ff ff e8 1f 76 ba fe be 02 00 00 00 4c 89 f7 e8 62 b2 3d ff e9 14 f9 ff ff e8 08 76 ba fe <0f> 0b e9 97 f8 ff ff e8 fc 75 ba fe be 03 00 00 00 4c 89 f7 e8 3f
RSP: 0018:ffffc900037f7948 EFLAGS: 00010293
RAX: ffff88810a349c80 RBX: ffff888114ee1b00 RCX: ffffffff827b14cd
RDX: 0000000000000000 RSI: ffffffff827b1c38 RDI: 0000000000000005
RBP: ffff88810a2a8000 R08: ffff88810a349c80 R09: fffff520006fef1f
R10: 0000000000000003 R11: fffff520006fef1e R12: ffff888114ee2d00
R13: dffffc0000000000 R14: 0000000000000001 R15: ffff888114ee1d68
FS:  00007f2ac1945700(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd44798bc0 CR3: 0000000109810002 CR4: 0000000000170ef0
Call Trace:
 __tcp_close+0xd86/0x1110 net/ipv4/tcp.c:2433
 __mptcp_close_ssk+0x256/0x430 net/mptcp/protocol.c:1761
 __mptcp_destroy_sock+0x49b/0x770 net/mptcp/protocol.c:2127
 mptcp_close+0x62d/0x910 net/mptcp/protocol.c:2184
 inet_release+0xe9/0x1f0 net/ipv4/af_inet.c:434
 __sock_release+0xd2/0x280 net/socket.c:596
 sock_close+0x15/0x20 net/socket.c:1277
 __fput+0x276/0x960 fs/file_table.c:281
 task_work_run+0x109/0x1d0 kernel/task_work.c:151
 get_signal+0xe8f/0x1d40 kernel/signal.c:2561
 arch_do_signal+0x88/0x1b60 arch/x86/kernel/signal.c:811
 exit_to_user_mode_loop kernel/entry/common.c:161 [inline]
 exit_to_user_mode_prepare+0x9b/0xf0 kernel/entry/common.c:191
 syscall_exit_to_user_mode+0x22/0x150 kernel/entry/common.c:266
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f2ac1254469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007f2ac1944dc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffbf RBX: 000000000069bf00 RCX: 00007f2ac1254469
RDX: 0000000000000000 RSI: 0000000000008982 RDI: 0000000000000003
RBP: 000000000069bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069bf0c
R13: 00007ffeb53f178f R14: 00000000004668b0 R15: 0000000000000003

After commit 0397c6d8 ("mptcp: keep unaccepted MPC subflow into
join list"), the msk's workqueue and/or PM can touch the MPC
subflow - and acquire its socket lock - even if it's still unaccepted.

If the above event races with the relevant listener socket close, we
can end-up with the above splat.

This change addresses the issue delaying the MPC socket insertion
in conn_list at accept time - that is, partially reverting the
blamed commit.

We must additionally ensure that mptcp_pm_fully_established()
happens after accept() time, or the PM will not be able to
handle properly such event - conn_list could be empty otherwise.

In the receive path, we check the subflow list node to ensure
it is out of the listener queue. Be sure client subflows do
not match transiently such condition moving them into the join
list earlier at creation time.

Since we now have multiple mptcp_pm_fully_established() call sites
from different code-paths, said helper can now race with itself.
Use an additional PM status bit to avoid multiple notifications.
Reported-by: NChristoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/103
Fixes: 0397c6d8 ("mptcp: keep unaccepted MPC subflow into join list"),
Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b950ff4

04 12月, 2020 2 次提交

mptcp: emit tcp reset when a join request fails · 3ecfbe3e

由 Florian Westphal 提交于 11月 30, 2020

RFC 8684 says:
 If the token is unknown or the host wants to refuse subflow establishment
 (for example, due to a limit on the number of subflows it will permit),
 the receiver will send back a reset (RST) signal, analogous to an unknown
 port in TCP, containing an MP_TCPRST option (Section 3.6) with an
 "MPTCP specific error" reason code.

mptcp-next doesn't support MP_TCPRST yet, this can be added in another
change.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3ecfbe3e

tcp: merge 'init_req' and 'route_req' functions · 7ea851d1

由 Florian Westphal 提交于 11月 30, 2020

The Multipath-TCP standard (RFC 8684) says that an MPTCP host should send
a TCP reset if the token in a MP_JOIN request is unknown.

At this time we don't do this, the 3whs completes and the 'new subflow'
is reset afterwards.  There are two ways to allow MPTCP to send the
reset.

1. override 'send_synack' callback and emit the rst from there.
   The drawback is that the request socket gets inserted into the
   listeners queue just to get removed again right away.

2. Send the reset from the 'route_req' function instead.
   This avoids the 'add&remove request socket', but route_req lacks the
   skb that is required to send the TCP reset.

Instead of just adding the skb to that function for MPTCP sake alone,
Paolo suggested to merge init_req and route_req functions.

This saves one indirection from syn processing path and provides the skb
to the merged function at the same time.

'send reset on unknown mptcp join token' is added in next patch.
Suggested-by: NPaolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

7ea851d1

01 12月, 2020 1 次提交

mptcp: use mptcp release_cb for delayed tasks · 6e628cd3

由 Paolo Abeni 提交于 11月 27, 2020

We have some tasks triggered by the subflow receive path
which require to access the msk socket status, specifically:
mptcp_clean_una() and mptcp_push_pending()

We have almost everything in place to defer to the msk
release_cb such tasks when the msk sock is owned.

Since the worker is no more used to clean the acked data,
for fallback sockets we need to explicitly flush them.

As an added bonus we can move the wake-up code in __mptcp_clean_una(),
simplify a lot mptcp_poll() and move the timer update under
the data lock.

The worker is now used only to process and send DATA_FIN
packets and do the mptcp-level retransmissions.
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

6e628cd3

28 11月, 2020 1 次提交

mptcp: fix NULL ptr dereference on bad MPJ · d3ab7885

由 Paolo Abeni 提交于 11月 26, 2020

If an msk listener receives an MPJ carrying an invalid token, it
will zero the request socket msk entry. That should later
cause fallback and subflow reset - as per RFC - at
subflow_syn_recv_sock() time due to failing hmac validation.

Since commit 4cf8b7e4 ("subflow: introduce and use
mptcp_can_accept_new_subflow()"), we unconditionally dereference
- in mptcp_can_accept_new_subflow - the subflow request msk
before performing hmac validation. In the above scenario we
hit a NULL ptr dereference.

Address the issue doing the hmac validation earlier.

Fixes: 4cf8b7e4 ("subflow: introduce and use mptcp_can_accept_new_subflow()")
Tested-by: NDavide Caratti <dcaratti@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/03b2cfa3ac80d8fc18272edc6442a9ddf0b1e34e.1606400227.git.pabeni@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d3ab7885

21 11月, 2020 2 次提交

mptcp: refine MPTCP-level ack scheduling · ea4ca586

由 Paolo Abeni 提交于 11月 19, 2020

Send timely MPTCP-level ack is somewhat difficult when
the insertion into the msk receive level is performed
by the worker.

It needs TCP-level dup-ack to notify the MPTCP-level
ack_seq increase, as both the TCP-level ack seq and the
rcv window are unchanged.

We can actually avoid processing incoming data with the
worker, and let the subflow or recevmsg() send ack as needed.

When recvmsg() moves the skbs inside the msk receive queue,
the msk space is still unchanged, so tcp_cleanup_rbuf() could
end-up skipping TCP-level ack generation. Anyway, when
__mptcp_move_skbs() is invoked, a known amount of bytes is
going to be consumed soon: we update rcv wnd computation taking
them in account.

Additionally we need to explicitly trigger tcp_cleanup_rbuf()
when recvmsg() consumes a significant amount of the receive buffer.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

ea4ca586

mptcp: keep unaccepted MPC subflow into join list · 0397c6d8

由 Paolo Abeni 提交于 11月 19, 2020

This will simplify all operation dealing with subflows
before accept time (e.g. data fin processing, add_addr).

The join list is already flushed by mptcp_stream_accept()
before returning the newly created msk to the user space.

This also fixes an potential bug present into the old code:
conn_list was manipulated without helding the msk lock
in mptcp_stream_accept().
Tested-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0397c6d8

20 11月, 2020 1 次提交

crypto: sha - split sha.h into sha1.h and sha2.h · a24d22b2

由 Eric Biggers 提交于 11月 12, 2020

Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure.  So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1.  It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Acked-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a24d22b2

17 11月, 2020 2 次提交

mptcp: rework poll+nospace handling · 8edf0864

由 Florian Westphal 提交于 11月 16, 2020

MPTCP maintains a status bit, MPTCP_SEND_SPACE, that is set when at
least one subflow and the mptcp socket itself are writeable.

mptcp_poll returns EPOLLOUT if the bit is set.

mptcp_sendmsg makes sure MPTCP_SEND_SPACE gets cleared when last write
has used up all subflows or the mptcp socket wmem.

This reworks nospace handling as follows:

MPTCP_SEND_SPACE is replaced with MPTCP_NOSPACE, i.e. inverted meaning.
This bit is set when the mptcp socket is not writeable.
The mptcp-level ack path schedule will then schedule the mptcp worker
to allow it to free already-acked data (and reduce wmem usage).

This will then wake userspace processes that wait for a POLLOUT event.

sendmsg will set MPTCP_NOSPACE only when it has to wait for more
wmem (blocking I/O case).

poll path will set MPTCP_NOSPACE in case the mptcp socket is
not writeable.

Normal tcp-level notification (SOCK_NOSPACE) is only enabled
in case the subflow socket has no available wmem.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8edf0864

mptcp: refactor shutdown and close · e16163b6

由 Paolo Abeni 提交于 11月 16, 2020

We must not close the subflows before all the MPTCP level
data, comprising the DATA_FIN has been acked at the MPTCP
level, otherwise we could be unable to retransmit as needed.

__mptcp_wr_shutdown() shutdown is responsible to check for the
correct status and close all subflows. Is called by the output
path after spooling any data and at shutdown/close time.

In a similar way, __mptcp_destroy_sock() is responsible to clean-up
the MPTCP level status, and is called when the msk transition
to TCP_CLOSE.

The protocol level close() does not force anymore the TCP_CLOSE
status, but orphan the msk socket and all the subflows.
Orphaned msk sockets are forciby closed after a timeout or
when all MPTCP-level data is acked.

There is a caveat about keeping the orphaned subflows around:
the TCP stack can asynchronusly call tcp_cleanup_ulp() on them via
tcp_close(). To prevent accessing freed memory on later MPTCP
level operations, the msk acquires a reference to each subflow
socket and prevent subflow_ulp_release() from releasing the
subflow context before __mptcp_destroy_sock().

The additional subflow references are released by __mptcp_done()
and the async ULP release is detected checking ULP ops. If such
field has been already cleared by the ULP release path, the
dangling context is freed directly by __mptcp_done().
Co-developed-by: NDavide Caratti <dcaratti@redhat.com>
Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e16163b6

11 10月, 2020 2 次提交

mptcp: subflows garbage collection · 0e4f35d7

由 Paolo Abeni 提交于 10月 09, 2020

The msk can close MP_JOIN subflows if the initial handshake
fails. Currently such subflows are kept alive in the
conn_list until the msk itself is closed.

Beyond the wasted memory, we could end-up sending the
DATA_FIN and the DATA_FIN ack on such socket, even after a
reset.

Fixes: 43b54c6e ("mptcp: Use full MPTCP-level disconnect state machine")
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0e4f35d7

mptcp: fix fallback for MP_JOIN subflows · d5824847

由 Paolo Abeni 提交于 10月 09, 2020

Additional/MP_JOIN subflows that do not pass some initial handshake
tests currently causes fallback to TCP. That is an RFC violation:
we should instead reset the subflow and leave the the msk untouched.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/91
Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d5824847

09 10月, 2020 1 次提交

net: mptcp: make DACK4/DACK8 usage consistent among all subflows · 37198e93

由 Davide Caratti 提交于 10月 06, 2020

using packetdrill it's possible to observe the same MPTCP DSN being acked
by different subflows with DACK4 and DACK8. This is in contrast with what
specified in RFC8684 §3.3.2: if an MPTCP endpoint transmits a 64-bit wide
DSN, it MUST be acknowledged with a 64-bit wide DACK. Fix 'use_64bit_ack'
variable to make it a property of MPTCP sockets, not TCP subflows.

Fixes: a0c1d0ea ("mptcp: Use 32-bit DATA_ACK when possible")
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

37198e93

06 10月, 2020 1 次提交

mptcp: more DATA FIN fixes · 017512a0

由 Paolo Abeni 提交于 10月 05, 2020

Currently data fin on data packet are not handled properly:
the 'rcv_data_fin_seq' field is interpreted as the last
sequence number carrying a valid data, but for data fin
packet with valid maps we currently store map_seq + map_len,
that is, the next value.

The 'write_seq' fields carries instead the value subseguent
to the last valid byte, so in mptcp_write_data_fin() we
never detect correctly the last DSS map.

Fixes: 7279da61 ("mptcp: Use MPTCP-level flag for sending DATA_FIN")
Fixes: 1a49b2c2 ("mptcp: Handle incoming 32-bit DATA_FIN values")
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

017512a0

30 9月, 2020 1 次提交

mptcp: Handle incoming 32-bit DATA_FIN values · 1a49b2c2

由 Mat Martineau 提交于 9月 29, 2020

The peer may send a DATA_FIN mapping with either a 32-bit or 64-bit
sequence number. When a 32-bit sequence number is received for the
DATA_FIN, it must be expanded to 64 bits before comparing it to the
last acked sequence number. This expansion was missing.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/93
Fixes: 3721b9b6 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a49b2c2

25 9月, 2020 2 次提交

mptcp: add mptcp_destroy_common helper · 5c8c1640

由 Geliang Tang 提交于 9月 24, 2020

This patch added a new helper named mptcp_destroy_common containing the
shared code between mptcp_destroy() and mptcp_sock_destruct().
Suggested-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c8c1640

mptcp: remove addr and subflow in PM netlink · b6c08380

由 Geliang Tang 提交于 9月 24, 2020

This patch implements the remove announced addr and subflow logic in PM
netlink.

When the PM netlink removes an address, we traverse all the existing msk
sockets to find the relevant sockets.

We add a new list named anno_list in mptcp_pm_data, to record all the
announced addrs. In the traversing, we check if it has been recorded.
If it has been, we trigger the RM_ADDR signal.

We also check if this address is in conn_list. If it is, we remove the
subflow which using this local address.

Since we call mptcp_pm_free_anno_list in mptcp_destroy, we need to move
__mptcp_init_sock before the mptcp_is_enabled check in mptcp_init_sock.
Suggested-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Suggested-by: NPaolo Abeni <pabeni@redhat.com>
Suggested-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6c08380

24 9月, 2020 1 次提交

mptcp: Wake up MPTCP worker when DATA_FIN found on a TCP FIN packet · ef59b195

由 Mat Martineau 提交于 9月 21, 2020

When receiving a DATA_FIN MPTCP option on a TCP FIN packet, the DATA_FIN
information would be stored but the MPTCP worker did not get
scheduled. In turn, the MPTCP socket state would remain in
TCP_ESTABLISHED and no blocked operations would be awakened.

TCP FIN packets are seen by the MPTCP socket when moving skbs out of the
subflow receive queues, so schedule the MPTCP worker when a skb with
DATA_FIN but no data payload is moved from a subflow queue. Other cases
(DATA_FIN on a bare TCP ACK or on a packet with data payload) are
already handled.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/84
Fixes: 43b54c6e ("mptcp: Use full MPTCP-level disconnect state machine")
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef59b195

18 9月, 2020 1 次提交

mptcp: fix integer overflow in mptcp_subflow_discard_data() · 1d39cd8c

由 Paolo Abeni 提交于 9月 17, 2020

Christoph reported an infinite loop in the subflow receive path
under stress condition.

If there are multiple subflows, each of them using a large send
buffer, the delta between the sequence number used by
MPTCP-level retransmission can and the current msk->ack_seq
can be greater than MAX_INT.

In the above scenario, when calling mptcp_subflow_discard_data(),
such delta will be truncated to int, and could result in a negative
number: no bytes will be dropped, and subflow_check_data_avail()
will try again to process the same packet, looping forever.

This change addresses the issue by expanding the 'limit' size to 64
bits, so that overflows are not possible anymore.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/87
Fixes: 6719331c ("mptcp: trigger msk processing even for OoO data")
Reported-and-tested-by: NChristoph Paasch <cpaasch@apple.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d39cd8c

15 9月, 2020 2 次提交

mptcp: call tcp_cleanup_rbuf on subflows · c76c6956

由 Paolo Abeni 提交于 9月 14, 2020

That is needed to let the subflows announce promptly when new
space is available in the receive buffer.

tcp_cleanup_rbuf() is currently a static function, drop the
scope modifier and add a declaration in the TCP header.
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c76c6956

mptcp: allow creating non-backup subflows · 4596a2c1

由 Paolo Abeni 提交于 9月 14, 2020

Currently the 'backup' attribute of local endpoint
is ignored. Let's use it for the MP_JOIN handshake
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4596a2c1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功