- 17 11月, 2020 15 次提交
-
-
由 Paolo Abeni 提交于
When the worker moves some bytes from the OoO queue into the receive queue, the msk->ask_seq is updated, the MPTCP-level ack carrying that value needs to wait the next ingress packet, possibly slowing down or hanging the peer Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Florian Westphal 提交于
Before sending 'x' new bytes also check that the new snd_una would be within the permitted receive window. For every ACK that also contains a DSS ack, check whether its tcp-level receive window would advance the current mptcp window right edge and update it if so. Signed-off-by: NFlorian Westphal <fw@strlen.de> Co-developed-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Florian Westphal 提交于
MPTCP maintains a status bit, MPTCP_SEND_SPACE, that is set when at least one subflow and the mptcp socket itself are writeable. mptcp_poll returns EPOLLOUT if the bit is set. mptcp_sendmsg makes sure MPTCP_SEND_SPACE gets cleared when last write has used up all subflows or the mptcp socket wmem. This reworks nospace handling as follows: MPTCP_SEND_SPACE is replaced with MPTCP_NOSPACE, i.e. inverted meaning. This bit is set when the mptcp socket is not writeable. The mptcp-level ack path schedule will then schedule the mptcp worker to allow it to free already-acked data (and reduce wmem usage). This will then wake userspace processes that wait for a POLLOUT event. sendmsg will set MPTCP_NOSPACE only when it has to wait for more wmem (blocking I/O case). poll path will set MPTCP_NOSPACE in case the mptcp socket is not writeable. Normal tcp-level notification (SOCK_NOSPACE) is only enabled in case the subflow socket has no available wmem. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
After the previous patch we may end-up with unsent data in the write buffer. If such buffer is full, the writer will block for unlimited time. We need to trigger the MPTCP xmit path even for the subflow rx path, on MPTCP snd_una updates. Keep things simple and just schedule the work queue if needed. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
mptcp_sendmsg() is refactored so that first it copies the data provided from user space into the send queue, and then tries to spool the send queue via sendmsg_frag. There a subtle change in the mptcp level collapsing on consecutive data fragment: we now allow that only on unsent data. The latter don't need to deal with msghdr data anymore and can be simplified in a relevant way. snd_nxt and write_seq are now tracked independently. Overall this allows some relevant cleanup and will allow sending pending mptcp data on msk una update in later patch. Co-developed-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
We must not close the subflows before all the MPTCP level data, comprising the DATA_FIN has been acked at the MPTCP level, otherwise we could be unable to retransmit as needed. __mptcp_wr_shutdown() shutdown is responsible to check for the correct status and close all subflows. Is called by the output path after spooling any data and at shutdown/close time. In a similar way, __mptcp_destroy_sock() is responsible to clean-up the MPTCP level status, and is called when the msk transition to TCP_CLOSE. The protocol level close() does not force anymore the TCP_CLOSE status, but orphan the msk socket and all the subflows. Orphaned msk sockets are forciby closed after a timeout or when all MPTCP-level data is acked. There is a caveat about keeping the orphaned subflows around: the TCP stack can asynchronusly call tcp_cleanup_ulp() on them via tcp_close(). To prevent accessing freed memory on later MPTCP level operations, the msk acquires a reference to each subflow socket and prevent subflow_ulp_release() from releasing the subflow context before __mptcp_destroy_sock(). The additional subflow references are released by __mptcp_done() and the async ULP release is detected checking ULP ops. If such field has been already cleared by the ULP release path, the dangling context is freed directly by __mptcp_done(). Co-developed-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
Track the next MPTCP sequence number used on xmit, currently always equal to write_next. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
Preparation patch to track the data pending in the msk write queue. No functional change introduced here Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
The current argument list is pretty long and quite unreadable, move many of them into a specific struct. Later patches will add more stuff to such struct. Additionally drop the 'timeo' argument, now unused. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
remove some of code duplications an allow preventing rescheduling on close. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
unlocked version of protocol level close, will be used by MPTCP to allow decouple orphaning and subflow level close. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
mptcp_push_pending() is called even on orphaned msk (and orphaned subflows), if there is outstanding data at close() time. To cope with the above MPTCP needs to handle explicitly the allocation failure on xmit. The newly introduced do_tcp_sendfrag() allows that, just plug it. We can additionally drop a couple of sanity checks, duplicate in the TCP code. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Paolo Abeni 提交于
Will be needed by the next patch, as MPTCP needs to handle directly the error/memory-allocation-needed path. No functional changes intended. Additionally let MPTCP code access the tcp_remove_empty_skb() helper. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Francis Laniel 提交于
Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new name of this function. Signed-off-by: NFrancis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Francis Laniel 提交于
nla_strlcpy now returns -E2BIG if src was truncated when written to dst. It also returns this error value if dstsize is 0 or higher than INT_MAX. For example, if src is "foo\0" and dst is 3 bytes long, the result will be: 1. "foG" after memcpy (G means garbage). 2. "fo\0" after memset. 3. -E2BIG is returned because src was not completely written into dst. The callers of nla_strlcpy were modified to take into account this modification. Signed-off-by: NFrancis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 15 11月, 2020 7 次提交
-
-
由 Lev Stipakov 提交于
Commit d3fd6548 ("net: core: add dev_sw_netstats_tx_add") has added function "dev_sw_netstats_tx_add()" to update net device per-cpu TX stats. Use this function instead of own code. While on it, remove xfrmi_get_stats64() and replace it with dev_get_tstats64(). Signed-off-by: NLev Stipakov <lev@openvpn.net> Reviewed-by: NHeiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20201113215939.147007-1-lev@openvpn.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Lev Stipakov 提交于
Commit d3fd6548 ("net: core: add dev_sw_netstats_tx_add") has added function "dev_sw_netstats_tx_add()" to update net device per-cpu TX stats. Use this function instead of own code. While on it, remove internal_get_stats() and replace it with dev_get_tstats64(). Signed-off-by: NLev Stipakov <lev@openvpn.net> Reviewed-by: NHeiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20201113215336.145998-1-lev@openvpn.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Shi 提交于
We don't use the parameter result actually, so better to remove it and skip a gcc warning for unused variable. Signed-off-by: NAlex Shi <alex.shi@linux.alibaba.com> Link: https://lore.kernel.org/r/1605239517-49707-1-git-send-email-alex.shi@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
These functions do not need to be exported. Signed-off-by: NEric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20201113113553.3411756-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Both IPv4 and IPv6 needs it via a function pointer. Following patch will avoid the indirect call. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Oliver Herms 提交于
This patch adds an IPv4 routes encapsulation attribute to the result of netlink RTM_GETROUTE requests (e.g. ip route get 192.0.2.1). Signed-off-by: NOliver Herms <oliver.peter.herms@gmail.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20201113085517.GA1307262@twsSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Lukas Bulwahn 提交于
Commit bdb7cc64 ("ipv6: Count interface receive statistics on the ingress netdev") removed all callees for ipv6_skb_idev(). Hence, since then, ipv6_skb_idev() is unused and make CC=clang W=1 warns: net/ipv6/exthdrs.c:909:33: warning: unused function 'ipv6_skb_idev' [-Wunused-function] So, remove this unused function and a -Wunused-function warning. Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: NNathan Chancellor <natechancellor@gmail.com> Link: https://lore.kernel.org/r/20201113135012.32499-1-lukas.bulwahn@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 14 11月, 2020 3 次提交
-
-
由 Lorenzo Bianconi 提交于
Introduce the capability to batch page_pool ptr_ring refill since it is usually run inside the driver NAPI tx completion loop. Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/bpf/08dd249c9522c001313f520796faa777c4089e1c.1605267335.git.lorenzo@kernel.org
-
由 Lorenzo Bianconi 提交于
XDP bulk APIs introduce a defer/flush mechanism to return pages belonging to the same xdp_mem_allocator object (identified via the mem.id field) in bulk to optimize I-cache and D-cache since xdp_return_frame is usually run inside the driver NAPI tx completion loop. The bulk queue size is set to 16 to be aligned to how XDP_REDIRECT bulking works. The bulk is flushed when it is full or when mem.id changes. xdp_frame_bulk is usually stored/allocated on the function call-stack to avoid locking penalties. Current implementation considers only page_pool memory model. Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NIlias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/bpf/e190c03eac71b20c8407ae0fc2c399eda7835f49.1605267335.git.lorenzo@kernel.org
-
由 Wenlin Kang 提交于
Replace strncpy() with strscpy(), fixes the following warning: In function 'bearer_name_validate', inlined from 'tipc_enable_bearer' at net/tipc/bearer.c:246:7: net/tipc/bearer.c:141:2: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name_copy, name, TIPC_MAX_BEARER_NAME); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: NWenlin Kang <wenlin.kang@windriver.com> Acked-by: NYing Xue <ying.xue@windriver.com> Link: https://lore.kernel.org/r/20201112093442.8132-1-wenlin.kang@windriver.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 11月, 2020 8 次提交
-
-
由 Martin KaFai Lau 提交于
This patch enables the FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage_(get|delete) helper, so those tracing programs can access the sk's bpf_local_storage and the later selftest will show some examples. The bpf_sk_storage is currently used in bpf-tcp-cc, tc, cg sockops...etc which is running either in softirq or task context. This patch adds bpf_sk_storage_get_tracing_proto and bpf_sk_storage_delete_tracing_proto. They will check in runtime that the helpers can only be called when serving softirq or running in a task context. That should enable most common tracing use cases on sk. During the load time, the new tracing_allowed() function will ensure the tracing prog using the bpf_sk_storage_(get|delete) helper is not tracing any bpf_sk_storage*() function itself. The sk is passed as "void *" when calling into bpf_local_storage. This patch only allows tracing a kernel function. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201112211313.2587383-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
Rename some of the functions currently prefixed with sk_storage to bpf_sk_storage. That will make the next patch have fewer prefix check and also bring the bpf_sk_storage.c to a more consistent function naming. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NKP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201112211307.2587021-1-kafai@fb.com
-
由 Martin KaFai Lau 提交于
sk_storage_charge() is the only user of omem_charge(). This patch simplifies it by folding omem_charge() into sk_storage_charge(). Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Acked-by: NKP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201112211301.2586255-1-kafai@fb.com
-
由 Thierry Reding 提交于
When dumping the name and NTP servers advertised by DHCP, a blank line is emitted if either of the lists is empty. This can lead to confusing issues such as the blank line getting flagged as warning. This happens because the blank line is the result of pr_cont("\n") and that may see its level corrupted by some other driver concurrently writing to the console. Fix this by making sure that the terminating newline is only emitted if at least one entry in the lists was printed before. Reported-by: NJon Hunter <jonathanh@nvidia.com> Signed-off-by: NThierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20201110073757.1284594-1-thierry.reding@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Menglong Dong 提交于
The initialization for 'err' with '-ENOSYS' is redundant and can be removed, as it is updated soon and not used. Changes since v1: - Move the err declaration below struct sock *sk Signed-off-by: NMenglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/5faa01d5.1c69fb81.8451c.cb5b@mx.google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alexander Lobakin 提交于
udp{4,6}_lib_lookup_skb() use ip{,v6}_hdr() to get IP header of the packet. While it's probably OK for non-frag0 paths, this helpers will also point to junk on Fast/frag0 GRO when all headers are located in frags. As a result, sk/skb lookup may fail or give wrong results. To support both GRO modes, skb_gro_network_header() might be used. To not modify original functions, add private versions of udp{4,6}_lib_lookup_skb() only to perform correct sk lookups on GRO. Present since the introduction of "application-level" UDP GRO in 4.7-rc1. Misc: replace totally unneeded ternaries with plain ifs. Fixes: a6024562 ("udp: Add GRO functions to UDP socket") Suggested-by: NWillem de Bruijn <willemb@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alexander Lobakin 提交于
UDP GRO uses udp_hdr(skb) in its .gro_receive() callback. While it's probably OK for non-frag0 paths (when all headers or even the entire frame are already in skb head), this inline points to junk when using Fast GRO (napi_gro_frags() or napi_gro_receive() with only Ethernet header in skb head and all the rest in the frags) and breaks GRO packet compilation and the packet flow itself. To support both modes, skb_gro_header_fast() + skb_gro_header_slow() are typically used. UDP even has an inline helper that makes use of them, udp_gro_udphdr(). Use that instead of troublemaking udp_hdr() to get rid of the out-of-order delivers. Present since the introduction of plain UDP GRO in 5.0-rc1. Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NAlexander Lobakin <alobakin@pm.me> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Parav Pandit 提交于
Cited commit in fixes tag overwrites the port attributes for the registered port. Avoid such error by checking registered flag before setting attributes. Fixes: 71ad8d55 ("devlink: Replace devlink_port_attrs_set parameters with a struct") Signed-off-by: NParav Pandit <parav@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20201111034744.35554-1-parav@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 12 11月, 2020 7 次提交
-
-
由 Geliang Tang 提交于
Fix the following Smatch complaint: net/mptcp/pm_netlink.c:213 mptcp_pm_add_timer() warn: variable dereferenced before check 'msk' (see line 208) net/mptcp/pm_netlink.c 207 struct mptcp_sock *msk = entry->sock; 208 struct sock *sk = (struct sock *)msk; 209 struct net *net = sock_net(sk); ^^ "msk" dereferenced here. 210 211 pr_debug("msk=%p", msk); 212 213 if (!msk) ^^^^ Too late. 214 return; 215 Fixes: 93f323b9 ("mptcp: add a new sysctl add_addr_timeout") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Link: https://lore.kernel.org/r/078a2ef5bdc4e3b2c25ef852461692001f426495.1604976945.git.geliangtang@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Ido Schimmel 提交于
Be more consistent about the way in which the nexthop flags are set and set them in one go. Suggested-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20201110102553.1924232-1-idosch@idosch.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vincent Bernat 提交于
The disable_policy and disable_xfrm are a per-interface sysctl to disable IPsec policy or encryption on an interface. However, while a "all" variant is exposed, it was a noop since it was never evaluated. We use the usual "or" logic for this kind of sysctls. Signed-off-by: NVincent Bernat <vincent@bernat.ch> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Loic Poulain 提交于
Distant QRTR nodes can be accessed via an other node that acts as a bridge. When the a QRTR endpoint associated to a bridge node is released, all the linked distant nodes should also be released. This patch fixes endpoint release by: - Submitting QRTR BYE message locally on behalf of all the nodes accessible through the endpoint. - Removing all the routable node IDs from radix tree pointing to the released node endpoint. Signed-off-by: NLoic Poulain <loic.poulain@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Loic Poulain 提交于
This will be requested for allocating control packet in atomic context. Signed-off-by: NLoic Poulain <loic.poulain@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Loic Poulain 提交于
In order to reach non-immediate remote node services that are accessed through an intermediate node, the route to the remote node needs to be saved. E.g for a [node1 <=> node2 <=> node3] network - node2 forwards node3 service to node1 - node1 must save node2 as route for reaching node3 Signed-off-by: NLoic Poulain <loic.poulain@linaro.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Loic Poulain 提交于
A remote endpoint (immediate neighbors node) can forward services from other nodes (non-immadiate), in that case ctrl packet node ID (offering distant service) can differ from the qrtr source node (forwarding the packet). Signed-off-by: NLoic Poulain <loic.poulain@linaro.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: NManivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-