- 12 4月, 2021 4 次提交
-
-
由 Cong Wang 提交于
The last refcnt of the psock can be gone right after sock_map_remove_links(), so sk_psock_stop() could trigger a UAF. The reason why I placed sk_psock_stop() there is to avoid RCU read critical section, and more importantly, some callee of sock_map_remove_links() is supposed to be called with RCU read lock, we can not simply get rid of RCU read lock here. Therefore, the only choice we have is to grab an additional refcnt with sk_psock_get() and put it back after sk_psock_stop(). Fixes: 799aa7f9 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Reported-by: syzbot+7b6548ae483d6f4c64ae@syzkaller.appspotmail.com Signed-off-by: NCong Wang <cong.wang@bytedance.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NJakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210408030556.45134-1-xiyou.wangcong@gmail.com
-
由 Cong Wang 提交于
Using sk_psock() to retrieve psock pointer from sock requires RCU read lock, but we already get psock pointer before calling ->psock_update_sk_prot() in both cases, so we can just pass it without bothering sk_psock(). Fixes: 8a59f9d1 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") Reported-by: syzbot+320a3bc8d80f478c37e4@syzkaller.appspotmail.com Signed-off-by: NCong Wang <cong.wang@bytedance.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+320a3bc8d80f478c37e4@syzkaller.appspotmail.com Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210407032111.33398-1-xiyou.wangcong@gmail.com
-
由 Daniel Borkmann 提交于
Synchronize tools/include/uapi/linux/bpf.h which was missing changes from various commits: - f3c45326 ("bpf: Document PROG_TEST_RUN limitations") - e5e35e75 ("bpf: BPF-helper for MTU checking add length input") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Joe Stringer 提交于
Per net/bpf/test_run.c, particular prog types have additional restrictions around the parameters that can be provided, so document these in the header. I didn't bother documenting the limitation on duration for raw tracepoints since that's an output parameter anyway. Tested with ./tools/testing/selftests/bpf/test_doc_build.sh. Suggested-by: NYonghong Song <yhs@fb.com> Signed-off-by: NJoe Stringer <joe@cilium.io> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYonghong Song <yhs@fb.com> Acked-by: NLorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210410174549.816482-1-joe@cilium.io
-
- 09 4月, 2021 10 次提交
-
-
由 Andrii Nakryiko 提交于
Yauheni Kaliuta says: ==================== A set of fixes for selftests to make them working on systems with PAGE_SIZE > 4K + cleanup (version) and ringbuf_multi extention. --- v3->v4: - zero initialize BPF programs' static variables; - add bpf_map__inner_map to libbpf.map in alphabetical order; - add bpf_map__set_inner_map_fd test to ringbuf_multi; v2->v3: - reorder: move version removing patch first to keep main patches in one group; - rename "selftests/bpf: pass page size from userspace in sockopt_sk" as suggested; - convert sockopt_sk test to use ASSERT macros; - set page size from userspace - split patches to pairs userspace/bpf. It's easier to check that every conversion works as expected; v1->v2: - add missed 'selftests/bpf: test_progs/sockopt_sk: Convert to use BPF skeleton' ==================== Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
-
由 Yauheni Kaliuta 提交于
Test map__set_inner_map_fd() interaction with map-in-map initialization. Use hashmap of maps just to make it different to existing array of maps. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-9-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Set bpf table sizes dynamically according to the runtime page size value. Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-8-yauheni.kaliuta@redhat.com
-
由 Andrii Nakryiko 提交于
The API gives access to inner map for map in map types (array or hash of map). It will be used to dynamically set max_entries in it. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Replace hardcoded 4096 with runtime value in the userspace part of the test and set bpf table sizes dynamically according to the value. Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-6-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Replace hardcoded 4096 with runtime value in the userspace part of the test and set bpf table sizes dynamically according to the value. Do not switch to ASSERT macros, keep CHECK, for consistency with the rest of the test. Can be a separate cleanup patch. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-5-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Use ASSERT to check result but keep CHECK where format was used to report error. Use bpf_map__set_max_entries() to set map size dynamically from userspace according to page size. Zero-initialize the variable in bpf prog, otherwise it will cause problems on some versions of Clang. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-4-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Since there is no convenient way for bpf program to get PAGE_SIZE from inside of the kernel, pass the value from userspace. Zero-initialize the variable in bpf prog, otherwise it will cause problems on some versions of Clang. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-3-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
Switch the test to use BPF skeleton to save some boilerplate and make it easy to access bpf program bss segment. The latter will be used to pass PAGE_SIZE from userspace since there is no convenient way for bpf program to get it from inside of the kernel. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-2-yauheni.kaliuta@redhat.com
-
由 Yauheni Kaliuta 提交于
As pointed by Andrii Nakryiko, _version is useless now, remove it. Signed-off-by: NYauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-1-yauheni.kaliuta@redhat.com
-
- 07 4月, 2021 2 次提交
-
-
由 Muhammad Usama Anjum 提交于
bpf_preload_lock is already defined with DEFINE_MUTEX(). There is no need to initialize it again. Remove the extraneous initialization. Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210405194904.GA148013@LEGION
-
由 Cong Wang 提交于
These comments in udp_bpf_update_proto() are copied from the original TCP code and apparently do not apply to UDP. Just remove them. Reported-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NCong Wang <cong.wang@bytedance.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210403052715.13854-1-xiyou.wangcong@gmail.com
-
- 05 4月, 2021 1 次提交
-
-
由 Hengqi Chen 提交于
Add missing ')' for KERNEL_VERSION macro. Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com
-
- 04 4月, 2021 1 次提交
-
-
由 Martin KaFai Lau 提交于
The tracing test and the recent kfunc call test require CONFIG_DYNAMIC_FTRACE. This patch adds it to the config file. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210403002921.3419721-1-kafai@fb.com
-
- 03 4月, 2021 22 次提交
-
-
由 Yang Yingliang 提交于
Remove redundant semi-colon in finalize_btf_ext(). Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210402012634.1965453-1-yangyingliang@huawei.com
-
由 Wan Jiabing 提交于
struct btf_type is declared twice. One is declared at 35th line. The below one is not needed, hence remove the duplicate. Signed-off-by: NWan Jiabing <wanjiabing@vivo.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210401072037.995849-1-wanjiabing@vivo.com
-
由 Wan Jiabing 提交于
struct bpf_prog is declared twice. There is one declaration which is independent on the macro at 18th line. So the below one is not needed though. Remove the duplicate. Signed-off-by: NWan Jiabing <wanjiabing@vivo.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210401064637.993327-1-wanjiabing@vivo.com
-
由 He Fengqing 提交于
'stack' parameter is not used in ___bpf_prog_run() after f696b8f4 ("bpf: split bpf core interpreter"), the base address have been set to FP reg. So consequently remove it. Signed-off-by: NHe Fengqing <hefengqing@huawei.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210331075135.3850782-1-hefengqing@huawei.com
-
由 John Fastabend 提交于
With a relatively recent clang master branch test_map skips a section, libbpf: elf: skipping unrecognized data section(5) .rodata.str1.1 the cause is some pointless strings from bpf_printks in the BPF program loaded during testing. After just removing the prints to fix above error Daniel points out the program is a bit pointless and could be simply the empty program returning SK_PASS. Here we do just that and return simply SK_PASS. This program is used with test_maps selftests to test insert/remove of a program into the sockmap and sockhash maps. Its not testing actual functionality of the TCP sockmap programs, these are tested from test_sockmap. So we shouldn't lose in test coverage and fix above warnings. This original test was added before test_sockmap existed and has been copied around ever since, clean it up now. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/161731595664.74613.1603087410166945302.stgit@john-XPS-13-9370
-
由 Eric Dumazet 提交于
Group all the often used fields in the first cache line, to reduce cache line misses. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Order fields to increase locality for most used protocols. udplite and icmp are moved at the end. Same for proc_net_devsnmp6 which is not used in fast path. This potentially saves one cache line miss for typical TCP/UDP over IPv4/IPv6. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
If the "type_a->nfcid_len" is too large then it would lead to memory corruption in pn533_target_found_type_a() when we do: memcpy(nfc_tgt->nfcid1, tgt_type_a->nfcid_data, nfc_tgt->nfcid1_len); Fixes: c3b1e1e8 ("NFC: Export NFCID1 from pn533") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Ioana Ciornei says: ==================== dpaa2-eth: add rx copybreak support DMA unmapping, allocating a new buffer and DMA mapping it back on the refill path is really not that efficient. Proper buffer recycling (page pool, flipping the page and using the other half) cannot be done for DPAA2 since it's not a ring based controller but it rather deals with multiple queues which all get their buffers from the same buffer pool on Rx. To circumvent these limitations, add support for Rx copybreak in dpaa2-eth. Below you can find a summary of the tests that were run to end up with the default rx copybreak value of 512. A bit about the setup - a LS2088A SoC, 8 x Cortex A72 @ 1.8GHz, IPfwd zero loss test @ 20Gbit/s throughput. I tested multiple frame sizes to get an idea where is the break even point. Here are 2 sets of results, (1) is the baseline and (2) is just allocating a new skb for all frames sizes received (as if the copybreak was even to the MTU). All numbers are in Mpps. 64 128 256 512 640 768 896 (1) 3.23 3.23 3.24 3.21 3.1 2.76 2.71 (2) 3.95 3.88 3.79 3.62 3.3 3.02 2.65 It seems that even for 512 bytes frame sizes it's comfortably better when allocating a new skb. After that, we see diminishing rewards or even worse. Changes in v2: - properly marked dpaa2_eth_copybreak as static ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ioana Ciornei 提交于
It's useful, especially for debugging purposes, to have the Rx copybreak value changeable at runtime. Export it as an ethtool tunable. Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ioana Ciornei 提交于
DMA unmapping, allocating a new buffer and DMA mapping it back on the refill path is really not that efficient. Proper buffer recycling (page pool, flipping the page and using the other half) cannot be done for DPAA2 since it's not a ring based controller but it rather deals with multiple queues which all get their buffers from the same buffer pool on Rx. To circumvent these limitations, add support for Rx copybreak. For small sized packets instead of creating a skb around the buffer in which the frame was received, allocate a new sk buffer altogether, copy the contents of the frame and release the initial page back into the buffer pool. Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ioana Ciornei 提交于
Rename the dpaa2_eth_xdp_release_buf function into dpaa2_eth_recycle_buf since in the next patches we'll be using the same recycle mechanism for the normal stack path beside for XDP_DROP. Also, rename the array which holds the buffers to be recycled so that it does not have any reference to XDP. Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Mat Martineau says: ==================== MPTCP: Miscellaneous changes Here is a collection of patches from the MPTCP tree: Patches 1 and 2 add some helpful MIB counters for connection information. Patch 3 cleans up some unnecessary checks. Patch 4 is a new feature, support for the MP_TCPRST option. This option is used when resetting one subflow within a MPTCP connection, and provides a reason code that the recipient can use when deciding how to adapt to the lost subflow. Patches 5-7 update the existing MPTCP selftests to improve timeout handling and to share better information when tests fail. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matthieu Baerts 提交于
Very occasionally, MPTCP selftests fail. Yeah, I saw that at least once! Here we provide more details in case of errors with mptcp_join.sh script like it was done with mptcp_connect.sh, see commit 767389c8 ("selftests: mptcp: dump more info on errors") Suggested-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matthieu Baerts 提交于
Not to be impacted by packets sent between sub-tests. Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matthieu Baerts 提交于
'mptcp_connect' already has a timeout for poll() but in some cases, it is not enough. With "timeout" tool, we will force the command to fail if it doesn't finish on time. Thanks to that, the script will continue and display details about the current state before marking the test as failed. Displaying this state is very important to be able to understand the issue. Best to have our CI reporting the issue than just "the test hanged". Note that in mptcp_connect.sh, we were using a long timeout to validate the fact we cannot create a socket if a sysctl is set. We don't need this timeout. In diag.sh, we want to send signals to mptcp_connect instances that have been started in the netns. But we cannot send this signal to 'timeout' otherwise that will stop the timeout and messages telling us SIGUSR1 has been received will be printed. Instead of trying to find the right PID and storing them in an array, we can simply use the output of 'ip netns pids' which is all the PIDs we want to send signal to. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/160Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
The MPTCP reset option allows to carry a mptcp-specific error code that provides more information on the nature of a connection reset. Reset option data received gets stored in the subflow context so it can be sent to userspace via the 'subflow closed' netlink event. When a subflow is closed, the desired error code that should be sent to the peer is also placed in the subflow context structure. If a reset is sent before subflow establishment could complete, e.g. on HMAC failure during an MP_JOIN operation, the mptcp skb extension is used to store the reset information. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently we explicitly check for the first subflow being NULL in a couple of places, even if we don't need any special actions in such scenario. Just drop the unneeded checks, to avoid confusion. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
We are not currently tracking the active MPTCP connection attempts. Let's add the related counters. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
If the MPTCP protocol is unable to create a new token, the socket fallback to plain TCP, let's keep track of such events via a specific MIB. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Shannon Nelson says: ==================== ionic: add PTP and hw clock support This patchset adds support for accessing the DSC hardware clock and for offloading PTP timestamping. Tx packet timestamping happens through a separate Tx queue set up with expanded completion descriptors that can report the timestamp. Rx timestamping can happen either on all queues, or on a separate timestamping queue when specific filtering is requested. Again, the timestamps are reported with the expanded completion descriptors. The timestamping offload ability is advertised but not enabled until an OS service asks for it. At that time the driver's queues are reconfigured to use the different completion descriptors and the private processing queues as needed. Reading the raw clock value comes through a new pair of values in the device info registers in BAR0. These high and low values are interpreted with help from new clock mask, mult, and shift values in the device identity information. First we add the ability to detect new queue features, then the handling of the new descriptor sizes. After adding the new interface structures, we start adding the support code, saving the advertising to the stack for last. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shannon Nelson 提交于
Let the network stack know we've got support for timestamping the packets. Signed-off-by: NAllen Hubbe <allenbh@pensando.io> Signed-off-by: NShannon Nelson <snelson@pensando.io> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-