- 29 6月, 2022 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; -fstrict-flex-arrays=3 is coming and we need to land these changes to prevent issues like these in the short future: ../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^ Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78Build-tested-by: Nkernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/ Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 24 5月, 2022 5 次提交
-
-
由 Joanne Koong 提交于
This patch adds a new helper function void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len); which returns a pointer to the underlying data of a dynptr. *len* must be a statically known value. The bpf program may access the returned data slice as a normal buffer (eg can do direct reads and writes), since the verifier associates the length with the returned pointer, and enforces that no out of bounds accesses occur. Signed-off-by: NJoanne Koong <joannelkoong@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com
-
由 Joanne Koong 提交于
This patch adds two helper functions, bpf_dynptr_read and bpf_dynptr_write: long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset); long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len); The dynptr passed into these functions must be valid dynptrs that have been initialized. Signed-off-by: NJoanne Koong <joannelkoong@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-5-joannelkoong@gmail.com
-
由 Joanne Koong 提交于
Currently, our only way of writing dynamically-sized data into a ring buffer is through bpf_ringbuf_output but this incurs an extra memcpy cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra memcpy, but it can only safely support reservation sizes that are statically known since the verifier cannot guarantee that the bpf program won’t access memory outside the reserved space. The bpf_dynptr abstraction allows for dynamically-sized ring buffer reservations without the extra memcpy. There are 3 new APIs: long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr); void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags); void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags); These closely follow the functionalities of the original ringbuf APIs. For example, all ringbuffer dynptrs that have been reserved must be either submitted or discarded before the program exits. Signed-off-by: NJoanne Koong <joannelkoong@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NDavid Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com
-
由 Joanne Koong 提交于
This patch adds a new api bpf_dynptr_from_mem: long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr); which initializes a dynptr to point to a bpf program's local memory. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. Signed-off-by: NJoanne Koong <joannelkoong@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com
-
由 Joanne Koong 提交于
This patch adds the bulk of the verifier work for supporting dynamic pointers (dynptrs) in bpf. A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure defined internally as: struct bpf_dynptr_kern { void *data; u32 size; u32 offset; } __aligned(8); The upper 8 bits of *size* is reserved (it contains extra metadata about read-only status and dynptr type). Consequently, a dynptr only supports memory less than 16 MB. There are different types of dynptrs (eg malloc, ringbuf, ...). In this patchset, the most basic one, dynptrs to a bpf program's local memory, is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. In the verifier, dynptr state information will be tracked in stack slots. When the program passes in an uninitialized dynptr (ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding to the frame pointer where the dynptr resides at are marked STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg bpf_dynptr_read + bpf_dynptr_write which are added later in this patchset), the verifier enforces that the dynptr has been initialized properly by checking that their corresponding stack slots have been marked as STACK_DYNPTR. The 6th patch in this patchset adds test cases that the verifier should successfully reject, such as for example attempting to use a dynptr after doing a direct write into it inside the bpf program. Signed-off-by: NJoanne Koong <joannelkoong@gmail.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NDavid Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
-
- 21 5月, 2022 1 次提交
-
-
由 Geliang Tang 提交于
This patch implements a new struct bpf_func_proto, named bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP, and a new helper bpf_skc_to_mptcp_sock(), which invokes another new helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct mptcp_sock from a given subflow socket. v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c, remove build check for CONFIG_BPF_JIT v5: Drop EXPORT_SYMBOL (Martin) Co-developed-by: NNicolas Rybowski <nicolas.rybowski@tessares.net> Co-developed-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NNicolas Rybowski <nicolas.rybowski@tessares.net> Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NGeliang Tang <geliang.tang@suse.com> Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
-
- 16 5月, 2022 1 次提交
-
-
由 Eric Dumazet 提交于
New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS are used to report to user-space the device TSO limits. ip -d link sh dev eth1 ... tso_max_size 65536 tso_max_segs 65535 Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NAlexander Duyck <alexanderduyck@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 5月, 2022 1 次提交
-
-
由 Feng Zhou 提交于
Add new ebpf helpers bpf_map_lookup_percpu_elem. The implementation method is relatively simple, refer to the implementation method of map_lookup_elem of percpu map, increase the parameters of cpu, and obtain it according to the specified cpu. Signed-off-by: NFeng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 11 5月, 2022 3 次提交
-
-
由 Kui-Feng Lee 提交于
Pass a cookie along with BPF_LINK_CREATE requests. Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie. The cookie of a bpf_tracing_link is available by calling bpf_get_attach_cookie when running the BPF program of the attached link. The value of a cookie will be set at bpf_tramp_run_ctx by the trampoline of the link. Signed-off-by: NKui-Feng Lee <kuifeng@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com
-
由 Kui-Feng Lee 提交于
Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect struct bpf_tramp_link(s) for a trampoline. struct bpf_tramp_link extends bpf_link to act as a linked list node. arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to collects all bpf_tramp_link(s) that a trampoline should call. Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links instead of bpf_tramp_progs. Signed-off-by: NKui-Feng Lee <kuifeng@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com
-
由 Kaixi Fan 提交于
Add tunnel source ip field in "struct bpf_tunnel_key". Add related code to set and get tunnel source field. Signed-off-by: NKaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 09 5月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the changes in: d495f942 ("KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YnE5BIweGmCkpOTN@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 4月, 2022 1 次提交
-
-
由 Kumar Kartikeya Dwivedi 提交于
Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
-
- 09 4月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To get the changes in: b04d910a ("vdpa: support exposing the count of vqs to userspace") a61280dd ("vdpa: support exposing the config size to userspace") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h' diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h $ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h --- tools/include/uapi/linux/vhost.h 2021-07-15 16:17:01.840818309 -0300 +++ include/uapi/linux/vhost.h 2022-04-02 18:55:05.702522387 -0300 @@ -150,4 +150,11 @@ /* Get the valid iova range */ #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \ struct vhost_vdpa_iova_range) + +/* Get the config size */ +#define VHOST_VDPA_GET_CONFIG_SIZE _IOR(VHOST_VIRTIO, 0x79, __u32) + +/* Get the count of all virtqueues */ +#define VHOST_VDPA_GET_VQS_COUNT _IOR(VHOST_VIRTIO, 0x80, __u32) + #endif $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after $ diff -u before after --- before 2022-04-04 14:52:25.036375145 -0300 +++ after 2022-04-04 14:52:31.906549976 -0300 @@ -38,4 +38,6 @@ [0x73] = "VDPA_GET_CONFIG", [0x76] = "VDPA_GET_VRING_NUM", [0x78] = "VDPA_GET_IOVA_RANGE", + [0x79] = "VDPA_GET_CONFIG_SIZE", + [0x80] = "VDPA_GET_VQS_COUNT", }; $ Cc: Longpeng <longpeng2@huawei.com> Cc: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/lkml/YksxoFcOARk%2Fldev@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 4月, 2022 1 次提交
-
-
由 Haiyue Wang 提交于
The commit 8fd88691 ("bpf: Add BTF_KIND_FLOAT to uapi") has extended the BTF kind bitfield from 4 to 5 bits, correct the comment. Signed-off-by: NHaiyue Wang <haiyue.wang@intel.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com
-
- 02 4月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the changes in: 6d849191 ("KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2") ef11c946 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access") e9e9feeb ("KVM: s390: Add optional storage key checking to MEMOP IOCTL") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Cc: Oliver Upton <oupton@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YkSCOWHQdir1lhdJ@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 3月, 2022 1 次提交
-
-
由 Geliang Tang 提交于
Commit ee2a0988 missed updating the comments for helper bpf_get_stack in tools/include/uapi/linux/bpf.h. Sync it. Fixes: ee2a0988 ("bpf: Adjust BPF stack helper functions to accommodate skip > 0") Signed-off-by: NGeliang Tang <geliang.tang@suse.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ce54617746b7ed5e9ba3b844e55e74cb8a60e0b5.1648110794.git.geliang.tang@suse.com
-
- 18 3月, 2022 2 次提交
-
-
由 Jiri Olsa 提交于
Adding support to call bpf_get_attach_cookie helper from kprobe programs attached with kprobe multi link. The cookie is provided by array of u64 values, where each value is paired with provided function address or symbol with the same array index. When cookie array is provided it's sorted together with addresses (check bpf_kprobe_multi_cookie_swap). This way we can find cookie based on the address in bpf_get_attach_cookie helper. Suggested-by: NAndrii Nakryiko <andrii@kernel.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org
-
由 Jiri Olsa 提交于
Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe program through fprobe API. The fprobe API allows to attach probe on multiple functions at once very fast, because it works on top of ftrace. On the other hand this limits the probe point to the function entry or return. The kprobe program gets the same pt_regs input ctx as when it's attached through the perf API. Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment kprobe to multiple function with new link. User provides array of addresses or symbols with count to attach the kprobe program to. The new link_create uapi interface looks like: struct { __u32 flags; __u32 cnt; __aligned_u64 syms; __aligned_u64 addrs; } kprobe_multi; The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create return multi kprobe. Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org
-
- 11 3月, 2022 3 次提交
-
-
由 Roberto Sassu 提交于
ima_file_hash() has been modified to calculate the measurement of a file on demand, if it has not been already performed by IMA or the measurement is not fresh. For compatibility reasons, ima_inode_hash() remains unchanged. Keep the same approach in eBPF and introduce the new helper bpf_ima_file_hash() to take advantage of the modified behavior of ima_file_hash(). Signed-off-by: NRoberto Sassu <roberto.sassu@huawei.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220302111404.193900-4-roberto.sassu@huawei.com
-
由 Hengqi Chen 提交于
Fix the descriptions of the return values of helper bpf_current_task_under_cgroup(). Fixes: c6b5fb86 ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.com
-
由 Martin KaFai Lau 提交于
This patch is to simplify the uapi bpf.h regarding to the tstamp type and use a similar way as the kernel to describe the value stored in __sk_buff->tstamp. My earlier thought was to avoid describing the semantic and clock base for the rcv timestamp until there is more clarity on the use case, so the __sk_buff->delivery_time_type naming instead of __sk_buff->tstamp_type. With some thoughts, it can reuse the UNSPEC naming. This patch first removes BPF_SKB_DELIVERY_TIME_NONE and also rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC and BPF_SKB_DELIVERY_TIME_MONO to BPF_SKB_TSTAMP_DELIVERY_MONO. The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same: __sk_buff->tstamp has delivery time in mono clock base. BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv) tstamp at ingress and the delivery time at egress. At egress, the clock base could be found from skb->sk->sk_clockid. __sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed. With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress, the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type which was also suggested in the earlier discussion: https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/ The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type the same as its kernel skb->tstamp and skb->mono_delivery_time counter part. The internal kernel function bpf_skb_convert_dtime_type_read() is then renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified with the BPF_SKB_DELIVERY_TIME_NONE gone. A BPF_ALU32_IMM(BPF_AND) insn is also saved by using BPF_JMP32_IMM(BPF_JSET). The bpf helper bpf_skb_set_delivery_time() is also renamed to bpf_skb_set_tstamp(). The arg name is changed from dtime to tstamp also. It only allows setting tstamp 0 for BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later if there is use case to change mono delivery time to non mono. prog->delivery_time_access is also renamed to prog->tstamp_type_access. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
-
- 10 3月, 2022 1 次提交
-
-
由 Toke Høiland-Jørgensen 提交于
This adds support for running XDP programs through BPF_PROG_RUN in a mode that enables live packet processing of the resulting frames. Previous uses of BPF_PROG_RUN for XDP returned the XDP program return code and the modified packet data to userspace, which is useful for unit testing of XDP programs. The existing BPF_PROG_RUN for XDP allows userspace to set the ingress ifindex and RXQ number as part of the context object being passed to the kernel. This patch reuses that code, but adds a new mode with different semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES flag. When running BPF_PROG_RUN in this mode, the XDP program return codes will be honoured: returning XDP_PASS will result in the frame being injected into the networking stack as if it came from the selected networking interface, while returning XDP_TX and XDP_REDIRECT will result in the frame being transmitted out that interface. XDP_TX is translated into an XDP_REDIRECT operation to the same interface, since the real XDP_TX action is only possible from within the network drivers themselves, not from the process context where BPF_PROG_RUN is executed. Internally, this new mode of operation creates a page pool instance while setting up the test run, and feeds pages from that into the XDP program. The setup cost of this is amortised over the number of repetitions specified by userspace. To support the performance testing use case, we further optimise the setup step so that all pages in the pool are pre-initialised with the packet data, and pre-computed context and xdp_frame objects stored at the start of each page. This makes it possible to entirely avoid touching the page content on each XDP program invocation, and enables sending up to 9 Mpps/core on my test box. Because the data pages are recycled by the page pool, and the test runner doesn't re-initialise them for each run, subsequent invocations of the XDP program will see the packet data in the state it was after the last time it ran on that particular page. This means that an XDP program that modifies the packet before redirecting it has to be careful about which assumptions it makes about the packet content, but that is only an issue for the most naively written programs. Enabling the new flag is only allowed when not setting ctx_out and data_out in the test specification, since using it means frames will be redirected somewhere else, so they can't be returned. Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com
-
- 03 3月, 2022 1 次提交
-
-
由 Martin KaFai Lau 提交于
* __sk_buff->delivery_time_type: This patch adds __sk_buff->delivery_time_type. It tells if the delivery_time is stored in __sk_buff->tstamp or not. It will be most useful for ingress to tell if the __sk_buff->tstamp has the (rcv) timestamp or delivery_time. If delivery_time_type is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp. Two non-zero types are defined for the delivery_time_type, BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC. For UNSPEC, it can only happen in egress because only mono delivery_time can be forwarded to ingress now. The clock of UNSPEC delivery_time can be deduced from the skb->sk->sk_clockid which is how the sch_etf doing it also. * Provide forwarded delivery_time to tc-bpf@ingress: With the help of the new delivery_time_type, the tc-bpf has a way to tell if the __sk_buff->tstamp has the (rcv) timestamp or the delivery_time. During bpf load time, the verifier will learn if the bpf prog has accessed the new __sk_buff->delivery_time_type. If it does, it means the tc-bpf@ingress is expecting the skb->tstamp could have the delivery_time. The kernel will then read the skb->tstamp as-is during bpf insn rewrite without checking the skb->mono_delivery_time. This is done by adding a new prog->delivery_time_access bit. The same goes for writing skb->tstamp. * bpf_skb_set_delivery_time(): The bpf_skb_set_delivery_time() helper is added to allow setting both delivery_time and the delivery_time_type at the same time. If the tc-bpf does not need to change the delivery_time_type, it can directly write to the __sk_buff->tstamp as the existing tc-bpf has already been doing. It will be most useful at ingress to change the __sk_buff->tstamp from the (rcv) timestamp to a mono delivery_time and then bpf_redirect_*(). bpf only has mono clock helper (bpf_ktime_get_ns), and the current known use case is the mono EDT for fq, and only mono delivery time can be kept during forward now, so bpf_skb_set_delivery_time() only supports setting BPF_SKB_DELIVERY_TIME_MONO. It can be extended later when use cases come up and the forwarding path also supports other clock bases. Signed-off-by: NMartin KaFai Lau <kafai@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 3月, 2022 1 次提交
-
-
由 Anshuman Khandual 提交于
This expands generic branch type classification by adding two more entries there in i.e irq and exception return. Also updates the x86 implementation to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com
-
- 25 2月, 2022 1 次提交
-
-
由 David Dunn 提交于
Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of settings/features to allow userspace to configure PMU virtualization on a per-VM basis. For now, support a single flag, KVM_PMU_CAP_DISABLE, to allow disabling PMU virtualization for a VM even when KVM is configured with enable_pmu=true a module level. To keep KVM simple, disallow changing VM's PMU configuration after vCPUs have been created. Signed-off-by: NDavid Dunn <daviddunn@google.com> Message-Id: <20220223225743.2703915-2-daviddunn@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 2月, 2022 1 次提交
-
-
由 Nicholas Piggin 提交于
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. Reviewed-by: NFabiano Rosas <farosas@linux.ibm.com> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 2月, 2022 1 次提交
-
-
由 Hangbin Liu 提交于
This patch add a new bonding option ns_ip6_target, which correspond to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS request to determine the health of the link. For other related options like the validation, we still use arp_validate, and will change to ns_validate later. Note: the sysfs configuration support was removed based on https://lore.kernel.org/netdev/8863.1645071997@famineSigned-off-by: NHangbin Liu <liuhangbin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the trivial change in: ddecd228 ("perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures") Just adds a comment. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 2月, 2022 1 次提交
-
-
由 Jakub Sitnicki 提交于
Extend the context access tests for sk_lookup prog to cover the surprising case of a 4-byte load from the remote_port field, where the expected value is actually shifted by 16 bits. Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220209184333.654927-3-jakub@cloudflare.com
-
- 06 2月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the changes in: f6c6804c ("kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/Yf+4k5Fs5Q3HdSG9@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 2月, 2022 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the changes in: 9a10064f ("mm: add a field to store names for private anonymous memory") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ This actually adds a new prctl arg, but it has to be dealt with differently, as it is not in sequence with the other arguments. Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Colin Cross <ccross@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 2月, 2022 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To pick the trivial change in: cb1c4aba ("perf: Add new macros for mem_hops field") Just comment source code alignment. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/YflPKLhu2AtHmPov@kernel.org/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jakub Sitnicki 提交于
Add coverage to the verifier tests and tests for reading bpf_sock fields to ensure that 32-bit, 16-bit, and 8-bit loads from dst_port field are allowed only at intended offsets and produce expected values. While 16-bit and 8-bit access to dst_port field is straight-forward, 32-bit wide loads need be allowed and produce a zero-padded 16-bit value for backward compatibility. Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220130115518.213259-3-jakub@cloudflare.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 28 1月, 2022 1 次提交
-
-
由 Paolo Bonzini 提交于
Provide coverage for the new API. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 1月, 2022 1 次提交
-
-
由 Sean Young 提交于
The lirc.h file is an old copy of lirc.h from the kernel sources. It is out of date, and the bpf lirc tests don't need a new copy anyway. As long as /usr/include/linux/lirc.h is from kernel v5.2 or newer, the tests will compile fine. Signed-off-by: NSean Young <sean@mess.org> Reviewed-by: NShuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220124153028.394409-1-sean@mess.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 25 1月, 2022 1 次提交
-
-
由 Kenny Yu 提交于
This adds a helper for bpf programs to read the memory of other tasks. As an example use case at Meta, we are using a bpf task iterator program and this new helper to print C++ async stack traces for all threads of a given process. Signed-off-by: NKenny Yu <kennyyu@fb.com> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124185403.468466-3-kennyyu@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 22 1月, 2022 2 次提交
-
-
由 Lorenzo Bianconi 提交于
Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine to return a pointer to a given position in the xdp_buff if the requested area (offset + len) is contained in a contiguous memory area otherwise it will be copied in a bounce buffer provided by the caller. Similar to the tc counterpart, introduce the two following xdp helpers: - bpf_xdp_load_bytes - bpf_xdp_store_bytes Reviewed-by: NEelco Chaudron <echaudro@redhat.com> Acked-by: NToke Hoiland-Jorgensen <toke@redhat.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Lorenzo Bianconi 提交于
Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer total size (linear and paged area) Acked-by: NToke Hoiland-Jorgensen <toke@redhat.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-