提交 · 94dfc73e7cf4a31da66b8843f0b9283ddd6b8381 · openeuler / Kernel

29 6月, 2022 1 次提交

treewide: uapi: Replace zero-length arrays with flexible-array members · 94dfc73e

由 Gustavo A. R. Silva 提交于 4月 06, 2022

There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].

This code was transformed with the help of Coccinelle:
(linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)

@@
identifier S, member, array;
type T1, T2;
@@

struct S {
  ...
  T1 member;
  T2 array[
- 0
  ];
};

-fstrict-flex-arrays=3 is coming and we need to land these changes
to prevent issues like these in the short future:

../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
but the source string has length 2 (including NUL byte) [-Wfortify-source]
		strcpy(de3->name, ".");
		^

Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
this breaks anything, we can use a union with a new member name.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/78Build-tested-by: Nkernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/
Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h
Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

94dfc73e

24 5月, 2022 5 次提交

bpf: Add dynptr data slices · 34d4ef57

由 Joanne Koong 提交于 5月 23, 2022

This patch adds a new helper function

void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len);

which returns a pointer to the underlying data of a dynptr. *len*
must be a statically known value. The bpf program may access the returned
data slice as a normal buffer (eg can do direct reads and writes), since
the verifier associates the length with the returned pointer, and
enforces that no out of bounds accesses occur.
Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com

34d4ef57

bpf: Add bpf_dynptr_read and bpf_dynptr_write · 13bbbfbe

由 Joanne Koong 提交于 5月 23, 2022

This patch adds two helper functions, bpf_dynptr_read and
bpf_dynptr_write:

long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset);

long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len);

The dynptr passed into these functions must be valid dynptrs that have
been initialized.
Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-5-joannelkoong@gmail.com

13bbbfbe

bpf: Dynptr support for ring buffers · bc34dee6

由 Joanne Koong 提交于 5月 23, 2022

Currently, our only way of writing dynamically-sized data into a ring
buffer is through bpf_ringbuf_output but this incurs an extra memcpy
cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra
memcpy, but it can only safely support reservation sizes that are
statically known since the verifier cannot guarantee that the bpf
program won’t access memory outside the reserved space.

The bpf_dynptr abstraction allows for dynamically-sized ring buffer
reservations without the extra memcpy.

There are 3 new APIs:

long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr);
void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags);
void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags);

These closely follow the functionalities of the original ringbuf APIs.
For example, all ringbuffer dynptrs that have been reserved must be
either submitted or discarded before the program exits.
Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NDavid Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com

bc34dee6

bpf: Add bpf_dynptr_from_mem for local dynptrs · 263ae152

由 Joanne Koong 提交于 5月 23, 2022

This patch adds a new api bpf_dynptr_from_mem:

long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr);

which initializes a dynptr to point to a bpf program's local memory. For now
only local memory that is of reg type PTR_TO_MAP_VALUE is supported.
Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com

263ae152

bpf: Add verifier support for dynptrs · 97e03f52

由 Joanne Koong 提交于 5月 23, 2022

This patch adds the bulk of the verifier work for supporting dynamic
pointers (dynptrs) in bpf.

A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure
defined internally as:

struct bpf_dynptr_kern {
    void *data;
    u32 size;
    u32 offset;
} __aligned(8);

The upper 8 bits of *size* is reserved (it contains extra metadata about
read-only status and dynptr type). Consequently, a dynptr only supports
memory less than 16 MB.

There are different types of dynptrs (eg malloc, ringbuf, ...). In this
patchset, the most basic one, dynptrs to a bpf program's local memory,
is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE
is supported.

In the verifier, dynptr state information will be tracked in stack
slots. When the program passes in an uninitialized dynptr
(ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding
to the frame pointer where the dynptr resides at are marked
STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg
bpf_dynptr_read + bpf_dynptr_write which are added later in this
patchset), the verifier enforces that the dynptr has been initialized
properly by checking that their corresponding stack slots have been
marked as STACK_DYNPTR.

The 6th patch in this patchset adds test cases that the verifier should
successfully reject, such as for example attempting to use a dynptr
after doing a direct write into it inside the bpf program.
Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NDavid Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com

97e03f52

21 5月, 2022 1 次提交

bpf: Add bpf_skc_to_mptcp_sock_proto · 3bc253c2

由 Geliang Tang 提交于 5月 19, 2022

This patch implements a new struct bpf_func_proto, named
bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP,
and a new helper bpf_skc_to_mptcp_sock(), which invokes another new
helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct
mptcp_sock from a given subflow socket.

v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT
v5: Drop EXPORT_SYMBOL (Martin)
Co-developed-by: NNicolas Rybowski <nicolas.rybowski@tessares.net>
Co-developed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NNicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NGeliang Tang <geliang.tang@suse.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com

3bc253c2

16 5月, 2022 1 次提交

net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes · 89527be8

由 Eric Dumazet 提交于 5月 13, 2022

New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS
are used to report to user-space the device TSO limits.

ip -d link sh dev eth1
...
   tso_max_size 65536 tso_max_segs 65535
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NAlexander Duyck <alexanderduyck@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89527be8

12 5月, 2022 1 次提交

bpf: add bpf_map_lookup_percpu_elem for percpu map · 07343110

由 Feng Zhou 提交于 5月 11, 2022

Add new ebpf helpers bpf_map_lookup_percpu_elem.

The implementation method is relatively simple, refer to the implementation
method of map_lookup_elem of percpu map, increase the parameters of cpu, and
obtain it according to the specified cpu.
Signed-off-by: NFeng Zhou <zhoufeng.zf@bytedance.com>
Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

07343110

11 5月, 2022 3 次提交

bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm. · 2fcc8241

由 Kui-Feng Lee 提交于 5月 10, 2022

Pass a cookie along with BPF_LINK_CREATE requests.

Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie.
The cookie of a bpf_tracing_link is available by calling
bpf_get_attach_cookie when running the BPF program of the attached
link.

The value of a cookie will be set at bpf_tramp_run_ctx by the
trampoline of the link.
Signed-off-by: NKui-Feng Lee <kuifeng@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com

2fcc8241

bpf, x86: Generate trampolines from bpf_tramp_links · f7e0beaf

由 Kui-Feng Lee 提交于 5月 10, 2022

Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect
struct bpf_tramp_link(s) for a trampoline.  struct bpf_tramp_link
extends bpf_link to act as a linked list node.

arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to
collects all bpf_tramp_link(s) that a trampoline should call.

Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links
instead of bpf_tramp_progs.
Signed-off-by: NKui-Feng Lee <kuifeng@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com

f7e0beaf

bpf: Add source ip in "struct bpf_tunnel_key" · 26101f5a

由 Kaixi Fan 提交于 4月 30, 2022

Add tunnel source ip field in "struct bpf_tunnel_key". Add related code
to set and get tunnel source field.
Signed-off-by: NKaixi Fan <fankaixi.li@bytedance.com>
Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

26101f5a

09 5月, 2022 1 次提交

tools headers UAPI: Sync linux/kvm.h with the kernel sources · 474e76c4

由 Arnaldo Carvalho de Melo 提交于 5月 09, 2021

To pick the changes in:

  d495f942 ("KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YnE5BIweGmCkpOTN@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

474e76c4

26 4月, 2022 1 次提交

bpf: Allow storing referenced kptr in map · c0a5a21c

由 Kumar Kartikeya Dwivedi 提交于 4月 25, 2022

Extending the code in previous commits, introduce referenced kptr
support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
unreferenced kptr, referenced kptr have a lot more restrictions. In
addition to the type matching, only a newly introduced bpf_kptr_xchg
helper is allowed to modify the map value at that offset. This transfers
the referenced pointer being stored into the map, releasing the
references state for the program, and returning the old value and
creating new reference state for the returned pointer.

Similar to unreferenced pointer case, return value for this case will
also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
must either be eventually released by calling the corresponding release
function, otherwise it must be transferred into another map.

It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
the value, and obtain the old value if any.

BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
commit will permit using BPF_LDX for such pointers, but attempt at
making it safe, since the lifetime of object won't be guaranteed.

There are valid reasons to enforce the restriction of permitting only
bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
consistent in face of concurrent modification, and any prior values
contained in the map must also be released before a new one is moved
into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
returns the old value, which the verifier would require the user to
either free or move into another map, and releases the reference held
for the pointer being moved in.

In the future, direct BPF_XCHG instruction may also be permitted to work
like bpf_kptr_xchg helper.

Note that process_kptr_func doesn't have to call
check_helper_mem_access, since we already disallow rdonly/wronly flags
for map, which is what check_map_access_type checks, and we already
ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
so check_map_access is also not required.
Signed-off-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com

c0a5a21c

09 4月, 2022 1 次提交

tools include UAPI: Sync linux/vhost.h with the kernel sources · 940442de

由 Arnaldo Carvalho de Melo 提交于 4月 14, 2020

To get the changes in:

  b04d910a ("vdpa: support exposing the count of vqs to userspace")
  a61280dd ("vdpa: support exposing the config size to userspace")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
  diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

  $ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
  --- tools/include/uapi/linux/vhost.h	2021-07-15 16:17:01.840818309 -0300
  +++ include/uapi/linux/vhost.h	2022-04-02 18:55:05.702522387 -0300
  @@ -150,4 +150,11 @@
   /* Get the valid iova range */
   #define VHOST_VDPA_GET_IOVA_RANGE	_IOR(VHOST_VIRTIO, 0x78, \
   					     struct vhost_vdpa_iova_range)
  +
  +/* Get the config size */
  +#define VHOST_VDPA_GET_CONFIG_SIZE	_IOR(VHOST_VIRTIO, 0x79, __u32)
  +
  +/* Get the count of all virtqueues */
  +#define VHOST_VDPA_GET_VQS_COUNT	_IOR(VHOST_VIRTIO, 0x80, __u32)
  +
   #endif
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
  $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
  $ diff -u before after
  --- before	2022-04-04 14:52:25.036375145 -0300
  +++ after	2022-04-04 14:52:31.906549976 -0300
  @@ -38,4 +38,6 @@
   	[0x73] = "VDPA_GET_CONFIG",
   	[0x76] = "VDPA_GET_VRING_NUM",
   	[0x78] = "VDPA_GET_IOVA_RANGE",
  +	[0x79] = "VDPA_GET_CONFIG_SIZE",
  +	[0x80] = "VDPA_GET_VQS_COUNT",
   };
  $

Cc: Longpeng <longpeng2@huawei.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/lkml/YksxoFcOARk%2Fldev@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

940442de

04 4月, 2022 1 次提交

bpf: Correct the comment for BTF kind bitfield · 66df0fdb

由 Haiyue Wang 提交于 4月 03, 2022

The commit 8fd88691 ("bpf: Add BTF_KIND_FLOAT to uapi") has extended
the BTF kind bitfield from 4 to 5 bits, correct the comment.
Signed-off-by: NHaiyue Wang <haiyue.wang@intel.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com

66df0fdb

02 4月, 2022 1 次提交

tools headers UAPI: Sync linux/kvm.h with the kernel sources · 7ceda0cf

由 Arnaldo Carvalho de Melo 提交于 5月 09, 2021

To pick the changes in:

  6d849191 ("KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2")
  ef11c946 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access")
  e9e9feeb ("KVM: s390: Add optional storage key checking to MEMOP IOCTL")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YkSCOWHQdir1lhdJ@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

7ceda0cf

29 3月, 2022 1 次提交

bpf: Sync comments for bpf_get_stack · 98870605

由 Geliang Tang 提交于 3月 24, 2022

Commit ee2a0988 missed updating the comments for helper bpf_get_stack
in tools/include/uapi/linux/bpf.h. Sync it.

Fixes: ee2a0988 ("bpf: Adjust BPF stack helper functions to accommodate skip > 0")
Signed-off-by: NGeliang Tang <geliang.tang@suse.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/ce54617746b7ed5e9ba3b844e55e74cb8a60e0b5.1648110794.git.geliang.tang@suse.com

98870605

18 3月, 2022 2 次提交

bpf: Add cookie support to programs attached with kprobe multi link · ca74823c

由 Jiri Olsa 提交于 3月 16, 2022

Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.

The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.

When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.
Suggested-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org

ca74823c

bpf: Add multi kprobe link · 0dcac272

由 Jiri Olsa 提交于 3月 16, 2022

Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.

The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.

User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:

  struct {
          __u32           flags;
          __u32           cnt;
          __aligned_u64   syms;
          __aligned_u64   addrs;
  } kprobe_multi;

The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org

0dcac272

11 3月, 2022 3 次提交

bpf-lsm: Introduce new helper bpf_ima_file_hash() · 174b1694

由 Roberto Sassu 提交于 3月 02, 2022

ima_file_hash() has been modified to calculate the measurement of a file on
demand, if it has not been already performed by IMA or the measurement is
not fresh. For compatibility reasons, ima_inode_hash() remains unchanged.

Keep the same approach in eBPF and introduce the new helper
bpf_ima_file_hash() to take advantage of the modified behavior of
ima_file_hash().
Signed-off-by: NRoberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-4-roberto.sassu@huawei.com

174b1694

bpf: Fix comment for helper bpf_current_task_under_cgroup() · 58617014

由 Hengqi Chen 提交于 3月 10, 2022

Fix the descriptions of the return values of helper bpf_current_task_under_cgroup().

Fixes: c6b5fb86 ("bpf: add documentation for eBPF helpers (42-50)")
Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.com

58617014

bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/ · 9bb984f2

由 Martin KaFai Lau 提交于 3月 09, 2022

This patch is to simplify the uapi bpf.h regarding to the tstamp type
and use a similar way as the kernel to describe the value stored
in __sk_buff->tstamp.

My earlier thought was to avoid describing the semantic and
clock base for the rcv timestamp until there is more clarity
on the use case, so the __sk_buff->delivery_time_type naming instead
of __sk_buff->tstamp_type.

With some thoughts, it can reuse the UNSPEC naming.  This patch first
removes BPF_SKB_DELIVERY_TIME_NONE and also

rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC
and    BPF_SKB_DELIVERY_TIME_MONO   to BPF_SKB_TSTAMP_DELIVERY_MONO.

The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same:
__sk_buff->tstamp has delivery time in mono clock base.

BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv)
tstamp at ingress and the delivery time at egress.  At egress,
the clock base could be found from skb->sk->sk_clockid.
__sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed.

With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress,
the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type
which was also suggested in the earlier discussion:
https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/

The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type
the same as its kernel skb->tstamp and skb->mono_delivery_time
counter part.

The internal kernel function bpf_skb_convert_dtime_type_read() is then
renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified
with the BPF_SKB_DELIVERY_TIME_NONE gone.  A BPF_ALU32_IMM(BPF_AND)
insn is also saved by using BPF_JMP32_IMM(BPF_JSET).

The bpf helper bpf_skb_set_delivery_time() is also renamed to
bpf_skb_set_tstamp().  The arg name is changed from dtime
to tstamp also.  It only allows setting tstamp 0 for
BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later
if there is use case to change mono delivery time to
non mono.

prog->delivery_time_access is also renamed to prog->tstamp_type_access.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com

9bb984f2

10 3月, 2022 1 次提交

bpf: Add "live packet" mode for XDP in BPF_PROG_RUN · b530e9e1

由 Toke Høiland-Jørgensen 提交于 3月 09, 2022

This adds support for running XDP programs through BPF_PROG_RUN in a mode
that enables live packet processing of the resulting frames. Previous uses
of BPF_PROG_RUN for XDP returned the XDP program return code and the
modified packet data to userspace, which is useful for unit testing of XDP
programs.

The existing BPF_PROG_RUN for XDP allows userspace to set the ingress
ifindex and RXQ number as part of the context object being passed to the
kernel. This patch reuses that code, but adds a new mode with different
semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES
flag.

When running BPF_PROG_RUN in this mode, the XDP program return codes will
be honoured: returning XDP_PASS will result in the frame being injected
into the networking stack as if it came from the selected networking
interface, while returning XDP_TX and XDP_REDIRECT will result in the frame
being transmitted out that interface. XDP_TX is translated into an
XDP_REDIRECT operation to the same interface, since the real XDP_TX action
is only possible from within the network drivers themselves, not from the
process context where BPF_PROG_RUN is executed.

Internally, this new mode of operation creates a page pool instance while
setting up the test run, and feeds pages from that into the XDP program.
The setup cost of this is amortised over the number of repetitions
specified by userspace.

To support the performance testing use case, we further optimise the setup
step so that all pages in the pool are pre-initialised with the packet
data, and pre-computed context and xdp_frame objects stored at the start of
each page. This makes it possible to entirely avoid touching the page
content on each XDP program invocation, and enables sending up to 9
Mpps/core on my test box.

Because the data pages are recycled by the page pool, and the test runner
doesn't re-initialise them for each run, subsequent invocations of the XDP
program will see the packet data in the state it was after the last time it
ran on that particular page. This means that an XDP program that modifies
the packet before redirecting it has to be careful about which assumptions
it makes about the packet content, but that is only an issue for the most
naively written programs.

Enabling the new flag is only allowed when not setting ctx_out and data_out
in the test specification, since using it means frames will be redirected
somewhere else, so they can't be returned.
Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com

b530e9e1

03 3月, 2022 1 次提交

bpf: Add __sk_buff->delivery_time_type and bpf_skb_set_skb_delivery_time() · 8d21ec0e

由 Martin KaFai Lau 提交于 3月 02, 2022

* __sk_buff->delivery_time_type:
This patch adds __sk_buff->delivery_time_type.  It tells if the
delivery_time is stored in __sk_buff->tstamp or not.

It will be most useful for ingress to tell if the __sk_buff->tstamp
has the (rcv) timestamp or delivery_time.  If delivery_time_type
is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp.

Two non-zero types are defined for the delivery_time_type,
BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC.  For UNSPEC,
it can only happen in egress because only mono delivery_time can be
forwarded to ingress now.  The clock of UNSPEC delivery_time
can be deduced from the skb->sk->sk_clockid which is how
the sch_etf doing it also.

* Provide forwarded delivery_time to tc-bpf@ingress:
With the help of the new delivery_time_type, the tc-bpf has a way
to tell if the __sk_buff->tstamp has the (rcv) timestamp or
the delivery_time.  During bpf load time, the verifier will learn if
the bpf prog has accessed the new __sk_buff->delivery_time_type.
If it does, it means the tc-bpf@ingress is expecting the
skb->tstamp could have the delivery_time.  The kernel will then
read the skb->tstamp as-is during bpf insn rewrite without
checking the skb->mono_delivery_time.  This is done by adding a
new prog->delivery_time_access bit.  The same goes for
writing skb->tstamp.

* bpf_skb_set_delivery_time():
The bpf_skb_set_delivery_time() helper is added to allow setting both
delivery_time and the delivery_time_type at the same time.  If the
tc-bpf does not need to change the delivery_time_type, it can directly
write to the __sk_buff->tstamp as the existing tc-bpf has already been
doing.  It will be most useful at ingress to change the
__sk_buff->tstamp from the (rcv) timestamp to
a mono delivery_time and then bpf_redirect_*().

bpf only has mono clock helper (bpf_ktime_get_ns), and
the current known use case is the mono EDT for fq, and
only mono delivery time can be kept during forward now,
so bpf_skb_set_delivery_time() only supports setting
BPF_SKB_DELIVERY_TIME_MONO.  It can be extended later when use cases
come up and the forwarding path also supports other clock bases.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d21ec0e

01 3月, 2022 1 次提交

perf: Add irq and exception return branch types · cedd3614

由 Anshuman Khandual 提交于 2月 24, 2022

This expands generic branch type classification by adding two more entries
there in i.e irq and exception return. Also updates the x86 implementation
to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes
branch types reported to user space on x86 platform but it should not be a
problem. The possible scenarios and impacts are enumerated here.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com

cedd3614

25 2月, 2022 1 次提交

KVM: x86: Provide per VM capability for disabling PMU virtualization · ba7bb663

由 David Dunn 提交于 2月 23, 2022

Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of
settings/features to allow userspace to configure PMU virtualization on
a per-VM basis.  For now, support a single flag, KVM_PMU_CAP_DISABLE,
to allow disabling PMU virtualization for a VM even when KVM is configured
with enable_pmu=true a module level.

To keep KVM simple, disallow changing VM's PMU configuration after vCPUs
have been created.
Signed-off-by: NDavid Dunn <daviddunn@google.com>
Message-Id: <20220223225743.2703915-2-daviddunn@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ba7bb663

22 2月, 2022 1 次提交

KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3 · 93b71801

由 Nicholas Piggin 提交于 2月 22, 2022

Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL
resource mode to 3 with the H_SET_MODE hypercall. This capability
differs between processor types and KVM types (PR, HV, Nested HV), and
affects guest-visible behaviour.

QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and
use the KVM CAP if available to determine KVM support[2].
Reviewed-by: NFabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

93b71801

21 2月, 2022 1 次提交

bonding: add new option ns_ip6_target · 129e3c1b

由 Hangbin Liu 提交于 2月 21, 2022

This patch add a new bonding option ns_ip6_target, which correspond
to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS
request to determine the health of the link.

For other related options like the validation, we still use
arp_validate, and will change to ns_validate later.

Note: the sysfs configuration support was removed based on
https://lore.kernel.org/netdev/8863.1645071997@famineSigned-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

129e3c1b

17 2月, 2022 1 次提交

tools headers UAPI: Sync linux/perf_event.h with the kernel sources · 714b8b71

由 Arnaldo Carvalho de Melo 提交于 5月 21, 2021

To pick the trivial change in:

ddecd228 ("perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures")

Just adds a comment.

This silences this perf build warning:

Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

714b8b71

10 2月, 2022 1 次提交

selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup · 2ed0dc59

由 Jakub Sitnicki 提交于 2月 09, 2022

Extend the context access tests for sk_lookup prog to cover the surprising
case of a 4-byte load from the remote_port field, where the expected value
is actually shifted by 16 bits.
Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220209184333.654927-3-jakub@cloudflare.com

2ed0dc59

06 2月, 2022 1 次提交

tools headers UAPI: Sync linux/kvm.h with the kernel sources · b7b9825f

由 Arnaldo Carvalho de Melo 提交于 5月 09, 2021

To pick the changes in:

  f6c6804c ("kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Yf+4k5Fs5Q3HdSG9@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

b7b9825f

02 2月, 2022 1 次提交

tools headers UAPI: Sync linux/prctl.h with the kernel sources · fc45e658

由 Arnaldo Carvalho de Melo 提交于 2月 11, 2021

To pick the changes in:

  9a10064f ("mm: add a field to store names for private anonymous memory")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  $

This actually adds a new prctl arg, but it has to be dealt with
differently, as it is not in sequence with the other arguments.

Just silences this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
  diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Colin Cross <ccross@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fc45e658

01 2月, 2022 2 次提交

tools headers UAPI: Sync linux/perf_event.h with the kernel sources · 88443d3f

由 Arnaldo Carvalho de Melo 提交于 5月 21, 2021

To pick the trivial change in:

cb1c4aba ("perf: Add new macros for mem_hops field")

Just comment source code alignment.

This silences this perf build warning:

Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/YflPKLhu2AtHmPov@kernel.org/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

88443d3f

selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads · 8f50f16f

由 Jakub Sitnicki 提交于 1月 30, 2022

Add coverage to the verifier tests and tests for reading bpf_sock fields to
ensure that 32-bit, 16-bit, and 8-bit loads from dst_port field are allowed
only at intended offsets and produce expected values.

While 16-bit and 8-bit access to dst_port field is straight-forward, 32-bit
wide loads need be allowed and produce a zero-padded 16-bit value for
backward compatibility.
Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220130115518.213259-3-jakub@cloudflare.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

8f50f16f

28 1月, 2022 1 次提交
- P
  selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP · b19c99b9
  由 Paolo Bonzini 提交于 1月 26, 2022
```
Provide coverage for the new API.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  b19c99b9
26 1月, 2022 1 次提交

tools headers UAPI: remove stale lirc.h · e2bcbd77

由 Sean Young 提交于 1月 24, 2022

The lirc.h file is an old copy of lirc.h from the kernel sources. It is
out of date, and the bpf lirc tests don't need a new copy anyway. As
long as /usr/include/linux/lirc.h is from kernel v5.2 or newer, the tests
will compile fine.
Signed-off-by: NSean Young <sean@mess.org>
Reviewed-by: NShuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220124153028.394409-1-sean@mess.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

e2bcbd77

25 1月, 2022 1 次提交

bpf: Add bpf_copy_from_user_task() helper · 376040e4

由 Kenny Yu 提交于 1月 24, 2022

This adds a helper for bpf programs to read the memory of other
tasks.

As an example use case at Meta, we are using a bpf task iterator program
and this new helper to print C++ async stack traces for all threads of
a given process.
Signed-off-by: NKenny Yu <kennyyu@fb.com>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124185403.468466-3-kennyyu@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

376040e4

22 1月, 2022 2 次提交

net: xdp: introduce bpf_xdp_pointer utility routine · 3f364222

由 Lorenzo Bianconi 提交于 1月 21, 2022

Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine
to return a pointer to a given position in the xdp_buff if the requested
area (offset + len) is contained in a contiguous memory area otherwise it
will be copied in a bounce buffer provided by the caller.
Similar to the tc counterpart, introduce the two following xdp helpers:
- bpf_xdp_load_bytes
- bpf_xdp_store_bytes
Reviewed-by: NEelco Chaudron <echaudro@redhat.com>
Acked-by: NToke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

3f364222

bpf: introduce bpf_xdp_get_buff_len helper · 0165cc81

由 Lorenzo Bianconi 提交于 1月 21, 2022

Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
total size (linear and paged area)
Acked-by: NToke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

0165cc81

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功