提交 · 4f4a5de1253915c824820107319cfde7990a432f · openeuler / Kernel

15 11月, 2022 1 次提交

libbpf: Use correct return pointer in attach_raw_tp · 5fd2a60a

由 Jiri Olsa 提交于 11月 14, 2022

We need to pass '*link' to final libbpf_get_error,
because that one holds the return value, not 'link'.

Fixes: 4fa5bcfe ("libbpf: Allow BPF program auto-attach handlers to bail out")
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221114145257.882322-1-jolsa@kernel.org

5fd2a60a

27 9月, 2022 1 次提交

libbpf: Fix the case of running as non-root with capabilities · 6a4ab886

由 Jon Doron 提交于 9月 25, 2022

When running rootless with special capabilities like:
FOWNER / DAC_OVERRIDE / DAC_READ_SEARCH

The "access" API will not make the proper check if there is really
access to a file or not.

>From the access man page:
"
The check is done using the calling process's real UID and GID, rather
than the effective IDs as is done when actually attempting an operation
(e.g., open(2)) on the file.  Similarly, for the root user, the check
uses the set of permitted capabilities  rather than the set of effective
capabilities; ***and for non-root users, the check uses an empty set of
capabilities.***
"

What that means is that for non-root user the access API will not do the
proper validation if the process really has permission to a file or not.

To resolve this this patch replaces all the access API calls with
faccessat with AT_EACCESS flag.
Signed-off-by: NJon Doron <jond@wiz.io>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220925070431.1313680-1-arilou@gmail.com

6a4ab886

24 9月, 2022 1 次提交

libbpf: Add pathname_concat() helper · e588c116

由 Wang Yufen 提交于 9月 22, 2022

Move snprintf and len check to common helper pathname_concat() to make the
code simpler.
Signed-off-by: NWang Yufen <wangyufen@huawei.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1663828124-10437-1-git-send-email-wangyufen@huawei.com

e588c116

22 9月, 2022 2 次提交

bpf: Add libbpf logic for user-space ring buffer · b66ccae0

由 David Vernet 提交于 9月 19, 2022

Now that all of the logic is in place in the kernel to support user-space
produced ring buffers, we can add the user-space logic to libbpf. This
patch therefore adds the following public symbols to libbpf:

struct user_ring_buffer *
user_ring_buffer__new(int map_fd,
		      const struct user_ring_buffer_opts *opts);
void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);
void user_ring_buffer__submit(struct user_ring_buffer *rb, void *sample);
void user_ring_buffer__discard(struct user_ring_buffer *rb,
void user_ring_buffer__free(struct user_ring_buffer *rb);

A user-space producer must first create a struct user_ring_buffer * object
with user_ring_buffer__new(), and can then reserve samples in the
ring buffer using one of the following two symbols:

void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);

With user_ring_buffer__reserve(), a pointer to a 'size' region of the ring
buffer will be returned if sufficient space is available in the buffer.
user_ring_buffer__reserve_blocking() provides similar semantics, but will
block for up to 'timeout_ms' in epoll_wait if there is insufficient space
in the buffer. This function has the guarantee from the kernel that it will
receive at least one event-notification per invocation to
bpf_ringbuf_drain(), provided that at least one sample is drained, and the
BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain().

Once a sample is reserved, it must either be committed to the ring buffer
with user_ring_buffer__submit(), or discarded with
user_ring_buffer__discard().
Signed-off-by: NDavid Vernet <void@manifault.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-4-void@manifault.com

b66ccae0

bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type · 583c1f42

由 David Vernet 提交于 9月 19, 2022

We want to support a ringbuf map type where samples are published from
user-space, to be consumed by BPF programs. BPF currently supports a
kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF
map type.  We'll need to define a new map type for user-space -> kernel,
as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply
to a user-space producer ring buffer, and we'll want to add one or
more helper functions that would not apply for a kernel-producer
ring buffer.

This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
definition. The map type is useless in its current form, as there is no
way to access or use it for anything until we one or more BPF helpers. A
follow-on patch will therefore add a new helper function that allows BPF
programs to run callbacks on samples that are published to the ring
buffer.
Signed-off-by: NDavid Vernet <void@manifault.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com

583c1f42

17 9月, 2022 1 次提交

libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set · 749c202c

由 Andrii Nakryiko 提交于 9月 09, 2022

Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF
for freplace programs. It's wrong to search in vmlinux BTF and libbpf
doesn't even mark vmlinux BTF as required for freplace programs. So
trying to search anything in obj->vmlinux_btf might cause NULL
dereference if nothing else in BPF object requires vmlinux BTF.

Instead, error out if freplace (EXT) program doesn't specify
attach_prog_fd during at the load time.

Fixes: 91abb4a6 ("libbpf: Support attachment of BPF tracing programs to kernel modules")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org

749c202c

18 8月, 2022 4 次提交

libbpf: Clean up deprecated and legacy aliases · abf84b64

由 Andrii Nakryiko 提交于 8月 15, 2022

Remove three missed deprecated APIs that were aliased to new APIs:
bpf_object__unload, bpf_prog_attach_xattr and btf__load.

Also move legacy API libbpf_find_kernel_btf (aliased to
btf__load_vmlinux_btf) into libbpf_legacy.h.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NHao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-4-andrii@kernel.org

abf84b64

libbpf: Streamline bpf_attr and perf_event_attr initialization · 813847a3

由 Andrii Nakryiko 提交于 8月 15, 2022

Make sure that entire libbpf code base is initializing bpf_attr and
perf_event_attr with memset(0). Also for bpf_attr make sure we
clear and pass to kernel only relevant parts of bpf_attr. bpf_attr is
a huge union of independent sub-command attributes, so there is no need
to clear and pass entire union bpf_attr, which over time grows quite
a lot and for most commands this growth is completely irrelevant.

Few cases where we were relying on compiler initialization of BPF UAPI
structs (like bpf_prog_info, bpf_map_info, etc) with `= {};` were
switched to memset(0) pattern for future-proofing.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NHao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-3-andrii@kernel.org

813847a3

libbpf: Fix potential NULL dereference when parsing ELF · d4e6d684

由 Andrii Nakryiko 提交于 8月 15, 2022

Fix if condition filtering empty ELF sections to prevent NULL
dereference.

Fixes: 47ea7417 ("libbpf: Skip empty sections in bpf_object__init_global_data_maps")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NHao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-2-andrii@kernel.org

d4e6d684

libbpf: Allows disabling auto attach · 43cb8cba

由 Hao Luo 提交于 8月 16, 2022

Adds libbpf APIs for disabling auto-attach for individual functions.
This is motivated by the use case of cgroup iter [1]. Some iter
types require their parameters to be non-zero, therefore applying
auto-attach on them will fail. With these two new APIs, users who
want to use auto-attach and these types of iters can disable
auto-attach on the program and perform manual attach.

[1] https://lore.kernel.org/bpf/CAEf4BzZ+a2uDo_t6kGBziqdz--m2gh2_EUwkGLDtMd65uwxUjA@mail.gmail.com/Signed-off-by: NHao Luo <haoluo@google.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220816234012.910255-1-haoluo@google.com

43cb8cba

16 8月, 2022 1 次提交

libbpf: Making bpf_prog_load() ignore name if kernel doesn't support · 1f235777

由 Hangbin Liu 提交于 8月 13, 2022

Similar with commit 10b62d6a ("libbpf: Add names for auxiliary maps"),
let's make bpf_prog_load() also ignore name if kernel doesn't support
program name.

To achieve this, we need to call sys_bpf_prog_load() directly in
probe_kern_prog_name() to avoid circular dependency. sys_bpf_prog_load()
also need to be exported in the libbpf_internal.h file.
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NQuentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220813000936.6464-1-liuhangbin@gmail.com

1f235777

12 8月, 2022 1 次提交

libbpf: Add names for auxiliary maps · 10b62d6a

由 Hangbin Liu 提交于 8月 11, 2022

The bpftool self-created maps can appear in final map show output due to
deferred removal in kernel. These maps don't have a name, which would make
users confused about where it comes from.

With a libbpf_ prefix name, users could know who created these maps.
It also could make some tests (like test_offload.py, which skip base maps
without names as a workaround) filter them out.

Kernel adds bpf prog/map name support in the same merge
commit fadad670 ("Merge branch 'bpf-extend-info'"). So we can also use
kernel_supports(NULL, FEAT_PROG_NAME) to check if kernel supports map name.

As discussed [1], Let's make bpf_map_create accept non-null
name string, and silently ignore the name if kernel doesn't support.

  [1] https://lore.kernel.org/bpf/CAEf4BzYL1TQwo1231s83pjTdFPk9XWWhfZC5=KzkU-VO0k=0Ug@mail.gmail.com/Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220811034020.529685-1-liuhangbin@gmail.com

10b62d6a

11 8月, 2022 1 次提交

libbpf: preserve errno across pr_warn/pr_info/pr_debug · d7c5802f

由 Andrii Nakryiko 提交于 8月 10, 2022

As suggested in [0], make sure that libbpf_print saves and restored
errno and as such guaranteed that no matter what actual print callback
user installs, macros like pr_warn/pr_info/pr_debug are completely
transparent as far as errno goes.

While libbpf code is pretty careful about not clobbering important errno
values accidentally with pr_warn(), it's a trivial change to make sure
that pr_warn can be used anywhere without a risk of clobbering errno.

No functional changes, just future proofing.

  [0] https://github.com/libbpf/libbpf/pull/536Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NDaniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/r/20220810183425.1998735-1-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

d7c5802f

09 8月, 2022 1 次提交

libbpf: Do not require executable permission for shared libraries · 9e32084e

由 Hengqi Chen 提交于 8月 06, 2022

Currently, resolve_full_path() requires executable permission for both
programs and shared libraries. This causes failures on distos like Debian
since the shared libraries are not installed executable and Linux is not
requiring shared libraries to have executable permissions. Let's remove
executable permission check for shared libraries.
Reported-by: NGoro Fuji <goro@fastly.com>
Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220806102021.3867130-1-hengqi.chen@gmail.com

9e32084e

08 8月, 2022 1 次提交

libbpf: Reject legacy 'maps' ELF section · e19db676

由 Andrii Nakryiko 提交于 8月 03, 2022

Add explicit error message if BPF object file is still using legacy BPF
map definitions in SEC("maps"). Before this change, if BPF object file
is still using legacy map definition user will see a bit confusing:

libbpf: elf: skipping unrecognized data section(4) maps
libbpf: prog 'handler': bad map relo against 'server_map' in section 'maps'

Now libbpf will be explicit about rejecting "maps" ELF section:

libbpf: elf: legacy map definitions in 'maps' section are not supported by libbpf v1.0+
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220803214202.23750-1-andrii@kernel.org

e19db676

05 8月, 2022 1 次提交

libbpf: Skip empty sections in bpf_object__init_global_data_maps · 47ea7417

由 James Hilliard 提交于 7月 31, 2022

The GNU assembler generates an empty .bss section. This is a well
established behavior in GAS that happens in all supported targets.

The LLVM assembler doesn't generate an empty .bss section.

bpftool chokes on the empty .bss section.

Additionally in bpf_object__elf_collect the sec_desc->data is not
initialized when a section is not recognized. In this case, this
happens with .comment.

So we must check that sec_desc->data is initialized before checking
if the size is 0.
Signed-off-by: NJames Hilliard <james.hilliard1@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220731232649.4668-1-james.hilliard1@gmail.com

47ea7417

29 7月, 2022 1 次提交

libbpf: Support PPC in arch_specific_syscall_pfx · 64893e83

由 Daniel Müller 提交于 7月 28, 2022

Commit 708ac5be ("libbpf: add ksyscall/kretsyscall sections support
for syscall kprobes") added the arch_specific_syscall_pfx() function,
which returns a string representing the architecture in use. As it turns
out this function is currently not aware of Power PC, where NULL is
returned. That's being flagged by the libbpf CI system, which builds for
ppc64le and the compiler sees a NULL pointer being passed in to a %s
format string.
With this change we add representations for two more architectures, for
Power PC and Power PC 64, and also adjust the string format logic to
handle NULL pointers gracefully, in an attempt to prevent similar issues
with other architectures in the future.

Fixes: 708ac5be ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes")
Signed-off-by: NDaniel Müller <deso@posteo.net>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net

64893e83

20 7月, 2022 5 次提交

libbpf: make RINGBUF map size adjustments more eagerly · 597fbc46

由 Andrii Nakryiko 提交于 7月 15, 2022

Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2
of page_size) more eagerly: during open phase when initializing the map
and on explicit calls to bpf_map__set_max_entries().

Such approach allows user to check actual size of BPF ringbuf even
before it's created in the kernel, but also it prevents various edge
case scenarios where BPF ringbuf size can get out of sync with what it
would be in kernel. One of them (reported in [0]) is during an attempt
to pin/reuse BPF ringbuf.

Move adjust_ringbuf_sz() helper closer to its first actual use. The
implementation of the helper is unchanged.

Also make detection of whether bpf_object is already loaded more robust
by checking obj->loaded explicitly, given that map->fd can be < 0 even
if bpf_object is already loaded due to ability to disable map creation
with bpf_map__set_autocreate(map, false).

  [0] Closes: https://github.com/libbpf/libbpf/pull/530

Fixes: 0087a681 ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

597fbc46

libbpf: fallback to tracefs mount point if debugfs is not mounted · a1ac9fd6

由 Andrii Nakryiko 提交于 7月 15, 2022

Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if
debugfs (/sys/kernel/debug/tracing) isn't mounted.
Acked-by: NYonghong Song <yhs@fb.com>
Suggested-by: NConnor O'Brien <connoro@google.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

a1ac9fd6

libbpf: add ksyscall/kretsyscall sections support for syscall kprobes · 708ac5be

由 Andrii Nakryiko 提交于 7月 14, 2022

Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding
kretsyscall variants (for return kprobes) to allow users to kprobe
syscall functions in kernel. These special sections allow to ignore
complexities and differences between kernel versions and host
architectures when it comes to syscall wrapper and corresponding
__<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on
whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf
itself doesn't rely on /proc/config.gz for detecting this, see
BPF_KSYSCALL patch for how it's done internally).

Combined with the use of BPF_KSYSCALL() macro, this allows to just
specify intended syscall name and expected input arguments and leave
dealing with all the variations to libbpf.

In addition to SEC("ksyscall+") and SEC("kretsyscall+") add
bpf_program__attach_ksyscall() API which allows to specify syscall name
at runtime and provide associated BPF cookie value.

At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not
handle all the calling convention quirks for mmap(), clone() and compat
syscalls. It also only attaches to "native" syscall interfaces. If host
system supports compat syscalls or defines 32-bit syscalls in 64-bit
kernel, such syscall interfaces won't be attached to by libbpf.

These limitations may or may not change in the future. Therefore it is
recommended to use SEC("kprobe") for these syscalls or if working with
compat and 32-bit interfaces is required.
Tested-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

708ac5be

libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL · 6f5d467d

由 Andrii Nakryiko 提交于 7月 14, 2022

Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to
match libbpf's SEC("ksyscall") section name, added in next patch) to use
__kconfig variable to determine how to properly fetch syscall arguments.

Instead of relying on hard-coded knowledge of whether kernel's
architecture uses syscall wrapper or not (which only reflects the latest
kernel versions, but is not necessarily true for older kernels and won't
necessarily hold for later kernel versions on some particular host
architecture), determine this at runtime by attempting to create
perf_event (with fallback to kprobe event creation through tracefs on
legacy kernels, just like kprobe attachment code is doing) for kernel
function that would correspond to bpf() syscall on a system that has
CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try
'__x64_sys_bpf').

If host kernel uses syscall wrapper, syscall kernel function's first
argument is a pointer to struct pt_regs that then contains syscall
arguments. In such case we need to use bpf_probe_read_kernel() to fetch
actual arguments (which we do through BPF_CORE_READ() macro) from inner
pt_regs.

But if the kernel doesn't use syscall wrapper approach, input
arguments can be read from struct pt_regs directly with no probe reading.

All this feature detection is done without requiring /proc/config.gz
existence and parsing, and BPF-side helper code uses newly added
LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with
user-side feature detection of libbpf.

BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that
define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf"))
and SEC("ksyscall") program added in the next patch (which are the same
kprobe program with added benefit of libbpf determining correct kernel
function name automatically).

Kretprobe and kretsyscall (added in next patch) programs don't need
BPF_KSYSCALL as they don't provide access to input arguments. Normal
BPF_KRETPROBE is completely sufficient and is recommended.
Tested-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

6f5d467d

libbpf: generalize virtual __kconfig externs and use it for USDT · 55d00c37

由 Andrii Nakryiko 提交于 7月 14, 2022

Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION.
LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead
customly filled out by libbpf.

This patch generalizes this approach to support more such virtual
__kconfig externs. One such extern added in this patch is
LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in
usdt.bpf.h instead of using CO-RE-based enum detection approach for
detecting bpf_get_attach_cookie() BPF helper. This allows to remove
otherwise not needed CO-RE dependency and keeps user-space and BPF-side
parts of libbpf's USDT support strictly in sync in terms of their
feature detection.

We'll use similar approach for syscall wrapper detection for
BPF_KSYSCALL() BPF-side macro in follow up patch.

Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values
and LINUX_ for virtual libbpf-backed externs. In the future we might
extend the set of prefixes that are supported. This can be done without
any breaking changes, as currently any __kconfig extern with
unrecognized name is rejected.

For LINUX_xxx externs we support the normal "weak rule": if libbpf
doesn't recognize given LINUX_xxx extern but such extern is marked as
__weak, it is not rejected and defaults to zero. This follows
CONFIG_xxx handling logic and will allow BPF applications to
opportunistically use newer libbpf virtual externs without breaking on
older libbpf versions unnecessarily.
Tested-by: NAlan Maguire <alan.maguire@oracle.com>
Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

55d00c37

16 7月, 2022 1 次提交

libbpf: perfbuf: Add API to get the ring buffer · 9ff5efde

由 Jon Doron 提交于 7月 15, 2022

Add support for writing a custom event reader, by exposing the ring
buffer.

With the new API perf_buffer__buffer() you will get access to the
raw mmaped()'ed per-cpu underlying memory of the ring buffer.

This region contains both the perf buffer data and header
(struct perf_event_mmap_page), which manages the ring buffer
state (head/tail positions, when accessing the head/tail position
it's important to take into consideration SMP).
With this type of low level access one can implement different types of
consumers here are few simple examples where this API helps with:

1. perf_event_read_simple is allocating using malloc, perhaps you want
   to handle the wrap-around in some other way.
2. Since perf buf is per-cpu then the order of the events is not
   guarnteed, for example:
   Given 3 events where each event has a timestamp t0 < t1 < t2,
   and the events are spread on more than 1 CPU, then we can end
   up with the following state in the ring buf:
   CPU[0] => [t0, t2]
   CPU[1] => [t1]
   When you consume the events from CPU[0], you could know there is
   a t1 missing, (assuming there are no drops, and your event data
   contains a sequential index).
   So now one can simply do the following, for CPU[0], you can store
   the address of t0 and t2 in an array (without moving the tail, so
   there data is not perished) then move on the CPU[1] and set the
   address of t1 in the same array.
   So you end up with something like:
   void **arr[] = [&t0, &t1, &t2], now you can consume it orderely
   and move the tails as you process in order.
3. Assuming there are multiple CPUs and we want to start draining the
   messages from them, then we can "pick" with which one to start with
   according to the remaining free space in the ring buffer.
Signed-off-by: NJon Doron <jond@wiz.io>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com

9ff5efde

14 7月, 2022 2 次提交

libbpf: Fix the name of a reused map · bf3f0037

由 Anquan Wu 提交于 7月 12, 2022

BPF map name is limited to BPF_OBJ_NAME_LEN.
A map name is defined as being longer than BPF_OBJ_NAME_LEN,
it will be truncated to BPF_OBJ_NAME_LEN when a userspace program
calls libbpf to create the map. A pinned map also generates a path
in the /sys. If the previous program wanted to reuse the map，
it can not get bpf_map by name, because the name of the map is only
partially the same as the name which get from pinned path.

The syscall information below show that map name "process_pinned_map"
is truncated to "process_pinned_".

bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map",
bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory)

bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4,
value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0,
map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6,
btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4

This patch check that if the name of pinned map are the same as the
actual name for the first (BPF_OBJ_NAME_LEN - 1),
bpf map still uses the name which is included in bpf object.

Fixes: 26736eb9 ("tools: libbpf: allow map reuse")
Signed-off-by: NAnquan Wu <leiqi96@hotmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM

bf3f0037

libbpf: Error out when binary_path is NULL for uprobe and USDT · 8ed2f5a6

由 Hengqi Chen 提交于 7月 12, 2022

binary_path is a required non-null parameter for bpf_program__attach_usdt
and bpf_program__attach_uprobe_opts. Check it against NULL to prevent
coredump on strchr.
Signed-off-by: NHengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com

8ed2f5a6

06 7月, 2022 4 次提交

libbpf: Cleanup the legacy uprobe_event on failed add/attach_event() · 2655144f

由 Chuang Wang 提交于 6月 29, 2022

A potential scenario, when an error is returned after
add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or
bpf_program__attach_perf_event_opts() in
bpf_program__attach_uprobe_opts() returns an error, the uprobe_event
that was previously created is not cleaned.

So, with this patch, when an error is returned, fix this by adding
remove_uprobe_event_legacy()
Signed-off-by: NChuang Wang <nashuiliang@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com

2655144f

libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy() · 5666fc99

由 Chuang Wang 提交于 6月 29, 2022

Use "type" as opposed to "err" in pr_warn() after
determine_uprobe_perf_type_legacy() returns an error.
Signed-off-by: NChuang Wang <nashuiliang@gmail.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-3-nashuiliang@gmail.com

5666fc99

libbpf: Cleanup the legacy kprobe_event on failed add/attach_event() · 80940293

由 Chuang Wang 提交于 6月 29, 2022

Before the 0bc11ed5 commit ("kprobes: Allow kprobes coexist with
livepatch"), in a scenario where livepatch and kprobe coexist on the
same function entry, the creation of kprobe_event using
add_kprobe_event_legacy() will be successful, at the same time as a
trace event (e.g. /debugfs/tracing/events/kprobe/XXX) will exist, but
perf_event_open() will return an error because both livepatch and kprobe
use FTRACE_OPS_FL_IPMODIFY. As follows:

1) add a livepatch

$ insmod livepatch-XXX.ko

2) add a kprobe using tracefs API (i.e. add_kprobe_event_legacy)

$ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events

3) enable this kprobe (i.e. sys_perf_event_open)

This will return an error, -EBUSY.

On Andrii Nakryiko's comment, few error paths in
bpf_program__attach_kprobe_opts() that should need to call
remove_kprobe_event_legacy().

With this patch, whenever an error is returned after
add_kprobe_event_legacy() or bpf_program__attach_perf_event_opts(), this
ensures that the created kprobe_event is cleaned.
Signed-off-by: NChuang Wang <nashuiliang@gmail.com>
Signed-off-by: NJingren Zhou <zhoujingren@didiglobal.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-2-nashuiliang@gmail.com

80940293

bpf, libbpf: Add type match support · ec6209c8

由 Daniel Müller 提交于 6月 28, 2022

This patch adds support for the proposed type match relation to
relo_core where it is shared between userspace and kernel. It plumbs
through both kernel-side and libbpf-side support.

The matching relation is defined as follows (copy from source):
- modifiers and typedefs are stripped (and, hence, effectively ignored)
- generally speaking types need to be of same kind (struct vs. struct, union
  vs. union, etc.)
  - exceptions are struct/union behind a pointer which could also match a
    forward declaration of a struct or union, respectively, and enum vs.
    enum64 (see below)
Then, depending on type:
- integers:
  - match if size and signedness match
- arrays & pointers:
  - target types are recursively matched
- structs & unions:
  - local members need to exist in target with the same name
  - for each member we recursively check match unless it is already behind a
    pointer, in which case we only check matching names and compatible kind
- enums:
  - local variants have to have a match in target by symbolic name (but not
    numeric value)
  - size has to match (but enum may match enum64 and vice versa)
- function pointers:
  - number and position of arguments in local type has to match target
  - for each argument and the return value we recursively check match
Signed-off-by: NDaniel Müller <deso@posteo.net>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net

ec6209c8

30 6月, 2022 1 次提交

libbpf: add lsm_cgoup_sock type · bffcf348

由 Stanislav Fomichev 提交于 6月 28, 2022

lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

bffcf348

29 6月, 2022 9 次提交

libbpf: enforce strict libbpf 1.0 behaviors · bd054102

由 Andrii Nakryiko 提交于 6月 27, 2022

Remove support for legacy features and behaviors that previously had to
be disabled by calling libbpf_set_strict_mode():
  - legacy BPF map definitions are not supported now;
  - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see
    libbpf_set_memlock_rlim());
  - program name is used for program pinning (instead of section name);
  - cleaned up error returning logic;
  - entry BPF programs should have SEC() always.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

bd054102

libbpf: clean up SEC() handling · 450b167f

由 Andrii Nakryiko 提交于 6月 27, 2022

Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap}
sections.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

450b167f

libbpf: remove internal multi-instance prog support · cf90a20d

由 Andrii Nakryiko 提交于 6月 27, 2022

Clean up internals that had to deal with the possibility of
multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all
this is not necessary now and can be simplified.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

cf90a20d

libbpf: remove multi-instance and custom private data APIs · b4bda502

由 Andrii Nakryiko 提交于 6月 27, 2022

Remove all the public APIs that are related to creating multi-instance
bpf_programs through custom preprocessing callback and generally working
with them.

Also remove all the bpf_{object,map,program}__[set_]priv() APIs.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

b4bda502

libbpf: remove most other deprecated high-level APIs · 146bf811

由 Andrii Nakryiko 提交于 6月 27, 2022

Remove a bunch of high-level bpf_object/bpf_map/bpf_program related
APIs. All the APIs related to private per-object/map/prog state,
program preprocessing callback, and generally everything multi-instance
related is removed in a separate patch.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

146bf811

libbpf: remove prog_info_linear APIs · 9a590538

由 Andrii Nakryiko 提交于 6月 27, 2022

Remove prog_info_linear-related APIs previously used by perf.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

9a590538

libbpf: clean up perfbuf APIs · 22dd7a58

由 Andrii Nakryiko 提交于 6月 27, 2022

Remove deprecated perfbuf APIs and clean up opts structs.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

22dd7a58

libbpf: remove deprecated BTF APIs · aaf6886d

由 Andrii Nakryiko 提交于 6月 27, 2022

Get rid of deprecated BTF-related APIs.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

aaf6886d

libbpf: remove deprecated low-level APIs · 765a3413

由 Andrii Nakryiko 提交于 6月 27, 2022

Drop low-level APIs as well as high-level (and very confusingly named)
BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated()
APIs.
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

765a3413

25 6月, 2022 1 次提交

bpf: Merge "types_are_compat" logic into relo_core.c · fd75733d

由 Daniel Müller 提交于 6月 23, 2022

BPF type compatibility checks (bpf_core_types_are_compat()) are
currently duplicated between kernel and user space. That's a historical
artifact more than intentional doing and can lead to subtle bugs where
one implementation is adjusted but another is forgotten.

That happened with the enum64 work, for example, where the libbpf side
was changed (commit 23b2a3a8 ("libbpf: Add enum64 relocation
support")) to use the btf_kind_core_compat() helper function but the
kernel side was not (commit 6089fb32 ("bpf: Add btf enum64
support")).

This patch addresses both the duplication issue, by merging both
implementations and moving them into relo_core.c, and fixes the alluded
to kind check (by giving preference to libbpf's already adjusted logic).

For discussion of the topic, please refer to:
https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665

Changelog:
v1 -> v2:
- limited libbpf recursion limit to 32
- changed name to __bpf_core_types_are_compat
- included warning previously present in libbpf version
- merged kernel and user space changes into a single patch
Signed-off-by: NDaniel Müller <deso@posteo.net>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net

fd75733d

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功