1. 19 3月, 2021 2 次提交
    • A
      libbpf: Add BPF static linker APIs · faf6ed32
      Andrii Nakryiko 提交于
      Introduce BPF static linker APIs to libbpf. BPF static linker allows to
      perform static linking of multiple BPF object files into a single combined
      resulting object file, preserving all the BPF programs, maps, global
      variables, etc.
      
      Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are
      concatenated together. Similarly, code sections are also concatenated. All the
      symbols and ELF relocations are also concatenated in their respective ELF
      sections and are adjusted accordingly to the new object file layout.
      
      Static variables and functions are handled correctly as well, adjusting BPF
      instructions offsets to reflect new variable/function offset within the
      combined ELF section. Such relocations are referencing STT_SECTION symbols and
      that stays intact.
      
      Data sections in different files can have different alignment requirements, so
      that is taken care of as well, adjusting sizes and offsets as necessary to
      satisfy both old and new alignment requirements.
      
      DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG
      section, which is ignored by libbpf in bpf_object__open() anyways. So, in
      a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty
      nice property, especially if resulting .o file is then used to generate BPF
      skeleton.
      
      Original string sections are ignored and instead we construct our own set of
      unique strings using libbpf-internal `struct strset` API.
      
      To reduce the size of the patch, all the .BTF and .BTF.ext processing was
      moved into a separate patch.
      
      The high-level API consists of just 4 functions:
        - bpf_linker__new() creates an instance of BPF static linker. It accepts
          output filename and (currently empty) options struct;
        - bpf_linker__add_file() takes input filename and appends it to the already
          processed ELF data; it can be called multiple times, one for each BPF
          ELF object file that needs to be linked in;
        - bpf_linker__finalize() needs to be called to dump final ELF contents into
          the output file, specified when bpf_linker was created; after
          bpf_linker__finalize() is called, no more bpf_linker__add_file() and
          bpf_linker__finalize() calls are allowed, they will return error;
        - regardless of whether bpf_linker__finalize() was called or not,
          bpf_linker__free() will free up all the used resources.
      
      Currently, BPF static linker doesn't resolve cross-object file references
      (extern variables and/or functions). This will be added in the follow up patch
      set.
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org
      faf6ed32
    • A
      libbpf: Add generic BTF type shallow copy API · 9af44bc5
      Andrii Nakryiko 提交于
      Add btf__add_type() API that performs shallow copy of a given BTF type from
      the source BTF into the destination BTF. All the information and type IDs are
      preserved, but all the strings encountered are added into the destination BTF
      and corresponding offsets are rewritten. BTF type IDs are assumed to be
      correct or such that will be (somehow) modified afterwards.
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org
      9af44bc5
  2. 05 3月, 2021 1 次提交
  3. 15 12月, 2020 1 次提交
  4. 04 12月, 2020 2 次提交
  5. 06 11月, 2020 1 次提交
    • A
      libbpf: Implement basic split BTF support · ba451366
      Andrii Nakryiko 提交于
      Support split BTF operation, in which one BTF (base BTF) provides basic set of
      types and strings, while another one (split BTF) builds on top of base's types
      and strings and adds its own new types and strings. From API standpoint, the
      fact that the split BTF is built on top of the base BTF is transparent.
      
      Type numeration is transparent. If the base BTF had last type ID #N, then all
      types in the split BTF start at type ID N+1. Any type in split BTF can
      reference base BTF types, but not vice versa. Programmatically construction of
      a split BTF on top of a base BTF is supported: one can create an empty split
      BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw
      binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get
      initial set of split types/strings from the ELF file with .BTF section.
      
      String offsets are similarly transparent and are a logical continuation of
      base BTF's strings. When building BTF programmatically and adding a new string
      (explicitly with btf__add_str() or implicitly through appending new
      types/members), string-to-be-added would first be looked up from the base
      BTF's string section and re-used if it's there. If not, it will be looked up
      and/or added to the split BTF string section. Similarly to type IDs, types in
      split BTF can refer to strings from base BTF absolutely transparently (but not
      vice versa, of course, because base BTF doesn't "know" about existence of
      split BTF).
      
      Internal type index is slightly adjusted to be zero-indexed, ignoring a fake
      [0] VOID type. This allows to handle split/base BTF type lookups transparently
      by using btf->start_id type ID offset, which is always 1 for base/non-split
      BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF.
      
      BTF deduplication is not yet supported for split BTF and support for it will
      be added in separate patch.
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org
      ba451366
  6. 30 9月, 2020 2 次提交
  7. 29 9月, 2020 5 次提交
    • A
      libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset · f86ed050
      Andrii Nakryiko 提交于
      BTF strings are used not just for names, they can be arbitrary strings used
      for CO-RE relocations, line/func infos, etc. Thus "name_by_offset" terminology
      is too specific and might be misleading. Instead, introduce
      btf__str_by_offset() API which uses generic string terminology.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200929020533.711288-3-andriin@fb.com
      f86ed050
    • A
      libbpf: Add BTF writing APIs · 4a3b33f8
      Andrii Nakryiko 提交于
      Add APIs for appending new BTF types at the end of BTF object.
      
      Each BTF kind has either one API of the form btf__add_<kind>(). For types
      that have variable amount of additional items (struct/union, enum, func_proto,
      datasec), additional API is provided to emit each such item. E.g., for
      emitting a struct, one would use the following sequence of API calls:
      
      btf__add_struct(...);
      btf__add_field(...);
      ...
      btf__add_field(...);
      
      Each btf__add_field() will ensure that the last BTF type is of STRUCT or
      UNION kind and will automatically increment that type's vlen field.
      
      All the strings are provided as C strings (const char *), not a string offset.
      This significantly improves usability of BTF writer APIs. All such strings
      will be automatically appended to string section or existing string will be
      re-used, if such string was already added previously.
      
      Each API attempts to do all the reasonable validations, like enforcing
      non-empty names for entities with required names, proper value bounds, various
      bit offset restrictions, etc.
      
      Type ID validation is minimal because it's possible to emit a type that refers
      to type that will be emitted later, so libbpf has no way to enforce such
      cases. User must be careful to properly emit all the necessary types and
      specify type IDs that will be valid in the finally generated BTF.
      
      Each of btf__add_<kind>() APIs return new type ID on success or negative
      value on error. APIs like btf__add_field() that emit additional items
      return zero on success and negative value on error.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200929020533.711288-2-andriin@fb.com
      4a3b33f8
    • A
      libbpf: Add btf__new_empty() to create an empty BTF object · a871b043
      Andrii Nakryiko 提交于
      Add an ability to create an empty BTF object from scratch. This is going to be
      used by pahole for BTF encoding. And also by selftest for convenient creation
      of BTF objects.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com
      a871b043
    • A
      libbpf: Allow modification of BTF and add btf__add_str API · 919d2b1d
      Andrii Nakryiko 提交于
      Allow internal BTF representation to switch from default read-only mode, in
      which raw BTF data is a single non-modifiable block of memory with BTF header,
      types, and strings layed out sequentially and contiguously in memory, into
      a writable representation with types and strings data split out into separate
      memory regions, that can be dynamically expanded.
      
      Such writable internal representation is transparent to users of libbpf APIs,
      but allows to append new types and strings at the end of BTF, which is
      a typical use case when generating BTF programmatically. All the basic
      guarantees of BTF types and strings layout is preserved, i.e., user can get
      `struct btf_type *` pointer and read it directly. Such btf_type pointers might
      be invalidated if BTF is modified, so some care is required in such mixed
      read/write scenarios.
      
      Switch from read-only to writable configuration happens automatically the
      first time when user attempts to modify BTF by either adding a new type or new
      string. It is still possible to get raw BTF data, which is a single piece of
      memory that can be persisted in ELF section or into a file as raw BTF. Such
      raw data memory is also still owned by BTF and will be freed either when BTF
      object is freed or if another modification to BTF happens, as any modification
      invalidates BTF raw representation.
      
      This patch adds the first two BTF manipulation APIs: btf__add_str(), which
      allows to add arbitrary strings to BTF string section, and btf__find_str()
      which allows to find existing string offset, but not add it if it's missing.
      All the added strings are automatically deduplicated. This is achieved by
      maintaining an additional string lookup index for all unique strings. Such
      index is built when BTF is switched to modifiable mode. If at that time BTF
      strings section contained duplicate strings, they are not de-duplicated. This
      is done specifically to not modify the existing content of BTF (types, their
      string offsets, etc), which can cause confusion and is especially important
      property if there is struct btf_ext associated with struct btf. By following
      this "imperfect deduplication" process, btf_ext is kept consitent and correct.
      If deduplication of strings is necessary, it can be forced by doing BTF
      deduplication, at which point all the strings will be eagerly deduplicated and
      all string offsets both in struct btf and struct btf_ext will be updated.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com
      919d2b1d
    • S
      libbpf: Support test run of raw tracepoint programs · 88f7fe72
      Song Liu 提交于
      Add bpf_prog_test_run_opts() with support of new fields in bpf_attr.test,
      namely, flags and cpu. Also extend _opts operations to support outputs via
      opts.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200925205432.1777-3-songliubraving@fb.com
      88f7fe72
  8. 16 9月, 2020 1 次提交
  9. 04 9月, 2020 1 次提交
  10. 01 9月, 2020 1 次提交
  11. 22 8月, 2020 1 次提交
    • A
      libbpf: Add perf_buffer APIs for better integration with outside epoll loop · dca5612f
      Andrii Nakryiko 提交于
      Add a set of APIs to perf_buffer manage to allow applications to integrate
      perf buffer polling into existing epoll-based infrastructure. One example is
      applications using libevent already and wanting to plug perf_buffer polling,
      instead of relying on perf_buffer__poll() and waste an extra thread to do it.
      But perf_buffer is still extremely useful to set up and consume perf buffer
      rings even for such use cases.
      
      So to accomodate such new use cases, add three new APIs:
        - perf_buffer__buffer_cnt() returns number of per-CPU buffers maintained by
          given instance of perf_buffer manager;
        - perf_buffer__buffer_fd() returns FD of perf_event corresponding to
          a specified per-CPU buffer; this FD is then polled independently;
        - perf_buffer__consume_buffer() consumes data from single per-CPU buffer,
          identified by its slot index.
      
      To support a simpler, but less efficient, way to integrate perf_buffer into
      external polling logic, also expose underlying epoll FD through
      perf_buffer__epoll_fd() API. It will need to be followed by
      perf_buffer__poll(), wasting extra syscall, or perf_buffer__consume(), wasting
      CPU to iterate buffers with no data. But could be simpler and more convenient
      for some cases.
      
      These APIs allow for great flexiblity, but do not sacrifice general usability
      of perf_buffer.
      
      Also exercise and check new APIs in perf_buffer selftest.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/20200821165927.849538-1-andriin@fb.com
      dca5612f
  12. 14 8月, 2020 1 次提交
    • A
      libbpf: Handle BTF pointer sizes more carefully · 44ad23df
      Andrii Nakryiko 提交于
      With libbpf and BTF it is pretty common to have libbpf built for one
      architecture, while BTF information was generated for a different architecture
      (typically, but not always, BPF). In such case, the size of a pointer might
      differ betweem architectures. libbpf previously was always making an
      assumption that pointer size for BTF is the same as native architecture
      pointer size, but that breaks for cases where libbpf is built as 32-bit
      library, while BTF is for 64-bit architecture.
      
      To solve this, add heuristic to determine pointer size by searching for `long`
      or `unsigned long` integer type and using its size as a pointer size. Also,
      allow to override the pointer size with a new API btf__set_pointer_size(), for
      cases where application knows which pointer size should be used. User
      application can check what libbpf "guessed" by looking at the result of
      btf__pointer_size(). If it's not 0, then libbpf successfully determined a
      pointer size, otherwise native arch pointer size will be used.
      
      For cases where BTF is parsed from ELF file, use ELF's class (32-bit or
      64-bit) to determine pointer size.
      
      Fixes: 8a138aed ("bpf: btf: Add BTF support to libbpf")
      Fixes: 351131b5 ("libbpf: add btf_dump API for BTF-to-C conversion")
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200813204945.1020225-5-andriin@fb.com
      44ad23df
  13. 03 8月, 2020 1 次提交
  14. 02 8月, 2020 1 次提交
  15. 26 7月, 2020 1 次提交
  16. 18 7月, 2020 1 次提交
  17. 09 7月, 2020 1 次提交
  18. 29 6月, 2020 1 次提交
    • A
      libbpf: Support disabling auto-loading BPF programs · d9297581
      Andrii Nakryiko 提交于
      Currently, bpf_object__load() (and by induction skeleton's load), will always
      attempt to prepare, relocate, and load into kernel every single BPF program
      found inside the BPF object file. This is often convenient and the right thing
      to do and what users expect.
      
      But there are plenty of cases (especially with BPF development constantly
      picking up the pace), where BPF application is intended to work with old
      kernels, with potentially reduced set of features. But on kernels supporting
      extra features, it would like to take a full advantage of them, by employing
      extra BPF program. This could be a choice of using fentry/fexit over
      kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF
      program might be providing optimized bpf_iter-based solution that user-space
      might want to use, whenever available. And so on.
      
      With libbpf and BPF CO-RE in particular, it's advantageous to not have to
      maintain two separate BPF object files to achieve this. So to enable such use
      cases, this patch adds ability to request not auto-loading chosen BPF
      programs. In such case, libbpf won't attempt to perform relocations (which
      might fail due to old kernel), won't try to resolve BTF types for
      BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be
      present, and so on. Skeleton will also automatically skip auto-attachment step
      for such not loaded BPF programs.
      
      Overall, this feature allows to simplify development and deployment of
      real-world BPF applications with complicated compatibility requirements.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com
      d9297581
  19. 23 6月, 2020 1 次提交
    • A
      libbpf: Add a bunch of attribute getters/setters for map definitions · 1bdb6c9a
      Andrii Nakryiko 提交于
      Add a bunch of getter for various aspects of BPF map. Some of these attribute
      (e.g., key_size, value_size, type, etc) are available right now in struct
      bpf_map_def, but this patch adds getter allowing to fetch them individually.
      bpf_map_def approach isn't very scalable, when ABI stability requirements are
      taken into account. It's much easier to extend libbpf and add support for new
      features, when each aspect of BPF map has separate getter/setter.
      
      Getters follow the common naming convention of not explicitly having "get" in
      its name: bpf_map__type() returns map type, bpf_map__key_size() returns
      key_size. Setters, though, explicitly have set in their name:
      bpf_map__set_type(), bpf_map__set_key_size().
      
      This patch ensures we now have a getter and a setter for the following
      map attributes:
        - type;
        - max_entries;
        - map_flags;
        - numa_node;
        - key_size;
        - value_size;
        - ifindex.
      
      bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is
      unnecessary, because libbpf actually supports zero max_entries for some cases
      (e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation
      time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is
      added. bpf_map__resize()'s behavior is preserved for backwards compatibility
      reasons.
      
      Map ifindex getter is added as well. There is a setter already, but no
      corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex()
      itself is converted from void function into error-returning one, similar to
      other setters. The only error returned right now is -EBUSY, if BPF map is
      already loaded and has corresponding FD.
      
      One lacking attribute with no ability to get/set or even specify it
      declaratively is numa_node. This patch fixes this gap and both adds
      programmatic getter/setter, as well as adds support for numa_node field in
      BTF-defined map.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
      1bdb6c9a
  20. 18 6月, 2020 1 次提交
  21. 02 6月, 2020 3 次提交
    • J
      libbpf: Add support for bpf_link-based netns attachment · d60d81ac
      Jakub Sitnicki 提交于
      Add bpf_program__attach_nets(), which uses LINK_CREATE subcommand to create
      an FD-based kernel bpf_link, for attach types tied to network namespace,
      that is BPF_FLOW_DISSECTOR for the moment.
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200531082846.2117903-7-jakub@cloudflare.com
      d60d81ac
    • A
      libbpf: Add BPF ring buffer support · bf99c936
      Andrii Nakryiko 提交于
      Declaring and instantiating BPF ring buffer doesn't require any changes to
      libbpf, as it's just another type of maps. So using existing BTF-defined maps
      syntax with __uint(type, BPF_MAP_TYPE_RINGBUF) and __uint(max_elements,
      <size-of-ring-buf>) is all that's necessary to create and use BPF ring buffer.
      
      This patch adds BPF ring buffer consumer to libbpf. It is very similar to
      perf_buffer implementation in terms of API, but also attempts to fix some
      minor problems and inconveniences with existing perf_buffer API.
      
      ring_buffer support both single ring buffer use case (with just using
      ring_buffer__new()), as well as allows to add more ring buffers, each with its
      own callback and context. This allows to efficiently poll and consume
      multiple, potentially completely independent, ring buffers, using single
      epoll instance.
      
      The latter is actually a problem in practice for applications
      that are using multiple sets of perf buffers. They have to create multiple
      instances for struct perf_buffer and poll them independently or in a loop,
      each approach having its own problems (e.g., inability to use a common poll
      timeout). struct ring_buffer eliminates this problem by aggregating many
      independent ring buffer instances under the single "ring buffer manager".
      
      Second, perf_buffer's callback can't return error, so applications that need
      to stop polling due to error in data or data signalling the end, have to use
      extra mechanisms to signal that polling has to stop. ring_buffer's callback
      can return error, which will be passed through back to user code and can be
      acted upon appropariately.
      
      Two APIs allow to consume ring buffer data:
        - ring_buffer__poll(), which will wait for data availability notification
          and will consume data only from reported ring buffer(s); this API allows
          to efficiently use resources by reading data only when it becomes
          available;
        - ring_buffer__consume(), will attempt to read new records regardless of
          data availablity notification sub-system. This API is useful for cases
          when lowest latency is required, in expense of burning CPU resources.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200529075424.3139988-3-andriin@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
      bf99c936
    • E
      libbpf: Add API to consume the perf ring buffer content · 272d51af
      Eelco Chaudron 提交于
      This new API, perf_buffer__consume, can be used as follows:
      
      - When you have a perf ring where wakeup_events is higher than 1,
        and you have remaining data in the rings you would like to pull
        out on exit (or maybe based on a timeout).
      
      - For low latency cases where you burn a CPU that constantly polls
        the queues.
      Signed-off-by: NEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/159048487929.89441.7465713173442594608.stgit@ebuildSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
      272d51af
  22. 10 5月, 2020 1 次提交
  23. 02 5月, 2020 1 次提交
  24. 29 4月, 2020 1 次提交
  25. 31 3月, 2020 1 次提交
  26. 30 3月, 2020 2 次提交
  27. 29 3月, 2020 1 次提交
  28. 03 3月, 2020 1 次提交
    • A
      libbpf: Add bpf_link pinning/unpinning · c016b68e
      Andrii Nakryiko 提交于
      With bpf_link abstraction supported by kernel explicitly, add
      pinning/unpinning API for links. Also allow to create (open) bpf_link from BPF
      FS file.
      
      This API allows to have an "ephemeral" FD-based BPF links (like raw tracepoint
      or fexit/freplace attachments) surviving user process exit, by pinning them in
      a BPF FS, which is an important use case for long-running BPF programs.
      
      As part of this, expose underlying FD for bpf_link. While legacy bpf_link's
      might not have a FD associated with them (which will be expressed as
      a bpf_link with fd=-1), kernel's abstraction is based around FD-based usage,
      so match it closely. This, subsequently, allows to have a generic
      pinning/unpinning API for generalized bpf_link. For some types of bpf_links
      kernel might not support pinning, in which case bpf_link__pin() will return
      error.
      
      With FD being part of generic bpf_link, also get rid of bpf_link_fd in favor
      of using vanialla bpf_link.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200303043159.323675-3-andriin@fb.com
      c016b68e
  29. 21 2月, 2020 2 次提交