1. 25 5月, 2019 6 次提交
  2. 23 5月, 2019 1 次提交
  3. 17 5月, 2019 1 次提交
  4. 16 5月, 2019 1 次提交
  5. 13 5月, 2019 1 次提交
    • A
      libbpf: detect supported kernel BTF features and sanitize BTF · d7c4b398
      Andrii Nakryiko 提交于
      Depending on used versions of libbpf, Clang, and kernel, it's possible to
      have valid BPF object files with valid BTF information, that still won't
      load successfully due to Clang emitting newer BTF features (e.g.,
      BTF_KIND_FUNC, .BTF.ext's line_info/func_info, BTF_KIND_DATASEC, etc), that
      are not yet supported by older kernel.
      
      This patch adds detection of BTF features and sanitizes BPF object's BTF
      by substituting various supported BTF kinds, which have compatible layout:
        - BTF_KIND_FUNC -> BTF_KIND_TYPEDEF
        - BTF_KIND_FUNC_PROTO -> BTF_KIND_ENUM
        - BTF_KIND_VAR -> BTF_KIND_INT
        - BTF_KIND_DATASEC -> BTF_KIND_STRUCT
      
      Replacement is done in such a way as to preserve as much information as
      possible (names, sizes, etc) where possible without violating kernel's
      validation rules.
      
      v2->v3:
        - remove duplicate #defines from libbpf_util.h
      
      v1->v2:
        - add internal libbpf_internal.h w/ common stuff
        - switch SK storage BTF to use new libbpf__probe_raw_btf()
      Reported-by: NAlexei Starovoitov <ast@fb.com>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d7c4b398
  6. 06 5月, 2019 1 次提交
  7. 05 5月, 2019 4 次提交
  8. 28 4月, 2019 1 次提交
  9. 27 4月, 2019 1 次提交
  10. 26 4月, 2019 4 次提交
    • M
      libbpf: add binary to gitignore · 39391377
      Matteo Croce 提交于
      Some binaries are generated when building libbpf from tools/lib/bpf/,
      namely libbpf.so.0.0.2 and libbpf.so.0.
      Add them to the local .gitignore.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      39391377
    • D
      libbpf: fix samples/bpf build failure due to undefined UINT32_MAX · 32e621e5
      Daniel T. Lee 提交于
      Currently, building bpf samples will cause the following error.
      
          ./tools/lib/bpf/bpf.h:132:27: error: 'UINT32_MAX' undeclared here (not in a function) ..
           #define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
                                     ^
          ./samples/bpf/bpf_load.h:31:25: note: in expansion of macro 'BPF_LOG_BUF_SIZE'
           extern char bpf_log_buf[BPF_LOG_BUF_SIZE];
                                   ^~~~~~~~~~~~~~~~
      
      Due to commit 4519efa6 ("libbpf: fix BPF_LOG_BUF_SIZE off-by-one error")
      hard-coded size of BPF_LOG_BUF_SIZE has been replaced with UINT32_MAX which is
      defined in <stdint.h> header.
      
      Even with this change, bpf selftests are running fine since these are built
      with clang and it includes header(-idirafter) from clang/6.0.0/include.
      (it has <stdint.h>)
      
          clang -I. -I./include/uapi -I../../../include/uapi -idirafter /usr/local/include -idirafter /usr/include \
          -idirafter /usr/lib/llvm-6.0/lib/clang/6.0.0/include -idirafter /usr/include/x86_64-linux-gnu \
          -Wno-compare-distinct-pointer-types -O2 -target bpf -emit-llvm -c progs/test_sysctl_prog.c -o - | \
          llc -march=bpf -mcpu=generic  -filetype=obj -o /linux/tools/testing/selftests/bpf/test_sysctl_prog.o
      
      But bpf samples are compiled with GCC, and it only searches and includes
      headers declared at the target file. As '#include <stdint.h>' hasn't been
      declared in tools/lib/bpf/bpf.h, it causes build failure of bpf samples.
      
          gcc -Wp,-MD,./samples/bpf/.sockex3_user.o.d -Wall -Wmissing-prototypes -Wstrict-prototypes \
          -O2 -fomit-frame-pointer -std=gnu89 -I./usr/include -I./tools/lib/ -I./tools/testing/selftests/bpf/ \
          -I./tools/  lib/ -I./tools/include -I./tools/perf -c -o ./samples/bpf/sockex3_user.o ./samples/bpf/sockex3_user.c;
      
      This commit add declaration of '#include <stdint.h>' to tools/lib/bpf/bpf.h
      to fix this problem.
      Signed-off-by: NDaniel T. Lee <danieltimlee@gmail.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      32e621e5
    • D
      bpf, libbpf: fix segfault in bpf_object__init_maps' pr_debug statement · 4f8827d2
      Daniel Borkmann 提交于
      Ran into it while testing; in bpf_object__init_maps() data can be NULL
      in the case where no map section is present. Therefore we simply cannot
      access data->d_size before NULL test. Move the pr_debug() where it's
      safe to access.
      
      Fixes: d859900c ("bpf, libbpf: support global data/bss/rodata sections")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      4f8827d2
    • D
      bpf, libbpf: handle old kernels more graceful wrt global data sections · 8837fe5d
      Daniel Borkmann 提交于
      Andrii reported a corner case where e.g. global static data is present
      in the BPF ELF file in form of .data/.bss/.rodata section, but without
      any relocations to it. Such programs could be loaded before commit
      d859900c ("bpf, libbpf: support global data/bss/rodata sections"),
      whereas afterwards if kernel lacks support then loading would fail.
      
      Add a probing mechanism which skips setting up libbpf internal maps
      in case of missing kernel support. In presence of relocation entries,
      we abort the load attempt.
      
      Fixes: d859900c ("bpf, libbpf: support global data/bss/rodata sections")
      Reported-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      8837fe5d
  11. 20 4月, 2019 1 次提交
  12. 19 4月, 2019 1 次提交
  13. 17 4月, 2019 5 次提交
    • M
      libbpf: optimize barrier for XDP socket rings · 2c5935f1
      Magnus Karlsson 提交于
      The full memory barrier in the XDP socket rings on the consumer side
      between the load of the data and the store of the consumer ring is
      there to protect the store from being executed before the load of the
      data. If this was allowed to happen, the producer might overwrite the
      data field with a new entry before the consumer got the chance to read
      it.
      
      On x86, stores are guaranteed not to be reordered with older loads, so
      it does not need a full memory barrier here. A compile time barrier
      would be enough. This patch introdcues a new primitive in
      libbpf_util.h that implements a new barrier type (libbpf_smp_rwmb)
      hindering stores to be reordered with older loads. It is then used in
      the XDP socket ring access code in libbpf to improve performance.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      2c5935f1
    • M
      libbpf: remove dependency on barrier.h in xsk.h · b7e3a280
      Magnus Karlsson 提交于
      The use of smp_rmb() and smp_wmb() creates a Linux header dependency
      on barrier.h that is unnecessary in most parts. This patch implements
      the two small defines that are needed from barrier.h. As a bonus, the
      new implementations are faster than the default ones as they default
      to sfence and lfence for x86, while we only need a compiler barrier in
      our case. Just as it is when the same ring access code is compiled in
      the kernel.
      
      Fixes: 1cad0788 ("libbpf: add support for using AF_XDP sockets")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      b7e3a280
    • M
      libbpf: remove likely/unlikely in xsk.h · a06d7296
      Magnus Karlsson 提交于
      This patch removes the use of likely and unlikely in xsk.h since they
      create a dependency on Linux headers as reported by several
      users. There have also been reports that the use of these decreases
      performance as the compiler puts the code on two different cache lines
      instead of on a single one. All in all, I think we are better off
      without them.
      
      Fixes: 1cad0788 ("libbpf: add support for using AF_XDP sockets")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      a06d7296
    • M
      libbpf: fix XDP socket ring buffer memory ordering · d5e63fdd
      Magnus Karlsson 提交于
      The ring buffer code of	XDP sockets is missing a memory	barrier	on the
      consumer side between the load of the data and the write that signals
      that it is ok for the producer to put new data into the buffer. On
      architectures that does not guarantee that stores are not reordered
      with older loads, the producer might put data into the ring before the
      consumer had the chance to read it. As IA does guarantee this
      ordering, it would only need a compiler barrier here, but there are no
      primitives in barrier.h for this specific case (hinder writes to be ordered
      before older reads) so I had to add a smp_mb() here which will
      translate into a run-time synch operation on IA.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d5e63fdd
    • A
      libbpf: fix printf formatter for ptrdiff_t argument · e1d1dc46
      Andrii Nakryiko 提交于
      Using %ld for printing out value of ptrdiff_t type is not portable
      between 32-bit and 64-bit archs. This is causing compilation errors for
      libbpf on 32-bit platform (discovered as part of an effort to integrate
      libbpf into systemd ([0])). Proper formatter is %td, which is used in
      this patch.
      
      v2->v1:
        - add Reported-by
        - provide more context on how this issue was discovered
      
      [0] https://github.com/systemd/systemd/pull/12151Reported-by: NEvgeny Vereshchagin <evvers@ya.ru>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e1d1dc46
  14. 16 4月, 2019 1 次提交
  15. 13 4月, 2019 1 次提交
  16. 11 4月, 2019 2 次提交
  17. 10 4月, 2019 5 次提交
    • M
      libbpf: fix crash in XDP socket part with new larger BPF_LOG_BUF_SIZE · 50bd645b
      Magnus Karlsson 提交于
      In commit da11b417 ("libbpf: teach libbpf about log_level bit 2"),
      the BPF_LOG_BUF_SIZE was increased to 16M. The XDP socket part of
      libbpf allocated the log_buf on the stack, but for the new 16M buffer
      size this is not going to work. Change the code so it uses a 16K buffer
      instead.
      
      Fixes: da11b417 ("libbpf: teach libbpf about log_level bit 2")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      50bd645b
    • Y
      bpf, bpftool: fix a few ubsan warnings · 69a0f9ec
      Yonghong Song 提交于
      The issue is reported at https://github.com/libbpf/libbpf/issues/28.
      
      Basically, per C standard, for
        void *memcpy(void *dest, const void *src, size_t n)
      if "dest" or "src" is NULL, regardless of whether "n" is 0 or not,
      the result of memcpy is undefined. clang ubsan reported three such
      instances in bpf.c with the following pattern:
        memcpy(dest, 0, 0).
      
      Although in practice, no known compiler will cause issues when
      copy size is 0. Let us still fix the issue to silence ubsan
      warnings.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      69a0f9ec
    • D
      bpf, libbpf: add support for BTF Var and DataSec · 1713d68b
      Daniel Borkmann 提交于
      This adds libbpf support for BTF Var and DataSec kinds. Main point
      here is that libbpf needs to do some preparatory work before the
      whole BTF object can be loaded into the kernel, that is, fixing up
      of DataSec size taken from the ELF section size and non-static
      variable offset which needs to be taken from the ELF's string section.
      
      Upstream LLVM doesn't fix these up since at time of BTF emission
      it is too early in the compilation process thus this information
      isn't available yet, hence loader needs to take care of it.
      
      Note, deduplication handling has not been in the scope of this work
      and needs to be addressed in a future commit.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://reviews.llvm.org/D59441Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      1713d68b
    • D
      bpf, libbpf: support global data/bss/rodata sections · d859900c
      Daniel Borkmann 提交于
      This work adds BPF loader support for global data sections
      to libbpf. This allows to write BPF programs in more natural
      C-like way by being able to define global variables and const
      data.
      
      Back at LPC 2018 [0] we presented a first prototype which
      implemented support for global data sections by extending BPF
      syscall where union bpf_attr would get additional memory/size
      pair for each section passed during prog load in order to later
      add this base address into the ldimm64 instruction along with
      the user provided offset when accessing a variable. Consensus
      from LPC was that for proper upstream support, it would be
      more desirable to use maps instead of bpf_attr extension as
      this would allow for introspection of these sections as well
      as potential live updates of their content. This work follows
      this path by taking the following steps from loader side:
      
       1) In bpf_object__elf_collect() step we pick up ".data",
          ".rodata", and ".bss" section information.
      
       2) If present, in bpf_object__init_internal_map() we add
          maps to the obj's map array that corresponds to each
          of the present sections. Given section size and access
          properties can differ, a single entry array map is
          created with value size that is corresponding to the
          ELF section size of .data, .bss or .rodata. These
          internal maps are integrated into the normal map
          handling of libbpf such that when user traverses all
          obj maps, they can be differentiated from user-created
          ones via bpf_map__is_internal(). In later steps when
          we actually create these maps in the kernel via
          bpf_object__create_maps(), then for .data and .rodata
          sections their content is copied into the map through
          bpf_map_update_elem(). For .bss this is not necessary
          since array map is already zero-initialized by default.
          Additionally, for .rodata the map is frozen as read-only
          after setup, such that neither from program nor syscall
          side writes would be possible.
      
       3) In bpf_program__collect_reloc() step, we record the
          corresponding map, insn index, and relocation type for
          the global data.
      
       4) And last but not least in the actual relocation step in
          bpf_program__relocate(), we mark the ldimm64 instruction
          with src_reg = BPF_PSEUDO_MAP_VALUE where in the first
          imm field the map's file descriptor is stored as similarly
          done as in BPF_PSEUDO_MAP_FD, and in the second imm field
          (as ldimm64 is 2-insn wide) we store the access offset
          into the section. Given these maps have only single element
          ldimm64's off remains zero in both parts.
      
       5) On kernel side, this special marked BPF_PSEUDO_MAP_VALUE
          load will then store the actual target address in order
          to have a 'map-lookup'-free access. That is, the actual
          map value base address + offset. The destination register
          in the verifier will then be marked as PTR_TO_MAP_VALUE,
          containing the fixed offset as reg->off and backing BPF
          map as reg->map_ptr. Meaning, it's treated as any other
          normal map value from verification side, only with
          efficient, direct value access instead of actual call to
          map lookup helper as in the typical case.
      
      Currently, only support for static global variables has been
      added, and libbpf rejects non-static global variables from
      loading. This can be lifted until we have proper semantics
      for how BPF will treat multi-object BPF loads. From BTF side,
      libbpf will set the value type id of the types corresponding
      to the ".bss", ".data" and ".rodata" names which LLVM will
      emit without the object name prefix. The key type will be
      left as zero, thus making use of the key-less BTF option in
      array maps.
      
      Simple example dump of program using globals vars in each
      section:
      
        # bpftool prog
        [...]
        6784: sched_cls  name load_static_dat  tag a7e1291567277844  gpl
              loaded_at 2019-03-11T15:39:34+0000  uid 0
              xlated 1776B  jited 993B  memlock 4096B  map_ids 2238,2237,2235,2236,2239,2240
      
        # bpftool map show id 2237
        2237: array  name test_glo.bss  flags 0x0
              key 4B  value 64B  max_entries 1  memlock 4096B
        # bpftool map show id 2235
        2235: array  name test_glo.data  flags 0x0
              key 4B  value 64B  max_entries 1  memlock 4096B
        # bpftool map show id 2236
        2236: array  name test_glo.rodata  flags 0x80
              key 4B  value 96B  max_entries 1  memlock 4096B
      
        # bpftool prog dump xlated id 6784
        int load_static_data(struct __sk_buff * skb):
        ; int load_static_data(struct __sk_buff *skb)
           0: (b7) r6 = 0
        ; test_reloc(number, 0, &num0);
           1: (63) *(u32 *)(r10 -4) = r6
           2: (bf) r2 = r10
        ; int load_static_data(struct __sk_buff *skb)
           3: (07) r2 += -4
        ; test_reloc(number, 0, &num0);
           4: (18) r1 = map[id:2238]
           6: (18) r3 = map[id:2237][0]+0    <-- direct addr in .bss area
           8: (b7) r4 = 0
           9: (85) call array_map_update_elem#100464
          10: (b7) r1 = 1
        ; test_reloc(number, 1, &num1);
        [...]
        ; test_reloc(string, 2, str2);
         120: (18) r8 = map[id:2237][0]+16   <-- same here at offset +16
         122: (18) r1 = map[id:2239]
         124: (18) r3 = map[id:2237][0]+16
         126: (b7) r4 = 0
         127: (85) call array_map_update_elem#100464
         128: (b7) r1 = 120
        ; str1[5] = 'x';
         129: (73) *(u8 *)(r9 +5) = r1
        ; test_reloc(string, 3, str1);
         130: (b7) r1 = 3
         131: (63) *(u32 *)(r10 -4) = r1
         132: (b7) r9 = 3
         133: (bf) r2 = r10
        ; int load_static_data(struct __sk_buff *skb)
         134: (07) r2 += -4
        ; test_reloc(string, 3, str1);
         135: (18) r1 = map[id:2239]
         137: (18) r3 = map[id:2235][0]+16   <-- direct addr in .data area
         139: (b7) r4 = 0
         140: (85) call array_map_update_elem#100464
         141: (b7) r1 = 111
        ; __builtin_memcpy(&str2[2], "hello", sizeof("hello"));
         142: (73) *(u8 *)(r8 +6) = r1       <-- further access based on .bss data
         143: (b7) r1 = 108
         144: (73) *(u8 *)(r8 +5) = r1
        [...]
      
      For Cilium use-case in particular, this enables migrating configuration
      constants from Cilium daemon's generated header defines into global
      data sections such that expensive runtime recompilations with LLVM can
      be avoided altogether. Instead, the ELF file becomes effectively a
      "template", meaning, it is compiled only once (!) and the Cilium daemon
      will then rewrite relevant configuration data from the ELF's .data or
      .rodata sections directly instead of recompiling the program. The
      updated ELF is then loaded into the kernel and atomically replaces
      the existing program in the networking datapath. More info in [0].
      
      Based upon recent fix in LLVM, commit c0db6b6bd444 ("[BPF] Don't fail
      for static variables").
      
        [0] LPC 2018, BPF track, "ELF relocation for static data in BPF",
            http://vger.kernel.org/lpc-bpf2018.html#session-3Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d859900c
    • J
      bpf, libbpf: refactor relocation handling · f8c7a4d4
      Joe Stringer 提交于
      Adjust the code for relocations slightly with no functional changes,
      so that upcoming patches that will introduce support for relocations
      into the .data, .rodata and .bss sections can be added independent
      of these changes.
      Signed-off-by: NJoe Stringer <joe@wand.net.nz>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      f8c7a4d4
  18. 07 4月, 2019 1 次提交
  19. 04 4月, 2019 1 次提交
    • A
      libbpf: teach libbpf about log_level bit 2 · da11b417
      Alexei Starovoitov 提交于
      Allow bpf_prog_load_xattr() to specify log_level for program loading.
      
      Teach libbpf to accept log_level with bit 2 set.
      
      Increase default BPF_LOG_BUF_SIZE from 256k to 16M.
      There is no downside to increase it to a maximum allowed by old kernels.
      Existing 256k limit caused ENOSPC errors and users were not able to see
      verifier error which is printed at the end of the verifier log.
      
      If ENOSPC is hit, double the verifier log and try again to capture
      the verifier error.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      da11b417
  20. 29 3月, 2019 1 次提交